python - Why does LANG change the output of str.encode() -
this command returns b'?'
, expected since "α" not in iso-8859-1 encoding.
lang=en_us.utf-8 python -c "print('α'.encode('iso-8859-1', 'replace'))"
this command returns b'\xce\xb1'
, don't understand.
lang=en_us.iso-8859-1 python -c "print('α'.encode('iso-8859-1', 'replace'))"
what causing this? trying remove characters not in encoding (here iso-8859-1), replacing them ?
, think code should do.
it's not changing output of str.encode
; it's changing encoding of sys.stdin
.
$ lang=en_us.utf-8 python -c "print(__import__('sys').stdin.encoding)" utf-8 $ lang=en_us.iso-8859-1 python -c "print(__import__('sys').stdin.encoding)" iso-8859-1
as result, python interprets utf-8 b'\xce\xb1'
terminal literal bytes:
$ lang=en_us.iso-8859-1 python3 -c "print(len('α'))" 2 $ lang=en_us.utf-8 python3 -c "print(len('α'))" 1
Comments
Post a Comment