python - Why does LANG change the output of str.encode() -

May 15, 2010

this command returns b'?', expected since "α" not in iso-8859-1 encoding.

lang=en_us.utf-8 python -c "print('α'.encode('iso-8859-1', 'replace'))"

this command returns b'\xce\xb1', don't understand.

lang=en_us.iso-8859-1 python -c "print('α'.encode('iso-8859-1', 'replace'))"

what causing this? trying remove characters not in encoding (here iso-8859-1), replacing them ?, think code should do.

it's not changing output of str.encode; it's changing encoding of sys.stdin.

$ lang=en_us.utf-8 python -c "print(__import__('sys').stdin.encoding)" utf-8 $ lang=en_us.iso-8859-1 python -c "print(__import__('sys').stdin.encoding)" iso-8859-1

as result, python interprets utf-8 b'\xce\xb1' terminal literal bytes:

$ lang=en_us.iso-8859-1 python3 -c "print(len('α'))" 2 $ lang=en_us.utf-8 python3 -c "print(len('α'))"                  1

Search This Blog

Call

python - Why does LANG change the output of str.encode() -

Comments

Post a Comment

Popular posts from this blog

node.js - Using Node without global install -

php - CakePHP HttpSockets send array of paramms -

angularjs - ADAL JS Angular- WebAPI add a new role claim to the token -