Python string.letters does not include locale diacritics -
i trying alphabet python string module depending on given locale no success (that diacritics, i.e. éèêà... french). here minimal example :
import locale, string locale.setlocale(locale.lc_all, 'en_us.utf-8') print string.letters # shows abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz locale.setlocale(locale.lc_all, 'fr_fr.utf-8') print string.letters # shows abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz
in python documentation, said string.letters locale dependent, seems not work me.
what doing wrong , right way obtain language-dependent alphabet ?
edit: checked locale print locale.getlocale()
after setting , correctly changed.
in python 2.7 (there no string.letters in python 3.x) works if set locale 'fr_fr' (equivalent 'fr_fr.iso8859-1', not 'fr_fr.utf-8').
>>> import locale, string >>> locale.setlocale(locale.lc_all, 'es_es') 'es_es' >>> string.letters 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz\xaa\xb5\xba\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff' >>> locale.setlocale(locale.lc_all, 'es_es.utf-8') 'es_es.utf-8' >>> string.letters 'abcdefghijklmnopqrstuvwxyzabcdefghijklmnopqrstuvwxyz'
so \xaa character "ª", \xab "«", \xd1 "Ñ" , on. encoding representation indeed broken.
i highly recommend reading this: https://pythonhosted.org/kitchen/unicode-frustrations.html
Comments
Post a Comment