email parsing - HeaderParseError in python -


i headerparseerror if try parse string decode_header() in python 2.6.5 (and 2.7). here repr() of string:

 '=?iso-8859-1?b?qw5tzwxkdw5nie5ldhphbnnjagx1c3mgu_xkcmluzznwlmpwzw==?=' 

this string comes mime email contains jpeg picture. thunderbird can decode filename (which contains german umlauts).

>>> email.header import decode_header >>> decode_header('=?iso-8859-1?b?qw5tzwxkdw5nie5ldhphbnnjagx1c3mgu_xkcmluzznwlmpwzw==?=') traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "/usr/lib64/python2.6/email/header.py", line 101, in decode_header     raise headerparseerror email.errors.headerparseerror 

it seems incompatibility between python's character set base64-encoded strings , mail agent's:

>>> email.header import decode_header >>> a='qw5tzwxkdw5nie5ldhphbnnjagx1c3mgu_xkcmluzznwlmpwzw==' >>> decode_header(a) traceback (most recent call last):   file "<stdin>", line 1, in <module>   file "/usr/lib/python2.7/email/header.py", line 108, in decode_header     raise headerparseerror email.errors.headerparseerror >>> a1= a.replace('_', '/') >>> decode_header(a1) [('anmeldung netzanschluss s\xecdring3p.jpg', 'iso-8859-1')] >>> print _[0][0].decode(_[0][1]) anmeldung netzanschluss südring3p.jpg 

python utilizes character set wikipedia article suggests (i.e 0-9, a-z, a-z, +, /). in same article, alternatives (including underscore that's issue here) included; however, underscore's value vague (it's value 62 or 63, depending on alternative).

i don't know python can guess intentions of b0rken mail agents; suggest appropriate guessing whenever decode_header fails.

i'm calling “broken” mail agent because there no need escape either + or / in message header: it's not url, why not use typical character set?


Comments

Popular posts from this blog

c++ - Is it possible to compile a VST on linux? -

java - Output of Eclipse is rubbish -

jquery - Confused with JSON data and normal data in Django ajax request -