python - lxml - parse an xml with no line breaks in it -
i using lxml iterparse in python loop through elements in xml file. works fine of xmls, fails some. 1 of them has no line breaks in it. error , sample of such xml below. clues?
thanks!!
<root><person><name>"xyz"</name><age>"10"</age></person><person><name>"abc"</name><age>"20"</age></person></root>
error -
xmlsyntaxerror: document empty, line 1, column 1
code -
from lxml import etree def parsexml(context,elemlist): event, element in context: if element.tag in elemlist: #read text , attributes element.clear() def main(object): elemlist= ['name','age','id'] context=etree.iterparse(fullfilepath, events=("start","end")) parsexml(context,elemlist)
etree.iterparse expects buffer source argument. , name of variable passing, "fullfilepath", tells me it's not file (so parser trying parse file_path insted of file content ). try passing opened file instead.
context=etree.iterparse(open(fullfilepath), events=("start","end"))
or string:
from lxml import etree xml = '<root><person><name>"xyz"</name><age>"10"</age></person><person><name>"abc"</name><age>"20"</age></person></root>\n' def parsexml(context,elemlist): event, element in context: if element.tag in elemlist: print element.tag, element.clear() def main(): elemlist= ['name','age','id'] context=etree.iterparse(stringio(xml), events=("start","end")) parsexml(context,elemlist) main() >>>name name age age name name age age
ps: , mean this?
def main(object):
Comments
Post a Comment