XML parsing

https://d33wubrfki0l68.cloudfront.net/ebd5827fc53e413556893ac47d4e819f13ba90d9/8b500/_images/33888714601_a1f7d020a2_k_d.jpg

untangle

untangle is a simple library whichtakes an XML document and returns a Python object which mirrors the nodes andattributes in its structure.

For example, an XML file like this:

  1. <?xml version="1.0"?>
  2. <root>
  3. <child name="child1">
  4. </root>

can be loaded like this:

  1. import untangle
  2. obj = untangle.parse('path/to/file.xml')

and then you can get the child element’s name attribute like this:

  1. obj.root.child['name']

untangle also supports loading XML from a string or a URL.

xmltodict

xmltodict is another simplelibrary that aims at making XML feel like working with JSON.

An XML file like this:

  1. <mydocument has="an attribute">
  2. <and>
  3. <many>elements</many>
  4. <many>more elements</many>
  5. </and>
  6. <plus a="complex">
  7. element as well
  8. </plus>
  9. </mydocument>

can be loaded into a Python dict like this:

  1. import xmltodict
  2.  
  3. with open('path/to/file.xml') as fd:
  4. doc = xmltodict.parse(fd.read())

and then you can access elements, attributes, and values like this:

  1. doc['mydocument']['@has'] # == u'an attribute'
  2. doc['mydocument']['and']['many'] # == [u'elements', u'more elements']
  3. doc['mydocument']['plus']['@a'] # == u'complex'
  4. doc['mydocument']['plus']['#text'] # == u'element as well'

xmltodict also lets you roundtrip back to XML with the unparse function,has a streaming mode suitable for handling files that don’t fit in memory,and supports XML namespaces.

原文: https://docs.python-guide.org/scenarios/xml/