Sunday, May 20, 2007

Pretty print xml with python - indenting xml.

Here's a fairly simple way to pretty print xml with python. By pretty printing XML, I mean indenting XML with python nicely.

from xml.dom.ext import PrettyPrint
from xml.dom.ext.reader.Sax import FromXmlFile
import sys
doc = FromXmlFile(sys.argv[1])
PrettyPrint(doc, sys.stdout)

Are there any other ways to pretty print xml with python?

How do you pretty print xml with your favourite xml API? Pretty printing xml with ElementTree anyone?

UPDATE: there's a few other ways listed in the comments.

Melbourne Web Developer Written by a Melbourne web developer. Available for your projects - php, mysql, e commerce, javascript, CMS, css, flash, actionscript, python, games, postgresql, xml.


Daniel said...

This method is pretty slow; I also found xml.dom.ext to be a little hard to install (is it maintained?).

The last time I revisited this problem, I would up using a pretty-printing XSLT with lxml as the XSLT interface. It's faster than the fairly slow xml.dom.ext.PrettyPrint.

Filip Salomonsson said...

The etree.tostring function in lxml has a pretty_print argument that can come in handy.

Mostly, though, I use a custom indent function for both lxml and cElementTree trees.

Fredrik said...

Matthew Strax-Haber said...

To pretty print an ElementTree:

from xml.minidom import parseString
from xml.etree import ElementTree

def prettyPrint(element):
txt = ElementTree.tostring(element)
print minidom.parseString(txt).toprettyxml()

Stéphane said...

xml.dom.minidom import parseString

Excellent tips !! Thank you so much !

Ted said...

I haven't used xml.dom.ext, but I've found that `lxml.etree.tostring(lxml.etree.parse(filename))` runs at least 3x faster than prettyprinting with `xml.minidom`. This is not really surprising, since lxml wraps C libraries.