Python 101 - Intro to XML Parsing with ElementTree - Mouse Vs Python (2024)

If you have followed this blog for a while, you may remember that we’ve covered several XML parsing libraries that are included with Python. In this article, we’ll be continuing that series by taking a quick look at the ElementTree library. You will learn how to create an XML file, edit XML and parse the XML. For comparison’s sake, we’ll use the same XML we used in the previous minidom article to illustrate the differences between using minidom and ElementTree. Here is the original XML:

  1181251680 040000008200E000 1181572063   1800 Bring pizza home 

Now let’s dig into the Python!

How to Create XML with ElementTree

Creating XML with ElementTree is very simple. In this section, we will attempt to create the XML above with Python. Here’s the code:

import xml.etree.ElementTree as xml#----------------------------------------------------------------------def createXML(filename): """ Create an example XML file """ root = xml.Element("zAppointments") appt = xml.Element("appointment") root.append(appt) # add appointment children begin = xml.SubElement(appt, "begin") begin.text = "1181251680" uid = xml.SubElement(appt, "uid") uid.text = "040000008200E000" alarmTime = xml.SubElement(appt, "alarmTime") alarmTime.text = "1181572063" state = xml.SubElement(appt, "state") location = xml.SubElement(appt, "location") duration = xml.SubElement(appt, "duration") duration.text = "1800" subject = xml.SubElement(appt, "subject") tree = xml.ElementTree(root) with open(filename, "w") as fh: tree.write(fh) #----------------------------------------------------------------------if __name__ == "__main__": createXML("appt.xml")

If you run this code, you should get something like the following (probably all on one line):

  1181251680 040000008200E000 1181572063   1800  

This is pretty close to the original and is certainly valid XML, but it’s not quite the same. However, it’s close enough. Let’s take a moment to review the code and make sure we understand it. First we create the root element by using ElementTree’s Element function. Then we create an appointment element and append it to the root. Next we create SubElements by passing the appointment Element object (appt) to SubElement along with a name, like “begin”. Then for each SubElement, we set its text property to give it a value. At the end of the script, we create an ElementTree and use it to write the XML out to a file.

What’s annoying is that it write out the XML all on one line instead of in a nice readable format (i.e. “pretty print”). There’s a recipe on Effbot, but there doesn’t appear to be a way to do it internally. You may also want to take a look at some of the other solutions on StackOverflow. It should be noted that lxml supports “pretty print” out of the box.

Now we’re ready to learn how to edit the file!

How to Edit XML with ElementTree

Editing XML with ElementTree is also easy. To make things a little more interesting though, we’ll add another appointment block to the XML:

  1181251680 040000008200E000 1181572063   1800 Bring pizza home   1181253977 sdlkjlkadhdakhdfd 1181588888 TX Dallas 1800 Bring pizza home 

Now let’s write some code to change each of the begin tag’s values from seconds since the epoch to something a little more readable. We’ll use Python’s time module to facilitate this:

import timeimport xml.etree.cElementTree as ET#----------------------------------------------------------------------def editXML(filename): """ Edit an example XML file """ tree = ET.ElementTree(file=filename) root = tree.getroot() for begin_time in root.iter("begin"): begin_time.text = time.ctime(int(begin_time.text)) tree = ET.ElementTree(root) with open("updated.xml", "w") as f: tree.write(f) #----------------------------------------------------------------------if __name__ == "__main__": editXML("original_appt.xml")

Here we create an ElementTree object (tree) and we extract the root from it. Then we use ElementTree’s iter() method to find all the tags that are labeled “begin”. Note that the iter() method was added in Python 2.7. In our for loop, we set each item’s text property to a more human readable time format via time.ctime(). You’ll note that we had to convert the string to an integer when passing it to ctime. The output should look something like the following:

  Thu Jun 07 16:28:00 2007 040000008200E000 1181572063   1800 Bring pizza home   Thu Jun 07 17:06:17 2007 sdlkjlkadhdakhdfd 1181588888 TX Dallas 1800 Bring pizza home 

You can also use ElementTree’s find() or findall() methods to get search for specific tags in your XML. The find() method will just find the first instance whereas the findall() will find all the tags with the specified label. These are helpful for editing purposes or for parsing, which is our next topic!

How to Parse XML with ElementTree

Now we get to learn how to do some basic parsing with ElementTree. First we’ll read through the code and then we’ll go through bit by bit so we can understand it. Note that this code is based around the original example, but it should work on the second one as well.

import xml.etree.cElementTree as ET#----------------------------------------------------------------------def parseXML(xml_file): """ Parse XML with ElementTree """ tree = ET.ElementTree(file=xml_file) print tree.getroot() root = tree.getroot() print "tag=%s, attrib=%s" % (root.tag, root.attrib) for child in root: print child.tag, child.attrib if child.tag == "appointment": for step_child in child: print step_child.tag # iterate over the entire tree print "-" * 40 print "Iterating using a tree iterator" print "-" * 40 iter_ = tree.getiterator() for elem in iter_: print elem.tag # get the information via the children! print "-" * 40 print "Iterating using getchildren()" print "-" * 40 appointments = root.getchildren() for appointment in appointments: appt_children = appointment.getchildren() for appt_child in appt_children: print "%s=%s" % (appt_child.tag, appt_child.text) #----------------------------------------------------------------------if __name__ == "__main__": parseXML("appt.xml")

You may have already noticed this, but in this example and the last one, we’ve been importing cElementTree instead of the normal ElementTree. The main difference between the two is that cElementTree is C-based instead of Python-based, so it’s much faster. Anyway, once again we create an ElementTree object and extract the root from it. You’ll note that e print out the root and the root’s tag and attributes. Next we show several ways of iterating over the tags. The first loop just iterates over the XML child by child. This would only print out the top level child (appointment) though, so we added an if statement to check for that child and iterate over its children too.

Next we grab an iterator from the tree object itself and iterate over it that way. You get the same information, but without the extra steps in the first example. The third method uses the root’s getchildren() function. Here again we need an inner loop to grab all the children inside each appointment tag. The last example uses the root’s iter() method to just loop over any tags that match the string “begin”.

As mentioned in the last section, you could also use find() or findall() to help you find specific tags or sets of tags respectively. Also note that each Element object has a tag and a text property that you can use to acquire that exact information.

Wrapping Up

Now you know how to use ElementTree to create, edit and parse XML. You can add that information to your XML parsing toolkit and use it for fun or profit. You will find links to previous articles on some of the other XML parsing tools below as well as additional information about ElementTree itself.

Related Articles from Mouse Vs Python

  • Parsing XML with minidom
  • Python: Parsing XML with lxml
  • Parsing XML with Python using lxml.objectify

Additional Reading

Download the Source

  • ETXMLParsing.zip
Python 101 - Intro to XML Parsing with ElementTree - Mouse Vs Python (2024)
Top Articles
Flex Card for Seniors: What It Is and Why Everyone Is Talking About It
Fact Check: Government Medicare Program Does NOT Offer A Flex Card | Lead Stories
AMC Theatre - Rent A Private Theatre (Up to 20 Guests) From $99+ (Select Theaters)
Fort Morgan Hometown Takeover Map
Craigslist Monterrey Ca
Gomoviesmalayalam
12 Rue Gotlib 21St Arrondissem*nt
Explore Tarot: Your Ultimate Tarot Cheat Sheet for Beginners
Santa Clara College Confidential
Is Sportsurge Safe and Legal in 2024? Any Alternatives?
Bellinghamcraigslist
Kent And Pelczar Obituaries
Swimgs Yung Wong Travels Sophie Koch Hits 3 Tabs Winnie The Pooh Halloween Bob The Builder Christmas Springs Cow Dog Pig Hollywood Studios Beach House Flying Fun Hot Air Balloons, Riding Lessons And Bikes Pack Both Up Away The Alpha Baa Baa Twinkle
Hardly Antonyms
Pollen Count Los Altos
Ladyva Is She Married
Maplestar Kemono
State HOF Adds 25 More Players
Payment and Ticket Options | Greyhound
Kitty Piggy Ssbbw
Vrachtwagens in Nederland kopen - gebruikt en nieuw - TrucksNL
Bekijk ons gevarieerde aanbod occasions in Oss.
Soul Eater Resonance Wavelength Tier List
Receptionist Position Near Me
Free T33N Leaks
Fuse Box Diagram Honda Accord (2013-2017)
Craig Woolard Net Worth
La Qua Brothers Funeral Home
Mrstryst
Dumb Money, la recensione: Paul Dano e quel film biografico sul caso GameStop
Goodwill Thrift Store & Donation Center Marietta Photos
Mississippi State baseball vs Virginia score, highlights: Bulldogs crumble in the ninth, season ends in NCAA regional
Hotels Near New Life Plastic Surgery
The Complete Guide To The Infamous "imskirby Incident"
Hannibal Mo Craigslist Pets
SF bay area cars & trucks "chevrolet 50" - craigslist
When His Eyes Opened Chapter 2048
2008 DODGE RAM diesel for sale - Gladstone, OR - craigslist
What Does Code 898 Mean On Irs Transcript
Orion Nebula: Facts about Earth’s nearest stellar nursery
F9 2385
Miami Vice turns 40: A look back at the iconic series
814-747-6702
Searsport Maine Tide Chart
Lyons Hr Prism Login
Dancing Bear - House Party! ID ? Brunette in hardcore action
Ouhsc Qualtrics
Craigslist Anc Ak
Strawberry Lake Nd Cabins For Sale
Home | General Store and Gas Station | Cressman's General Store | California
What Responsibilities Are Listed In Duties 2 3 And 4
Equinox Great Neck Class Schedule
Latest Posts
Article information

Author: Amb. Frankie Simonis

Last Updated:

Views: 6079

Rating: 4.6 / 5 (56 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Amb. Frankie Simonis

Birthday: 1998-02-19

Address: 64841 Delmar Isle, North Wiley, OR 74073

Phone: +17844167847676

Job: Forward IT Agent

Hobby: LARPing, Kitesurfing, Sewing, Digital arts, Sand art, Gardening, Dance

Introduction: My name is Amb. Frankie Simonis, I am a hilarious, enchanting, energetic, cooperative, innocent, cute, joyous person who loves writing and wants to share my knowledge and understanding with you.