Reading & Writing XML using the PHP DOM library

Reading XML using the DOM library

The easiest way to read a well-formed XML file is to use the Document Object Model (DOM) library compiled into some installations of PHP. The DOM library reads the entire XML document into memory and represents it as a tree of nodes, as illustrated in Figure 1.
Figure 1. XML DOM tree for the books XML
XML DOM tree for the books XML

The books node at the top of the tree has two child book tags. Within each book, there are authorpublisher, and titlenodes. The authorpublisher, and title nodes each have child text nodes that contain the text.

The code to read the books XML file and display the contents using the DOM is shown in Listing 2.
Listing 2. Reading books XML with the DOM

<!--?php
  $doc = new DOMDocument();
  $doc->load( 'books.xml' );

  $books = $doc->getElementsByTagName( "book" );
  foreach( $books as $book )
  {
  $authors = $book->getElementsByTagName( "author" );
  $author = $authors->item(0)->nodeValue;

  $publishers = $book->getElementsByTagName( "publisher" );
  $publisher = $publishers->item(0)->nodeValue;

  $titles = $book->getElementsByTagName( "title" );
  $title = $titles->item(0)->nodeValue;

  echo "$title - $author - $publisher\n";
  }
  ?>

The script starts by creating a new DOMdocument object and loading the books XML into that object using the load method. After that, the script uses the getElementsByName method to get a list of all of the elements with the given name.

Within the loop of the book nodes, the script uses the getElementsByName method to get the nodeValue for the author,publisher, and title tags. The nodeValue is the text within the node. The script then displays those values.

You can run the PHP script on the command line like this:

% php e1.php
PHP Hacks - Jack Herrington - O'Reilly
Podcasting Hacks - Jack Herrington - O'Reilly
%

As you can see, a line is printed for each book block. That’s a good start.

Writing XML with the DOM

Reading XML is only one part of the equation. What about writing it? The best way to write XML is to use the DOM. Listing 5 shows how the DOM builds the books XML file.
Listing 5. Writing books XML with the DOM

<!--?php
  $books = array();
  $books [] = array(
  'title' => 'PHP Hacks',
  'author' => 'Jack Herrington',
  'publisher' => "O'Reilly"
  );
  $books [] = array(
  'title' => 'Podcasting Hacks',
  'author' => 'Jack Herrington',
  'publisher' => "O'Reilly"
  );

  $doc = new DOMDocument();
  $doc->formatOutput = true;

  $r = $doc->createElement( "books" );
  $doc->appendChild( $r );

  foreach( $books as $book )
  {
  $b = $doc->createElement( "book" );

  $author = $doc->createElement( "author" );
  $author->appendChild(
  $doc->createTextNode( $book['author'] )
  );
  $b->appendChild( $author );

  $title = $doc->createElement( "title" );
  $title->appendChild(
  $doc->createTextNode( $book['title'] )
  );
  $b->appendChild( $title );

  $publisher = $doc->createElement( "publisher" );
  $publisher->appendChild(
  $doc->createTextNode( $book['publisher'] )
  );
  $b->appendChild( $publisher );

  $r->appendChild( $b );
  }

  echo $doc->saveXML();
  ?>

At the top of the script, the books array is loaded with some example books. That data could come from the user or from a database.

After the example books are loaded, the script creates a new DOMDocument and adds the root books node to it. Then the script creates an element for the author, title, and publisher for each book and adds a text node to each of those nodes. The final step for each book node is to re-attach it to the root books node.

The end of the script dumps the XML to the console using the saveXML method. (You can also use the save method to create a file from the XML.) The output of the script is shown in Listing 6.
Listing 6. Output from the DOM build script

  % php e4.php
  <?xml version="1.0"?>
  <books>
  <book>
  <author>Jack Herrington</author>
  <title>PHP Hacks</title>
  <publisher>O'Reilly</publisher>
  </book>
  <book>
  <author>Jack Herrington</author>
  <title>Podcasting Hacks</title>
  <publisher>O'Reilly</publisher>
  </book>
  </books>
  %

The real value of using the DOM is that the XML it creates is always well formed.

[Ref: http://www.ibm.com/developerworks/opensource/library/os-xmldomphp/index.html]

One Response

Add a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.