iMarc | Interactive Media Architects
  • Portfolio
  • Process
  • About
  • Communiqué
  • Contact
  • Support
  • Search

XML Pretty Printer in PHP5

by Will Bond - March 15, 2007 / 1:12pm View more articles

Today I was working with debugging some XML, and needed a way to make the XML more readable. The following is a little function that should do the job for some simple XML, and you can probably tweak it to your needs. It uses PHP5's SimpleXMLElement object to parse and return the XML, then indents it in a logical way.

More Articles Get the RSS Feed Post A Comment

8 Comments

by Phoenix   #
on May 16, 2008 / 8:54am
Nice work man, pretty easy to tweak too.
I got it just right to my job.

Cheers.
by WheeGuy   #
on July 7, 2008 / 4:38pm
Thank you for the cute function, it saved me some time. For very linear responses (ie, no newlines) give this a shot:

1. Replace the explode's separator with '><'
2. Replace the join's separator with ">
<"
by anon   #
on October 13, 2008 / 3:33pm
In response to the above poster's comment:

Good start, but nothing seems to get indented like that. Maybe this is a better fix...

Replace line 16 (the line beginning with "$xml_lines") with the following:

$xml_lines = explode("
", str_replace("><", ">
<", $xml_obj->asXML()));

That's it =)
by anon   #
on October 13, 2008 / 3:36pm
oops, line 12, not 16. And imagine backslash-Ns where those line breaks are (either will work).
by Fred Trotter   #
on October 30, 2008 / 2:15pm
After much fighting... I have included both the improvements from the comments above, and a trick that allows your xml pretty print function to work when xml is included in other xml which is what happens when there is an XML arguement to a soap call.

This makes your function work well against the results of SoapClient::__getLastResponse and SoapClient::__getLastRequest

This will save me a lot of time being able to directly read debug output from
__getLastResponse
and
__getLastRequest

May this help the next googler.

function xml_pretty_printer($xml, $html_output=FALSE)
{
$xml_obj = new SimpleXMLElement($xml);
$xml_lines = explode("\n", str_replace("><",">\n<",$xml_obj->asXML()));
$indent_level = 0;

$new_xml_lines = array();
foreach ($xml_lines as $xml_line) {
if (preg_match('#^(<[a-z0-9_:-]+((s+[a-z0-9_:-]+="[^"]+")*)?>.*<s*/s*[^>]+>)|(<[a-z0-9_:-]+((s+[a-z0-9_:-]+="[^"]+")*)?s*/s*>)#i', ltrim($xml_line))) {
$new_line = str_pad('', $indent_level*4) . ltrim($xml_line);
$new_xml_lines[] = $new_line;
} elseif (preg_match('#^<[a-z0-9_:-]+((s+[a-z0-9_:-]+="[^"]+")*)?>#i', ltrim($xml_line))) {

$new_line = str_pad('', $indent_level*4) . ltrim($xml_line);
$indent_level++;
$new_xml_lines[] = $new_line;
} elseif (preg_match('#<s*/s*[^>/]+>#i', $xml_line)) {
$indent_level--;
if (trim($new_xml_lines[sizeof($new_xml_lines)-1]) == trim(str_replace("/", "", $xml_line))) {
$new_xml_lines[sizeof($new_xml_lines)-1] .= $xml_line;
} else {
$new_line = str_pad('', $indent_level*4) . $xml_line;
$new_xml_lines[] = $new_line;
}
} else {
$new_line = str_pad('', $indent_level*4) . $xml_line;
$new_xml_lines[] = $new_line;
}
}

$xml = join("\n", $new_xml_lines);
return ($html_output) ? '<pre>' . $this->xmlspecialchars($xml) . '</pre>' : $xml;
}

function xmlspecialchars($text) {
return str_replace('&#039;', '&apos;', htmlspecialchars($text, ENT_QUOTES, 'UTF-8',false));
}
by Ray   #
on October 30, 2008 / 8:30pm
Hey Fred,

In your last example, the whitespace tokens in the regexes are missing backslashes, resulting in indenting errors where entities have attributes. Replace 's' with 's' in those, and it's all good. :-)
by Ray   #
on October 30, 2008 / 8:31pm
Aaarrrggghhh... looks like backslashes are being stripped from posts (twice?), resulting in backslashes being removed from the comments.
by Kirk, the Next Googler   #
on November 26, 2008 / 4:43pm
Thanks - I've been helped! Work's great after touching up the regex's.

Comments have been turned off on this blog.
Read something more recent.

Statements and opinions expressed in this blog and any comments made are the private opinions of the respective poster, and, as such, iMarc LLC is neither responsible nor liable for such content.

iMarc

iMarc is a web development company in Newburyport, MA. This is our blog.
View all blogs or learn more about iMarc.

About the Author

Will's Head Will Bond, Senior Technical Architect
I’m involved with everything technical at iMarc from servers to coding and markup best practices. I’m also the unofficial resident open source advocate.

After work I spend time with my awesome wife, daughter & son in and around our home in beautiful Newbury, MA.
More blogs by Will

Search Our Blog

Recent Communiqués

  • Bureaucracy at the W3C
  • Clients
  • Bring Back Fun
  • Browsers and Brands
  • Getting shot in paintball is good for you
  • Hiring: Junior Web Developer, Specializing in PHP
  • Password Management Done Right
  • BOFH
  • Limits
  • Unfriendliest CAPTCHA ever
  • Debug CSS
  • Bringing Business White Papers to the Web
  • i ♥ @alaskaair
  • Micropayments
  • Beating CAPTCHA

Popular Communiqués

  • Bring Back Fun
  • Password Management Done Right
  • Hiring: Junior Web Developer, Specializing in PHP
  • Getting shot in paintball is good for you
  • Bureaucracy at the W3C
  • Browsers and Brands
  • Limits

Recent Comments

  • Bring Back Fun

    By Robert Mohns: Go to panic.com/goods Drag a t-shirt into the "Cart" at the bottom of the screen. …

  • Inconsistent Web Analytics Numbers: Google vs. The World

    By Jim Samuel: Great article. Thanks for posting it. I've been trying to find an explanation for the discrepancy between…

  • Password Management Done Right

    By Mary: Hey Dan, great post. I've been using a VeriSign secured toolbar called Billeo to manage my…

  • Browsers and Brands

    By Reto L.: I think Rob has it right -- I just asked my mother how she gets to CNN's website and her response was…

  • Browsers and Brands

    By Robert Mohns: Actually, I think all those people who said the browser is how you search for stuff are correct. What's…

RSS

RSS Icon Learn about RSS and get the feed for our blog.

About iMarc

  • We build custom web sites
  • In-house strategy, design, programming, hosting
  • In business since 1997
  • We’re located in Newburyport, MA
  • Call us at (978) 462-8848

© 2009 iMarc LLC, Contact Us

Links

  • Home
  • Portfolio
  • Client Support
  • Log In
  • (icon)RSS

Meet the Team

Craig's Head Craig Ruksznis, Developer

I develop websites.

(please vote for me for the "Most generic response to what my job description is best described as")

Learn More | Meet the Others