How can I parse a complex XML with PHP and CDATA?

Accessing a parent attribute in XML with PHP - sample code porovided

  • I am having a heck of a time getting at a attribute of a parent in my XML from PHP. here is a sample of the XML: ************************************* SAMPLE XML CODE START ************************************* <results first="1" last="10" total="46486"> <next href="http://[domain]/[page].xml?col=bcgovt+nrmweb+govdaily&amp;st=11&amp;charset=utf-8&amp;qt=+test&amp;la=en"/> <totaldocs>374768</totaldocs> <term><text>test</text><count>47033</count></term> <result href="http://www.bchealthguide.org/healthfiles/hfile45.stm"> <title>Should I Get My Well Water <highlight>Tested</highlight>? - Health File #45</title> <score>54%</score> <size>14.0K</size> <summary>The BC HealthFiles are a series of over 150 one-page, easy to understand fact sheets about a wide range of public and environmental health and safety issues.</summary> <date>2005-06-10T03:18:06Z</date> <indexed>2005-06-10T03:18:06Z</indexed> <flags>2</flags> <nlinks>2</nlinks> <quality>18</quality> <publisher></publisher> <keywords>test;shall, Tested;Should, 45#shall, healthfile 45#shall, environmental health, private well, healthfiles be, healthfiles be a, the bc healthfiles, healthfiles be a series, drink water, your well, lead and copper, nitrate, sewage, drinking</keywords> <in_collection>bcgovt</in_collection> <term><text>test</text><frequency>25.0</frequency></term> </result> <result href="http://www.icbc.com/Licensing/index.html"> <title>ICBC - Driver licensing</title> <score>54%</score> <size>33.3K</size> <summary>This section of icbc.com contains all necessary information about driver licensing in British Columbia.</summary> <date>2005-05-20T18:00:09Z</date> <indexed>2005-05-25T03:07:34Z</indexed> <flags>2</flags> <nlinks>1</nlinks> <quality>18</quality> <publisher></publisher> <keywords>best hit, test best, motor vehicle, road test, graduate license, terrain)act, vehicle all, all terrain, drive record, vehicle driver, driver service, b.c.driver</keywords> <in_collection>bcgovt</in_collection> <term><text>test</text><frequency>25.0</frequency></term> </result> ....etc.... </results> ************************************* SAMPLE XML CODE END ************************************* I have no issues accsing the children information... FYI: Here is what i am using to access the XML via PHP ************************************* PHP CODE START ************************************* $strSearchTerm = $_POST["qt"]; $insideitem = false; $link_href = ""; $tag = ""; $title = ""; $description = ""; $link = ""; $score = ""; $size = ""; $summary = ""; $date = ""; $indexed = ""; $flags = ""; $nlinks = ""; $quality = ""; $publisher = ""; $keywords = ""; $in_collection = ""; function startElement($parser, $name, $attrs) { global $insideitem, $tag, $link_href, $title, $highlight, $description, $link, $score, $size, $summary, $date, $indexed, $flags, $nlinks, $quality, $publisher, $keywords, $in_collection, $term, $text, $frequency; if ($insideitem) { $tag = $name; } elseif ($name == "RESULT") { // ?????? $link_href = $attrs["href"]; $insideitem = true; } } function endElement($parser, $name) { global $insideitem, $tag, $link_href, $title, $highlight, $description, $link, $score, $size, $summary, $date, $indexed, $flags, $nlinks, $quality, $publisher, $keywords, $in_collection, $term, $text, $frequency; if ($name == "RESULT") { // This is where i want to display attribute information printf("<ul><li><b>URL: <a href='%s'>%s</a></b></li>",htmlspecialchars(trim($link_href)),htmlspecialchars(trim($link_href))); printf("<li><b>Title:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($title))); printf("<li><b>Score:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($score))); printf("<li><b>Size:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($size))); printf("<li><b>Summary:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($summary))); printf("<li><b>Date:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($date))); printf("<li><b>Date Indexed:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($indexed))); printf("<li><b>Flags:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($flags))); printf("<li><b>Number of Links:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($nlinks))); printf("<li><b>Quality:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($quality))); // - commented out because data contains no content - printf("<li><b>%s</li>",htmlspecialchars(trim($publisher))); printf("<li><b>Keywords:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($keywords))); printf("<li><b>Collection:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($in_collection))); printf("<li><b>Search Term:</b><font color=#3300CC> %s</font></li>",htmlspecialchars(trim($text))); printf("<li><b>Frequency:</b><font color=#3300CC> %s</font></li></ul><hr size=1 color=#999999 noshade />",htmlspecialchars(trim($frequency))); $link_href = ""; $title = ""; $description = ""; $link = ""; $score = ""; $size = ""; $summary = ""; $date = ""; $indexed = ""; $flags = ""; $nlinks = ""; $quality = ""; $publisher = ""; $keywords = ""; $in_collection = ""; $term = ""; $text = ""; $frequency = ""; $insideitem = false; } } function characterData($parser, $data) { global $insideitem, $tag, $link_href, $title, $highlight, $description, $link, $score, $size, $summary, $date, $indexed, $flags, $nlinks, $quality, $publisher, $keywords, $in_collection, $term, $text, $frequency; if ($insideitem) { switch ($tag) { case "TITLE": $title .= $data; case "HIGHLIGHT": $highlight .= $data; break; case "DESCRIPTION": $description .= $data; break; case "LINK": $link .= $data; break; case 'SCORE': $score .= $data; break; case 'SIZE': $size .= $data; break; case 'SUMMARY': $summary .= $data; break; case 'DATE': $date .= $data; break; case 'INDEXED': $indexed .= $data; break; case 'FLAGS': $flags .= $data; break; case 'NLINKS': $nlinks .= $data; break; case 'QUALITY': $quality .= $data; break; case 'PUBLISHER': $publisher .= $data; break; case 'KEYWORDS': $keywords .= $data; break; case 'IN_COLLECTION': $in_collection .= $data; break; case 'TERM': $term .= $data; case 'TEXT': $text .= $data; break; case 'FREQUENCY': $frequency .= $data; break; break; } } } $xml_parser = xml_parser_create(); xml_set_element_handler($xml_parser, "startElement", "endElement"); xml_set_character_data_handler($xml_parser, "characterData"); // build the query for searchfeed // Parse type // XML (which is what this parser was built to read) $strQuery = "http://[domain]/[page].xml?"; $strQuery .= "qt=" . $_POST["qt"] . "&"; $strQuery .= "charset=iso-8859-1"; $fp = fopen($strQuery, "r") or die("Error reading RSS data."); while ($data = fread($fp, 4096)) xml_parse($xml_parser, $data, feof($fp)) or die(sprintf("XML error: %s at line %d", xml_error_string(xml_get_error_code($xml_parser)), xml_get_current_line_number($xml_parser))); fclose($fp); xml_parser_free($xml_parser); ************************************* PHP CODE END ************************************* It works fine as it is right now (with proper domain and page info) - but i need to get at the only attribute of the parent in the XML (href). Hoping someone can help me with this one. Google / PHP forums turned up nothing as did my PHP books. To clarify - i need to get at the attribute within the result item. --> <result href="http://www.icbc.com/Licensing/index.html"> --> href="http://www.icbc.com/Licensing/index.html" I want to put it inside the "link_href" variable and display it along with all of the other variables. Thanks

  • Answer:

    Hello pcormie-ga Thank-you for your question. I have been able to troubleshoot your question for you and discover the solution. In your code you have indicated correctly where you believe the error lies: // ?????? $link_href = $attrs["href"]; This line needs to be changed to: $link_href = $attrs['HREF']; I believe the problem is now solved as I do not think you have indicated there are any further problems with the script. If you require any further assistance on this subject please do not hesitate to ask for clarification and I will do my best to respond swiftly.

pcormie-ga at Google Answers Visit the source

Was this solution helpful to you?

Just Added Q & A:

Find solution

For every problem there is a solution! Proved by Solucija.

  • Got an issue and looking for advice?

  • Ask Solucija to search every corner of the Web for help.

  • Get workable solutions and helpful tips in a moment.

Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.