I often have to parse an XML Document to pull out element text when doing an integration with a third party system. I have used XPATH for these types of queries for years, but have always cringed when there were a lot of namespace declarations and prefixes used in the elements. Half of the time, these prefixes wouldn’t be declared or used properly and it would through off my XPATH query.
A few months ago I discovered a nice little XPath command that lets me ignore the namespace prefixes. Of course, XML evangelists would hang me up to dry if I ignored namespaces, but in all honesty, namespaces have never made a difference in my SOAP Responses or other XML data transfers. So, since this probably will work accurately for people in 99% of the use cases out there, I am making it my new favorite way to query an XML document using XPATH.
For example, if you wanted to access an XML document like the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 | <soapenv:Envelope xmlns:soapenc="http://schemas.xmlsoap.org/soap/encoding/" xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <soapenv:Body> Â Â Â Â <RetrieveContentResponse xmlns="urn:srm0"> Â Â Â Â Â Â Â Â Â <returnval> Â Â Â Â Â Â Â Â Â Â Â Â Â Â <apiVersion>2.0</apiVersion> Â Â Â Â Â Â Â Â Â Â Â Â Â Â <srmApi type="SrmApi">SrmRecoveryApi1</srmApi> Â Â Â Â Â Â Â Â Â Â Â Â Â Â <protection type="SrmProtection">SrmProtection</protection> Â Â Â Â Â Â Â Â Â Â Â Â Â Â <recovery type="SrmRecovery">SrmRecovery</recovery> Â Â Â Â Â Â Â Â Â Â Â Â Â Â <about xmlns:vim25="urn:vim25"> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:name>VMware vCenter Site Recovery Manager</vim25:name> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:fullName> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â VMware vCenter Site Recovery Manager 5.0.0 build-474459 Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â </vim25:fullName> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:vendor>VMware, Inc.</vim25:vendor> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:version>5.0.0</vim25:version> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:build>474459</vim25:build> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:localeVersion>INTL</vim25:localeVersion> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:localeBuild>000</vim25:localeBuild> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:osType>Windows</vim25:osType> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:productLineId>srm</vim25:productLineId> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:apiType>SiteRecovery</vim25:apiType> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â <vim25:apiVersion>2.0</vim25:apiVersion> Â Â Â Â Â Â Â Â Â Â Â Â Â Â </about> Â Â Â Â Â Â Â Â Â </returnval> Â Â Â Â </RetrieveContentResponse> </soapenv:Body> </soapenv:Envelope> |
You may want to get the contents of the “version” field, which in this case is “5.0.0”.
Since there is only one “version” element in the document, you may suppose that you could use an XPath statement of “//version”. However, this xpath statement doesn’t work with many XPATH query libraries probably due to some namespacing issues in the document.
One sure fire way to get an element in the document is to ignore namespaces altogether. In the grand scheme of things, ignoring namespaces is not often desired with big XML documents. However, for Web Service responses, you are usually pretty safe in doing so.
In order to ignore a namespace, you can use the following command:
*[local-name()='ELEMENT_NAME_GOES_HERE']
So, for example, if I wanted an easy way to get the version data from the XML document above, I could use any of the following XPATH statements (as well as other derivatives):
- //*[local-name()=’version’]
- /*/*/*/*/*/*[local-name()=’version’]
- //*[local-name()=’about’]//*[local-name()=’version’]
The “local-name()” function offers a very convenient way to blow past any namespace errors/weirdness in an XML response to allow you to quickly identify an XML element.
Awesome John! Thanks!
John, many thanks for the tips! Never thought this is possible.
One quick question, do you think will these statements have any performance degradation?
Thanks again!
Regards,
Sathish
@Sathish – Great thing to consider! Until now, I have not noticed any negative performance hits. Of course, web service responses are typically smaller than your average XML document anyway, so any potential performance hit should be negligible. Finally, I recommend using this when there are namespace issues (like the referenced example) and not for every encounter with XML. Finally, I believe the biggest performance hit comes with using relative paths vs. full paths. If you want to fine tune an XPath query, that is where you will get the biggest bang for your buck.
Is using an XML Declaration in the input absolutely necessary when receiving a SOAP Message in ServiceNow?
When I include a in my input, the XML parsing in my workflow is thrown off. However, when I remove the from the input, my workflow accepts it all very nicely.
It seems like XML Declarations are not required in all XML documents.
Would love to hear some thoughts on this. Thanks.
Awesome. But i can’t figure out the syntax when you have multiple elements, e.g. 2 returnal objects. How to select the second returnal object with this method?
My previous question is solved.
//*[local-name()=’version’][2]
99% of the namespace are broken and useless, simply because the majority of the xmln point to pages deleted long ago.
Good job John! Perfect answer..