Talk:XPath: Difference between revisions

From MozillaZine Knowledge Base
Jump to navigationJump to search
No edit summary
(clarification)
 
(10 intermediate revisions by 3 users not shown)
Line 7: Line 7:


So this page is actually a bit misleading in that it suggests that you can only use XPath with XML - infact it will work with any DOM tree. I'm also not sure that making use of a helper function actually helps people understand what's going on (but then I should finish my documentation...) --[[User:Jgraham|Jgraham]] 03:20, 20 Mar 2005 (PST)
So this page is actually a bit misleading in that it suggests that you can only use XPath with XML - infact it will work with any DOM tree. I'm also not sure that making use of a helper function actually helps people understand what's going on (but then I should finish my documentation...) --[[User:Jgraham|Jgraham]] 03:20, 20 Mar 2005 (PST)
:DOM trees are in-memory representations of XML. They are one and the same, so I don't see how this can be misleading.
::HTML is not XML but HTML has a DOM and XPath works with HTML [[User:Jgraham|Jgraham]]
:::I think you are confusing the [http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/core.html Document Object Model Core] specification with the [http://www.w3.org/TR/DOM-Level-2-HTML/html.html Document Object Model HTML] specification. The former defines interfaces such as Document, Node, Element, Attr, etc., while the latter defines interfaces such as HTMLDocument and HTMLElement. The [http://www.w3.org/TR/2004/NOTE-DOM-Level-3-XPath-20040226/xpath.html Document Object Model XPath], which defines interfaces such as XPathEvaluator, XPathExpression, and XPathResult (some of which are used in this article's evaluateXPath() function), maps the DOM Core specification and the [http://www.w3.org/TR/xpath XPath] specification , but <b>not</b> DOM HTML and XPath. Indeed, if you try to use any of the methods in the DOM XPath interfaces (e.g., <code>XPathEvaluator.evaluate()</code>) with DOM HTML interfaces, you will get an exception. In other words, XPath <b>does not</b> work with DOM HTML trees> Try it if you don't believe me. [[User:Grimholtz|grimholtz]]
::::"Mozilla implements much of the [http://www.w3.org/TR/DOM-Level-3-XPath/xpath.html DOM 3 XPath]. This allows XPath expressions to be run against both HTML and XML documents." ''(Source: [http://www-jcsu.jesus.cam.ac.uk/~jg307/mozilla/xpath-tutorial.html Mozilla XPath Documentation])'' Note that the examples running on that page work quite well, and that the page is ''not'' well-formed XML. --[[User:Unarmed|Unarmed]] 06:12, 30 Mar 2005 (PST)
:::::The code on that page does not use the interfaces defined by the W3C for evaluation XPath expressions against a DOM Core object ([http://www.w3.org/TR/2004/NOTE-DOM-Level-3-XPath-20040226/xpath.html Document Object Model XPath]). It uses <code>document.evaluate()</code>, where <i>document</i> is an instance of <code>HTMLDocument</code> (not <code>XMLDocument</code>). I've never seen this method until now, and there doesn't appear to be any reference to Document.evaluate() in any of the W3C specifications I'm perusing. IE doesn't recognize it, nor do I see support for it in Xalan. Do you happen to know anything else about it? FWIW, the code in this article from [http://kb.mozillazine.org/index.php?title=XPath&oldid=9903 before 16:58, 18 Mar 2005] <b>does not </b> work in Firefox with <code>HTMLDocument</code> DOMs. Save this as HTML and open in FireFox:
<pre>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<script>
// Evaluate an XPath expression aExpression against a given DOM node
// or Document object (aNode), returning the results as an array
// thanks wanderingstan at morethanwarm dot mail dot com for the
// initial work.
function evaluateXPath(aNode, aExpr) {
  var xpe = new XPathEvaluator();
  var nsResolver = xpe.createNSResolver(aNode.ownerDocument.documentElement);
  var result = xpe.evaluate(aExpr, aNode, nsResolver, 0, null);
  var found = [];
  while (res = result.iterateNext())
    found.push(res);
  return found;
}
</script>
</head>
<input type="edit"> <!-- note non-well-formed -->
<body onload="alert(evaluateXPath(document, '//body'));">
</body>
</html>
</pre>
:::::It yields a JavaScript error: <code>aNode.ownerDocument</code> has no properties. However, I made a simple change to the article on [http://kb.mozillazine.org/index.php?title=XPath&oldid=9904 16:58, 18 Mar 2005] so the code works with both <code>XMLDocument</code>s and <code>HTMLDocument</code>s. It's in the the second line of <code>evaluateXPath()</code>:
<pre>
var nsResolver = xpe.createNSResolver(aNode.ownerDocument == null ?
  aNode.documentElement : aNode.ownerDocument.documentElement);
</pre>
:::::This article uses the W3C standard DOM XPath interfaces and methods, not <code>Document.evaluate()</code> like the [http://www-jcsu.jesus.cam.ac.uk/~jg307/mozilla/xpath-tutorial.html link] you sent.
:::::--[[User:Grimholtz|grimholtz]]
::::::Right - disclaimer: I wrote the article being referenced. It's not finished or complete (corrections or additions are welcome) and is based more on what works in Mozilla than a detailed reading of the spec. Despite that, the DOM3 XPath Spec is quite clearly a ''DOM'' spec and appears to be designed to work with all DOM trees. In particular, sentences such as "In a DOM implementation which supports the XPath 3.0 feature, as described above, the XPathEvaluator interface will be implemented on the same object which implements the Document interface" suggest that <code>document.evaluate</code> is legitimate.
::::::Regardless of what the spec intends, there is an implementation of XPath in Mozilla that works with HTML and XML (almost) interchangably. It is immensley useful; I've noticed many of the existing [http://dunck.us/collab/GreaseMonkeyUserScripts Greasemonkey user scripts] use XPath to access particular DOM elements they want to manipulate and they're probably *all* working with HTML not XML. Since the purpose of the kb is to document Mozilla (and not the w3c specs) XPath in HTML is well worth documenting and <code>document.evaluate</code> provides (IMHO) the most transparent interface for that. [[User:Jgraham|Jgraham]]
:::::::You should feel free to edit this article as you see fit since it is in the public domain. However, there's something to be said for thoroughness, in addition to the value of showing off Mozilla's capabilities in that they've actually implemented all of these interfaces, so I would just ask that if you document <code>document.evaluate()</code>, please don't remove the existing code. Rather, you could highlight <code>document.evaluate()</code> as an alternative (and certainly easier and perhaps more performant) means to evaluate XPath expressions.  --[[User:Grimholtz|grimholtz]]

Latest revision as of 23:10, 30 March 2005

asquella, why did you write 'Note that you shouldn't use this function if you expect to get a long list of results from it.' --grimholtz

That function copies query results to an array, taking additional memory and CPU cycles. I wouldn't use it at all, but for simple queries I think that simpler interface is worth the overhead. (Not that I've done any profiling.)
Related example (though not completely applicable): try opening JS Console when it has a lot of entries asqueella


So this page is actually a bit misleading in that it suggests that you can only use XPath with XML - infact it will work with any DOM tree. I'm also not sure that making use of a helper function actually helps people understand what's going on (but then I should finish my documentation...) --Jgraham 03:20, 20 Mar 2005 (PST)

DOM trees are in-memory representations of XML. They are one and the same, so I don't see how this can be misleading.
HTML is not XML but HTML has a DOM and XPath works with HTML Jgraham
I think you are confusing the Document Object Model Core specification with the Document Object Model HTML specification. The former defines interfaces such as Document, Node, Element, Attr, etc., while the latter defines interfaces such as HTMLDocument and HTMLElement. The Document Object Model XPath, which defines interfaces such as XPathEvaluator, XPathExpression, and XPathResult (some of which are used in this article's evaluateXPath() function), maps the DOM Core specification and the XPath specification , but not DOM HTML and XPath. Indeed, if you try to use any of the methods in the DOM XPath interfaces (e.g., XPathEvaluator.evaluate()) with DOM HTML interfaces, you will get an exception. In other words, XPath does not work with DOM HTML trees> Try it if you don't believe me. grimholtz
"Mozilla implements much of the DOM 3 XPath. This allows XPath expressions to be run against both HTML and XML documents." (Source: Mozilla XPath Documentation) Note that the examples running on that page work quite well, and that the page is not well-formed XML. --Unarmed 06:12, 30 Mar 2005 (PST)


The code on that page does not use the interfaces defined by the W3C for evaluation XPath expressions against a DOM Core object (Document Object Model XPath). It uses document.evaluate(), where document is an instance of HTMLDocument (not XMLDocument). I've never seen this method until now, and there doesn't appear to be any reference to Document.evaluate() in any of the W3C specifications I'm perusing. IE doesn't recognize it, nor do I see support for it in Xalan. Do you happen to know anything else about it? FWIW, the code in this article from before 16:58, 18 Mar 2005 does not work in Firefox with HTMLDocument DOMs. Save this as HTML and open in FireFox:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
<script>
// Evaluate an XPath expression aExpression against a given DOM node
// or Document object (aNode), returning the results as an array
// thanks wanderingstan at morethanwarm dot mail dot com for the
// initial work.
function evaluateXPath(aNode, aExpr) {
  var xpe = new XPathEvaluator();
  var nsResolver = xpe.createNSResolver(aNode.ownerDocument.documentElement);
  var result = xpe.evaluate(aExpr, aNode, nsResolver, 0, null);
  var found = [];
  while (res = result.iterateNext())
    found.push(res);
  return found;
}
</script>
</head>
<input type="edit"> <!-- note non-well-formed -->
<body onload="alert(evaluateXPath(document, '//body'));">

</body>
</html>
It yields a JavaScript error: aNode.ownerDocument has no properties. However, I made a simple change to the article on 16:58, 18 Mar 2005 so the code works with both XMLDocuments and HTMLDocuments. It's in the the second line of evaluateXPath():
var nsResolver = xpe.createNSResolver(aNode.ownerDocument == null ?
  aNode.documentElement : aNode.ownerDocument.documentElement);
This article uses the W3C standard DOM XPath interfaces and methods, not Document.evaluate() like the link you sent.
--grimholtz
Right - disclaimer: I wrote the article being referenced. It's not finished or complete (corrections or additions are welcome) and is based more on what works in Mozilla than a detailed reading of the spec. Despite that, the DOM3 XPath Spec is quite clearly a DOM spec and appears to be designed to work with all DOM trees. In particular, sentences such as "In a DOM implementation which supports the XPath 3.0 feature, as described above, the XPathEvaluator interface will be implemented on the same object which implements the Document interface" suggest that document.evaluate is legitimate.
Regardless of what the spec intends, there is an implementation of XPath in Mozilla that works with HTML and XML (almost) interchangably. It is immensley useful; I've noticed many of the existing Greasemonkey user scripts use XPath to access particular DOM elements they want to manipulate and they're probably *all* working with HTML not XML. Since the purpose of the kb is to document Mozilla (and not the w3c specs) XPath in HTML is well worth documenting and document.evaluate provides (IMHO) the most transparent interface for that. Jgraham


You should feel free to edit this article as you see fit since it is in the public domain. However, there's something to be said for thoroughness, in addition to the value of showing off Mozilla's capabilities in that they've actually implemented all of these interfaces, so I would just ask that if you document document.evaluate(), please don't remove the existing code. Rather, you could highlight document.evaluate() as an alternative (and certainly easier and perhaps more performant) means to evaluate XPath expressions. --grimholtz