html - XPath Expression: Select elements between A HREF="expr" tags -
i didn't found explicit way select nodes exist between 2 anchors (<a></a>
tag pair) in html file.
the first anchor has following format:
<a href="file://start..."></a>
second anchor:
<a href="file://end..."></a>
i've verified both can selected using starts-with (note i'm using html agility pack):
htmlnode n0 = html.documentnode.selectsinglenode("//a[starts-with(@href,'file://start')]")); htmlnode n1 = html.documentnode.selectsinglenode("//a[starts-with(@href,'file://end')]"));
with in mind, , amateurish xpath skills, wrote following expression tags between 2 anchors:
html.documentnode.selectnodes("//*[not(following-sibling::a[starts-with(@href,'file://start0')]) , not (preceding-sibling::a[starts-with(@href,'file://end0')])]");
this seems work, selects html document!
i need to, example following html fragment:
<html> ... <a href="file://start0"></a> <p>first nodes</p> <p>first nodes <span>x</span> </p> <p>first nodes</p> <a href="file://end0"></a> ... </html>
remove both anchors, 3 p (including of course inner span).
any way this?
i don't know if xpath 2.0 offers better ways achieve this.
*edit (special case!) *
i should handle case where:
"select tags between x , x', x <p><a href="file://..."></a></p>
"
so instead of:
<a href="file://start..."></a> <!-- xhtml extracted --> <a href="file://end..."></a>
i should handle also:
<p> <a href="file://start..."></a> </p> <!-- xhtml extracted --> <p> <a href="file://end..."></a> </p>
thank much, again.
use xpath 1.0 expression:
//a[starts-with(@href,'file://start')]/following-sibling::node() [count(.| //a[starts-with(@href,'file://end')]/preceding-sibling::node()) = count(//a[starts-with(@href,'file://end')]/preceding-sibling::node()) ]
or, use xpath 2.0 expression:
//a[starts-with(@href,'file://start')]/following-sibling::node() intersect //a[starts-with(@href,'file://end')]/preceding-sibling::node()
the xpath 2.0 expression uses xpath 2.0 intersect
operator.
the xpath 1.0 expression uses kayessian (after @michael kay) formula intersectioon of 2 node-sets:
$ns1[count(.|$ns2) = count($ns2)]
verification xslt:
this xslt 1.0 transformation:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="/"> <xsl:copy-of select= " //a[starts-with(@href,'file://start')]/following-sibling::node() [count(.| //a[starts-with(@href,'file://end')]/preceding-sibling::node()) = count(//a[starts-with(@href,'file://end')]/preceding-sibling::node()) ] "/> </xsl:template> </xsl:stylesheet>
when applied on provided xml document:
<html>... <a href="file://start0"></a> <p>first nodes</p> <p>first nodes <span>x</span> </p> <p>first nodes</p> <a href="file://end0"></a>... </html>
produces wanted, correct result:
<p>first nodes</p> <p>first nodes <span>x</span> </p> <p>first nodes</p>
this xslt 2.0 transformation:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="/"> <xsl:copy-of select= " //a[starts-with(@href,'file://start')]/following-sibling::node() intersect //a[starts-with(@href,'file://end')]/preceding-sibling::node() "/> </xsl:template> </xsl:stylesheet>
when applied on same xml document (above) again produces wanted result.
Comments
Post a Comment