html - XPath Expression: Select elements between A HREF="expr" tags -


i didn't found explicit way select nodes exist between 2 anchors (<a></a> tag pair) in html file.

the first anchor has following format:

<a href="file://start..."></a> 

second anchor:

<a href="file://end..."></a> 

i've verified both can selected using starts-with (note i'm using html agility pack):

htmlnode n0 = html.documentnode.selectsinglenode("//a[starts-with(@href,'file://start')]")); htmlnode n1 = html.documentnode.selectsinglenode("//a[starts-with(@href,'file://end')]")); 

with in mind, , amateurish xpath skills, wrote following expression tags between 2 anchors:

html.documentnode.selectnodes("//*[not(following-sibling::a[starts-with(@href,'file://start0')]) , not (preceding-sibling::a[starts-with(@href,'file://end0')])]"); 

this seems work, selects html document!

i need to, example following html fragment:

<html> ...  <a href="file://start0"></a> <p>first nodes</p> <p>first nodes     <span>x</span> </p> <p>first nodes</p> <a href="file://end0"></a>  ... </html> 

remove both anchors, 3 p (including of course inner span).

any way this?

i don't know if xpath 2.0 offers better ways achieve this.

*edit (special case!) *

i should handle case where:

"select tags between x , x', x <p><a href="file://..."></a></p>"

so instead of:

<a href="file://start..."></a> <!-- xhtml extracted --> <a href="file://end..."></a> 

i should handle also:

<p>   <a href="file://start..."></a> </p> <!-- xhtml extracted -->  <p>   <a href="file://end..."></a> </p> 

thank much, again.

use xpath 1.0 expression:

//a[starts-with(@href,'file://start')]/following-sibling::node()      [count(.| //a[starts-with(@href,'file://end')]/preceding-sibling::node())      =       count(//a[starts-with(@href,'file://end')]/preceding-sibling::node())      ] 

or, use xpath 2.0 expression:

    //a[starts-with(@href,'file://start')]/following-sibling::node()   intersect     //a[starts-with(@href,'file://end')]/preceding-sibling::node() 

the xpath 2.0 expression uses xpath 2.0 intersect operator.

the xpath 1.0 expression uses kayessian (after @michael kay) formula intersectioon of 2 node-sets:

$ns1[count(.|$ns2) = count($ns2)] 

verification xslt:

this xslt 1.0 transformation:

<xsl:stylesheet version="1.0"  xmlns:xsl="http://www.w3.org/1999/xsl/transform">  <xsl:output omit-xml-declaration="yes" indent="yes"/>  <xsl:strip-space elements="*"/>   <xsl:template match="/">   <xsl:copy-of select=   "    //a[starts-with(@href,'file://start')]/following-sibling::node()          [count(.| //a[starts-with(@href,'file://end')]/preceding-sibling::node())          =           count(//a[starts-with(@href,'file://end')]/preceding-sibling::node())          ]   "/>  </xsl:template> </xsl:stylesheet> 

when applied on provided xml document:

<html>...     <a href="file://start0"></a>     <p>first nodes</p>     <p>first nodes             <span>x</span>     </p>     <p>first nodes</p>     <a href="file://end0"></a>... </html> 

produces wanted, correct result:

<p>first nodes</p> <p>first nodes             <span>x</span> </p> <p>first nodes</p> 

this xslt 2.0 transformation:

<xsl:stylesheet version="2.0"  xmlns:xsl="http://www.w3.org/1999/xsl/transform">  <xsl:output omit-xml-declaration="yes" indent="yes"/>  <xsl:strip-space elements="*"/>   <xsl:template match="/">   <xsl:copy-of select=   " //a[starts-with(@href,'file://start')]/following-sibling::node()    intersect     //a[starts-with(@href,'file://end')]/preceding-sibling::node()   "/>  </xsl:template> </xsl:stylesheet> 

when applied on same xml document (above) again produces wanted result.


Comments

Popular posts from this blog

c# - SharpSVN - How to get the previous revision? -

c++ - Is it possible to compile a VST on linux? -

url - Querystring manipulation of email Address in PHP -