[ACCEPTED]-Get second element text with XPath?-lxml

Accepted answer
Score: 42

I tried this but it doesn't work.

t = item.findtext('.//span[@class="python"]//a[2]')

This is a FAQ about the // abbreviation.

.//a[2] means: Select 11 all a descendents of the current node that 10 are the second a child of their parent. So 9 this may select more than one element or 8 no element -- depending on the concrete 7 XML document.

To put it more simply, the 6 [] operator has higher precedence than //.

If 5 you want just one (the second) of all nodes 4 returned you have to use brackets to force 3 your wanted precedence:


This really selects 2 the second a descendent of the current node.

For the actual expression used in the question, change it to:


or 1 change it to:

Score: 2

I'm not sure what the problem is...

>>> d = """<span class='python'>
...   <a>google</a>
...   <a>chrome</a>
... </span>"""
>>> from lxml import etree
>>> d = etree.HTML(d)
>>> d.xpath('.//span[@class="python"]/a[2]/text()')


Score: 2

From Comments:

or the simplification of the actual HTML 6 I posted is too simple

You are right. What 5 is the meaning of .//span[@class="python"]//a[2]? This will be expanded 4 to:


It will finaly select the second a child 3 (fn:position() refers to the child axe). So, nothing will 2 be select if your document is like:

<span class='python'> 
      <a>google</a><!-- This is the first "a" child of its parent --> 
    <a>chrome</a><!-- This is also the first "a" child of its parent --> 

If you 1 want the second of all descendants, use:


More Related questions