Skip to content Skip to sidebar Skip to footer

Xpath: Select Current And Next Node's Text By Current Node Attributes

First of all, this is a spawn from my previous question. I have posted this again because I was advised to do so by the person whose answer I accepted in the original post as he fe

Solution 1:

The required single XPath expression to select the relevant data for all courses is quite messy, so here I am taking another approach, which can be used (if necessary at all) to produce that single XPath expression:

This simple XSLT transformation:

<xsl:stylesheetversion="1.0"xmlns:xsl="http://www.w3.org/1999/XSL/Transform"><xsl:outputomit-xml-declaration="yes"indent="yes"/><xsl:strip-spaceelements="*"/><xsl:templatematch="p[@class='titlestyle']"><xsl:text>&#xA;===================&#xA;</xsl:text><xsl:value-ofselect="text()[1]"/></xsl:template><xsl:templatematch=
  "span/span[@class='title2'][not(position() >1)]"><xsl:text>&#xA;</xsl:text><xsl:value-ofselect="."/><xsl:value-ofselect="following-sibling::a[1]"/><xsl:iftest="not(following-sibling::a)"><xsl:value-ofselect="following-sibling::text()[1]"/></xsl:if><xsl:text>&#xA;</xsl:text></xsl:template><xsl:templatematch="text()"/></xsl:stylesheet>

when applied on the page at: http://www.utm.utoronto.ca/regcal/WEBLISTCOURSES1.html (tidied up to become a well-formed XML document), produces the wanted result:

===================
Anthropology
===================
ANT101H5 Introduction to Biological Anthropology and Archaeology

Exclusion: ANT100Y5

===================
ANT102H5 Introduction to Sociocultural and Linguistic Anthropology

Exclusion: ANT100Y5

===================
ANT200Y5 World Archaeology and Prehistory

Prerequisite:101H5

===================
ANT203Y5 Biological Anthropology

Prerequisite:101H5

===================
ANT204Y5 Sociocultural Anthropology

Prerequisite:101H5

===================
ANT205H5 Introduction to Forensic Anthropology

Prerequisite:101H5

===================
ANT206Y5 Culture and Communication: Introduction to Linguistic Anthropology

Exclusion: ANT206H5

===================
ANT241Y5 Aboriginal Peoples of North America

===================
ANT299Y5 Research Opportunity Program

===================
ANT304H5 Anthropology and Aboriginal Peoples

Exclusion: ANT304Y5

===================
ANT306H5 Forensic Anthropology Field School

Prerequisite: ANT205H5

===================
ANT308H5 Case Studies in Archaeological Botany and Zoology

Prerequisite: ANT200Y5

===================
ANT309H5 Southeast Asian Archaeology

Prerequisite: ANT200Y5

===================
ANT310H5 Complex Societies

Prerequisite: ANT200Y5

===================
ANT312H5 Archaeological Analysis

Prerequisite: ANT200Y5

===================
ANT313H5 China, Korea and Japan in Prehistory

Prerequisite: ANT200Y5

===================
ANT314H5 Archaeological Theory

Exclusion: ANT411H5

===================
ANT316H5 South Asian Archaeology

Prerequisite: ANT200Y5

===================
ANT317H5 Archaeology of Eastern North America

Prerequisite: ANT200Y5

===================
ANT318H5 Archaeological Fieldwork

Prerequisite: ANT200Y5

===================
ANT320H5 Archaeological Approaches to Technology

Prerequisite: ANT200Y5

===================
ANT322H5 Anthropology of Youth Culture

Exclusion: ANT204Y5

===================
ANT327H5 Agricultural Origins:  The Second Revolution

Prerequisite: ANT200Y5

===================
ANT331H5 The Biology of Human Sexuality

Exclusion: ANT330H5

===================
ANT332H5 Human Origins

Exclusion: ANT332Y5

===================
ANT333H5 Human Origins II

Exclusion: ANT332Y5

===================
ANT334H5 Human Osteology

Exclusion: ANT334Y5

===================
ANT335H5 Anthropology of Gender

Exclusion: ANT331Y5

===================
ANT336H5 Molecular Anthropology

Prerequisite: ANT203Y5

===================
ANT338H5 Laboratory Methods in Biological Anthropology

Prerequisite: ANT203Y5

===================
ANT339Y5 Human Adaptation through Biological and Cultural Means

Prerequisite: ANT203Y5

===================
ANT340H5 Osteological Theory

Exclusion: ANT334Y5

===================
ANT350H5 Globalization and the Changing World of Work

Prerequisite: ANT204Y5

===================
ANT351H5 Money, Markets, Gifts: Topics in Economic Anthropology

Prerequisite: ANT204Y5

===================
ANT352H5 Power, Authority, and Legitimacy: Topics in Political Anthropology

Prerequisite: ANT204Y5

===================
ANT358H5 Ethnographic Methods

Prerequisite: ANT204Y5

===================
ANT360H5 Anthropology of Religion

Exclusion: ANT209Y5

===================
ANT361H5 Anthropology ofSub-Saharan Africa

Exclusion: ANT212Y5

===================
ANT362H5 Language in Culture and Society

Prerequisite: ANT204Y5

===================
ANT363H5 Magic, Witchcraft and Science

Prerequisite: ANT360H5

===================
ANT364H5 Lab in Social Interaction

Prerequisite: ANT206H5

===================
ANT365H5 Semiotic Anthropology

Prerequisite: ANT204Y5

===================
ANT368H5 World Religions and Ecology

Exclusion: RLG311H5

===================
ANT369H5 Religious Violence and Nonviolence

Exclusion: RLG317H5

===================
ANT397H5 Independent Study

Prerequisite: Permission of Faculty Advisor


===================
ANT398Y5 Independent Reading

Prerequisite: Permission of Faculty Advisor


===================
ANT399Y5 Research Opportunity Program

Prerequisite: P.I.


===================
ANT401H5 Vocal and Visual Communication

Prerequisite: ANT102H5

===================
ANT414H5 People and Plants in Prehistory

Prerequisite: ANT200Y5

===================
ANT415H5 Faunal Archaeo-Osteology

Exclusion: ANT415Y5

===================
ANT416H5 Advanced Archaeological Analysis

Prerequisite: ANT312H5

===================
ANT418H5 Advanced Archaeological Fieldwork

Prerequisite: ANT318H5

===================
ANT430H5 Special Problems in Biological Anthropology and Archaeology

Prerequisite: P.I


===================
ANT430Y5 Special Problems in Biological Anthropology and Archaeology

Prerequisite: P.I. 


===================
ANT431Y5 Special Problems in Sociocultural or Linguistic Anthropology

Prerequisite: P.I.


===================
ANT431H5 Special Problems in Sociocultural or Linguistic Anthropology

Prerequisite: P.I.


===================
ANT432H5 Special Seminar in Anthropology

Prerequisite: P.I.


===================
ANT433H5 Genes, Language, Artifact and Mind

Prerequisite: ANT200Y5

===================
ANT434H5 Palaeopathology

Prerequisite: ANT334Y5

===================
ANT438H5 The Development of Thought in Biological Anthropology

Prerequisite: ANT203Y5

===================
ANT439Y5 Advanced Forensic Anthropology

Prerequisite: ANT205H5

===================
ANT441H5 Advanced Bioarchaeology

Prerequisite: ANT334H5

===================
ANT457H5 Anthropology and the Environment

Prerequisite: ANT102H5

===================
ANT458H5 Anthropology of Crime, Law andOrderExclusion: ANT204Y5

===================
ANT459H5 The Ethnography of Speaking

Prerequisite: ANT206Y5

===================
ANT460H5 Theory in Sociocultural Anthropology

Prerequisite: ANT204Y5

===================
ANT461H5 Emergent Topics in Socio-Cultural &amp;  Linguistic Anthropology

Prerequisite: ANT204Y5

===================
ANT498H5 Advanced Independent Study

Prerequisite: P.I.


===================
ANT499Y5 Advanced Independent Research

Prerequisite: P.I.

Solution 2:

Try instead of [<int>] use something like [position() mod <offset> = <base>]

Offset being the distance between each node you are interested. It may be different for @class='titlestyle' and @class='title2'.

ites = hxs.select("(//p[@class='titlestyle'])[position() mod <offset to next to match> = 2]/text()[1] | (//span[@class='title2'])[position() mod <offset to next to match> = 2]/text() | \
                    (//span[@class='title2'])[position() mod <offset to next to match> = 2]/following-sibling::a[1]/text() | (//span[@class='title2'])[position() mod <offset to next to match> = 3]/text() | \
                    (//span[@class='title2'])[position() mod <offset to next to match> = 3]/following-sibling::a[1]/text()")

EDIT: As requested.

One at a time perform each inidividual xpath without constraining on its position. This is a manual fact finding excercise to determine the final values to use in the xpath.

Return all nodes matching the following xpath (this is the first one).

ites = hxs.select("(//p[@class='titlestyle'])/text()[1]")

ites will contain some you want for the class and some that you do not.

You have already determined for this one the 2nd is the first node you want. Now count the distance to the next one in ites that you want this rule match on. This is what we can refer to as <offset to next to match>.

Now repeat the above for each of the remaining xpath searches.

Think of hxs.select("") as filter and as it walks the xml every single thing that matches your xpath will be returned.

Here is an example http://zvon.org/xxl/XPathTutorial/Output/example22.html

Post a Comment for "Xpath: Select Current And Next Node's Text By Current Node Attributes"