The fragment of XPath that we described so far is usually called navigational or core XPath. It essentially provides all features to navigate the XML tree. However, full XPath offers more, including:
Consider the following XML document containing an excerpt of a bibliography:
<?xml version="1.0"?> <!DOCTYPE biblio SYSTEM "biblio.dtd"> <biblio> <inproceedings key="M4M05" cite="JLLI05"> <author>M. Franceschet</author> <author>E. Zimuel</author> <title>Modal logic and navigational XPath: an experimental comparison</title> <booktitle>Workshop Methods for Modalities</booktitle> <pages>156-172</pages> <year>2005</year> <url>http://www.sci.unich.it/~francesc/pubs/m4m05.pdf</url> </inproceedings> <article key="JLLI05" cite="M4M05"> <author>M. Franceschet</author> <author>B. ten Cate</author> <title>Guarded fragments with constants</title> <journal>Journal of Logic, Language and Information</journal> <volume>14</volume> <number>3</number> <pages>281-288</pages> <year>2005</year> <url>http://www.sci.unich.it/~francesc/pubs/jlli05.pdf</url> <price>15</price> </article> </biblio>
A simple DTD for the above document follows:
<!ELEMENT biblio (article | inproceedings)*> <!ELEMENT article (author+,title,journal,volume,number,pages,year,url,price)> <!ATTLIST article key ID #REQUIRED cite IDREFS #IMPLIED> <!ELEMENT inproceedings (author+,title,booktitle,pages,year,url)> <!ATTLIST inproceedings key ID #REQUIRED cite IDREFS #IMPLIED> <!ELEMENT author (#PCDATA)> <!ELEMENT title (#PCDATA)> <!ELEMENT booktitle (#PCDATA)> <!ELEMENT pages (#PCDATA)> <!ELEMENT year (#PCDATA)> <!ELEMENT journal (#PCDATA)> <!ELEMENT volume (#PCDATA)> <!ELEMENT number (#PCDATA)> <!ELEMENT url (#PCDATA)> <!ELEMENT price (#PCDATA)>
Comparison operators may be used to compare the string value of a node. For instance, the query:
/child::biblio/child::*[child::author = "E. Zimuel"]
retrieves all bibliography items having some author named E. Zimuel. Strings should be single or double quoted. The following query selects all articles published later than year 2000:
/child::biblio/child::article[child::year > 2000]
Since 2000 is not quoted, it is considered as a number, and the > sign is interpreted an the greater-than operator on numbers. If you write child::year > "2000", then 2000 is regarded as a string, and the > sign is interpreted an the greater-than operator on strings (the lexicographical order).
XPath defines a number of functions that you may use in filters (usually) or in raw expressions. Each function returns one of the four basic types: string, number, Boolean, node set. The only relevant Boolean function is not() that complements its argument. Other relevant functions follows:
/child::biblio/child::*[attribute::key = "M4M05"]/child::author[position() = 1]
/child::biblio/child::*[attribute::key = "M4M05"]/child::author[position() = last()]
count(/child::biblio/child::*[attribute::key = "M4M05"]/child::author)while the next one selects the conference papers with more than 3 authors:
/child::biblio/child::inproceedings[count(child::author) > 3]
id("M4M05")while the next one selects the entries that are cited by the entry with ID M4M05:
id(/child::biblio/child::*[attribute::key = "M4M05"]/attribute::cite)You may also write the last one as:
/child::biblio/child::*[attribute::key = "M4M05"]/id(attribute::cite)Notice that the id() functions works only if you have properly declared the ID and IDREF attributes in the document DTD.
/child::biblio/child::*[contains(child::title,"logic")]
sum(/child::biblio/child::article[year = "2005"]/price)