In XQuery you may define your own functions to be used in queries. Here is an example of a recursive function that returns the categories that are reachable from category0 trough an arbitrary path in the category graph:
declare function local:closure($input as node()*, $result as node()*) as node()* { let $current := //edge[@from = $input]/@to let $new := $current except $result let $all := ($result, $new) return if(exists($new)) then ($new, local:closure($new,$all)) else () }; for $c in local:closure(id("category0")/@id,()) return <category id="{$c}"/>
The above definition deserves some comment:
As a second example, consider e bibliography enhanced with citations. An simple XML instance including 6 publications follows:
<citex> <article cites="6"> <author>M. Franceschet</author> </article> <article cites="5"> <author>M. Franceschet</author> </article> <article cites="4"> <author>M. Franceschet</author> </article> <article cites="3"> <author>M. Franceschet</author> </article> <article cites="2"> <author>M. Franceschet</author> </article> <article cites="1"> <author>M. Franceschet</author> </article> <citex>
The h index of a set of publications is the highest number of publications in the set that have each received at least that number of citations. For instance, the h index of the above set of publications in 3, since there are at most 3 publications that received at least 3 citations. The following XQuery function computes the h index for the set of publications of a given author:
declare function local:h($doc as node()*, $author as xs:string) as xs:integer { let $pub := for $x in $doc/citex/*[author=$author] order by xs:integer($x/@cites) descending return $x let $cites := for $n in (1 to count($pub)) where xs:integer($pub[$n]/@cites) >= $n return $pub[$n]/@cites return count($cites) };
The g index is a variant of the mentioned h index. Given a set of articles ranked in decreasing order of the number of citations that they received, the g index of a set is the largest number of publications such that the top g articles received together at least g2 citations. For instance, the g index of the above set of publications is 4, because the sum of the citations of the top-4 publications is 18, which is larger than 42 = 16, while the sum of the citations of the top-5 publications is 20, which is lower than 52 = 25 . The next XQuery function is a first attempt to compute the g index for the set of publications of a given author:
declare function local:g($doc as node()*, $author as xs:string) as xs:integer { let $pub := for $x in $doc/citex/*[author=$author] order by xs:integer($x/@cites) descending return $x $sum := 0 let $cites := for $n in (1 to count($pub)) let $sum := $sum + $pub[$n]/@cites where $sum >= ($n * $n) return $pub[$n]/@cites return count($cites) };
The solution is not correct since a for loop in XQuery is executed in parallel and not sequentially; hence, the variable $sum does not contain the sum of the citations up to the current publication. It contains the citations of the current publication. A correct but inefficient solution follows:
declare function local:g($doc as node()*, $author as xs:string) as xs:integer { let $pub := for $x in $doc/citex/*[author=$author] order by xs:integer($x/@cites) descending return $x let $cites := for $n in (1 to count($pub)) let $seq := for $i in (1 to $n) return $pub[$i]/@cites let $sum := sum($seq) where $sum >= ($n * $n) return $pub[$n]/@cites return count($cites) };
An efficient recursive solution follows:
declare function local:gRec($cites as xs:integer*, $g as xs:integer, $sum as xs:integer) as xs:integer { let $sumCites := $sum + $cites[1] let $new := remove($cites, 1) return if (exists($cites)) then (if ($sumCites >= $g * $g) then local:gRec($new, $g+1, $sumCites) else $g - 1 ) else ($g - 1) }; declare function local:g($doc as node()*, $author as xs:string) as xs:integer { let $pub := for $x in $doc/citex/*[author=$author] order by xs:integer($x/@cites) descending return $x return local:gRec($pub/@cites, 1, 0) };
We use the defined local funcions as follows:
let $doc := /. let $author := "M. Franceschet" return <bibliometrics author = "{$author}" h = "{local:h($doc, $author)}" g = "{local:g($doc, $author)}"/>