In XQuery you may define your own functions to be used in queries. Here is an example of a recursive function that returns the categories that are reachable from category0 trough an arbitrary path in the category graph:
declare function local:closure($input as node()*, $result as node()*) as node()*
{
let $current := //edge[@from = $input]/@to
let $new := $current except $result
let $all := ($result, $new)
return
if(exists($new))
then ($new, local:closure($new,$all))
else ()
};
for $c in local:closure(id("category0")/@id,())
return <category id="{$c}"/>
The above definition deserves some comment:
As a second example, consider e bibliography enhanced with citations. An simple XML instance including 6 publications follows:
<citex> <article cites="6"> <author>M. Franceschet</author> </article> <article cites="5"> <author>M. Franceschet</author> </article> <article cites="4"> <author>M. Franceschet</author> </article> <article cites="3"> <author>M. Franceschet</author> </article> <article cites="2"> <author>M. Franceschet</author> </article> <article cites="1"> <author>M. Franceschet</author> </article> <citex>
The h index of a set of publications is the highest number of publications in the set that have each received at least that number of citations. For instance, the h index of the above set of publications in 3, since there are at most 3 publications that received at least 3 citations. The following XQuery function computes the h index for the set of publications of a given author:
declare function local:h($doc as node()*, $author as xs:string) as xs:integer
{
let $pub := for $x in $doc/citex/*[author=$author]
order by xs:integer($x/@cites) descending
return $x
let $cites := for $n in (1 to count($pub))
where xs:integer($pub[$n]/@cites) >= $n
return $pub[$n]/@cites
return count($cites)
};
The g index is a variant of the mentioned h index. Given a set of articles ranked in decreasing order of the number of citations that they received, the g index of a set is the largest number of publications such that the top g articles received together at least g2 citations. For instance, the g index of the above set of publications is 4, because the sum of the citations of the top-4 publications is 18, which is larger than 42 = 16, while the sum of the citations of the top-5 publications is 20, which is lower than 52 = 25 . The next XQuery function is a first attempt to compute the g index for the set of publications of a given author:
declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
let $pub := for $x in $doc/citex/*[author=$author]
order by xs:integer($x/@cites) descending
return $x
$sum := 0
let $cites := for $n in (1 to count($pub))
let $sum := $sum + $pub[$n]/@cites
where $sum >= ($n * $n)
return $pub[$n]/@cites
return count($cites)
};
The solution is not correct since a for loop in XQuery is executed in parallel and not sequentially; hence, the variable $sum does not contain the sum of the citations up to the current publication. It contains the citations of the current publication. A correct but inefficient solution follows:
declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
let $pub := for $x in $doc/citex/*[author=$author]
order by xs:integer($x/@cites) descending
return $x
let $cites := for $n in (1 to count($pub))
let $seq := for $i in (1 to $n)
return $pub[$i]/@cites
let $sum := sum($seq)
where $sum >= ($n * $n)
return $pub[$n]/@cites
return count($cites)
};
An efficient recursive solution follows:
declare function local:gRec($cites as xs:integer*,
$g as xs:integer,
$sum as xs:integer) as xs:integer {
let $sumCites := $sum + $cites[1]
let $new := remove($cites, 1)
return
if (exists($cites))
then (if ($sumCites >= $g * $g)
then local:gRec($new, $g+1, $sumCites)
else $g - 1
)
else ($g - 1)
};
declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
let $pub := for $x in $doc/citex/*[author=$author]
order by xs:integer($x/@cites) descending
return $x
return local:gRec($pub/@cites, 1, 0)
};
We use the defined local funcions as follows:
let $doc := /.
let $author := "M. Franceschet"
return
<bibliometrics author = "{$author}"
h = "{local:h($doc, $author)}"
g = "{local:g($doc, $author)}"/>