Next page Previous page Start of chapter End of chapter

User-defined functions

In XQuery you may define your own functions to be used in queries. Here is an example of a recursive function that returns the categories that are reachable from category0 trough an arbitrary path in the category graph:

declare function local:closure($input as node()*, $result as node()*) as node()*
{
   let $current := //edge[@from = $input]/@to
   let $new := $current except $result
   let $all := ($result, $new)
   return
      if(exists($new))
      then ($new, local:closure($new,$all))
      else ()
};

for $c in local:closure(id("category0")/@id,())
return <category id="{$c}"/>

The above definition deserves some comment:

As a second example, consider e bibliography enhanced with citations. An simple XML instance including 6 publications follows:

<citex>
<article cites="6">
  <author>M. Franceschet</author>
</article>
<article cites="5">
  <author>M. Franceschet</author>
</article>
<article cites="4">
  <author>M. Franceschet</author>
</article>
<article cites="3">
  <author>M. Franceschet</author>
</article>
<article cites="2">
  <author>M. Franceschet</author>
</article>
<article cites="1">
  <author>M. Franceschet</author>
</article>
<citex>

The h index of a set of publications is the highest number of publications in the set that have each received at least that number of citations. For instance, the h index of the above set of publications in 3, since there are at most 3 publications that received at least 3 citations. The following XQuery function computes the h index for the set of publications of a given author:

declare function local:h($doc as node()*, $author as xs:string) as xs:integer
{
  let $pub := for $x in $doc/citex/*[author=$author]
              order by xs:integer($x/@cites) descending
              return $x
  let $cites := for $n in (1 to count($pub))
                where xs:integer($pub[$n]/@cites) >= $n
                return $pub[$n]/@cites
  return count($cites)
};

The g index is a variant of the mentioned h index. Given a set of articles ranked in decreasing order of the number of citations that they received, the g index of a set is the largest number of publications such that the top g articles received together at least g2 citations. For instance, the g index of the above set of publications is 4, because the sum of the citations of the top-4 publications is 18, which is larger than 42 = 16, while the sum of the citations of the top-5 publications is 20, which is lower than 52 = 25 . The next XQuery function is a first attempt to compute the g index for the set of publications of a given author:

declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
  let $pub := for $x in $doc/citex/*[author=$author]
              order by xs:integer($x/@cites) descending
              return $x
  $sum := 0
  let $cites := for $n in (1 to count($pub))
                let $sum := $sum + $pub[$n]/@cites
                where $sum >= ($n * $n)
                return $pub[$n]/@cites
  return count($cites)
};

The solution is not correct since a for loop in XQuery is executed in parallel and not sequentially; hence, the variable $sum does not contain the sum of the citations up to the current publication. It contains the citations of the current publication. A correct but inefficient solution follows:

declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
  let $pub := for $x in $doc/citex/*[author=$author]
              order by xs:integer($x/@cites) descending
              return $x
  let $cites := for $n in (1 to count($pub))
                let $seq := for $i in (1 to $n)
                            return $pub[$i]/@cites
                let $sum := sum($seq)
                where $sum >= ($n * $n)
                return $pub[$n]/@cites
  return count($cites)
};

An efficient recursive solution follows:

declare function local:gRec($cites as xs:integer*,
                          $g as xs:integer,
                          $sum as xs:integer) as xs:integer {
   let $sumCites := $sum + $cites[1]
   let $new := remove($cites, 1)
   return
      if (exists($cites))
      then (if ($sumCites >= $g * $g)
            then local:gRec($new, $g+1, $sumCites)
            else $g - 1
           )
      else ($g - 1)
};

declare function local:g($doc as node()*, $author as xs:string) as xs:integer
{
  let $pub := for $x in $doc/citex/*[author=$author]
              order by xs:integer($x/@cites) descending
              return $x
  return local:gRec($pub/@cites, 1, 0)
};

We use the defined local funcions as follows:

let $doc := /.
let $author := "M. Franceschet"
return 
<bibliometrics author = "{$author}" 
                  h = "{local:h($doc, $author)}" 
                  g = "{local:g($doc, $author)}"/>
Next page Previous page Start of chapter End of chapter
Caffè XML - Massimo Franceschet