Saxonica.com

saxon:for-each-group()

saxon:for-each-group($population as item()*, $key as jt:net.sf.saxon.expr.UserFunctionCall, $action as jt:net.sf.saxon.expr.UserFunctionCall) ==> item()*

This function is available only in Saxon-SA

The action of this function is analagous to the xsl:for-each-group instruction (with a group-by attribute) in XSLT 2.0. It is provided to give XQuery users access to grouping facilities comparable to those provided in XSLT 2.0. (The function is available in XSLT also, but is unnecessary in that environment.)

The first argument defines the population, a collection of items to be grouped. These may be any items (nodes or atomic values). The second argument is a function (created using saxon:function) that is called once for each item in the population, to calculate a grouping key for that item. The third argument is another function (also created using saxon:function) that is called once to process each group of items from the population.

Two items in the population are in the same group if they have the same value for the grouping key. Strings are compared using the default collation. If the value of the grouping key is a sequence of more than one item, then an item in the population may appear in more than one group; if it is an empty sequence, then the item will appear in no group.

The order in which the groups are processed is subject to change: at present it is the same as the default order in xsl:for-each-group, namely order of first appearance. There is no way to change this order; if the groups need to be sorted then it is best to sort the output afterwards. Each group is passed as an argument to a call on the action function supplied as the third argument; the values returned by these calls form the result of the saxon:for-each-group call. The items within each group are in their original order (population order).

The following example groups cities by country. It takes input like this:

<doc>
<city name="Paris" country="France"/>
<city name="Madrid" country="Spain"/>
<city name="Vienna" country="Austria"/>
<city name="Barcelona" country="Spain"/>
<city name="Salzburg" country="Austria"/>
<city name="Bonn" country="Germany"/>
<city name="Lyon" country="France"/>
<city name="Hannover" country="Germany"/>
<city name="Calais" country="France"/>
<city name="Berlin" country="Germany"/>
</doc>

and produces output like this:

<out>
   <country leading="Paris" size="3" name="France">
      <city name="Calais"/>
      <city name="Lyon"/>
      <city name="Paris"/>
   </country>
   <country leading="Madrid" size="2" name="Spain">
      <city name="Barcelona"/>
      <city name="Madrid"/>
   </country>
   <country leading="Vienna" size="2" name="Austria">
      <city name="Salzburg"/>
      <city name="Vienna"/>
   </country>
   <country leading="Bonn" size="3" name="Germany">
      <city name="Berlin"/>
      <city name="Bonn"/>
      <city name="Hannover"/>
   </country>
</out>

The XQuery code to achieve this is:

declare namespace f="f.uri";

(: Test saxon:for-each-group extension function :)

declare function f:get-country ($c) { $c/@country };

declare function f:put-country ($group) {
    <country name="{$group[1]/@country}" leading="{$group[1]/@name}" size="{count($group)}">
       {for $g in $group 
           order by $g/@name
           return <city>{ $g/@name }</city>
       }
    </country>
};    

<out>
    {saxon:for-each-group(/*/city, 
                         saxon:function('f:get-country', 1), 
                         saxon:function('f:put-country', 1))}
</out>

Next