Saxon extensions to the W3C XSLT/XQuery specifications

Extension Functions

A number of new extension functions are available:

A number of changes have been made to existing extension functions:

Extensions to Standard Functions

Various Saxon-specific options have been provided for the map:merge() function: saxon:on-duplicates provides a call-back function for handling duplicate keys; saxon:key-type allows Saxon to optimize the resulting map if the keys are all strings; saxon:final allows Saxon to optimize for the case where no further changes to the content of the map are likely; saxon:duplicates-error-code defines an error code to be used when duplicate keys are encountered.

Extension Instructions

A new instruction saxon:for-each-member is available; it iterates over the members of a supplied array.

For example:

<saxon:for-each-member select="[(1,2), (3,4)]" bind-to="m"> <subtotal>{sum($m)}</subtotal> </saxon:for-each-member>

outputs <subtotal>3</subtotal><subtotal>7</subtotal>

A new attribute xsl:mode/@saxon:as is available. Its value is a sequence type, with Saxon extensions permitted. This provides a default value for the as attribute of all template rules in the mode, unless they have their own required type defined in an as or saxon:as attribute. This is handy in cases where, for example, all the template rules in a particular mode are required to return a boolean value. It is particularly useful in cases where the return type is a complex tuple type, as this means that changes to the tuple type only need to be made in one place.

XPath Syntax Extensions

Tuple Types

The experimental syntax for declaring tuple types has been revised; the specification has been expanded and clarified, and the implementation is much more thoroughly tested. The colon separating field name and required type has been replaced with "as"; and the notation ",*"" can be used after the last field to indicate that the tuple type is extensible (that is, additional fields are permitted beyond those declared). In addition, field names that are not NCNames can now be used, written in quotes. For example, a tuple type may now be declared as tuple(key as xs:string, 'max size' as xs:numeric?, value, *).

Item Type Aliases

Where named type aliases are defined in XSLT or XQuery, the syntax for referring to them in an XPath SequenceType has changed from ~typename to type(typename).

Processing all Members of an Array

A new For-Member expression is available to process all the members of an array:

for member $x in EXPR (, $y in EXPR)* return EXPR

For example: for member $m in [(3,5,6), (8,12)] return sum($m) returns the sequence (14, 20).

This syntax is currently available only as a free-standing expression, not as a clause in a FLWOR expression; it has been designed, however, to allow integration into a FLWOR expression in the future.

Extensions to the Lookup Operator

Following a unary or binary "?" operator, Saxon now allows a string literal or variable reference to appear without surrounding parentheses: for example $map?"first name" or [1 to 10]?$i.

The otherwise Operator

The expression chapter[title='Introduction'] otherwise chapter[1] returns the chapter(s) whose title is "Introduction" if such a chapter exists, or the first chapter if not. More generally, A otherwise B returns A, unless it is an empty sequence, in which case it returns B.

KindTests

The syntax for the element() and attribute() KindTests is extended to allow constructs of the form element(*:div) or attribute(myns:*, myns:someType).

Abbreviated inline functions

The expression .{@x} (referred to as a "dot function") is an anonymous inline function that returns an attribute of the node passed as the function parameter. (This obsoletes the syntax fn{@x} introduced experimentally in Saxon 9.9, which is retained for the time being.) For example, sort(//employee, .{@lastname, @firstname}) returns employees sorted by last name then first name. A dot function has signature function(item()) as item()*: that is, it has arity one, and expects a single item as its argument.

The expression _{$1 + $2} (referred to as an "underscore function") is an anonymous inline function with two arguments, which may be of any type; it returns the sum of the two arguments. The numeric variable references $1 and $2 refer to the argument values based on their position in the argument list. The arity of the function is determined from the highest numeric variable reference. It is not necessary to reference all the arguments other than the last, for example _{$2} is an arity-2 function that returns the value of the second argument, ignoring the first. The numeric argument references must appear directly in the body of the underscore function; they cannot be referenced in a nested inline function (whether or not this is itself an underscore function). For example, for-each-pair((1,2,3), (4,5,6), _{$1 + $2}) returns (5,7,9). The signature of the function in this example is function(item()*, item()*) as item()*.

As a special case, _{12} is a zero-arity function that always returns the value 12.

XSLT extensions

The xsl:map instruction has acquired an extension attribute saxon:on-duplicates. The value is a user-supplied function which is called when map entries with duplicate keys are encountered. The function is supplied with the two conflicting values and can combine them to create a new value which is stored in the resulting map. This can be used to emulate all the options supplied on the map:merge function (use-first, use-last, combine, or fail) and to achieve other effects, for example delivering the sum, maximum, or string-join of the set of values associated with a single key, or selecting one of the values based on data such as a time-stamp attribute.

Provided that Saxon syntax extensions are enabled, some extensions to XSLT 3.0 syntax are implemented:

Provided Saxon syntax extensions are enabled, a range of new match patterns can be defined, particularly suitable when processing JSON. These include (by example):

All of these may be followed by optional predicates. Default priorities are defined, designed to reflect the type hierarchy so that more selective types have higher priority than less selective types; the rules for allocating priorities, however, should be regarded as provisional.

The s9api XsltCompiler interface, and the net.sf.saxon.Transform command line, now allow a default namespace to be specified; this acts as a "default default" for the value of the xpath-default-namespace attribute, and it has no effect if an explicit value for xpath-default-namespace appears in the stylesheet.

The s9api XsltCompiler interface, and the net.sf.saxon.Transform command line, also allow you to specify that unprefixed element names used in path expressions and match patterns should match by local name only, ignoring the namespace entirely. For example, a path X/Y/Z is then treated as if it were written *:X/*:Y/*:Z. This option overrides the effect of xpath-default-namespace in cases where it applies.

A further option is to indicate that unprefixed element names should match elements either in the default namespace (as specified using xpath-default-namespace) or in no namespace. This option is provided primarily to reflect the XSLT/XPath variation defined in the HTML5 specification, which says that unprefixed element names should match elements in the XHTML namespace when the context item is a node "in an HTML DOM", and elements in no namespace otherwise. It is difficult to reproduce this rule precisely in Saxon, because it's not clear what being "in an HTML DOM" means when the data model is XDM (for example, does it apply to a node constructed by using xsl:copy-of applied to an HTML element?). But this option provides an approximation that ensures Saxon (in particular, Saxon-JS) will behave the same way as an XSLT 1.0 stylesheet running in the browser in most practical situations. The option applies when an NCName is used as a NameTest (on any axis other than the attribute and namespace axes) and to the element name in an ElementTest (that is, element(name)), whether in an XPath expression, a pattern, or a SequenceType. It does not apply to unprefixed type names or to names used in a SchemaElementTest (that is, schema-element(name)).

XSD extensions

The saxon:flags attribute of the xs:pattern facet was implemented in Saxon 9.9 but not documented or tested; it has now been made official. It allows flags for regular expression matching to be specified, following the rules for XPath functions such as matches().

A new attribute xs:list/@saxon:separator is available for list types. The attribute holds a regular expression that is used to tokenize the supplied value, according to the rules of the XPath tokenize() function. For example with <xs:list saxon:separator=","/>, a list of integers in an attribute can now be comma-separated rather than space-separated.

A new constraining facet saxon:distinct is available for list types; if present (with the value "true"), the list must not contain duplicate values. If a list type has this property, then any types derived by retriction must also have the property.

A new constraining facet saxon:order is available for list types. The value may be ascending or descending. If present, the items in the list must be in ascending or descending order (which implies that they must all be comparable). If a list type has this property, then any types derived by retriction must have the same property.

A new attribute xs:unique/xs:field/@saxon:order="ascending|descending" is available for use with uniqueness constraints. If the attribute is present on any field of a uniqueness constraint, it indicates that the fields participating in the constraint must not only have unique values, they must also be correctly ordered. If the attribute is present on at least one field of a uniqueness constraint, then it defaults to saxon:order="ascending" on any fields where it is not specified. Not only does this provide another kind of integrity constraint that is quite hard to express using assertions, it also provides much more efficient checking of uniqueness constraints in the case where it is known that the data is correctly sorted. This validation check can be usefully applied to a document that is to form the input of the xsl:merge instruction in XSLT 3.0.