Saxon extensions to the W3C XSLT/XQuery specifications

Extension Functions

A number of new extension functions are available:

saxon:analyze-uri($string)

Parses a supplied URI, returning a map containing its various components (such as the scheme, port, path, fragment, and query).
saxon:characters($string)

Splits the supplied string into a sequence of single-character strings.
saxon:EQName($string)

Given a string in the form of a lexical EQName, returns the corresponding xs:QName value.
saxon:in-scope-namespaces($element)

Returns the in-scope namespaces of an element, in the form of a map from prefixes to URIs.
saxon:index-where($sequence, $predicate)

Returns the integer positions of items in the sequence that match the supplied predicate function.
saxon:is-NaN($atomic)

Returns true if the supplied argument is the xs:float or xs:double value NaN.
saxon:items-after($input, $predicate)

Returns all items in the input sequence that follow the first item that matches the predicate.
saxon:items-before($input, $predicate)

Returns all items in the input sequence that precede the first item that matches the predicate.
saxon:items-from($input, $predicate)

Returns all items in the input sequence starting with the first item that matches the predicate.
saxon:items-until($input, $predicate)

Returns all items in the input sequence up to and including the first item that matches the predicate.
saxon:parse-dateTime($string, $format)

Parses dates and times in non-standard formats. The first argument is a string containing the value to be parsed; the second is a pattern giving the expected format, in the notation used by the Java DateTimeFormatter class. The result is an xs:dateTime, xs:date, xs:time, xs:gYear, xs:gYearMonth, xs:gMonth, xs:gMonthDay, or xs:gDay depending on the components that were actually present in the input value.
saxon:replace-with()

Similar to fn:replace, but instead of supplying a replacement string, the caller supplies a callback function that computes the replacement string from the matched substring.
saxon:tunnel-params()

Returns a map containing the values of all tunnel parameters (whether or not they are declared in the current template). The keys in the map are QNames (the parameter names); the corresponding values are arbitrary XDM values.

A number of changes have been made to existing extension functions:

The saxon:evaluate-node() function is dropped. The functionality is available using xsl:evaluate in XSLT 3.0.
The extension function saxon:get-pseudo-attribute() now parses the supplied input much more rigorously, applying the rules found in the W3C specification, and raising an error for invalid syntax that was previously allowed through.

Extensions to Standard Functions

Various Saxon-specific options have been provided for the map:merge() function: saxon:on-duplicates provides a call-back function for handling duplicate keys; saxon:key-type allows Saxon to optimize the resulting map if the keys are all strings; saxon:final allows Saxon to optimize for the case where no further changes to the content of the map are likely; saxon:duplicates-error-code defines an error code to be used when duplicate keys are encountered.

Extension Instructions

A new instruction saxon:for-each-member is available; it iterates over the members of a supplied array.

For example:

<saxon:for-each-member select="[(1,2), (3,4)]" bind-to="m"> <subtotal>{sum($m)}</subtotal> </saxon:for-each-member>

outputs <subtotal>3</subtotal><subtotal>7</subtotal>

A new attribute xsl:mode/@saxon:as is available. Its value is a sequence type, with Saxon extensions permitted. This provides a default value for the as attribute of all template rules in the mode, unless they have their own required type defined in an as or saxon:as attribute. This is handy in cases where, for example, all the template rules in a particular mode are required to return a boolean value. It is particularly useful in cases where the return type is a complex tuple type, as this means that changes to the tuple type only need to be made in one place.

XPath Syntax Extensions

Tuple Types

The experimental syntax for declaring tuple types has been revised; the specification has been expanded and clarified, and the implementation is much more thoroughly tested. The colon separating field name and required type has been replaced with "as"; and the notation ",*"" can be used after the last field to indicate that the tuple type is extensible (that is, additional fields are permitted beyond those declared). In addition, field names that are not NCNames can now be used, written in quotes. For example, a tuple type may now be declared as tuple(key as xs:string, 'max size' as xs:numeric?, value, *).

Item Type Aliases

Where named type aliases are defined in XSLT or XQuery, the syntax for referring to them in an XPath SequenceType has changed from ~typename to type(typename).

Processing all Members of an Array

A new for-member expression is available to process all the members of an array:

for member $x in EXPR (, $y in EXPR)* return EXPR

For example: for member $m in [(3,5,6), (8,12)] return sum($m) returns the sequence (14, 20).

This syntax is currently available only as a free-standing expression, not as a clause in a FLWOR expression; it has been designed, however, to allow integration into a FLWOR expression in the future.

Extensions to the Lookup Operator

Following a unary or binary "?" operator, Saxon now allows a string literal or variable reference to appear without surrounding parentheses: for example $map?"first name" or [1 to 10]?$i.

The `otherwise` Operator

The expression chapter[title='Introduction'] otherwise chapter[1] returns the chapter(s) whose title is "Introduction" if such a chapter exists, or the first chapter if not. More generally, A otherwise B returns A, unless it is an empty sequence, in which case it returns B.

KindTests

The syntax for the element() and attribute() KindTests is extended to allow constructs of the form element(*:div) or attribute(myns:*, myns:someType).

Abbreviated inline functions

The expression .{@x} (referred to as a "dot function") is an anonymous inline function that returns an attribute of the node passed as the function parameter. (This obsoletes the syntax fn{@x} introduced experimentally in Saxon 9.9, which is retained for the time being.) For example, sort(//employee, .{@lastname, @firstname}) returns employees sorted by last name then first name. A dot function has signature function(item()) as item()*: that is, it has arity one, and expects a single item as its argument.

The expression _{$1 + $2} (referred to as an "underscore function") is an anonymous inline function with two arguments, which may be of any type; it returns the sum of the two arguments. The numeric variable references $1 and $2 refer to the argument values based on their position in the argument list. The arity of the function is determined from the highest numeric variable reference. It is not necessary to reference all the arguments other than the last, for example _{$2} is an arity-2 function that returns the value of the second argument, ignoring the first. The numeric argument references must appear directly in the body of the underscore function; they cannot be referenced in a nested inline function (whether or not this is itself an underscore function). For example, for-each-pair((1,2,3), (4,5,6), _{$1 + $2}) returns (5,7,9). The signature of the function in this example is function(item()*, item()*) as item()*.

As a special case, _{12} is a zero-arity function that always returns the value 12.

XSLT extensions

The xsl:map instruction has acquired an extension attribute saxon:on-duplicates. The value is a user-supplied function which is called when map entries with duplicate keys are encountered. The function is supplied with the two conflicting values and can combine them to create a new value which is stored in the resulting map. This can be used to emulate all the options supplied on the map:merge function (use-first, use-last, combine, or fail) and to achieve other effects, for example delivering the sum, maximum, or string-join of the set of values associated with a single key, or selecting one of the values based on data such as a time-stamp attribute.

Provided that Saxon syntax extensions are enabled, some extensions to XSLT 3.0 syntax are implemented:

The xsl:when and xsl:otherwise elements can have a select attribute in place of a contained sequence constructor.
The xsl:if elements can have a then attribute in place of a contained sequence constructor, and it can also have an else attribute.

For example, this function found in a W3C specification:
<xsl:function name="f:product"> <xsl:param name="seq"/> <xsl:choose> <xsl:when test="empty($seq)"> <xsl:sequence select="1"/ </xsl:when> <xsl:otherwise> <xsl:sequence select="head($seq) * f:product(tail($seq))"/> </xsl:otherwise> </xsl:choose> </xsl:function>
Can now be written:
<xsl:function name="f:product"> <xsl:param name="seq"/> <xsl:if test="empty($seq)" then="1" else="head($seq) * f:product(tail($seq))"/> </xsl:function>

Provided Saxon syntax extensions are enabled, a range of new match patterns can be defined, particularly suitable when processing JSON. These include (by example):

atomic(xs:integer): matches an atomic value of a given atomic type
union(xs:integer, xs:date): matches an atomic value of a given union type
map(xs:integer, element()*): matches a map of a given type
array(xs:integer*): matches an array of a given type
tuple(first, middle, last, *): matches a map conforming to a given tuple type

All of these may be followed by optional predicates. Default priorities are defined, designed to reflect the type hierarchy so that more selective types have higher priority than less selective types; the rules for allocating priorities, however, should be regarded as provisional.

The s9api XsltCompiler interface, and the net.sf.saxon.Transform command line, now allow a default namespace to be specified; this acts as a "default default" for the value of the xpath-default-namespace attribute, and it has no effect if an explicit value for xpath-default-namespace appears in the stylesheet.

The s9api XsltCompiler interface, and the net.sf.saxon.Transform command line, also allow you to specify that unprefixed element names used in path expressions and match patterns should match by local name only, ignoring the namespace entirely. For example, a path X/Y/Z is then treated as if it were written *:X/*:Y/*:Z. This option overrides the effect of xpath-default-namespace in cases where it applies.

A further option is to indicate that unprefixed element names should match elements either in the default namespace (as specified using xpath-default-namespace) or in no namespace. This option is provided primarily to reflect the XSLT/XPath variation defined in the HTML5 specification, which says that unprefixed element names should match elements in the XHTML namespace when the context item is a node "in an HTML DOM", and elements in no namespace otherwise. It is difficult to reproduce this rule precisely in Saxon, because it's not clear what being "in an HTML DOM" means when the data model is XDM (for example, does it apply to a node constructed by using xsl:copy-of applied to an HTML element?). But this option provides an approximation that ensures Saxon (in particular, Saxon-JS) will behave the same way as an XSLT 1.0 stylesheet running in the browser in most practical situations. The option applies when an NCName is used as a NameTest (on any axis other than the attribute and namespace axes) and to the element name in an ElementTest (that is, element(name)), whether in an XPath expression, a pattern, or a SequenceType. It does not apply to unprefixed type names or to names used in a SchemaElementTest (that is, schema-element(name)).