Saxon extensions to the W3C XSLT/XQuery specifications

Changes to existing Saxon extensions and new extensions in Saxon 12 are outlined below. This includes extension functions and instructions in the saxon namespace, as well as experimental implementations for version 4.0 extensions to XPath, XSLT, and XQuery. A W3C Community Group is working on these proposals; for more information see the QT4CG Specifications, and documentation about the Saxon implementations at Experimental 4.0 extensions.

New Extension Functions

A number of proposed 4.0 functions are implemented (see New functions): fn:all(), fn:all-different(), fn:all-equal(), fn:characters(), fn:contains-sequence(), fn:ends-with-sequence(), fn:expanded-QName(), fn:foot(), fn:highest(), fn:identity(), fn:index-where(), fn:in-scope-namespaces(), fn:intersperse(), fn:is-NaN(), fn:items-after(), fn:items-at(), fn:items-before(), fn:items-ending-where(), fn:items-starting-where(), fn:iterate-while(), fn:lowest(), fn:op(), fn:parcel()*, fn:parse-html(), fn:parse-QName(), fn:parts()*, fn:replicate(), fn:some(), fn:starts-with-sequence(), fn:trunk(), and fn:unparcel()*. Specifications of these functions can be found in the QT4CG draft specification.

The proposed 4.0 functions map:build() and map:filter() are implemented.

The proposed 4.0 functions array:empty(), array:exists(), array:foot(), array:index-where(), and array:trunk() are implemented.

From Saxon 12.1:

From Saxon 12.2:

From Saxon 12.3:

For 12.3, there have been changes to the keywords used for arguments to built-in functions, tracking the draft specifications which have adjusted the keywords to give greater consistency.

From Saxon 12.4:

Dropped or Changed Extension Functions

The Saxon extension functions saxon:evaluate(), saxon:eval(), and saxon:expression() were dropped in Saxon 12.0. (The function saxon:evaluate-node() was dropped in Saxon 10). The same effect (and more) can be achieved using the standard XSLT 3.0 instruction xsl:evaluate. From Saxon 12.4, the functions are reinstated for the benefit of XQuery users, where the xsl:evaluate instruction is not available.

The Saxon extension function saxon:in-scope-namespaces() has been aligned with the proposed 4.0 function fn:in-scope-namespaces() in that the returned map now always includes an entry for the XML namespace.

The extension function saxon:parse-html() is now a synonym for fn:parse-html(), a new function proposed for 4.0. The function has been reimplemented on SaxonJ to use the validator.nu library in place of TagSoup. In SaxonCS, it has been reimplemented to use AngleSharp in place of HtmlAgilityPack. In both cases this gives much closer conformance to the HTML5 parsing algorithm; the function has also been much more thoroughly tested. However, a handful of the 1300 new tests are currently giving unexplained results (some of these may turn out to be correct), so it remains work in progress.

XPath 4.0 Syntax Extensions

Some experimental syntax extensions intended for 4.0 have been dropped. "Dot Functions" (written as .{.+1} or fn{.+1}) are now written as ->{.+1}. "Underscore functions" (written as _{$1 + $2}) are dropped entirely (write instead ->($p1, $p2){$p1 + $p2}).

Union NodeTests are implemented. This feature allows steps such as ancestor::(div1|div2|div3), @(id|name), and following-sibling::(comment()|processing-instruction()).

Static function calls may supply arguments by keyword as well as by position.

From Saxon 12.1, XPath 4.0 string templates are implemented. Example: let $message := `{$day} of {$month}, {$year}`.

From Saxon 12.1, XPath "braced if" expressions are implemented. Examples:

  1. if ($condition) {<x>It's true</x>}
  2. if ($condition) {<x>It's true</x>} else {<x>It's a lie</x>}
  3. if ($condition) {<x>It's true</x>} else if ($polite) {<x>He mis-spoke</x>} else {<x>It's a lie</x>}

From 12.2, numeric integer literals can be written in hex (0xFFFF0000) or binary (0b101010), and underscores can be used as separators (1_000_000).

From 12.3, the non-ASCII characters × (xD7) and ÷ (xF7) can be used in place of * and div to represent the multiplication and division operators. The and characters (xFF1C and xFF1E) can be used in place of < and > to represent the less-than and greater-than operators; these characters can also be used in place of < and > in the compound tokens <=, >=, <<, >>, =>, ->, and =!>.

From 12.3, the mapping arrow operator =!> is implemented. This is similar to the XPath 3.1 arrow operator (for example $x => abs()), but it applies the function on the right-hand side to each item delivered by the left-hand side individually. For example (-2 to +2) =!> abs() returns (2, 1, 0, 1, 2). The effect is similar to writing (-2 to +2) ! abs(.), but the operator precedences make it easier to construct a pipeline of operations without parentheses.

From 12.3, an inline function following the arrow operator no longer needs to be parenthesized: the expression $in => (function($x){$x+1})() can now be written $in => function($x){$x+1}(). It is also possible to use a focus function: $in => function{.+1}()

From 12.3, the function coercion rules are extended to allow a supplied function item to have lower arity than that implied by the signature of the required type. For example, map:for-each() expects a function with two arguments, which are set respectively to the key and the value of an entry in the map. But if you are only interested in the key, you can supply a function of arity 1, and your function will be called omitting the second argument. Similarly, for a function that expects a predicate (a function of arity one), you can now supply the value fn:true#0 which has arity zero: this has the effect that the predicate will always be true.

From 12.3, the syntax for "focus functions" is changed from ->{EXPR} to function{EXPR}. At the same time, the abbreviated syntax for inline functions with named parameters (known as lambda expressions) is changed from ->($x, $y){$x + $y} to ($x, $y)->{$x + $y}. In both cases, Saxon continues to support the older format for the time being.

From 12.3, for and let expressions can use the keyword repeatedly, rather than using a comma: for $i in 1 to 10 for $j in 1 to 10 return $i * $j, or let $i := 10 let $j := 20 return $i + $j. This syntax was already valid in XQuery.

The implementation of "for member" expressions has changed in 12.3: as a result of this, any SEF files using the construct will need to be recompiled.

From 12.3, Switch and Typeswitch expressions in XQuery allow curly braces, for example: switch($x){case 1: return "a" case 2: return "b" default: return "?"}

From 12.4, element and attribute tests can use the syntax element(A|B) and attribute(A|B) to accept a union of names. Wildcards are also allowed, for example element(p:*|q:*)

From 12.4, casting to locally-defined union types and enumeration types is supported.

From 12.4, an extensible record test with no named fields is allowed: record(*).

From 12.4, there has been a change to the for member construct. To do two nested iterations over arrays $A and $B, you now need to write for member $a in $A, member $b in $B return .... Previously the second member was not needed, it was assumed to apply to subsequent clauses.

From 12.4, the atomic types xs:hexBinary and xs:base64Binary are mutually comparable; and promotion on function calls and variable binding now ensures that either type can be supplied where the other is required.

XQuery 4.0 extensions

See also XPath 4.0 extensions.

Function declarations may declare some parameters as optional, with a default value. From Saxon 12.1, the default value must either be a constant (for example, (), 0, false(), or ""), or the context item expression .. This restriction is imposed pending clarification of the specification.

From 12.3, XQuery switch expressions are generalized so that each case may match multiple values, for example case 0 to 9 return "single".

From 12.3, some of the constructs in a FLWOR window clause become optional, for example it is no longer necessary to say when true().

From 12.3, variables (both global variable declarations and local variables bound in let, for [member] and group by clauses) are subject to type coercion in the same way as function parameters. For example it becomes possible to say let $x as xs:double := 1 because the integer 1 is now coerced to an xs:double.

From 12.4, true() and false() are allowed as annotation values. (In previous XQuery versions, only string literals and numeric literals were allowed.)

From 12.4, the comparand expression in a switch expression can be omitted, and defaults to true().

From 12.4, the XQuery syntax for declaring named item types is changed to match the syntax in the draft specification (declare item-type my:type as union(x, y); in place of declare type my:type = union(x, y)).

XSLT 4.0 extensions

If XSLT 4.0 extensions are enabled, two new (experimental) values are available for xsl:mode/@on-no-match, namely shallow-copy-all and shallow-skip-all. These have the same effect as shallow-copy and shallow-skip respectively, except when processing maps and arrays.

The shallow-skip-all option has been dropped from the spec, and is removed in Saxon 12.4.

For shallow-copy-all:

The experimental @array and @map attributes of xsl:for-each, xsl:for-each-group, and xsl:iterate added in Saxon 11 are dropped. In their place, use select="array:members($array)" or select="map:key-value-pairs($map)".

From 12.2, the xsl:array/@use attribute is implemented. The @composite attribute is retained for the time being: composite="yes" is treated as equivalent to use="?value".

The proposed instruction xsl:match added in Saxon 11 has been dropped.

Function declarations (xsl:function) may declare some parameters as optional, with a default value. From Saxon 12.1, the default value must either be a constant (for example, (), 0, false(), or ""), or the context item expression .. This restriction is imposed pending clarification of the specification.

From 12.2 the syntax for type patterns changes so that match="type(T)" may specify any item type as T (including, for example, an atomic type name: match="type(xs:integer)"). The syntax match="atomic(T)" is dropped from the draft specification, and match="type(T)" should be used instead. For now Saxon still allows match="atomic(T)" but will produce a warning; it will be removed in a future Saxon release. The priority rules for type patterns are not yet fully aligned with the evolving specification: matching of atomic types works correctly according to the type hierarchy, but if union types and/or record types are used in match patterns (for example match="record(lat, long)") they should be disambiguated using explicit priorities.

From 12.3 xsl:matching-substring and xsl:non-matching-substring may take a select attribute in place of a contained sequence constructor.

From 12.3 enclosing modes are implemented: xsl:template rules can be defined as a child of xsl:mode, and within those template rules, any xsl:apply-templates instruction defaults to the enclosing mode.

From 12.4 the xsl:accumulator-rule/@saxon:capture extension has now been incorporated as a standard XSLT 4.0 feature under the name xsl:accumulator-rule/@capture. The Saxon attribute name remains available for the time being as a synonym.