Functions, operators, and data types for XPath 2.0

The new XSLT function type-available has been implemented. See W3C Bugzilla entry 3165 for details.

The component extraction functions on xs:duration have been changed to return normalized values. For example seconds-from-duration(xs:duration('PT90S')) now returns 30 rather than 90. See W3C Bugzilla entry 3369 for details of the specification change.

On JDK 1.5, and on the .NET platform, a case-blind regular expression (that is, the "i" flag) is now expanded by Saxon into the equivalent case-sensitive regular expression, rather than relying on the implementation of case-blind searching in the underlying regex engine. This change has been made because it was found that on both the Java and .NET platforms, case-blind searching did not behave according to the detailed rules defined for XPath, and in fact it was impossible to determine what the detailed rules were. On JDK 1.4, however, the underlying implementation is still used, which gives slightly different behaviour in corner cases involving character negation and subtraction.

The document-uri() function now returns a document URI in a wider range of circumstances, for example, it returns a value for documents returned by the collection() function. This has been achieved by ignoring the requirement that doc(document($D)) should always return $D. For example, if all the documents returned by a user-written CollectionURIResolver have the SystemId "dummy", then this string will be returned as the value of the document-uri property, even though the call doc('dummy') will not deliver the same (or any) document.

A consequence of this change is that the data model now distinguishes more carefully between the document URI and the base URI. For a document node, the document URI is represented internally by the SystemId property of the DocumentInfo object, while the base URI is represented by the BaseURI property. The document node of a temporary tree has a base URI, but it has no document URI. This means that the SystemId of a temporary document should now be null, whereas it was previously set to the value of the base URI.

If a URI passed to the document() function contains a fragment identifier, the fragment identifier must now be a valid NCName, otherwise a fatal error XTRE1160 is reported. Note that Saxon (as permitted by the letter of the specification) treats any document successfully retrieved by the document() function as having media type application/xml for the purpose of interpreting any fragment identifier.

The CollectionURIResolver interface has changed to allow collections to be stable (in the sense that retrieving the same collection twice will return the same nodes). The items in the SequenceIterator returned by this class may now be xs:anyURI values, as an alternative to returning nodes directly as NodeInfo instances. If xs:anyURI values are returned, the actual document is obtained by supplying this URI to the doc() function. This means Saxon first checks to see if there is already a loaded document with this URI, and if so, that document is returned. If there is no such document, then it will be loaded by calling the registered URIResolver. This mechanism is backwards compatible with existing user-written CollectionURIResolver implementations.