Data Model Changes

When the source document is supplied as a pre-built tree (in any format), and Saxon strips whitespace text nodes as requested by the stylesheet, the space stripping now takes account of any xml:space attributes present in the tree. Specifically, whitespace text nodes are preserved if xml:space="preserve" is specified. This can be expensive, but is required for conformance. When supplying pre-built trees as input (whether as DOM, JDOM, or XOM trees, or as native Saxon trees) it is best not to use xsl:strip-space in the stylesheet.

When the source document is supplied as a DOM or JDOM tree, multiple adjacent text and CDATA nodes are now mapped to a single text node in the XPath model. If the XPath text node is passed to a Java extension function, the extension function sees the first node in the underlying sequence. This change has not yet been made for XOM trees.

Saxon accepts URIs of the form "document.xml#id" where "id" is the value of an attribute defined in the DTD as being of type ID. It now also accepts such URIs where the fragment identifier is the value of an xml:id attribute.

Where a stylesheet is embedded in a source document, or a schema is embedded within a stylesheet, the base URI of the embedded document was previously taken as being the same as the base URI of the containing document. It is now taken as the base URI of the relevant element. This means that the xml:base attribute is taken into account.

In a previous release, following a change in the W3C specifications, Saxon was changed so that DTD-based types such as ID and IDREF did not set the type annotation on the attribute node. An unintended consequence of this change was that the idref() function stopped working when an attribute was defined in the DTD as being of type IDREF or IDREFS. This has now been fixed. Doing so required some changes to the data model. The is-id and is-idref properties defined in the W3C data model are not reflected directly in the Saxon implementation, but the information is now available in a slightly different way. The method getTypeAnnotation() when applied to an attribute node may now return a value that contains the fingerprint code for the type xs:ID, xs:IDREF, or xs:IDREFS together with a high bit (NodeInfo.IS_DTD_TYPE) indicating that the type is DTD-derived rather than schema-derived. When this bit is set, the value should be treated as being untyped atomic, but the type annotation returned indicates whether the is-id or is-idref properties are present. This same change applies to the type code passed with attributes in the Receiver and PullProvider interfaces.