System Programming Interfaces
Sequence and SequenceIterator
The class Sequence, and its many subclasses such as GroundedValue and Item,
were changed in Saxon 9.9 to use Java Generics; the same change was also made to SequenceIterator
and its many subclasses. This change has been reverted in Saxon 10. The reason for this is basically that the change added
a lot of complexity and delivered no benefits: an analysis of the motivation can be found in a
Saxon blog article.
NodeInfo
The way in which namespaces are represented in the NodeInfo interface has changed, to be closer to the
XDM model. Instead of getDeclaredNamespaces(), which returned the namespace information as a set of
declarations and undeclarations relative to the parent element, the interface now has getAllNamespaces(),
which returns all the in-scope namespaces as a NamespaceMap object.
This brings the API much closer to the way that the XDM data model
is defined, which ultimately makes it easier to use. The potential inefficiency of having namespaces defined redundantly
on every single element node is handled by sharing NamespaceMap objects (so two elements with the same
in-scope namespaces share the same NamespaceMap). The NamespaceMap object is immutable,
which makes such sharing easy to manage.
-
The native tree models in Saxon (the TinyTree and LinkedTree) implement this by maintaining stored
NamespaceMapobjects within the internal data structure. The tree builders ensure that these objects are shared, so that a child element with no local namespace declarations will point to the sameNamespaceMapobject as its parent element. In many documents, namespaces are all declared on the root element, so there will only be oneNamespaceMapobject for the whole document. -
The external tree models (DOM, XOM, JDOM2, etc.) generally cache the
NamespaceMapfor an element, so computing the in-scope namespaces for an element involves getting the cachedNamespaceMapfor the parent element, and then making any changes needed if there are local namespace declarations or undeclarations present on the child element.
To make it easier to iterate over the children of a node, the NodeInfo interface now has a method
children() returning Iterable<NodeInfo>. The method has a default implementation.
This makes it possible to use a Java "for each":
There is also a variant of children() that takes a Java Predicate as its second argument.
The NodeTest class now implements Predicate<NodeInfo>, so you can write for example:
which processes the children of a node that are themselves element nodes.
Similarly, there is a new method attributes() which returns an AttributeMap. This class
replaces the old AttributeCollection throughout the product. The most notable differences are (a) an AttributeMap
is immutable, and (b) its primary access is as an Iterable, rather than using integer subscripts to index into it.
There are different implementations of AttributeMap depending on the number of attributes; for larger attribute sets,
a hash trie structure is used to give faster access by name, and to enable addition of attributes without copying the whole structure.
The method NodeInfo.iterateAxis(axisNumber, condition) has been generalized to take a Java Predicate as its
second argument.
Receiver
The Receiver interface has been forked into two interfaces: Receiver and Outputter.
-
The new Receiver interface receives information about the start of an element in a single call containing information about the element name and type, the in-scope namespaces of the element, and the attributes of the element. This interface is used in the validation pipeline and the serialization pipeline, where it simplifies many of the operations performed by these complex pipelines.
-
The Outputter interface splits the start-of-element information across a sequence of calls:
startElement()for the element name and type, one call onnamespace()for each namespace declaration, one onattribute()for each attribute, and finally a call onstartContent()(which in some cases may be implicit). TheOutputterinterface is used primarily to capture the result of push-mode XSLT and XQuery instructions, notably instructions that generate elements and attributes. It's necessary in this case to be able to capture attributes and namespaces as independent events because instructions can generate them as separate events. Most commonly these instructions feed directly into a ComplexContentOutputter, which amalgamates start tag events into a singlestartElement()call which is then sent to aReceiverpipeline.
DateTimeValue
DateTimeValue and related classes such as
DateValue and TimeValue
provide additional constructors and getter methods supporting conversion to or from:
java.time.Instant, java.time.LocalDateTime,
java.time.OffsetDateTime, java.time.ZonedDateTime, and java.time.LocalDate.
Conversion of DateTimeValue to/from classes in the java.time package now retains
nano-second precision (previously, it only retained micro-second precision).
Miscellaneous
For push-mode instruction execution, the current receiver/outputter is no longer held in the XPathContext
object. Instead it is passed to the relevant methods (such as Expression.process()) as an explicit parameter.
This reduces the need to create new XPathContext objects.
The PullProvider interface has changed to use a type-safe
enum class for event types rather than integer constants.
The SequenceIterator interface has changed: the
getProperties() method now returns a type-safe EnumSet rather than an integer with
bit-significant flags.
Saxon has generally moved to the use of EQName notation (Q{uri}local)
in preference to Clark names ({uri}local) for internal use. Clark names
are still used where mandated by the JAXP specification, and in other stable APIs.
In lower-level system programming interfaces, however, applications may notice the
change. For example the change affects the representation of output properties such as
method and cdata-section-elements, which are exposed to
user-written serialization methods and customised serializers.
Dropped interfaces
The method StaticQueryContext.setExternalNamespaceResolver() is dropped (it has been present in the product
for many years but appears to be untested, and its intended behavior is unclear).
Some long-deprecated classes and methods have been dropped, including those listed below. The general principle is that methods deprecated since 9.7 or earlier have all been dropped; methods first deprecated in 9.8 have been dropped if they were producing no useful effect in 9.9 (for example, setter methods that did nothing).
-
StaticQueryContext.buildDocument(), deprecated since Saxon 9.2 (use a s9apiDocumentBuilder) -
Configuration.buildDocument(Source)andConfiguration.buildDocument(Source, ParseOptions), deprecated since Saxon 9.7 (use a s9apiDocumentBuilder, orConfiguration.buildDocumentTree()) -
net.sf.saxon.om.DocumentInfo, since 9.7 used only by deprecated methods -
AbstractStaticContext.declareCollation(),StaticQueryContext.declareCollation(), andStaticContext.getCollation(), deprecated since 9.6 (collations should be now registered at the level of theConfiguration) -
XsltTransformer.setInitialContextNode(), deprecated since 9.7 (usesetGlobalContextItem()) -
Xslt30Transformer.setInitialContextItem(), deprecated since 9.8 (usesetGlobalContextItem()) -
XQueryExpression.pull(), deprecated since 9.8 (userun()) -
ParseOptions.setStripSpace(),AugmentedSource.setStripSpace(),ParseOptions.getStripSpace()andAugmentedSource.getStripSpace(), deprecated since 9.8 (usesetSpaceStrippingRule()) -
XsltCompiler.compilePackages()andaddCompilePackages(), deprecated since 9.8 (use aPackageLibrary) -
BuildingStreamWriter.setInventPrefixes()andisInventPrefixes(), deprecated since 9.7. The method previously had no effect, so the call can be dropped. (Note, the method was previously marked as deprecated in the implementation class, but not in the interface.) -
CollectionURIResolverand associated methods: deprecated since 9.7. (Use aCollectionFinder. If necessary, wrap the oldCollectionURIResolverin aCollectionURIResolverWrapper, whose source code can be found in Saxon 9.9.) This change also causesFeature.COLLECTION_URI_RESOLVERandFeature.COLLECTION_URI_RESOLVER_CLASSto be dropped, as well as the-croption on theTransformandQuerycommand line. -
CompilerInfo.setExtensionFunctionLibrary()andgetExtensionFunctionLibrary(): deprecated since 9.7 (use other mechanisms, such as packages)
The methods setRecoveryPolicy and getRecoveryPolicy have been dropped from
the Configuration and CompilerInfo classes, along with the configuration feature
Feature.RECOVERY_POLICY. In recent releases the only remaining effect of this switch was
to provide a default for the xsl:mode attributes on-multiple-match="fail" and
warning-on-multiple-match="true"; this must now be controlled via the xsl:mode
declaration in the stylesheet.
Tracing
The mechanism supporting execution tracing (including the -T and -TP options on the command line) has been substantially revamped, involving some changes to interfaces:
- If a CodeInjector is nominated, it now operates as a
pass over the fully-constructed expression tree,
rather than being invoked by the XSLT and XPath parsers as expressions and instructions are created. This means that
it now visits every node in the expression tree, not only selected nodes as before. Of course a
CodeInjectorcan always ignore nodes that it is not interested in. The new mechanism opens the way to much more variety in the kind of instrumentation tasks that theCodeInjectorcan undertake. - The object passed to the TraceListener at run-time is now a
Traceable, replacing the
old
InstructionInfo. ATraceableis either an expression, or a component such as aUserFunction,GlobalVariable, orTemplateRule. TheInstructionInfoclass was something of a rag-bag, and duplicated a lot of information that in the last few releases has been mantained in theRetainedStaticContextandLocationobjects attached to the expression itself. - This has enabled the dropping of a number of auxiliary classes such as
InstructionDetailsandLocationKind. - The way in which expressions are identified by the
StandardErrorListeneris now better aligned with the way they are identified by theTraceListener. - In the
CodeInjectorandTraceListenerinterfaces, default implementations of methods are now provided; among other things, this eliminates the need for theTraceListener2interface.
ErrorListener
The StandardErrorListener and StandardInvalidityHandler have been refactored internally (a) to make better reuse of common code, and (b) to make it easier to write customised subclasses to achieve tailored diagnostic output. To this end, some static methods have been replaced by non-static methods, and large methods have been split into smaller pieces, documented to allow them to be overridden. A consequence of these changes is that existing user-written code subclassing these two classes may need changing. If this is onerous, one way of getting around the problem is to copy the open-source implementations of these methods from the Saxon 9.9 distribution, and rename this as a user-defined class.
A new configuration option RETAIN_NODE_FOR_DIAGNOSTICS is provided. If set, this causes the location information associated with instructions and expressions in an XSLT stylesheet to retain a reference to the node in the stylesheet tree where the error occurred (rather than simply retaining the node name and line number). This information can be useful to IDEs that wish to highlight where in the source the error occurred. The information is retained both for static and dynamic error reporting. The option is off by default because retaining the links prevents the temporary data used during compilation being garbage-collected. Previous releases attempted to retain the information for static errors but not for dynamic errors; it is now retained for both, but only if this option is set.