System Programming Interfaces
Nodes and Fingerprints
The gradual move to reduce dependence on the
have been dropped, except for nodes that
interface. This means that implementations of
NodeInfo that wrap third-party
XML tree models no longer need to implement these methods, and no longer need to be tied to a
In earlier releases, document nodes were always represented by an object that implemented the
DocumentInfo interface (which extended
was used to hold information about the tree as a whole, for example keys and IDs. In Saxon
9.7, the class
DocumentInfo is retained to provide a measure of compatibility
for some commonly used interfaces, but it is no longer the case that every document node is represented by
an instance of
DocumentInfo; in fact
DocumentInfo is now just a wrapper
NodeInfo designed to keep existing code working. Information about a tree
as a whole is now contained in a new
object; this exists for all trees, whether or not they are rooted at a document node. This provides
a place to put information about accumulators, which can exist for any tree whether or not the
root is a document node.
A number of changes have been made to the way collection URIs are handled, mainly: (a) to support the XPath 3.1 capability to return any kind of item in a collection, not only a node (for example, collections can now include maps derived from JSON files, unparsed text files, and binary objects); (b) to allow streamed processing of the documents in a collection; and (c) to conform with the rules in the specification as regards stability (that is, repeated calls returning the same results).
interface is superseded by a new more flexible
CollectionFinder. The old
CollectionURIResolver is still supported, but provides less capability.
The new mechanism is described in the Javadoc documentation; for an outline, see Collections.
To handle the Saxon collection URIs with options such as
Source object that is returned can be an
AugmentedSource, which holds parser
options as well as the source information itself.
In Saxon-EE, fn:collection()
is multi-threaded, parsing multiple documents simultaneously in different threads. This
previously happened within the default collection URI resolver; it now happens within
the code of the
fn:collection() function itself, so it works even if a
user-defined collection URI resolver is in use. An additional change in this release is
that the order in which documents are returned in the result of
fn:collection() is now always the same as the order in which they are
delivered by the collection URI resolver, making the order more predictable at a slight
cost in latency.
Collections can now be stable, meaning that multiple calls with the same collection URI
are guaranteed to return the same results. Collection stability can be expensive,
because the contents of a collection have
to be maintained in memory just in case it is used again; it is therefore not the default,
even though required for conformance with the W3C specifications. Collection stability can be
switched on in several ways: the collection URI can include the query parameter
stable=yes; the collection finder can return a
ResourceCollection object whose
isStable() method returns true; or the Configuration property
FeatureKeys.STABLE_COLLECTION_URI can be set to true. A collection
is stable if any of these methods returns true.
unparsed=true among the query parameters of the collection URI is
no longer supported, as the functionality can now be achieved by calling fn:uri-collection()
followed by fn:unparsed-text().
A new option for the collection URI query parameters is
this is used, the items returned by the
collection() function are maps; the
entries in the map include properties of the resources within the collection, plus a
fetch() that can be called to fetch the actual content of the
resource. For further details see Collections.
URIResolver and the standard
ModuleURIResolver have been
enhanced to recognize the classpath URI scheme. For example, in XSLT it is now possible
<xsl:include href="classpath:utility.xsl"> which locates
utility.xsl on the Java classpath. (The classpath URI scheme was
introduced as part of the Spring framework, but Saxon's implementation is
free-standing.) On the command line, in options such as
-s, names prefixed
classpath: are now recognized (along with
file) as being URIs rather than filenames, avoiding the need to specify
Receiver interface has changed, so
that location information is now passed with all events (for example,
startElement as a
Location object, rather than as an
locationId). This change was necessary because with independent
compilation of packages, it becomes difficult to allocate globally unique location IDs
at package compile time. The change also enables richer location information to be
maintained, enabling more precise diagnostics especially of dynamic errors.
The move away from integer location IDs to
Location objects is fairly
pervasive, and affects many interfaces that are important to products that interface
intimately to Saxon, for example to provide debugging support. In particular expressions
in the expression tree now contain location information in the form of a
object; they no longer implement the
SourceLocator interface directly.
The Expression tree
There have been substantial changes to the internal structure of the
These are only likely to affect applications that interface to Saxon at a very low level. Among the changes:
Containerobject has gone.
- Expressions now contain a reference to their parent expression in the tree.
- An expression now contains a reference to a
RetainedStaticContextobject, which holds that part of the static context that might be needed at execution time. To save space, an expression whose static context is the same as its parent or sibling expressions will generally share the same
- Because expressions now hold more context information, the need to pass this information
dynamically during the type-checking and optimization processes using the
ExpressionVisitorobject is diminished.