Streamed processing of input documents
Streamed execution of XQuery is now supported in Saxon-EE.
The streamability rules have been brought into line with the latest XSLT 3.0 draft. This does not in fact have a major impact on what is streamable, although there are some changes in the details.
The XSLT extension attribute
saxon:read-once="yes", which was an early interface
provided for streamed processing, is dropped. In place of
select="doc('a.xml')//e" saxon:read-once="yes"/>, use
<xsl:stream href="a.xml"><xsl:apply-templates select=".//e"/></xsl:stream>
Many more functions and operators are now fully streamable (and more fully tested for
streaming). These include
unordered; comma expressions,
instance of, conditional
Streaming of deep-equal() is restricted in that any nodes in the streamed input sequence are materialized (one at a time) in memory.
A number of functions and operators can cause parsing of the input document to terminate
early if no more input is required. These include
deep-equal, general comparisons,
of. The parse is only terminated if the streamed argument to the function consumes
the entire document. Terminating the parsing early means that wellformedness errors in the
rest of the file will not be detected.
Two configuration options for streaming have been added. The option
FeatureKeys.STREAMABILITY controls whether streaming is used at all, and if so,
whether Saxon streaming extensions are enabled. The option
FeatureKeys.STREAMING_FALLBACK controls what happens when non-streamable
constructs are encountered, the options being either to raise a compile-time error, or to
attempt a non-streamed evaluation (which will produce the same result as a streamed
evaluation, provided enough memory is available).
Saxon has come into line with the W3C specification by no longer allowing an
xsl:apply-templates instruction whose select
expression is crawling (for example
select="//*") and whose body is consuming. (A
crawling expression, informally, is one that selects elements on the descendant axis, unless
it's wrapped in a call on
outermost() to ensure that nested elements are not
selected.) It was found that there were cases where this didn't work (notably when the body of
the instruction contained local variable declarations) which were difficult to fix.