saxonica.com

Streaming Templates

Streaming templates allow a document to be processed hierarchically in the classical XSLT style, applying template rules to each element (or other nodes) in a top-down manner, while scanning the source document in a pure streaming fashion, without building the source tree in memory. Saxon-SA (from release 9.2) allows streamed processing of a document using template rules, provided the templates conform to a set of strict guidelines.

Streaming is a property of a mode; a mode can be declared to be streamable, and if it is so declared, then all template rules using that mode must obey the rules for streamability. A mode is declared to be streamable using the top-level stylesheet declaration:

<saxon:mode name="s" streamable="yes" xmlns:saxon="http://saxon.sf.net/"/>

The name attribute is optional; if omitted, the declaration applies to the default (unnamed) mode.

Streamed processing of a source document can be applied either to the principal source document of the transformation, or to a secondary source document read using the doc() or document() function.

To use streaming on the principal source document, the input to the transformation must be supplied in the form of a StreamSource or SAXSource, and the initial mode selected on entry to the transformation must be a streamable mode. In this case there must be no references to the context item in the initializer of any global variable.

Streamed processing of a secondary document is initiated using the instruction:

<xsl:apply-templates select="doc('abc.xml')" mode="s"/>

Here the select attribute must be a simple call on the doc() or document() function, and the mode (explicit or implicit) must be declared as streamable.

If a mode is declared as streamable, then it must ONLY be used in streaming mode; it is not possible to apply templates using a streaming mode if the selected nodes are ordinary non-streamed nodes. (This restriction is intended to be temporary.)

The template rules for the streamable mode must follow the rules below:

  1. The match pattern for the template rule must contain no predicates, and no call on the key() or id() functions. Examples of acceptable patterns are *, para, or para/*

  2. Each template rule must contain at most one drill-down instruction. A drill-down instruction is one that reads the children or descendants of the context node. The following are recognized as drill-down instructions:

    • The instruction <xsl:apply-templates/>, where the select attribute is either omitted or specifies select="child::node()"/>. In this case the mode (explicit or implicit) must be a streaming mode, but it need not be the same as the mode in which the template is invoked; there must be no parameters (xsl:with-param children) and no sorting.

    • The instruction <xsl:copy-of select="."/> or <xsl:copy-of select="child::node()"/>

    • The instruction xsl:value-of, xsl:attribute, xsl:comment, xsl:namespace, or xsl:processing-instruction with a select attribute that is either . or string(.).

    • The instruction <xsl:sequence select="string(.)"/> or <xsl:sequence select="data(.)"/>

    • The instruction <xsl:variable select="."/> with an as attribute that forces the value of the node to be atomized.

  3. Any instruction that is a parent or ancestor of the drill-down construct must be one of the following:

    • xsl:element or a literal result element

    • xsl:copy

    • xsl:attribute, xsl:comment, xsl:processing-instruction, xsl:namespace, or xsl:value-of

    • xsl:result-document

    • xsl:variable

  4. Every instruction other than the drill-down instruction and its ancestors must be one that either has no dependencies on the context item, position, or size, or one whose only dependencies are by way of subexpressions falling into one of the following categories:

    • An expression that obtains a local property of the context item. The local properties of a node are the values obtained by applying the functions name(), node-name(), local-name(), namespace-uri(), generate-id(), base-uri(), boolean(), not(), exists(), empty(), or nilled(), or the operator instance of: for example, the expression . instance of schema-element(address).

    • An expression that obtains a local property of a node reachable from the context item by means of a path expression consisting entirely of axis steps (no predicates), where the steps comprise zero or more upwards steps (parent, ancestor, or ancestor-or-self) followed optionally by a step using the attribute or namespace axes. For example, the expression exists(../@status)

    • An expression that uses the string value or typed value of attribute or namespace nodes selected by means of a path expression consisting entirely of axis steps (no predicates), where the steps comprise zero or more upwards steps (parent, ancestor, or ancestor-or-self) followed (always) by a step using the attribute or namespace axes. Such an expression must appear in a context where the namespace or attribute value is atomized, for example an operand to an arithmetic expression or a value comparison.

    • An expression that calls position() is allowed; calls on last() are not allowed.

Note that the drill-down expression cannot simply consist of . (or string(.)) used in an atomizing context such as test=". = 3". It is necessary to bind the atomized value to a variable using for example <xsl:variable select="." as="xs:string"/>, and then use a reference to the variable in any subsequent computation.

Next