saxonica.com

Writing XSLT extension instructions

Saxon implements the element extensibility feature defined in the XSLT standard. This feature allows you to define your own instruction types for use in the stylesheet. These instructions can be used anywhere within a content constructor, for example as a child of xsl:template, xsl:if, xsl:variable, or of a literal result element.

User-written extension instructions are not currently supported on the .NET platform.

If a namespace prefix is to be used to denote extension elements, it must be declared in the extension-element-prefixes attribute on the xsl:stylesheet element, or the xsl:extension-element-prefixes attribute on any enclosing literal result element or extension element.

Saxon itself provides a number of stylesheet elements beyond those defined in the XSLT specification, including saxon:assign, saxon:entity-ref, and saxon:while. To enable these, use the standard XSLT extension mechanism: define extension-element-prefixes="saxon" on the xsl:stylesheet element, or xsl:extension-element-prefixes="saxon" on any enclosing literal result element.

Any element whose prefix matches a namespace listed in the extension-element-prefixes attribute of an enclosing element is treated as an extension element. If no class can be instantiated for the element (for example, because no ExtensionElementFactory can be loaded, or because the ExtensionElementFactory doesn't recognise the local name), then fallback action is taken as follows. If the element has one or more xsl:fallback children, they are processed. Otherwise, an error is reported. When xsl:fallback is used in any other context, it and its children are ignored.

Within the stylesheet it is possible to test whether an extension element is implemented by using the system function element-available(). This returns true if the namespace of the element identifies it as an extension element (or indeed as a standard XSLT instruction) and if a class can be instantiated to represent it. If the namespace is not that of an extension element, or if no class can be instantiated, it returns false.

To invoke a user-defined set of extension elements, include the prefix in this attribute as described, and associate it with a namespace URI that ends in "/" followed by the fully qualified class name of a Java class that implements the net.sf.saxon.style.ExtensionElementFactory interface. This interface defines a single method, getExtensionClass(), which takes the local name of the element (that is, the name without its namespace prefix) as a parameter, and returns the Java class used to implement this extension element (for example, return SQLConnect.class). The class returned must be a subclass of net.sf.saxon.style.StyleElement, and the easiest way to implement it is as a subclass of net.sf.saxon.style.ExtensionInstruction.

It's a good idea to choose a namespace that includes the string "ElementFactory". Saxon will spot this and produce a warning if anyone uses this namespace but forgets to declare it as an extension element namespace. This is an easy mistake to make and often a hard one to diagnose.

Implementing extension instructions

The best way to see how to implement an extension element is by looking at the example, for SQL extension elements, provided in package net.sf.saxon.sql, and at the sample stylesheet books-sql.xsl which uses these extension elements.

The StyleElement class represents an element node in the stylesheet document. Saxon calls methods on this class to validate and type-check the element, and to generate a node in the expression tree that is evaluated at run-time. Assuming that the class is written to extend ExtensionInstruction, the methods it should provide are:

prepareAttributes()

This is called while the stylesheet tree is still being built, so it should not attempt to navigate the tree. Its task is to validate the attributes of the stylesheet element and perform any preprocessing necessary. For example, if the attribute is an attribute value template, this includes creating an Expression that can subsequently be evaluated to get the AVT's value.

validate()

This is called once the tree has been built, and its task is to check that the stylesheet element is valid "in context": that is, it may navigate the tree and check the validity of the element in relation to other elements in the stylesheet module, or in the stylesheet as a whole. By convention, a parent element contains checks on its children, rather than the other way around: this allows child elements to be reused in a new context without changing their code. The system will automatically call the method mayContainSequenceConstructor(). If this returns true, it will automatically check that all the children are instructions (that is, that their isInstruction() method returns true).

If the extension element is not allowed to have any children, you can call checkEmpty() from the validate() method. However, users will normally expect that an extension instruction is allowed to contain an xsl:fallback child instruction, and you should design for this.

If there are any XPath expressions in attributes of the extension instruction (for example a select attribute or an attribute value template), then the validate() method should call the typeCheck() method to process these expressions: for example select = typeCheck("select", select);

compile()

This is called to create an Expression object which is added to the expression tree. See below for further details.

isInstruction()

This should return true, to ensure that the element is allowed to appear within a template body.

mayContainSequenceConstructor()

This should return true, to ensure that the element can contain instructions. Even if it can't contain anything else, extension elements should allow an xsl:fallback instruction to provide portability between processors

The StyleElement class has access to many services supplied either via its superclasses or via the XPathContext object. For details, see the API documentation of the individual classes.

The simplest way to implement the compile() method is to return an instance of a class that is defined as a subclass of SimpleExpression. However, in principle any Expression object can be returned, either an expression class that already exists within Saxon, or a user-written implementation. A subclass of SimpleExpression should implement the methods getImplementationMethod() and getExpressionType(), and depending on the value returned by getImplementationMethod(), should implement one of the methods evaluateItem(), iterate(), or process().

Next