Saxon Documentation

Full Contents

About Saxon

Changes in this Release

Licensing

Using XSLT 2.0

Using XQuery

Handling Source Documents

XML Schema Processing

XPath API for Java

Saxon on .NET

Extensibility
	Introduction
»	Integrated extension functions
	Writing reflexive extension functions in Java
	Converting Arguments to Java Extension Functions
	Converting the Result of a Java Extension Function
	Writing extension functions for .NET
	Converting Arguments to .NET Extension Functions
	Converting the Result of a .NET Extension Function
	Writing XSLT extension instructions
	Customizing Serialization
	Implementing a collating sequence
	Localizing numbers and dates
	Writing a URI Resolver for Input Files
	Writing a URI Resolver for Output Files

Saxon Extensions

Sample Saxon Applications

The Saxon SQL Extension

XSLT Elements

XPath 2.0 Expression Syntax

Function Library

Standards Conformance

Integrated extension functions

There are two ways of writing extension functions. The traditional way is to map the name of the function to a Java or .NET method: specifically, the namespace URI of the function name maps to the Java or .NET class name, and the local part of the function name maps to the Java or .NET method name. These are known as reflexive extension functions, and are described in later pages.

In Saxon 9.2, this technique is supplemented by a new mechanism, referred to as integrated extension functions. With this approach, each extension function is implemented as a pair of Java or .NET classes. The first class, the ExtensionFunctionDefinition, provides general static information about the extension function (including its name, arity, and the types of its arguments and result). The second class, an ExtensionFunctionCall, represents a specific call on the extension function, and includes the call() method that Saxon invokes to evaluate the function.

There are several advantages in this approach:

You can choose any function name you like, in any namespace.
The function signature is made explicit, in terms of XPath types for the arguments and result.
There is no ambiguity about which of several candidate Java or .NET methods is invoked.
There is less scope for configuration problems involving dynamic loading of named classes.
All conversions from XPath values to Java or .NET values are entirely under user control.
The function implementation is activated at compile time, allowing it to perform optimization based on the expressions supplied as arguments, or to save parts of the static context that it needs, such as the static base URI or the current namespace context.
The function declares its properties, for example whether it uses the context item and whether it has side-effects, making it easier for the optimizer to manipulate the function call intelligently.
Integrated extension functions are more secure, because the function must be explicitly registered by the calling application before it can be called.

When a stylesheet or query uses integrated extension functions and is run from the command line, the classes that implement these extension functions must be registered with the Configuration. On Saxon-PE and Saxon-EE this is most conveniently done by declareing them in a configuration file. For details see The Saxon Configuration File.

The arguments passed in a call to an integrated extension function are type-checked against the declared types in the same way as for any other XPath function call, including the standard conversions such as atomization and numeric promotion. The return value is checked against the declared return type but is not converted: it is the responsibility of the function implementation to return a value of the correct type.

The methods that must be implemented (or that may be implemented) by an integrated extension function are listed in the table below. Further details are in the Javadoc for the IntegratedFunction class.

First, the ExtensionFunctionDefinition class:

Method	Effect
getFunctionQName	Returns the name of the function, as a QName (represented by the Saxon class `StructuredQName`). Like all other functions, integrated extension functions must be in a namespace. The prefix part of the QName is immaterial.
getMinumumNumberOfArguments	Indicates the minimum number of arguments that must be supplied in a call to the function. A call with fewer arguments than this will be rejected as a static error.
getMaximumNumberOfArguments	Indicates the maximum number of arguments that must be supplied in a call to the function. A call with more arguments than this will be rejected as a static error.
getArgumentTypes	Returns the static type of each argument to the function, as an array with one member per argument. The type is returned as an instance of the Saxon class `net.sf.saxon.type.SequenceType`. Some of the more commonly-used types are represented by static constants in the `SequenceType` class. If there are fewer members in the array than there are arguments in the function call, Saxon assumes that all arguments have the same type as the last one that is explicitly declared; this allows for function with a variable number of arguments, such as `concat()`.
getResultType	Returns the static type of the result of the function. The actual result returned at runtime will be checked against this declared type, but no conversion takes place. Like the argument types, the result type is returned as an instance of `net.sf.saxon.type.SequenceType`. When Saxon calls this method, it supplies an array containing the inferred static types of the actual arguments to the function call. The implementation can use this information to return a more precise result, for example in cases where the value returned by the function is of the same type as the value supplied in the first argument.
trustResultType	This method normally returns `false`. It can return `true` if the implementor of the extension function is confident that no run-time checking of the function result is needed; that is, if the method is guaranteed to return a value of the declared result type.
dependsOnFocus	This method must return true if the implementation of the function accesses the context item, context position, or context size from the dynamic evaluation context. The method does not need to be implemented otherwise, as its default value is false.
hasSideEffects	This method should be implemented, and return true, if the function has side-effects of any kind, including constructing new nodes if the identity of the nodes is signficant. When this method returns true, Saxon will try to avoid moving the function call out of loops or otherwise rearranging the sequence of calls. However, functions with side-effects are still discouraged, because the optimizer cannot always detect their presence if they are deeply nested within other calls.
makeCallExpression	This method must be implemented; it is called at compile time when a call to this extension function is identified, to create an instance of the relevant `ExtensionFunctionCall` object to hold details of the function call expression.

The methods defined on the second object, the ExtensionFunctionCall, are:

Method	Effect
supplyStaticContext	Saxon calls this method fairly early on during the compilation process to supply details of the static context in which the function call appears. The method may in some circumstances be called more than once; it will always be called at least once. As well as the static context information itself, the expressions supplied as arguments are also made available. If evaluation of the function depends on information in the static context, this information should be copied into private variables for use at run-time.
rewrite	Saxon calls this method at a fairly late stage during compilation to give the implementation the opportunity to optimize itself, for example by performing partial evaluation of intermediate results, or if all the arguments are compile-time constants (instances of `net.sf.saxon.expr.Literal`) even by early evaluation of the entire function call. The method can return any `Expression` (which includes the option of returning a `Literal` to represent the final result); the returned expression will then be evaluated at run-time in place of the original. It is entirely the responsibility of the implementation to ensure that the substitute expression is equivalent in every way, including the type of its result.
copyLocalData	Saxon occasionally needs to make a copy of an expression tree. When it copies an integrated function call it will invoke this method, which is responsible for ensuring that any local data maintained within the function call objects is correctly copied.
call	Saxon calls this method at run-time to evaluate the function. The value of each argument is supplied in the form of a `SequenceIterator`, that is, an iterator over the items in the sequence that make up the value of the argument. This may use lazy evaluation, which means that a dynamic error can occur when reading the next item from the SequenceIterator; it also means that if the implementation does not require all the items from the value of one of the arguments, they will not necessarily be evaluated at all (it is good practice to call the close() method on the iterator if it is not read to completion.) The implementation delivers the result also in the form of a `SequenceIterator`, which in turn means that the result may be subject to delayed evaluation: the calling code will only access items in the result as they are required, and may not always read the result to completion. To return a singleton result, use the class `net.sf.saxon.om.SingletonIterator`; to return an empty sequence, return the unique instance of `net.sf.saxon.om.EmptyIterator`.

Having written an integrated extension function, it must be registered with Saxon so that calls on the function are recognized by the parser. This is done using the registerExtensionFunction method available on the Configuration class, and also on the s9api Processor class. It can also be registered via an entry in the configuration file. The function can be given any name, although names in the fn:, xs:, and saxon: namespaces are strongly discouraged and may not work.

It is also possible to register integrated extension functions under XQJ, using the SaxonXQStaticContext class which implements the XQStaticContext interface.

There are corresponding classes in the .NET API, which can be used to define an extension function written in a .NET language such as C#.