Writing input filters

Saxon can take its input from a JAXP SAXSource object, which essentially represents a sequence of SAX events representing the output of an XML parser. A very useful technique is to interpose a filter between the parser and Saxon. The filter will typically be an instance of the SAX2 XMLFilter class.

There are a number of ways of using a Saxon XSLT transformation as part of a pipeline of filters. Some of these techniques also work with XQuery. The techniques include:

  • Generate the transformation as an XMLFilter using the newXMLFilter() method of the TransformerFactory. This works with XSLT only. A drawback of this approach is that it is not possible to supply parameters to the transformation using standard JAXP facilities. It is possible, however, by casting the XMLFilter to a net.sf.saxon.jaxp.FilterImpl, and calling its getTransformer() method, which returns a Transformer object offering the usual addParameter() method.

  • Generate the transformation as a SAX ContentHandler using the newTransformerHandler() method. The pipeline stages after the transformation can be added by giving the transformation a SAXResult as its destination. This again is XSLT only.

  • Implement the pipeline step before the transformation or query as an XMLFilter, and use this as the XMLReader part of a SAXSource, pretending to be an XML parser. This technique works with both XSLT and XQuery, and it can even be used from the command line, by nominating the XMLFilter as the source parser using the -x option on the command line.

The -x option on the Saxon command line specifies the parser that Saxon will use to process the source files. This class must implement the SAX2 XMLReader interface, but it is not required to be a real XML parser; it can take the input from any kind of source file, so long as it presents it in the form of a stream of SAX events. When using the JAXP API, the equivalent to the -x option is to set the configuration property SOURCE_PARSER_CLASS.