Detailed Pattern-Matching Statistics

Techniques described here are experimental and only available in Saxon-EE.

Saxon-EE 9.7 has an additional experimental package (supplied in saxon9-stats.jar) that analyses template pattern matching performance, providing comparative statistics between a number of possible configurations and optimization features operating on the same input data. The results are presented as a small linked set of web pages and SVG graphs.

To run this tool, with an appropriate classpath pointing to a Saxon-EE jar , saxon9-stats.jar and suitable licence file, execute

java com.saxonica.StatsTransform -s:source? -xsl:stylesheet? -o:output? -stats:statsfile? ...

where statsfile is an optional control file described below. Many of the more usual Saxon control options (-repeat: etc.) are relevant.

The stylesheet will be executed over the input for each of the cases given, with statistical data being written into files within suitable local directories. Then an internal stylesheet generates a set of web pages, whose master index will be at the location statistics/{replace(statsfile,'\.xml$','.html')}, i.e in a locally-relative subdirectory statistics, with a name usually substituting html for trailing xml.

Statistic Control File

The (optional) statistics control file has the following generic form

<stats xsl="stylesheet"? s="source"? description="description"?> <group name="group-name" show="details? time? call? rank?"> <case label="case-tag ; case-label"? dir="caseWorkingDirectory/"/>? </group> * </stats>

where the content is defined as follows:

The stylesheet to execute, relative to this file. This can be overridden through the -xsl switch. At least one of these (attribute or switch) must be present.
The source document to process, relative to this file. This can be overridden through the -s switch. At least one of these (attribute or switch) must be present unless the initial-template has been specified via a command-line switch.
The name to use for the comparison group, which should be exploitable as a directory name.
What aspects to display, given as a whitespace separated list of directives:
  • details - detailed comparison of individual templates in differing configurations
  • time - relative comparison of the most pattern-matching time-expensive templates, for each mode
  • call - relative comparison of the most called templates, for each mode
  • rank - display of the rule rank distribution for each mode
A tag to use as a class for all CSS styling of elements of that group, particularly in graphics.
A string to label that particular case, in tables and in keys on graphs.
The working directory to use for this particular case. In particular, if a configuration file is required for that case, then config.xml should exist in that directory, and either a label is defined on the case element or the top-level configuration element should have a @label attribute (e.g. label="Index ; comparison indexation") to enable the results to be labelled and differentiated. This directory will be used both to keep generated statistics and other helped files, as well as the output from the execution. (This implies result-document URIs are relative to a point inside this directory.)

An example is:

file stats.simple.index.xml = <stats> <group name="statsTransform.default" show="details time call rank"> <case label="OFF ; Normal XSLT templating" dir="statsTransform/"/> <case dir="statsTransformIndex/"/> </group> </stats>

Where two different configurations for Saxon will be compared running the stylesheet defined in the command line on the source file again defined in the command line. The first case will be normal XSLT behaviour; the second will be with Saxon configured by the configuration file statsTransformIndex/config.xml. (It is assumed that that configuration file has a /configuration/@label description of similar form, from whence suitable labels can be generated.)

The results will appear in a web-page statistics/stats.simple.index.html.

For details of the experimental template pattern optimization options, which can be defined in the configuration file, and for which this analysis package was developed, see The <PatternOptimization> element.