Saxonica: Publications

Publications

Books

XSLT 2.0 Programmer's Reference 4th edition by Michael Kay, published by Wrox Press. This book is widely recognized as the authoritative reference on the XSLT 2.0 language, second only to the W3C specification itself. It covers every feature of the language comprehensively, while at the same time explaining the concepts behind the language design, and giving many examples of practical stylesheets to illustrate each language feature.

Michael Kay's XSLT 2.0 and XPath 2.0 (for XML, XSLT, and XPath) is some of the best money I've ever spent on XML-technology-related documentation - it is a fantastic piece of work.

— Bridger Dyson-Smith, posting on xsl-list, 2 August 2014

Find it on amazon.com

Previous editions

  • The third edition was published in two separate volumes, covering XSLT 2.0 and XPath 2.0 separately. This edition was produced before the final specifications were ratified by W3C, so there are some inaccuracies. The format (split into two volumes) was not especially popular with readers, especially as many made the mistake of buying the XSLT volume on its own, without realising that it relied heavily on the reader also having access to the XPath book. Navigation in the book was also difficult because of the absence of running heads for the alphabetical chapters. The fourth edition corrects all these problems, and has received a much more enthusiastic reception.
  • The second edition remains in print, and is useful as the definitive reference to the XSLT 1.0 language (though it does include some features from the draft XSLT 1.1 specification, which W3C abandoned just before the book went to print.
  • The first edition was published in April 2000, very soon after the XSLT 1.0 specification was ratified. It quickly established itself as the definitive guide to the language and played a significant part in ensuring the rapid and successful adoption of XSLT by the user community.

Also available:

XQuery from the Experts: A Guide to the W3C XML Query Language http://www.amazon.com/exec/obidos/ASIN/0321180607

Eight chapters by members of W3C's Query Working Group provide an overview of XQuery designed to be of interest to programmers at every skill level. Coverage ranges from strictly technical subjects to historical essays on the language's ancestry and the process behind XQuery's design. The book presents its material in both tutorial and reference form.

Michael Kay's chapter provides a high-level comparison of XQuery and XSLT, looking both at the differences between the two languages and at their similarities.

Chapter Three is especially helpful for understanding the similarities and differences between XQuery, XPath and XSLT. To really understand where XQuery fits, you must understand this interrelationship. Not only does Mr. Kay do a great job explaining that, he actually makes it fun to read.

— A quote from a reader's review

Return to top of page

Published Papers and Articles

Projection and Streaming: Compared, Contrasted, and Synthesized

Michael Kay. Presented at XML Prague 2017.

This paper describes, compares, and contrasts two techniques designed to enable an XML document to be processed without building an entire tree representation of the document in memory. Document projection analyses a query to determine which parts of the document are relevant to the query, and discards everything else during source document parsing. Streaming attempts to execute a stylesheet "on the fly" while the source document is being read. For both techniques, the paper describes the way that they are implemented in the Saxon XSLT and XQuery engine. Performance results are given that apply to both techniques, in relation to the queries in the XMark benchmark applied to a 118Mb source document. The paper concludes with a discussion of ideas for combining the benefits of both techniques and getting more synergy between them.

Michael Kay. "Projection and Streaming: Compared, Contrasted, and Synthesized". XML Prague 2017. http://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf

XPath 3.1 in the Browser

John Lumley, Debbie Lockett, Michael Kay. Presented at XML Prague 2017.

This paper discusses the implementation of an XPath 3.1 processor with high levels of standards compliance that runs entirely within current modern browsers. The runtime engine Saxon-JS, written in JavaScript and developed by Saxonica, used to run pre-compiled XSLT 3.0 stylesheets, is extended with a dynamic XPath parser and converter to the Saxon-JS compilation format. This is used to support both XSLT's xsl:evaluate instruction and a JavaScript API XPath.evaluate() which supports XPath outside an XSLT context.

John Lumley, Debbie Lockett, and Michael Kay. "XPath 3.1 in the Browser". XML Prague 2017. http://archive.xmlprague.cz/2017/files/xmlprague-2017-proceedings.pdf

Approximate CSS Styling in XSLT

John Lumley. Presented at Balisage 2016, Washington DC.

This paper discusses transforming a CSS stylesheet into an XSLT transform that projects an approximation of the styling from the CSS onto a target XML document. It was developed during several XSLT-based projects involving multi-dialect XML documents, where there was a need either to evaluate CSS properties for another external tool, such as in an HTML → XSL-FO → PDF pipeline, or where a document styling needed to be "fixed" for embedding in another document, such as examples in professional papers. The paper presents examples, explains the general architecture of the generated XSLT transform, discusses how that transform is itself constructed from the CSS stylesheet and outlines the strengths and weaknesses and some of the directions in which the tool could be developed. It is approximate in that it only supports some of the core CSS features, assumes the user is "skilled in the art" and is working with CSS stylesheets that are understood and visible, and that the execution speed of the CSS "projection" is not an issue. Nevertheless, in the author's experience the ability to mix CSS styling into the "XSLT researcher's toolbox" has proved to be of some utility.

Lumley, John. "Approximate CSS Styling in XSLT". Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). doi:10.4242/BalisageVol17.Lumley01.

Saxon-JS: XSLT 3.0 in the Browser

Debbie Lockett and Michael Kay. Presented at Balisage 2016, Washington DC.

We introduce Saxon-JS, an XSLT 3.0 run-time written in pure JavaScript. We've effectively split the Saxon product into its compile time and run time components. The compiler runs on the server, and generates an intermediate representation of the compiled and optimized stylesheet in a custom XML format. Saxon-JS, running on the browser, reads in the compiled stylesheet and executes it. We describe some particular features of Saxon-JS: the event-handling extensions to the XSLT language (as used for Saxon-CE), the way that XSLT and JavaScript can interwork, conformance to the W3C XSLT and XPath specifications, and some details of the internal implementation.

Lockett, Debbie, and Michael Kay. "Saxon-JS: XSLT 3.0 in the Browser." Presented at Balisage: The Markup Conference 2016, Washington, DC, August 2 - 5, 2016. In Proceedings of Balisage: The Markup Conference 2016. Balisage Series on Markup Technologies, vol. 17 (2016). doi:10.4242/BalisageVol17.Lockett01.

Transforming JSON using XSLT 3.0

Michael Kay. Presented at XML Prague 2016.

The XSLT 3.0 and XPath 3.1 specifications, now at Candidate Recommendation status, introduce capabilities for importing and exporting JSON data, either by converting it to XML, or by representing it natively using new data structures: maps and arrays. The purpose of this paper is to explore the usability of these facilities for tackling some practical transformation tasks. Two representative transformation tasks are considered, and solutions for each are provided either by converting the JSON data to XML and transforming that in the traditional way, or by transforming the native representation of JSON as maps and arrays. The exercise demonstrates that the absence of parent or ancestor axes in the native representation of JSON means that the transformation task needs to be approached in a very different way.

Kay, Michael. "Transforming JSON using XSLT 3.0". XML Prague 2016. http://archive.xmlprague.cz/2016/files/xmlprague-2016-proceedings.pdf

Two from Three (in XSLT)

John Lumley. Presented at Balisage 2015, Washington DC.

This paper discusses automated methods of 'downgrading' XSLT 3.0 programs into XSLT 2.0 syntax and semantics. The stimulus was running portions of a document processing system, that had been upgraded to use more coherent features of XSLT 3.0, in the environment of a browser-based standards-compliant XSLT 2.0 implementation (Saxon-CE). The work involves detailed knowledge of XSLT and is intended to automate significant sections of the 'downconversion', leaving other sections to conditional compilation directives. All conversion tools are of course written in XSLT and several aspects involve partial processing and evaluation of XSLT semantics within XSLT.

Lumley, John. "Two from Three (in XSLT)". Presented at Balisage: The Markup Conference 2015, Washington, DC, August 11 - 14, 2015. In Proceedings of Balisage: The Markup Conference 2015. Balisage Series on Markup Technologies, vol. 15 (2015). doi:10.4242/BalisageVol15.Lumley01.

Improving Pattern Matching Performance in XSLT

John Lumley and Michael Kay. Presented at XML London 2015 and again at Balisage 2015, Washington DC.

This paper discusses improving the performance of XSLT programs that use very large numbers of similar patterns in their push-mode templates. The experimentation focusses around stylesheets used for processing DITA document frameworks, where much of the document logical structure is encoded in @class attributes. The processing stylesheets, often defined in XSLT 1.0, use string-containment tests on these attributes to describe push-template applicability. For some cases this can mean a few hundred string tests have to be performed for every element node in the input document to determine which template to evaluate, which sometimes means up to 30% of the entire processing time is taken up with such pattern matching. The paper examines methods, within XSLT implementations, to ameliorate this situation, including using sets of pattern preconditions and pretokenization of the class-describing attributes. How such optimisation may be configured for an XSLT implementation is discussed.

Dr. John Lumley and Dr. Michael Kay. "Improving Pattern Matching Performance in XSLT". Presented at XML London 2015, June 6 - 7th, 2015. doi:10.14337/XMLLondon15.Lumley01.

Parallel Processing in the Saxon XSLT Processor

Michael Kay. Presented at XML Prague 2015.

One of the supposed benefits of using declarative languages (like XSLT) is the potential for parallel execution, taking advantage of the multi-core processors that are now available in commodity hardware. This paper describes recent developments in one popular XSLT processor, Saxon, which start to exploit this potential. It outlines the challenges in implementing parallel execution, and reports on the benefits that have been observed.

Kay, Michael. "Parallel Processing in the Saxon XSLT Processor". XML Prague 2015. http://archive.xmlprague.cz/2015/files/xmlprague-2015-proceedings.pdf

Analysing XSLT Streamability

John Lumley. Presented at Balisage 2014, Washington DC.

Determining streamability of constructs in XSLT 3.0 involves the application of a set of rules that appear to be complex. A tool that analyses these rules on a given stylesheet has been developed to help developers understand why sections which were designed with streaming might fail the required conditions. This paper discusses the structure of this analysis tool.

Lumley, John. "Analysing XSLT Streamability". Presented at Balisage: The Markup Conference 2014, Washington, DC, August 5 - 8, 2014. In Proceedings of Balisage: The Markup Conference 2014. Balisage Series on Markup Technologies, vol. 13 (2014). doi:10.4242/BalisageVol13.Lumley01.

Benchmarking XSLT Performance

Michael Kay and Debbie Lockett. Presented at XML London 2014.

This paper presents a new benchmarking framework for XSLT. The project, called XT-Speedo, is open source and we hope that it will attract a community of developers. The tangible deliverable consists of a set of test material, a set of test drivers for various XSLT processors, and tools for analyzing the test results. Underpinning these deliverables is a methodology and set of measurement objectives that influence the design and selection of material for the test suite, which are also described in this paper.

Dr. Michael Kay and Dr. Debbie Lockett. "Benchmarking XSLT Performance". Presented at XML London 2014, June 7 - 8th, 2014. doi:10.14337/XMLLondon14.Kay01.

Streaming in the Saxon XSLT Processor

Michael Kay. Presented at XML Prague 2014.

Streaming is a major new feature of the XSLT 3.0 specification, currently a Last Call Working Draft. This paper discusses streaming as defined in the W3C specification, and as implemented in Saxon. Streaming refers to the ability to transform a document that is too big to fit in memory, which depends on transformation itself being in some sense linear, so that pieces of the output appear in the same order as the pieces of the input on which they depend. This constraint is reflected in the W3C specification by a set of streamability rules that determine statically whether a stylesheet is streamable or not. This paper gives a tutorial introduction to the streamability rules and they way they are implemented in Saxon. It then does on to describe the implementation architecture for implementing streaming in the Saxon run-time, by means of push pipelines, and gives rationale for this choice of architecture.

Kay, Michael. "Streamability in Saxon". XML Prague 2014. http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf

Finalising a (small) Standard

John Lumley. Presented at XML Prague 2014.

This paper discusses issues and lessons that arose during the finalisation of a standard (library) for XSLT/XPath/XQuery extension functions to manipulate binary data. This process took place during 2013 in the EXPath community, through shared (mailing-list) commenting, specification redrafting, implementation experimentation and test suite development. The purpose, form and specification of the library (which isn’t technically difficult) are described briefly. Lessons and suggestions arising from the development are presented in four broad categories: establishing policies, concurrent implementation and application, using tools and declarative approaches, and pragmatic issues. None of these lessons are new, but bear reinforcement. This work was performed under the auspices of the EXPath community and was funded by Saxonica Ltd.

Lumley, John. "Finalising a (small) Standard". XML Prague 2014. http://archive.xmlprague.cz/2014/files/xmlprague-2014-proceedings.pdf

XML on the Web: Is it still relevant?

O'Neil Delpratt. Presented at XML London 2013.

This paper discusses what is meant by the term XML on the Web and how this relates to the browser. The success of XSLT in the browser has so far been underwhelming, and it examines the reasons for this and considers whether the situation might change. It describes the capabilities of the first XSLT 2.0 processor designed to run within web browsers, bringing not just the extra capabilities of a new version of XSLT, but also a new way of thinking about how XSLT can be used to create interactive client-side applications. Using this processor, the author demonstrates as a use-case, a technical documentation application which permits browsing and searching in a intuitive way and shows its internals to illustrate how it works.

O'Neil Delpratt. "XML on the Web: Is it still relevant?". Presented at XML London 2013, June 15 - 16th, 2013. doi:10.14337/XMLLondon13.Delpratt01.

Multi-user interaction using client-side XSLT

O'Neil Delpratt and Michael Kay. Presented at XML Prague 2013.

This paper describes two use-case applications to illustrate the capabilities of the first XSLT 2.0 processor designed to run within web browsers. The first is a technical documentation application, which permits browsing and searching in a intuitive way. The second is a multi-player chess game application; using the same XSLT 2.0 processor as the first application, it is in fact very different in purpose and design in that it provides multi-user interaction on the GUI and implements communication via a social media network: namely Twitter.

O'Neil Delpratt and Michael Kay. "Multi-user interaction using client-side XSLT". XML Prague 2013. http://archive.xmlprague.cz/2013/files/xmlprague-2013-proceedings.pdf

The Effects of Bytecode Generation in XSLT and XQuery

O'Neil Delpratt and Michael Kay. Presented at Balisage 2011, Montréal.

This paper discusses highly efficient optimization of expression with XSLT and XQuery processors today and presents further speed improvements that can be gained by generating bytecode rather than interpreting queries directly. Although optimization produces the most throughput gain, the gains from optimization and bytecode generation are orthogonal, and compilation can produce about 25% gain over and above gains from optimization. Tests with two variants of a well-known XSLT/XQuery processor, one with code generation and one with optimization alone, demonstrate the effect on a range of queries.

Delpratt, O'Neil Davion, and Michael Kay. "The Effects of Bytecode Generation in XSLT and XQuery". Presented at Balisage: The Markup Conference 2011, Montréal, Canada, August 2 - 5, 2011. In Proceedings of Balisage: The Markup Conference 2011. Balisage Series on Markup Technologies, vol. 7 (2011). doi:10.4242/BalisageVol7.Delpratt01.

A Streaming XSLT Processor

XSLT transformations can refer to any information in the source document from any point in the stylesheet, without constraint; XSLT implementations typically support this freedom by building a tree representation of the entire source document in memory and in consequence can process only documents which fit in memory. But many transformations can in principle be performed without storing the entire source tree. The paper (given at Balisage 2010, Montréal) reports on the progress of the W3C XSL Working Group implementation of a new version of XSLT, designed to make streamed implementations of XSLT feasible.

Kay, Michael. "A Streaming XSLT Processor". Presented at Balisage: The Markup Conference 2010, Montréal, Canada, August 3 - 6, 2010. In Proceedings of Balisage: The Markup Conference 2010. Balisage Series on Markup Technologies, vol. 5 (2010). doi:10.4242/BalisageVol5.Kay01.

You Pull, I’ll Push: On the Polarity of Pipelines

This paper (given at Balisage 2009, Montréal) discusses the most effective way to move XML data through a processing pipeline. It draws on the concept of program inversion, originally developed to eliminate bottlenecks in magnetic-tape-based processes, and ideas derived from Jackson Structured Programming which allow processes written in a convenient pull style to be compiled into push-style code; thus potentially reducing both coordination overhead and latency.

Kay, Michael. "You Pull, I’ll Push: on the Polarity of Pipelines". Presented at Balisage: The Markup Conference 2009, Montréal, Canada, August 11 - 14, 2009. In Proceedings of Balisage: The Markup Conference 2009. Balisage Series on Markup Technologies, vol. 3 (2009). doi:10.4242/BalisageVol3.Kay01.

Ten Reasons Why Saxon XQuery is Fast

A paper written for the IEEE Data Engineering Bulletin, included in a special issue published in December 2008 and devoted to papers on the state-of-the-art in XQuery implementation. Most of what the paper says is of course equally applicable to XSLT.

Writing an XSLT Optimizer in XSLT

This paper (given at Extreme Markup 2007) explores the possibility that since query optimization is an exercise in transforming expression trees, and XSLT is a language for transforming trees, it ought to be possible to write an optimizer in XSLT. (The rendition of the paper is poor because it has been only partially recovered after IDEAlliance, the conference organizers, withdrew their public archive of the conference proceedings.)

C24 White Paper: Using XQuery with Financial Messages

Back in 2006-7, Saxonica collaborated with C24 to enable Saxon to be used as the query engine within the C24 Integration Objects product. (The company was subsequently acquired by Iona, which in turn was acquired by Progress, but it is now independent again and trading under its old name. In 2013 we've resumed the collaboration and hope to move the technology forward to take advantage of all the things that have happened in Saxon in the meantime.) This May 2007 paper describes how such an integration enables XQuery to be used to access non-XML data such as SWIFT financial messages, and to convert data between different formats.

Positional Grouping in XQuery

Published at the XIME-P 2006 XQuery workshop at the SIGMOD Conference in Chicago, this paper proposes an extension to XQuery to handle positional grouping problems, derived from experience with the xsl:for-each-group construct in XSLT 2.0.

Using XSLT and XQuery for Life-Size Applications

This paper discusses the role of the XSLT 2.0 and XQuery 1.0 languages when it comes to writing real-life, sizeable applications for performing data transformations: especially factors such as error handling, debugging, performance, reuse and customization of code, relationships with XML Schema and other technologies such as XForms, and the use of pipeline-based application architectures.

Comparing XSLT and XQuery

This paper by Michael Kay was presented at XTech 2005 in Amsterdam. It compares XSLT and XQuery not just using a blow-by-blow feature comparison, but an assessment of the suitability of the languages for different tasks, and the kinds of users the two languages are aimed at.

Up-Conversion using XSLT 2.0

This paper by Michael Kay was presented at XML 2004 in Washington DC. By means of a case study, it shows how some of the new features in XSLT 2.0 (notably the grouping instructions and the facilities for handling regular expressions) make XSLT 2.0 suitable for applications such as up-conversion (creating structured XML from unstructured input) that were quite infeasible in XSLT 1.0.

XSLT and XPath Optimization

This paper by Michael Kay, presented at XML Europe 2004 in Amsterdam, looked at the techniques used inside an XSLT processor (Saxon, of course!) to optimize performance. It described some of the techniques actually used in the Saxon processor, and surveyed other ideas coming from academia.

XML Five Years On: a review of the achievements so far and the challenges ahead

Keynote address given by Michael Kay at the Document Engineering 2003 Conference in Grenoble, France.

XML & Co. - was bringt die Zukunft?

Article in ComputerWoche (in German): XML begann als "SGML light" und sollte sich vor allem durch Einfachheit auszeichnen. Eine Reihe von Zusatzstandards erhöhten aber zwischenzeitlich die Komplexität beträchtlich. Während der Kernstandard weitgehend stabil bleibt, stehen in anderen Bereichen größere Änderungen bevor.

Saxon: Anatomy of an XSLT Processor

This paper by Michael Kay, although published as long ago as 2001, remains a frequently cited description of how XSLT processing in a product like Saxon actually works.

What kind of a language is XSLT?

This paper by Michael Kay, published at the same time as the one above, gives an overview of the capabilities of the XSLT language.

Return to top of page

Articles written for Stylus Studio

Saxonica has a close working relationship with the Stylus Studio team: Stylus Studio was the first XML development environment to offer Saxon-SA as a standard feature. As part of this collaboration, we wrote a regular column for their web site. The following articles have been published:

Return to top of page

Demonstrations

In some of my tutorials and seminars I use a genealogy application to illustrate the features of XSLT 2.0. The files for this demonstration are available for download.

Return to top of page