Saxonica.com
Why XSLT and XQuery?

Saxonica specializes in XSLT and XQuery technology. If you're new to these languages, this page tries to explain briefly what these languages are and why you would want to use them. It also gives you some reasons for choosing the Saxon product in preference to other implementations.

XML is for publishing, XML is for exchanging data

XML is now widely used both as a format for maintaining published documents, and as a protocol for exchanging data between applications, often between different organizations. It became successful because it can handle information of arbitrary complexity, yet at the same time it is essentially a very simple (and therefore inexpensive) technology. The rules for particular kinds of XML message can be written and agreed by means of an XML Schema, allowing errors to be detected automatically, and because XML messages are human-readable, it is very easy to find and resolve problems when they occur.

XML is also easy to integrate into business applications, thanks to the wide availability of XML parsers on popular platforms such as Java and .NET, and the increasing number of software packages (from relational databases to office desktop applications) that are now XML-enabled "out of the box".

XSLT: Styling and Transformation

XSLT stands for eXtensible Stylesheet Language - Transformation. The language was designed primarily for writing stylesheets that enable XML documents to be displayed to their readers. By writing more than one stylesheet, you can display the same information in different ways to different audiences, and tailor the presentation to the capabilities of different display devices, including web browsers, conventional print media, handheld devices, and digital TV. By authoring your information in XML, and rendering it using XSLT, you separate the tasks of the content author and the graphic designer, and you ensure that a uniform design is achieved across all your content. This approach also allows you to change the visual design without rewriting the content.

To see an example of this in action, compare the Saxon documentation on this site with the earlier version of the same material: the only difference is a 400-line stylesheet written in XSLT.

XSLT is also used increasingly for transforming data. For example, if you need to send electronic orders to your suppliers, then the XML message used for this purpose will have to conform to a schema agreed with the suppliers. If the data starts life in an Excel spreadsheet, you can export it in XML using the built-in capabilities of Excel, and then transform it to the format required by the agreed schema, by means of an XSLT transformation.

XSLT 1.0 was defined by the World Wide Web Consortium (W3C) in 1999. Saxon was one of the first implementations. Many others followed, including products from major vendors such as Microsoft, Oracle, and IBM. Despite this competition, the Saxon XSLT 1.0 processor has been downloaded nearly 200,000 times, as well as being bundled with many other software products. It features regularly in the list of the top 100 open-source software products.

XSLT 2.0 and Schema-Aware Transformation

The specifications for XSLT 2.0 are now in the final stages of the W3C development process. A Candidate Recommendation was issued on 3 November 2005. This is essentially the final stage of quality assurance allowing any final errors to be corrected as a result of implementation experience before the final Recommendation is published. Saxon is the first and so far the only product to fully implement either of the XSLT 2.0 or XQuery 1.0 Candidate Recommendations, let alone to implement both in a single integrated package.

XSLT 2.0 adds many new features desperately needed by existing users. These include more powerful facilities for handling text (regular expressions) and structured data (grouping), which greatly increase the range of tasks to which XSLT can be applied, especially when converting legacy data and document formats to XML.

One of the most significant additions is that XSLT is now schema-aware. This means that the XML Schema used to define the source and result documents of a transformation can now be used to guide the compilation and execution of the stylesheet. This makes stylesheet code more robust, it speeds the debugging cycle, and it creates the potential for significant performance improvements.

Saxon was the first processor to implement the XSLT 2.0 specification and remains the only implementation that is anywhere near complete. Saxon is now available in two versions: an open source product, Saxon-B, which implements the conformance requirements for a Basic XSLT Processor, and a commercial product, Saxon-SA, which adds the extra features of a Schema-Aware XSLT Processor.

XQuery: The Query Language for XML

XQuery is another language specification from W3C, produced in parallel with XSLT 2.0. Whereas XSLT was designed primarily for document rendition, and does data transformation almost as a sideline, XQuery was conceived as a language for querying XML databases, in the same way as SQL is used for querying relational databases. All the major relational database vendors are adding XML support to their existing products, and new Native XML Databases are also appearing. These products all use XQuery to access the data.

Like XSLT, XQuery can also be used to transform XML messages from one format to another. The language is less powerful than XSLT 2.0, but usability studies have shown that it is easier for users to learn, and there are also indications that it is easier for vendors to optimize.

XQuery and XSLT have much in common. Both languages make use of XPath, a syntax for finding your way around the structure of an XML document. Both languages share the same data model and type system, and the same function library, which means that the two languages can work together well in a single application. For example, you can use XQuery to extract data from an XML database, and XSLT to present the results to users on the web.

Why Saxon?

The Saxon-SA product includes within a single package:

  • A schema-aware XSLT 2.0 processor
  • A schema-aware XQuery 1.0 processor
  • An XPath processor that can be called from Java applications
  • A free-standing XML Schema validator.

Saxon is written in 100% Java and therefore runs on any popular platform.

In the six years that Saxon has been available, it has established a reputation for fast performance, the highest level of conformance to the W3C specifications, excellent diagnostics, technical innovation, and responsive technical support direct from the original developer.

XSLT 2.0 enables you to tackle more complex transformation problems, and makes many routine tasks vastly simpler. This makes experienced developers more productive, and enables new users to learn the language more quickly. Although Saxon first provided XSLT 2.0 support over two years ago, it remains the only implementation that is anywhere near complete.

Saxon-SA completes the picture by offering the first schema-aware transformation engine, allowing you to use the schemas for your source and target XML documents to guide the transformation process both at compile time and at execution time.

The Saxon XSLT engine supports standard Java application programming interfaces, which means you only need to change a single environment variable to use Saxon in place of the processor bundled with your Java Development Kit. Many users have reported that this has speeded up applications tenfold.

Since mid-2003 Saxon has also included an XQuery 1.0 processor. This gives you a choice of languages supported by the same underlying engine. Saxon is the only product to offer XSLT and XQuery in an integrated package, allowing you to use each language for the things it does best. The two interfaces are complementary: for example you can write a function library in XQuery, and call the functions from your XSLT stylesheets. Saxon has established a reputation as the XQuery engine of choice, as shown by the number of vendors who are integrating it into their XML development environments.

A word of advice, though: Saxon is not an XML database. Users have successfully used Saxon to process XML datasets up to 200Mbytes in size, but the product does not attempt to offer the kind of facilities associated with a traditional database, such as transactions, concurrency, and recovery.

Finally, Saxon-SA offers an alternative choice of XML Schema processor. XML Schema is a notoriously complex specification, and until now there have been few implementations of the standard to choose from. Because the standard is so complex, there are often discrepancies between the results that these produce; the products are also notorious for the difficulty in understanding their error messages when a document is found to be invalid. If you are dissatisfied with your existing schema processor, Saxon provides an alternative. Use it alongside your existing processor to get a second opinion on the validity of your documents, perhaps expressed in language that is easier to understand. In time, you will probably find yourself using Saxon as your first choice of schema processor.

What is Saxonica?

Saxonica is a company created by the developer of Saxon, Michael Kay, to bring the technology to the commercial market. The existence of the company will ensure continued investment in moving the Saxon technology forwards, and remove the risk associated with using open-source software that has no support infrastructure.

The development of Saxon has pioneered a new way of creating software: solo development. Unlike conventional development techniques used both in commercial and open-source projects, there is no development team, and therefore no time wasted in meetings and coordination. The designer is the implementor. The most important result is that there are no messy compromises in the design. The approach shares many of the characteristics popularized under the name Extreme Programming, but takes it a step further. It's a high-risk approach to software development, but when it works, the result is an unprecedented level of productivity, responsiveness to new requirements, and software quality. The success of Saxon is sufficient proof that the approach can indeed work. In software as in other creative activities, the best results are often produced by individuals working alone.

Michael Kay, the developer of Saxon, has nearly thirty years' experience in the software industry, acting as lead designer on a succession of software projects, most of which involved large development teams. He held one of the most senior technical positions in ICL, a $4bn software and services company, now part of Fujitsu. His knowledge of XSLT and XQuery is second to none: he is the editor of the XSLT 2.0 specification, a joint editor of XPath 2.0, and an invited expert on the XQuery Working Group. He is also the author of the best-selling XSLT Programmer's Reference from Wrox Press, recently published in a new edition (in two volumes) covering XSLT 2.0 and XPath 2.0.

Saxonica will preserve the unique characteristics that have made Saxon so successful, notably the speed of development, the direct relationship between developer and user, and the quality of the product. At the same time it will put this on a commercial footing, enabling users of the technology to have confidence in its stability and its future.