Saxonica: Client testimonials

Client testimonials

Saxon has been an open source product since 1999. There have been over 200,000 downloads. The product is generally in the top 1% of SourceForge products with 250-300 downloads on a typical day. It's quite impossible, therefore, to give any kind of comprehensive description of what Saxon users are actually doing with the product. But here are a few summaries provided by satisfied users.

If you would like your own application to be featured, we'll be delighted to hear from you.


University of Virginia Press for the US National Archive

uovp

The US National Archives has launched a new website containing online versions of about 120 thousand documents from editions of the American "founding fathers".

David Sewell, of the University of Virginia Press, one of the two lead developers of the site, commented that:

On the back end, Saxon-PE is a crucial part of the workflow. The canonic versions of the data that drives the site consists of about 200 large single XML files, one per original published volume, in TEI encoding (www.tei-c.org). Those files need to be atomized into single documents and heavily transformed for loading into the Founders Online platform, and it is all done via XSLT 2 using Saxon-PE on my desktop. (The back-end engine of the website itself is MarkLogic, so the data remains as native XML and is searched and rendered via more XSLT and XQuery.)

So thanks to Michael Kay and the Saxon team for producing a tool we couldn't do without!

Return to top of page

Saxon and XBRL

XBRL

XBRL is an XML-based set of standards for financial business reporting. In a recent article in the XML Journal, Computational XSLT for Financial Statements, Edmund Gimzewski explains how Saxon's XSLT 2.0 processor can be used to implement the intensive computational aspects of XBRL processing.

A batch of 8,000 XML instance documents (one per company), each with data for 170 input items at 15 periods, was transformed to 8,000 files with 336 calculated items for the same periods. Some formulas referred to calculated items whose formulas referred to calculated items, up to seven levels above the input data. The null evaluation and period adjustment rules were more complex than those discussed, and annulization and formatting were applied. The 45 million calculations were completed in seven hours, which easily meets our requirements and compares well with other processing routes. The conditions were as follows. Software: Saxonica 8.1 JAR files were incorporated into a lightweight Java 1.4 application; the JVM ran on Windows 2000. Hardware: a PC with a 3GHz processor and 3GB of RAM.

Return to top of page

California Digital Library

California Digital Library

The California Digital Library (CDL), part of the University of California system, uses Saxon as an integral component of its dynaXML online publishing engine. CDL hosts thousands of full-length books in XML format, and uses a Java servlet in combination with XSLT to present them in an easy-to-use HTML interface. Saxon provides the "glue" between system components, and stylesheets allow simple but extremely flexible configuration, without modifying any Java code. The dynaXML system minimizes staff and hardware cost, while still affording a rich, responsive experience to a large number of concurrent users. A number of CDL projects use dynaXML, including:

For further technical information see the eXtensible Text Framework website.

Return to top of page

American College of Physicians

American College of Physicians

Thomas Kuhn has this to say about Saxon:

"The American College of Physicians is a large medical society with a sizeable publishing operation. We publish several journals, including the Annals of Internal Medicine, as well as other publications including books. We operate several websites and supply content to other websites and licensees. Increasingly, we are developing content directly for web and PDA.

About five years ago we began an organization-wide project to convert all of our publishing production to an XML-based workflow. At this point, four of our publications are edited in XML and these XML sources are used to drive the generation of all output formats, including print, web, PDF, PDA, and custom distributions to licensees. All of this is done using XSLTs and the Saxon processor.

The PIER clinical decision support website is re-generated from scratch each night using this method. Unfortunately, all of the content that we produce using Saxon is locked behind password access. Sorry.

In addition to generating finished output, we use this process to generate proof sites for online review and revision of content, to clean the data and make the structure consistent, and to capture subsets of the content. Also, the in-house editors run Saxon constantly to interactively preview the results of their edits and to create all sorts of reports and lists.

The excellent error messages and the ease of use of Saxon has contributed greatly to making development fast and straight-forward.

Please don't stop."

Return to top of page

NetYourWork, Inc: Web-Based Business Systems

Net Your Work

NetYourWork delivers web-based business systems that give mid-sized enterprises enhanced visibility for improved real-time decision-making. "Our systems automate financial processes through workflow management and reporting, resulting in enhanced compliance for critical business procedures and improved company-wide control."

Jim Hopp, Chief Technology Officer, writes:

"I wanted to let you know about our use of Saxon in a production environment. We have built a suite of business applications for small- and medium-sized U.S. businesses. The applications run on our servers, and customers access them via a web-browser.

The server that implements the business logic for the applications speaks only XML. We use Saxon to transform the XML responses from the server into HTML, XML, and Javascript for the browser. Saxon also turns the XML into XML:FO which we run thru FOP to generate PDF versions of reports. We invoke Saxon thru its Java interface.

We use the evaluate() and node-set() extensions. We have about 19,000 lines of XSLT, and all of it goes thru Saxon. We have one customer in production right now, and that results in about 4,000 transformations/day thru Saxon.

I started using Saxon in 2000, I think. I had started with Xalan, but found Saxon at the time to be faster and more correct. I've found no reason to evaluate any other processor.

Thanks for Saxon, and keep up the good work."

Return to top of page

Royal National Institute for the Blind

RNIB

Dave Pawson wrote to explain why this large UK-based charity uses Saxon. He cites the major reasons as:

  • Compliance with W3C specifications
  • Quick turnaround in responding to problems and fixing them
  • The existence of a wide user-base to ensure that the product is well-tested
  • By far the best error reporting.

The RNIB uses Saxon to generate documents in a wide variety of output formats, including:

  • Braille
  • Large Print
  • Plain Text
  • SSML using DAISY format (www.daisy.org)

Saxon is also used for validation of DAISY talking book files

Dave Pawson is also the maintainer of the unofficial but highly-respected XSLT FAQ site at http://www.dpawson.co.uk. He has used Saxon in the maintenance of this ever-growing resource for over four years.

Return to top of page

Miami Systems

Miami Systems provides print, promotional and electronic document solutions utilizing the latest digital and on-demand technologies.

Dale Urig describes how they use Saxon:

"I currently use Saxon as a front end for variable print data. My specific area is with telephone booklet jobs that describe various services that a customer has ordered on his/her telephone.

The services ordered for the telephone company's customers arrives daily in a "loosely" structured XML document from their database. The "looseness" of the document is dependent on what service are being added, what promotions are running, or what new areas the phone company may be expanding into.

At the beginning of the process I write a series of style sheets to group/seperate/pair-off data depending on the "look" the document will take on for the day. i.e.. If promo's are running I'll pull off the "add" inserts. If a new service is added to a section of the country, I'll pull in text pages describing the new service. Secondly, I group all "like" services to form the core of the booklet. i.e.. I pull the customers plans together listing costs and discounts for each phone number a customer is carrying (home / business / fax / cell lines).

The texts included for each service is next. The texts are static and can be pulled in from file.

Then, the address information is sent through a "hygiene" program to guess the best probability of the recipients address. Postal codes are attached and the necessary information for mass mailings.

Finally, the booklet is electronically printed and mailed."

Return to top of page

Roger Kovack

Roger Kovack is a well-known XML expert who has used Saxon on a wide variety of projects. Here are a few of them:

Wrox Case Study
I wrote a case study for Wrox, Beginning XML, 2nd Edition, where I described a servlet and stylesheet that provided an on-line phone directory. I originally wrote this application in Xalan but found Saxon to include the most recent XSLT and Xpath functionality while Xalan was always behind the curve.

HomeBuildersSystems Requirements and Architecture
Since then I embarked on a private project that is a vertical niche business application to serve new home builders, an explosive industry here in the US. This is a web application entirely based on Saxon, Apache Tomcat and MySQL.

This requires full transactional process that includes data updating, storage, indexing and querying in a single user request. The full transaction requirement is addressed with a Model-View-Controller architecture where each component is implemented in a separate execution process. The solution is implemented by transform chaining where each transform is responsible for the logic to determine the stylesheet to execute the next transform.

The Model is accessed from the stylesheet using a custom URIResolver and OutputURIResolver that recognizes parameters such as DB: to read a document from the database or DBINDEX: to read a named index of a document collection, just to mention a few. This allows Saxon to be customized without hacking its source code.

This application is unusual in that the schema of the source document isn't fixed in advance. This allows me to use XML as an accurate data structural representation of nested assemblies which is the hallmark of manufacturing and construction business processes.

This is a point where saxon:evaluate() provides a great convenience in stylesheet development as a couple of very compact variables can be used to always 'point' at the user's position while browsing the tree. There is no nesting or recursion or hair-raising SQL needed to navigate and manipulate the data at every actual and future location in the nested assemblies of the construction and manufacturing business.

Application Server Requirements: The comprehensive scope of this business application makes it impossible to code without some sort of application development language, not a technical language such as XSLT, SQL or Java. My response is to develop an application server on top of the servlet, XSLT transform engine and database servers. The application server itself is largely coded in Saxon although there are a number of underlying Java services that connect the transform engine to the other resources.

The resulting application server includes a comprehensive web interface, an XML database including complex indexing and query services, and document update services.

XML Database: The XML database is a hybrid of relational and document storage attempting to achieve reasonable performance, very high capacity and the robust reliability of MySQL. The SQL is entirely wrapped in an XML API that focuses the application developer on XML and XPath - never the workings of the database itself.

Document Updating: The document update services uses a home brew XML based update language which does not comply to any standards as they didn't seem to exist widely at the time I was developing this component. The actual updater code uses JDOM. I have some distant fantasy that I can replace it with Xquery. The document updater, database and web services present a relatively integrated set of application APIs to the business application developer, all sourced in XML and XSLT. A good deal of the code that generates these APIs are written in Saxon.

On-line XML Editor: Using Saxon to power a high level application server language: The web interface services includes an on line XML editor that is completely written in Saxon. The result is an application driven by property sheets sourced in XML, not in XSLT, although the application developer can embed Xpath in the property sheets. This is another critical feature implemented by saxon:evaluate() and saxon:call-template.

Performance: Since all this functionality could be time consuming, the application engine boosts performance by caching Saxon's compiled Templates in a global List but check the timestamp of the stylesheet stored in the file system. If the timestamps don't match, the application engine will recompile the stylesheets. Otherwise, Saxon's compiled Templates are fetched from the hash for immediate transforms. Also, the live transactional data documents may be stored as JDOM instances in memory to avoid reparsing them for each use.

This entire suite of application services typically delivers 7 to 50 web pages per second running on an AMD Athlon 1.67GHZ Windows 2000 machine. A modest server cluster may be able to produce 1 million pages per day using Saxon.

Saxon is certainly useful to me and I could go through an embarrassingly glowing list of detailed reasons why I believe Saxon is an excellent software product.

Return to top of page

Saxon as a Component in other XML Software

Saxon has been used as a component by many individuals and companies producing other XML software products, both open source and commercial. Here's one example, from Ari Krupnikov of the University of Edinburgh:

I'm using Saxon as the XPath core of STnG, my streaming transformation tool for XML[1][2]. Saxon proved amendable to uses it does not officially support, in my case, XPath resolution on partially-built documents. I used Xalan in my first prototype, but Saxon's much smaller source code [3] and simpler (in my mind) structure, with clearly visible entry points and internal structure made it easier to adapt it to my use. Also, your prompt and to-the-point replies to questions on this list, even when they concerned explicitly unsupported features, made it easier for to use Saxon for my purposes.

[1] http://sourceforge.net/projects/stng
[2] http://www.idealliance.org/papers/extreme03/html/2003/Krupnikov0/
[3] http://lists.xml.org/archives/xml-dev/200211/msg00782.html

Return to top of page