Third Party Source Components

These tables list components in Category B as described above: open source code that has been integrated at source level, without the involvement of the original author.

Unlike contributed code, this code was not written specifically for inclusion in Saxon, but was originally published under some other license.

B1 Generic Sorter

Origin

CERN (author Wolfgang Hoschek)

Description

Generic sort routines based on published algorithms

Approximate LOC

500

Saxon packages / modules

net.sf.saxon.sort.GenericSorter

Modifications

Minimal modifications needed to integrate the code

Availability of source

Currently available as part of Colt project, http://dsd.lbl.gov/~hoschek/colt/, module cern.colt.GenericSorting

Source version used

Unknown. Snapshot taken in 2004?

License

CERN License: see below

B2 Unicode Normalization

Origin

Unicode Consortium (author Mark Davis)

Description

Routines for Unicode character normalization

Approximate LOC

3500 (including data sets)

Saxon packages / modules

net.sf.saxon.sort.codenorm.*

Modifications

Core functionality unchanged; rewrote the module that loads the data tables from the Unicode character database; removed dependencies on ICU; fixed a few bugs

Availability of source

Specification of algorithm at http://unicode.org/reports/tr15/, code originally published at http://www.unicode.org/reports/tr15/Normalizer.html, withdrawn in January 2012.

Source version used

No version number. Snapshot taken in June 2005

License

Unicode license: see below

B3 XPath Parser

Origin

James Clark (www.jclark.com)

Description

Top-down parser and lexical tokenizer for XPath

Approximate LOC

1000 (including data sets)

Saxon packages / modules

net.sf.saxon.expr.*, modules XPathParser, Tokenizer, Token

Modifications

Almost entirely rewritten with enhancements to handle XPath 2.0/3.0 and XQuery 1.0/3.0 syntax, improved error reporting, etc.

Availability of source

Derives from James Clark's xt product, which in its original form is at http://www.jclark.com/xml/xt-old.html. Package com.jclark.xsl.expr, modules ExprParser and ExprTokenizer

Source version used

Unknown. Snapshot taken in 1999.

License

James Clark (see below). Apparently copyright has since been transferred to the Thai Open Source Center Ltd.

B4 Apache Jakarta Regexp Engine

Saxon includes a regular expression engine derived from the Apache Jakarta Regexp project, which was originally developed by Jonathan Locke. It has been extensively modified to make the syntax and semantics conform to the W3C specifications, to fully support Unicode, to improve performance, and to integrate with Saxon.

Origin

Apache (author Jonathan Locke)

Description

Regular Expression engine

Approximate LOC

2600

Saxon packages / modules

Package net.sf.saxon.regex, classes RECompiler, REProgram, REMatcher

Modifications

Substantial modifications to implement the XSD and XPath regular expression syntax and semantics, to add Unicode support, to improve performance, and to integrate with the rest of the Saxon code.

Availability of source

http://jakarta.apache.org/regexp/

Source version used

Version 1.5

License

Apache License, version 2.0

B5 Immutable Hash Trie

Saxon implements XPath maps using an immutable hash trie map. The implementation is derived from code published on GitHub by Michael Froh. The published GitHub code carries no licensing conditions, but the author has given permission for Saxonica to release the code under the Mozilla Public License 2.0.

Origin

Github Gist (author Michael Froh, msfroh)

Description

Implementation of immutable hash trie maps

Approximate LOC

500

Saxon packages / modules

Package com.saxonica.functions.trie, classes ImmutableMap and ImmutableHashTrieMap.

Modifications

Minor modifications to integrate with Saxon code, especially by removing dependencies on some utility classes also published by the author.

Availability of source

https://gist.github.com/msfroh

Source version used

Version dated 2012-06-02

License

Mozilla Public License, version 2.0