Saxonica.com

Schema Processing

Saxon now recognizes an xs:import of the XML namespace, and no longer requires a schema location to be provided: the relevant schema components are constructed automatically. (However, if a schema location is supplied, then it is used.)

Durations can now be validated against value range facets (minInclusive, minExclusive, maxInclusive, maxExclusive. The algorithm for comparing durations that mix year/month and day/time components (for example 1 year compared with 365 days) is not precisely the same as the one in the XML Schema specification in a few edge cases.

A pattern facet is now checked against the canonical lexical representation of the value, as defined in XML Schema part 2. Previously it was checked against the result of casting the value to a string according to the XPath rules. This makes a difference for the types xs:decimal, xs:float, and xs:double, where the two specifications differ.

In most cases Saxon now uses the correct schema-defined semantics when comparing atomic values, for example in evaluating facets and in testing identity constraints. Previously Saxon used the XPath semantics. This means, for example, that when handling identity constraints the xs:integer 1 and the xs:double 1 are no longer considered equal.

Erratum E2-25 to XML Schema Part 2 has been implemented. This erratum changes the validation rules for the xs:language data type.

The rules for escaping of hyphens in regular expressions have changed. The rules in the specification are still unclear, but Saxon was disallowing some cases which clearly should be allowed, like the subtraction [\c-[X]]. The rules are now that within square brackets, an unescaped hyphen is taken as representing itself (that is, it matches a hyphen in the input) if it appears as the first character, or is followed by ']', or if it immediately follows a character range. (Thus [A-Z-0-9] allows A-Z, 0-9, or hyphen). It is taken as a subtraction operator only if followed by '['. The new rules affect XPath as well as XML Schema.

Saxon now enforces the constraint defined in the XML Schema specification that in a hierarchy of types, an element or attribute cannot be dropped at one level (by restriction) and then re-introduced at a deeper level (by extension) with an incompatible type.

XPath expressions used in identity constraints in a schema are now statically type-checked; this means that most errors in defining the path (for example, incorrectly spelt element names) will now result in a warning message. (This is a warning rather than an error because such path expressions are not disallowed by the XML Schema specification)

Running the new W3C XML Schema Test Suite (some 40,000 test cases) revealed a number of cases where Saxon was doing insufficient checks of schema or document validity. These cases have been fixed. The main ones are:

Internally, Saxon now maintains two copies of the content model of a complex type: the version that corresponds to the component model as defined in the XML Schema specification, and a simplified version in which group references are expanded and pointless particles are eliminated.

Next