XML Schema 1.0 implementation
Saxon now implements enumeration facets on union and list types as the authors of the specification intended. Although the spec as written has problems (bug 5328 has been raised), the intent is that the enumeration facet as written should be interpreted as an instance of the type being restricted. Previously enumeration facets on union and list types were doing a string comparison on the lexical value.
The reporting of
keyRef validation errors has been improved. Multiple errors can now be reported in a single
schema validation run, and the line number given with the error message reflects the location of the unresolved
keyRef value, rather than the end of the document as before.
A new configuration option is available to control whether the schema processor takes notice (and attempts to
xsi:noNamespaceSchemaLocation attributes encountered
in an instance document that is being validated. This is available as the named property
Configuration classes, via methods on the S9API and .NET
SchemaValidator classes, and the XQJ class
SaxonXQDataSource, and via the
option on the command line interfaces
New methods have been added to class
com.saxonica.schema.SchemaCompiler to allow setting of "deferred validation
mode". In this mode a sequence of calls on
readSchema() can be made, followed by a single call on
The effect is to defer all generation of the finite state machines used for run-time validation until
called. This avoids repeated (and wasted) recompilation of complex types every time new elements are added to a substitution
group, or every time a new complex type is derived by extension from an existing type. This facility was developed with
XBRL as the primary use case, and has the effect of reducing compilation time for this collection of schema documents from
400 seconds to 560 milliseconds.
minOccurs and numeric
maxOccurs constraints (other than 0, 1, or unbounded)
appear on an element or wildcard particle, Saxon now implements a finite state machine using simple counters to count the
number of occurrences, rather than "unfolding" the FSM as previously. This removes the limits on the values
maxOccurs, as well as the cost in time and memory of handling large
finite values of
maxOccurs. The unfolding technique is still used when
maxOccurs appear on other kinds of particle, specifically on
sequence or choice groups, or when "vulnerable" repeated element and wildcard particles appear within a model group that
can itself be repeated (a particle is vulnerable if all the other particles in the model group are optional).
A side-effect of this change is that the diagnostics are more specific when a validation failure occurs.
Another side-effect, hopefully temporary, is that some rather artificial type derivations are no longer allowed: specifically those where a wildcard with maxOccurs in the base type is specialized to a sequence of specific element particles in the derived type