Saxonica.com

Changes to Schema Processing

Both in XQuery and XSLT, Saxon now does a lot more static checking of the stylesheet/query against the schema. Providing that constructed elements are validated, and that the names of elements and attributes are known statically, Saxon checks statically that elements and attributes used to construct the content of an element whose type is known are permitted within the content model of the containing element. If the value of an element or attribute is given as a constant string, Saxon also checks that the string is valid for the type of that element or attribute. This is by no means a complete static check (for example there is no check that elements are output in the correct order or that mandatory elements/attributes are present) but it enables many errors to be detected at compile time that would otherwise be found only by validating the result tree during or after construction.

Note that this only works if the type of constructed elements is declared, using the validate expression in XQuery or the [xsl:]validation and [xsl:]type attributes in XSLT.

Restrictions on the use of namespace-sensitive values (QNames and NOTATIONs) have been removed. These values may now be freely used as fixed values, default values, and enumeration values, and they may appear in lists and unions.

Note that if a QName is used as a default value, the value that is added to a source document is the lexical value specified in the schema document. No attempt is made to ensure that the relevant namespace prefix is declared in the source document. The XML Schema specification does not make it clear what is supposed to happen here, and the problem is better avoided.

The implementation actually defers the checking of namespace prefixes until the full namespace context in the result tree is known. A consequence of this is that if a QName-validated attribute is added to an element that specifies validate="strip", the type annotation may be removed before the prefix of the QName is ever checked.

There has been a significant reorganisation of the class hierarchy for classes holding type information. There is now a hierarchy of interfaces representing the upper levels of the XML Schema type hierarchy (as modified by XPath): this includes SchemaType with subtypes ComplexType and SimpleType, and various other interfaces such as AtomicType. There are separate implementations of these interfaces for built-in types (available in Saxon-B) and for user-defined types (available only in Saxon-SA). This has enabled a cleaner structure to the two packages, with less tendency for schema-aware code to clutter the non-schema-aware product, and less need for artificial marker interfaces.

In the schema data model, an AttributeUse is now distinguished from an AttributeDecl. For a global attribute declaration, all the information is in an AttributeDecl; for an attribute reference of the form <xs:attribute ref="x"> all the information is in an AttributeUse; for a locally-declared attribute (the most common case) the information is divided between an AttributeDecl and an AttributeUse. This brings the Saxon model closer to the schema component model described in the W3C specification.

Similarly, an ElementParticle is now distinguished from an ElementDecl. For a locally-declared element, there will be one ElementParticle and one ElementDecl.

Next