Validation from Python

Schema validation can be controlled using the SaxonC Python interface. To use XML Schema validation you will need to install saxoncee.

The Python interface allows schemas to be loaded into a PySchemaValidator, and then to be used for validating instances, or for schema-aware XSLT and XQuery processing.

The main steps are:

  1. Create a PySaxonProcessor using the constructor proc = PySaxonProcessor(license=True) (schema processing requires SaxonC-EE). Then call the method new_schema_validator() to create a new PySchemaValidator.

  2. Set any options required on the PySchemaValidator to control the way in which schema documents will be loaded, and the way a validation episode is invoked and performed (for example, the source document, current working directory and schema parameters).

  3. Register and load a schema document by calling theregister_schema() method, using the keywords xsd_text, xsd_node or xsd_file to load a schema from lexical string, XdmNode object or file, respectively. When loading from file on disk, the file can either be a schema document in source XSD format, or a compiled schema in Saxon-defined SCM format (as produced using the export_schema() method).

  4. To validate an instance document, call the validate() or validate_to_node() methods on the PySchemaValidator object.

  5. Validation errors are returned to the standard error listener. It is also possible to call the class property validation_report to return a validator report as a PyXdmNode object.

Note that additional schemas referenced from the xsi:schemaLocation attributes within the source documents will be loaded as necessary. By default a target namespace is ignored if there is already a loaded schema for that namespace; Saxon makes no attempt to load multiple schemas for the same namespace and check them for consistency. This behaviour can be changed using the configuration option MULTIPLE_SCHEMA_IMPORTS.

Although the API is defined in such a way that a PySchemaValidator is created for a particular schema, in the Saxon implementation the schema components that are available to the validator are not only the components within that schema, but all the components that form part of any schema registered with the PySaxonProcessor.

The PySchemaValidator can be used with the PyDocumentBuilder class to parse and validate XML documents.