Validation from Python

The SaxonC Python interface allows schemas to be loaded into a PySaxonProcessor, and then to be used for validating instances, or for schema-aware XSLT and XQuery processing.

The main steps are:

  1. Create a PySaxonProcessor using the constructor proc = PySaxonProcessor(license=True). (Note that schema processing requires SaxonC-EE).

  2. Call the new_xsd_compiler() method to get a PyXsdCompiler. It is possible to get multiple schema compilers, or to use a single schema compiler to compile multiple schemas.

  3. If required, set options on the XsdCompiler to control the way in which schema documents will be loaded. It is possible to set the XSD version (to 1.0 or 1.1).

  4. Construct a PyXsdSchema object, representing a compiled schema, by calling the compile() method (using one of the keywords xsd_text, xsd_node or xsd_file to compile a schema from lexical string, PyXdmNode object or file, respectively). The resulting schema document is available to all applications run within the containing PySaxonProcessor.

  5. To validate an instance document, first call the new_schema_validator() method on the PyXsdSchema object. This returns a PySchemaValidator.

  6. Set options on the PySchemaValidator to control the way in which a particular validation episode is performed, and then invoke its validate() or validate_to_node() method to validate an instance document. Options available include the ability to control whether additional schema components can be referenced from the instance document using xsi:schemaLocation attributes.

  7. Validation errors are returned to the standard error listener. It is also possible to call the class property validation_report to return a validator report as a PyXdmNode object.

  8. It is possible to save a compiled schema in an SCM (schema component model) file using the export_components() method on the PyXsdSchema object, and to import a previously-saved schema by calling import_components() on the PyXsdCompiler object.

  9. If the schema is to be imported into a schema-aware XSLT stylesheet, XQuery, or XPath expression, it can be supplied to the relevant PyXslt30Processor, PyXQueryProcessor, or PyXPathProcessor via its useSchema() method.

Note that additional schemas referenced from the xsi:schemaLocation attributes within the source documents will be loaded as necessary. By default a target namespace is ignored if there is already a loaded schema for that namespace; Saxon makes no attempt to load multiple schemas for the same namespace and check them for consistency. This behaviour can be changed using the configuration option MULTIPLE_SCHEMA_IMPORTS.

Unlike Saxon versions prior to Saxon 13, compiling a schema does not make it globally available to all applications using a given PySaxonProcessor. Each compiled schema is distinct and can be used individually without interfering with other schemas. However, all the schemas used within a PySaxonProcessor must be consistent. Consistency is defined in the version 4.0 language specifications: in simplified terms, it means that the same name must not be used in different schemas to refer to different element or type declarations.

The PySchemaValidator can be used with the PyDocumentBuilder class to parse and validate XML documents.