saxonica.com

Importing and Exporting Schema Component Models

From version 9.0, Saxon provides the ability to export or import a compiled schema. The export format is an XML file, known as an SCM file (for schema component model). Exporting a schema in SCM format makes it quicker to reload when the same schema is used again. It is also a format that is easier for programs to analyze, in comparison with raw XSD schema documents.

The simplest way to create an SCM file is from the command line, using the com.saxonica.Validate command with the -scmout option. This is described here. Alternatively, an SCM file can be generated programmatically using the exportComponents() method of the com.saxonica.EnterpriseConfiguration class, which is described in the JavaDoc. The serializer is unselective: it will output an SCM containing all the schema components that have been loaded into the Configuration, other than built-in schema components.

An SCM file can be imported using the -scmin option of the com.saxonica.Validate command. It can also be loaded programmatically using the SchemaModelLoader class. For example:


SchemaModelLoader loader = new SchemaModelLoader(config);
loader.load(new StreamSource(new File("input.scm")));

A schema loaded in this way is then available for all tasks performed using this Configuration, including validation of source documents and compiling of schema-aware queries and stylesheets. In particular, it can be used when compiled queries are run under this Configuration.

Schema Component Models can also be imported and exported using the importComponents() and exportComponents() methods of the SchemaManager in the s9api interface.

The structure of an SCM file is defined in the schema scmschema.xsd which is available in the directory samples/scm/ in the saxon-resources download file. This is annotated to explain the mappings between elements and attributes in the SCM file and components and properties as defined in the W3C XML Schema Specification. The same directory contains a file scmschema.scm which contains the schema for SCM in SCM format.

The SCM file includes a representation of the finite state machines used to validate instances against a complex type. This means that the FSM does not need to be regenerated when a schema is loaded from an SCM file, which saves a lot of time. However, it also means that the SCM format is not currently suitable as a target format for software-generated schemas. A variant of SCM in which the finite state machines can be omitted may be provided in a future release.

Next