Running Validation from the Command Line

The Java class com.saxonica.Validate allows you to validate a source XML document against a given schema, or simply to check a schema for internal correctness.

To validate one or more source documents, using the Java platform, write:

java  com.saxonica.Validate   [options]  source.xml...  

The equivalent on the .NET platform is:

Validate [options]  source.xml...  

It is possible to use glob syntax to process multiple files, for example Validate *.xml.

In the above form, the command relies on the use of xsi:schemaLocation attributes within the instance document to identify the schema to be loaded. As an alternative, the schema can be specified on the command line:

[java com.saxonica.Validate | Validate] -xsd:schema.xsd -s:instance.xml

In this form of the command, it is possible to specify multiple schema documents and/or multiple instance documents, in both cases as a semicolon-separated list. Glob syntax (such as *.xml) is available only if the -s: prefix is omitted, because the shell has to recognize the argument as a filename.

Thus, source files to be validated can be listed either using the -s option, or in any argument that is not prefixed with "-". This allows the standard wildcard expansion facilities of the shell interpreter to be used, for example *.xml validates all files in the current directory with extension "xml".

If no instance documents are supplied, the effect of the command is simply to check a schema for internal correctness. So a schema can be verified using the command:

[java com.saxonica.Validate | Validate] -xsd:schema.xsd

More generally the syntax of the command is:

[java com.saxonica.Validate | Validate] [options] [params] [filenames]

where options generally take the form -code:value and params take the form keyword=value.

Command line options

The options are as follows (in any order):

-catalog:filenames

filenames is either a file name or a list of file names separated by semicolons; the files are OASIS XML catalogs used to define how public identifiers and system identifiers (URIs) used in a source document or schema are to be redirected, typically to resources available locally. For more details see Using XML Catalogs.

-config:filename

Loads options from a configuration file. This must describe a schema-aware configuration.

-dtd:(on|off|recover)

Setting -dtd:on requests DTD-based validation of the source files. Requires an XML parser that supports validation. The setting -dtd:off (which is the default) suppresses DTD validation. The setting -dtd:recover performs DTD validation but treats the error as non-fatal if it fails. Note that any external DTD is likely to be read even if not used for validation, because DTDs can contain definitions of entities.

-export:filename

Makes a copy of the compiled schema (providing it is valid) as a schema component model to the specified XML file. This file will contain schema components corresponding to all the loaded schema documents. This option may be combined with other options: the SCM file is written after all document instance validation has been carried out.

-ext:(on|off)

If ext:off is specified, suppress calls on dynamically-loaded external Java functions. This does not affect calls on integrated extension functions, including Saxon and EXSLT extension functions. This option is useful when loading an untrusted schema, perhaps from a remote site using an http:// URL; it ensures that the schema cannot call arbitrary Java methods and thereby gain privileged access to resources on your machine.

-init:initializer

The value is the name of a user-supplied class that implements the interface net.sf.saxon.lib.Initializer; this initializer will be called during the initialization process, and may be used to set any options required on the Configuration programmatically.

-limits:min,max

Sets upper limits on the values of minOccurs and maxOccurs allowed in a schema content model, in cases where Saxon is not able to implement the rules using a finite state machine with counters. For further details see Handling minOccurs and maxOccurs .

-opt:0...10

Set optimization level. The value is an integer in the range 0 (no optimization) to 10 (full optimization); currently all values other than 0 result in full optimization but this is likely to change in future. The default is full optimization; this feature allows optimization to be suppressed in cases where reducing compile time is important, or where optimization gets in the way of debugging, or causes extension functions with side-effects to behave unpredictably. (Note however, that even with no optimization, lazy evaluation may still cause the evaluation order to be not as expected.)

-quit:(on|off)

With the default setting, on, the command will quit the Java VM and return an exit code if a failure occurs. This is useful when running from an operating system shell. With the setting quit:off the command instead throws a RunTimeException, which is more useful when the command is invoked from another Java application such as Ant.

-r:classname

Use the specified URIResolver to process the URIs of all schema documents and source documents. The URIResolver is a user-defined class, that implements the URIResolver interface defined in JAXP, whose function is to take a URI supplied as a string, and return a SAX InputSource. It is invoked to process URIs found in xs:include and xs:import schemaLocation attributes of schema documents, the URIs found in xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the source document, and (if -u is also specified) to process the URI of the source file provided on the command line. Specifying -r:org.apache.xml.resolver.tools.CatalogResolver selects the Apache XML resolver (part of the Apache Commons project, which must be on the classpath) and enables URIs to be resolved via a catalog, allowing references to external websites to be redirected to local copies.

-report:filename

This option switches on the capture of validation reporting. Here filename specifies where the validation report should be written to on disk. The validation report is in XML format. The format of the validation report is defined in a schema which is available in the saxon-resources download file (see validation-reports.xsd).

-s:file;file...

Supplies a list of source documents to be validated. Each document is validated using the same options. The value is a list of filenames separated by semicolons. It is also possible to specify the names of source documents as arguments without any preceding option flag; in this case shell wildcards can be used. A filename can be specified as "-" to read the source document from standard input.

The validation of multiple source documents is done simultaneously (in parallel threads) by default. The number of threads used is set to the number of processors available on the machine. If the Configuration option FeatureKeys.ALLOW_MULTITHREADING is set to false, the source documents are validated synchronously in a single thread.

-scmin:filename

Loads a precompiled schema component model from the given file. The file should be generated in a previous run using the -export option. When this option is used, the -xsd option should not be present. Schemas loaded from an SCM file are assumed to be valid, without checking.

This option is retained for compatibility. From Saxon 9.7, SCM files can also be supplied in the -xsd option.

-scmout:filename

Synonym of -export:filename, retained for compatibility.

-stats:filename

Requests creation of an XML document containing statistics showing which schema components were used during the validation episode, and how often (coverage data). This data can be used as input to further processes to produce user-readable reports; for example the data could be combined with the output of -scmout to show which components were not used at all during the validation.

-t

Requests display of version and timing information to the standard error output. This also shows all the schema documents that have been loaded.

-top:element-name

Requires that the outermost element of the instance being validated has the required name. This is written in Clark notation format {uri}local.

-u

Indicates that the name of the source document and schema document are supplied as URIs; otherwise they are taken as filenames, unless they start with "http:" or "file:", in which case they they are taken as URLs.

-val:(strict|lax)

Invokes strict or lax validation (default is strict). Lax validation validates elements only if there is an element declaration to validate them against, or if they have an xsi:type attribute.

-x:classname

Requests use of the specified SAX parser for parsing the source file. The classname must be the fully-qualified name of a Java class that implements the org.xml.sax.XMLReader interface. In the absence of this argument, the standard JAXP facilities are used to locate an XML parser. Note that the XML parser performs the raw XML parsing only; Saxon always does the schema validation itself. Selecting -x:org.apache.xml.resolver.tools.ResolvingXMLReader selects a parser configured to use the Apache entity resolver, so that DTD and other external references in source documents are resolved via a catalog. The parser (part of the Apache Commons project) must be on the classpath.

-xi:(on|off)

Apply XInclude processing to all source XML documents (but not to schema documents). This currently only works when documents are parsed using the Xerces parser, which is the default in JDK 1.5 and later.

-xmlversion:(1.0|1.1)

If set to 1.1, allows XML 1.1 and XML Namespaces 1.1 constructs. This option must be set if source documents using XML 1.1 are to be validated, or if the schema itself is an XML 1.1 document. This option causes types such as xs:Name, xs:QName, and xs:ID to use the XML 1.1 definitions of these constructs.

-xsd:file;file...

Supplies a list of schema documents to be used for validation. The value is a list of filenames separated by semicolons. If no source documents are supplied, the schema documents will be processed and any errors in the schema will be notified. This option must not be used when -scmin is specified. The option may be omitted, in which case the schema to be used for validation will be located using the xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the source document. A filename can be specified as "-" to read the schema from standard input.

The documents may either be source XSD schema documents, or compiled SCM files generated previously using the -export option. Loading precompiled schemas in SCM format is substantially faster. In addition, an SCM file may contain an embedded license key, in which case it is possible to use it for validation using a Saxon-EE configuration that does not have its own license.

-xsdversion:(1.0|1.1)

Indicates whether the schema processor is to act as an XSD 1.0 or XSD 1.1 processor. The default is XSD 1.0. New features in XSD 1.1 are not permitted unless -xsdversion:1.1 is specified.

-xsiloc:(on|off)

If set to on (the default) the schema processor attempts to load any schema documents referenced in xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes in the instance document, unless a schema for the specified namespace (or non-namespace) is already available. If set to off, these attributes are ignored.

-y:classname

Use the specified SAX parser for schema documents. The supplied classname must be the fully-qualified class name of a Java class that implements the org.xml.sax.XMLReader or javax.xml.parsers.SAXParserFactory interface, and it must be instantiable using a zero-argument public constructor.

--feature:value

Set a feature defined in the Configuration interface. The names of features are defined in the Javadoc for class FeatureKeys: the value used here is the part of the name after the last "/", for example --allow-external-functions:off. Only features accepting a string or boolean may be set; for booleans the values true/false or on/off are recognized.

-?

Display command syntax.

Command line parameters

Parameters on the command line can be used to supply values for any saxon:param declarations in the stylesheet. See Parameterizing Schemas for details. The format of parameters is the same as for the XSLT and XQuery command lines: name=value to supply a simple value; +name=filename to supply the contents of an XML document as the parameter value; or ?name=expression to supply the result of evaluating an XPath expression (for example, ?date=current-date()).

The results of processing the schema, and of validating the source document against the schema, are written to the standard error output. Unless the -t option is used, successful processing of the source document and schema results in no output.