Saxon.Api

 

 

Saxon.Api

Class DocumentBuilder


public class DocumentBuilder

The DocumentBuilder class enables XDM documents to be built from various sources. The class is always instantiated using the NewDocumentBuilder method on the Processor object.

Property Summary

 Uri BaseUri

The base URI of a document loaded using this DocumentBuilder. This is used for resolving any relative URIs appearing within the document, for example in references to DTDs and external entities.

 XQueryExecutable DocumentProjectionQuery

Set a compiled query to be used for implementing document projection.

 bool DtdValidation

Determines whether DTD validation is applied to documents loaded using this DocumentBuilder.

 bool IsLineNumbering

Determines whether line numbering is enabled for documents loaded using this DocumentBuilder.

 SchemaValidationMode SchemaValidationMode

Determines whether schema validation is applied to documents loaded using this DocumentBuilder, and if so, whether it is strict or lax.

 SchemaValidator SchemaValidator

Property to set and get the schemaValidator to be used. This determines whether schema validation is applied to an input document and whether type annotations in a supplied document are retained. If no schemaValidator is supplied, then schema validation does not take place.

 QName TopLevelElementName

The required name of the top level element in a document instance being validated against a schema.

 TreeModel TreeModel

The Tree Model implementation to be used for the constructed document. By default the TinyTree is used. The main reason for using the LinkedTree alternative is if updating is required (the TinyTree is not updateable).

 WhitespacePolicy WhitespacePolicy

Determines the whitespace stripping policy applied when loading a document using this DocumentBuilder.

 XmlResolver XmlResolver

An XmlResolver, which will be used to resolve URIs of documents being loaded and of references to external entities within those documents (including any external DTD).

 

Method Summary

 XdmNode Build(Uri uri)

Load an XML document, retrieving it via a URI.

 XdmNode Build(Stream input)

Load an XML document supplied as raw (lexical) XML on a Stream.

 XdmNode Build(TextReader input)

Load an XML document supplied using a TextReader.

 XdmNode Build(XmlReader reader)

Load an XML document, delivered using an XmlReader.

 XdmNode Build(XmlNode source)

Load an XML DOM document, supplied as an XmlNode, into a Saxon XdmNode.

 XdmNode Wrap(XmlDocument doc)

Wrap an XML DOM document, supplied as an XmlNode, as a Saxon XdmNode.

 

Property Detail

BaseUri

public Uri BaseUri {get; set; }

The base URI of a document loaded using this DocumentBuilder. This is used for resolving any relative URIs appearing within the document, for example in references to DTDs and external entities.

This information is required when the document is loaded from a source that does not provide an intrinsic URI, notably when loading from a Stream or a TextReader.

DocumentProjectionQuery

public XQueryExecutable DocumentProjectionQuery {get; set; }

Set a compiled query to be used for implementing document projection.

The effect of using this option is that the tree constructed by the DocumentBuilder contains only those parts of the source document that are needed to answer this query. Running this query against the projected document should give the same results as against the raw document, but the projected document typically occupies significantly less memory. It is permissible to run other queries against the projected document, but unless they are carefully chosen, they will give the wrong answer, because the document being used is different from the original.

The query should be written to use the projected document as its initial context item. For example, if the query is //ITEM[COLOR='blue'], then only ITEM elements and their COLOR children will be retained in the projected document.

This facility is only available in Saxon-EE; if the facility is not available, calling this method has no effect.

DtdValidation

public bool DtdValidation {get; set; }

Determines whether DTD validation is applied to documents loaded using this DocumentBuilder.

By default, no DTD validation takes place.

IsLineNumbering

public bool IsLineNumbering {get; set; }

Determines whether line numbering is enabled for documents loaded using this DocumentBuilder.

By default, line numbering is disabled.

Line numbering is not available for all kinds of source: in particular, it is not available when loading from an existing XmlDocument.

The resulting line numbers are accessible to applications using the extension function saxon:line-number() applied to a node.

Line numbers are maintained only for element nodes; the line number returned for any other node will be that of the most recent element.

SchemaValidationMode

public SchemaValidationMode SchemaValidationMode {get; set; }

Determines whether schema validation is applied to documents loaded using this DocumentBuilder, and if so, whether it is strict or lax.

By default, no schema validation takes place.

This option requires Saxon Enterprise Edition (Saxon-EE).

SchemaValidator

public SchemaValidator SchemaValidator {get; set; }

Property to set and get the schemaValidator to be used. This determines whether schema validation is applied to an input document and whether type annotations in a supplied document are retained. If no schemaValidator is supplied, then schema validation does not take place.

TopLevelElementName

public QName TopLevelElementName {get; set; }

The required name of the top level element in a document instance being validated against a schema.

If this property is set, and if schema validation is requested, then validation will fail unless the outermost element of the document has the required name.

This option requires the schema-aware version of the Saxon product (Saxon-EE).

TreeModel

public TreeModel TreeModel {get; set; }

The Tree Model implementation to be used for the constructed document. By default the TinyTree is used. The main reason for using the LinkedTree alternative is if updating is required (the TinyTree is not updateable).

WhitespacePolicy

public WhitespacePolicy WhitespacePolicy {get; set; }

Determines the whitespace stripping policy applied when loading a document using this DocumentBuilder.

By default, whitespace text nodes appearing in element-only content are stripped, and all other whitespace text nodes are retained.

XmlResolver

public XmlResolver XmlResolver {get; set; }

An XmlResolver, which will be used to resolve URIs of documents being loaded and of references to external entities within those documents (including any external DTD).

By default an XmlUrlResolver is used. This means that the responsibility for resolving and dereferencing URIs rests with the .NET platform (and not with the GNU Classpath).

When Saxon invokes a user-written XmlResolver, the GetEntity method may return any of: a System.IO.Stream; a System.IO.TextReader; or a java.xml.transform.Source. However, if the XmlResolver is called by the XML parser to resolve external entity references, then it must return an instance of System.IO.Stream.

Method Detail

Build

public XdmNode Build(Uri uri)

Load an XML document, retrieving it via a URI.

Note that the type Uri requires an absolute URI.

The URI is dereferenced using the registered XmlResolver.

This method takes no account of any fragment part in the URI.

The role passed to the GetEntity method of the XmlResolver is "application/xml", and the required return type is System.IO.Stream.

The document located via the URI is parsed using the System.Xml parser.

Note that the Microsoft System.Xml parser does not report whether attributes are defined in the DTD as being of type ID and IDREF. This is true whether or not DTD-based validation is enabled. This means that such attributes are not accessible to the id() and idref() functions.

Parameters:

uri - The URI identifying the location where the document can be found. This will also be used as the base URI of the document (regardless of the setting of the BaseUri property).

Returns:

An XdmNode, the document node at the root of the tree of the resulting in-memory document.

Build

public XdmNode Build(Stream input)

Load an XML document supplied as raw (lexical) XML on a Stream.

The document is parsed using the Microsoft System.Xml parser if the "http://saxon.sf.net/feature/preferJaxpParser" property on the Processor is set to false; otherwise it is parsed using the Apache Xerces XML parser.

Before calling this method, the BaseUri property must be set to identify the base URI of this document, used for resolving any relative URIs contained within it.

Note that the Microsoft System.Xml parser does not report whether attributes are defined in the DTD as being of type ID and IDREF. This is true whether or not DTD-based validation is enabled. This means that such attributes are not accessible to the id() and idref() functions.

Parameters:

input - The Stream containing the XML source to be parsed. Closing this stream on completion is the responsibility of the caller.

Returns:

An XdmNode, the document node at the root of the tree of the resulting in-memory document.

Build

public XdmNode Build(TextReader input)

Load an XML document supplied using a TextReader.

The document is parsed using the Microsoft System.Xml parser if the "http://saxon.sf.net/feature/preferJaxpParser" property on the Processor is set to false; otherwise it is parsed using the Apache Xerces XML parser.

Before calling this method, the BaseUri property must be set to identify the base URI of this document, used for resolving any relative URIs contained within it.

Note that the Microsoft System.Xml parser does not report whether attributes are defined in the DTD as being of type ID and IDREF. This is true whether or not DTD-based validation is enabled. This means that such attributes are not accessible to the id() and idref() functions.

Parameters:

input - The TextReader containing the XML source to be parsed

Returns:

An XdmNode, the document node at the root of the tree of the resulting in-memory document.

Build

public XdmNode Build(XmlReader reader)

Load an XML document, delivered using an XmlReader.

The XmlReader is responsible for parsing the document; this method builds a tree representation of the document (in an internal Saxon format) and returns its document node. The XmlReader is not required to perform validation but it must expand any entity references. Saxon uses the properties of the XmlReader as supplied.

Use of a plain XmlTextReader is discouraged, because it does not expand entity references. This should only be used if you know in advance that the document will contain no entity references (or perhaps if your query or stylesheet is not interested in the content of text and attribute nodes). Instead, with .NET 1.1 use an XmlValidatingReader (with ValidationType set to None). The constructor for XmlValidatingReader is obsolete in .NET 2.0, but the same effect can be achieved by using the Create method of XmlReader with appropriate XmlReaderSettings.

Conformance with the W3C specifications requires that the Normalization property of an XmlTextReader should be set to true. However, Saxon does not insist on this.

If the XmlReader performs schema validation, Saxon will ignore any resulting type information. Type information can only be obtained by using Saxon's own schema validator, which will be run if the SchemaValidationMode property is set to Strict or Lax.

Note that the Microsoft System.Xml parser does not report whether attributes are defined in the DTD as being of type ID and IDREF. This is true whether or not DTD-based validation is enabled. This means that such attributes are not accessible to the id() and idref() functions.

Note that setting the XmlResolver property of the DocumentBuilder has no effect when this method is used; if an XmlResolver is required, it must be set on the XmlReader itself.

Parameters:

reader - The XMLReader that supplies the parsed XML source

Returns:

An XdmNode, the document node at the root of the tree of the resulting in-memory document.

Build

public XdmNode Build(XmlNode source)

Load an XML DOM document, supplied as an XmlNode, into a Saxon XdmNode.

The returned document will contain only the subtree rooted at the supplied node.

This method copies the DOM tree to create a Saxon tree. See the Wrap method for an alternative that creates a wrapper around the DOM tree, allowing it to be modified in situ.

Parameters:

source - The DOM Node to be copied to form a Saxon tree

Returns:

An XdmNode, the document node at the root of the tree of the resulting in-memory document.

Wrap

public XdmNode Wrap(XmlDocument doc)

Wrap an XML DOM document, supplied as an XmlNode, as a Saxon XdmNode.

This method must be applied at the level of the Document Node. Unlike the Build method, the original DOM is not copied. This saves memory and time, but it also means that it is not possible to perform operations such as whitespace stripping and schema validation.

Parameters:

doc - The DOM document node to be wrapped

Returns:

An XdmNode, the Saxon document node at the root of the tree of the resulting in-memory document.