net.sf.saxon.pull
Class StaxBridge

java.lang.Object
  extended by net.sf.saxon.pull.StaxBridge
All Implemented Interfaces:
SourceLocator, LocationProvider, SaxonLocator, SourceLocationProvider, PullProvider, Locator

public class StaxBridge
extends Object
implements PullProvider, SaxonLocator, SourceLocationProvider

This class implements the Saxon PullProvider API on top of a standard StAX parser (or any other StAX XMLStreamReader implementation)


Nested Class Summary
 class StaxBridge.StaxNamespaces
           
 
Field Summary
 
Fields inherited from interface net.sf.saxon.pull.PullProvider
ATOMIC_VALUE, ATTRIBUTE, COMMENT, END_DOCUMENT, END_ELEMENT, END_OF_INPUT, NAMESPACE, PROCESSING_INSTRUCTION, START_DOCUMENT, START_ELEMENT, START_OF_INPUT, TEXT
 
Constructor Summary
StaxBridge()
          Create a new instance of the class
 
Method Summary
 void close()
          Close the event reader.
 int current()
          Get the event most recently returned by next(), or by other calls that change the position, for example getStringValue() and skipToMatchingEnd().
 AtomicValue getAtomicValue()
          Get an atomic value.
 AttributeCollection getAttributes()
          Get the attributes associated with the current element.
 int getColumnNumber()
          Return the column number where the current document event ends.
 int getColumnNumber(long locationId)
          Get the column number within the document, entity, or module containing a particular location
 int getFingerprint()
          Get the fingerprint of the name of the element.
 int getLineNumber()
          Return the line number where the current document event ends.
 int getLineNumber(long locationId)
          Get the line number within the document, entity or module containing a particular location
 int getLocationId()
          Get the location of the current event.
 int getNameCode()
          Get the nameCode identifying the name of the current node.
 NamePool getNamePool()
          Get the name pool
 NamespaceDeclarations getNamespaceDeclarations()
          Get the namespace declarations associated with the current element.
 PipelineConfiguration getPipelineConfiguration()
          Get configuration information.
 String getPublicId()
          Return the public identifier for the current document event.
 SourceLocator getSourceLocator()
          Get the location of the current event.
 CharSequence getStringValue()
          Get the string value of the current element, text node, processing-instruction, or top-level attribute or namespace node, or atomic value.
 String getSystemId()
          Return the system identifier for the current document event.
 String getSystemId(long locationId)
          Get the URI of the document, entity, or module containing a particular location
 int getTypeAnnotation()
          Get the type annotation of the current attribute or element node, or atomic value.
 List getUnparsedEntities()
          Get a list of unparsed entities.
 XMLStreamReader getXMLStreamReader()
          Get the XMLStreamReader used by this StaxBridge.
static void main(String[] args)
          Simple test program Usage: java StaxBridge in.xml [out.xml]
 int next()
          Get the next event
 void setInputStream(String systemId, InputStream inputStream)
          Supply an input stream containing XML to be parsed.
 void setPipelineConfiguration(PipelineConfiguration pipe)
          Set configuration information.
 void setXMLStreamReader(XMLStreamReader reader)
          Supply an XMLStreamReader: the events reported by this XMLStreamReader will be translated into PullProvider events
 int skipToMatchingEnd()
          Skip the current subtree.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

StaxBridge

public StaxBridge()
Create a new instance of the class

Method Detail

setInputStream

public void setInputStream(String systemId,
                           InputStream inputStream)
                    throws XPathException
Supply an input stream containing XML to be parsed. A StAX parser is created using the JAXP XMLInputFactory.

Parameters:
systemId - The Base URI of the input document
inputStream - the stream containing the XML to be parsed
Throws:
XPathException - if an error occurs creating the StAX parser

setXMLStreamReader

public void setXMLStreamReader(XMLStreamReader reader)
Supply an XMLStreamReader: the events reported by this XMLStreamReader will be translated into PullProvider events

Parameters:
reader - the supplier of XML events, typically an XML parser

setPipelineConfiguration

public void setPipelineConfiguration(PipelineConfiguration pipe)
Set configuration information. This must only be called before any events have been read.

Specified by:
setPipelineConfiguration in interface PullProvider
Parameters:
pipe - the pipeline configuration

getPipelineConfiguration

public PipelineConfiguration getPipelineConfiguration()
Get configuration information.

Specified by:
getPipelineConfiguration in interface PullProvider
Returns:
the pipeline configuration

getXMLStreamReader

public XMLStreamReader getXMLStreamReader()
Get the XMLStreamReader used by this StaxBridge. This is available only after setInputStream() or setXMLStreamReader() has been called

Returns:
the instance of XMLStreamReader allocated when setInputStream() was called, or the instance supplied directly to setXMLStreamReader()

getNamePool

public NamePool getNamePool()
Get the name pool

Returns:
the name pool

next

public int next()
         throws XPathException
Get the next event

Specified by:
next in interface PullProvider
Returns:
an integer code indicating the type of event. The code PullProvider.END_OF_INPUT is returned at the end of the sequence.
Throws:
XPathException

current

public int current()
Get the event most recently returned by next(), or by other calls that change the position, for example getStringValue() and skipToMatchingEnd(). This method does not change the position of the PullProvider.

Specified by:
current in interface PullProvider
Returns:
the current event

getAttributes

public AttributeCollection getAttributes()
                                  throws XPathException
Get the attributes associated with the current element. This method must be called only after a START_ELEMENT event has been notified. The contents of the returned AttributeCollection are guaranteed to remain unchanged until the next START_ELEMENT event, but may be modified thereafter. The object should not be modified by the client.

Attributes may be read before or after reading the namespaces of an element, but must not be read after the first child node has been read, or after calling one of the methods skipToEnd(), getStringValue(), or getTypedValue().

Specified by:
getAttributes in interface PullProvider
Returns:
an AttributeCollection representing the attributes of the element that has just been notified.
Throws:
XPathException

getNamespaceDeclarations

public NamespaceDeclarations getNamespaceDeclarations()
                                               throws XPathException
Get the namespace declarations associated with the current element. This method must be called only after a START_ELEMENT event has been notified. In the case of a top-level START_ELEMENT event (that is, an element that either has no parent node, or whose parent is not included in the sequence being read), the NamespaceDeclarations object returned will contain a namespace declaration for each namespace that is in-scope for this element node. In the case of a non-top-level element, the NamespaceDeclarations will contain a set of namespace declarations and undeclarations, representing the differences between this element and its parent.

It is permissible for this method to return namespace declarations that are redundant.

The NamespaceDeclarations object is guaranteed to remain unchanged until the next START_ELEMENT event, but may then be overwritten. The object should not be modified by the client.

Namespaces may be read before or after reading the attributes of an element, but must not be read after the first child node has been read, or after calling one of the methods skipToEnd(), getStringValue(), or getTypedValue().

*

Specified by:
getNamespaceDeclarations in interface PullProvider
Returns:
the namespace declarations associated with the current START_ELEMENT event.
Throws:
XPathException

skipToMatchingEnd

public int skipToMatchingEnd()
                      throws XPathException
Skip the current subtree. This method may be called only immediately after a START_DOCUMENT or START_ELEMENT event. This call returns the matching END_DOCUMENT or END_ELEMENT event; the next call on next() will return the event following the END_DOCUMENT or END_ELEMENT.

Specified by:
skipToMatchingEnd in interface PullProvider
Returns:
the matching END_DOCUMENT or END_ELEMENT event
Throws:
XPathException

close

public void close()
Close the event reader. This indicates that no further events are required. It is not necessary to close an event reader after PullProvider.END_OF_INPUT has been reported, but it is recommended to close it if reading terminates prematurely. Once an event reader has been closed, the effect of further calls on next() is undefined.

Specified by:
close in interface PullProvider

getNameCode

public int getNameCode()
Get the nameCode identifying the name of the current node. This method can be used after the PullProvider.START_ELEMENT, PullProvider.PROCESSING_INSTRUCTION, PullProvider.ATTRIBUTE, or PullProvider.NAMESPACE events. With some PullProvider implementations, including this one, it can also be used after PullProvider.END_ELEMENT. If called at other times, the result is undefined and may result in an IllegalStateException. If called when the current node is an unnamed namespace node (a node representing the default namespace) the returned value is -1.

Specified by:
getNameCode in interface PullProvider
Returns:
the nameCode. The nameCode can be used to obtain the prefix, local name, and namespace URI from the name pool.

getFingerprint

public int getFingerprint()
Get the fingerprint of the name of the element. This is similar to the nameCode, except that it does not contain any information about the prefix: so two elements with the same fingerprint have the same name, excluding prefix. This method can be used after the PullProvider.START_ELEMENT, PullProvider.PROCESSING_INSTRUCTION, PullProvider.ATTRIBUTE, or PullProvider.NAMESPACE events. If called at other times, the result is undefined and may result in an IllegalStateException. If called when the current node is an unnamed namespace node (a node representing the default namespace) the returned value is -1.

Specified by:
getFingerprint in interface PullProvider
Returns:
the fingerprint. The fingerprint can be used to obtain the local name and namespace URI from the name pool.

getStringValue

public CharSequence getStringValue()
                            throws XPathException
Get the string value of the current element, text node, processing-instruction, or top-level attribute or namespace node, or atomic value.

In other situations the result is undefined and may result in an IllegalStateException.

If the most recent event was a PullProvider.START_ELEMENT, this method causes the content of the element to be read. The current event on completion of this method will be the corresponding PullProvider.END_ELEMENT. The next call of next() will return the event following the END_ELEMENT event.

Specified by:
getStringValue in interface PullProvider
Returns:
the String Value of the node in question, defined according to the rules in the XPath data model.
Throws:
XPathException

getAtomicValue

public AtomicValue getAtomicValue()
Get an atomic value. This call may be used only when the last event reported was ATOMIC_VALUE. This indicates that the PullProvider is reading a sequence that contains a free-standing atomic value; it is never used when reading the content of a node.

Specified by:
getAtomicValue in interface PullProvider
Returns:
the atomic value

getLocationId

public int getLocationId()
Get the location of the current event. The location is returned as an integer. This is a value that can be passed to the LocationProvider held by the PipelineConfiguration to get real location information (line number, system Id, etc). For an event stream representing a real document, the location information should identify the location in the lexical XML source. For a constructed document, it should identify the location in the query or stylesheet that caused the node to be created. A value of zero can be returned if no location information is available.

Returns:
the location ID

getTypeAnnotation

public int getTypeAnnotation()
Get the type annotation of the current attribute or element node, or atomic value. The result of this method is undefined unless the most recent event was START_ELEMENT, ATTRIBUTE, or ATOMIC_VALUE.

Specified by:
getTypeAnnotation in interface PullProvider
Returns:
the type annotation. This code is the fingerprint of a type name, which may be resolved to a SchemaType by access to the Configuration.

getSourceLocator

public SourceLocator getSourceLocator()
Get the location of the current event. For an event stream representing a real document, the location information should identify the location in the lexical XML source. For a constructed document, it should identify the location in the query or stylesheet that caused the node to be created. A value of null can be returned if no location information is available.

Specified by:
getSourceLocator in interface PullProvider
Returns:
the SourceLocator giving the location of the current event, or null if no location information is available

getPublicId

public String getPublicId()
Return the public identifier for the current document event.

The return value is the public identifier of the document entity or of the external parsed entity in which the markup triggering the event appears.

Specified by:
getPublicId in interface SourceLocator
Specified by:
getPublicId in interface Locator
Returns:
A string containing the public identifier, or null if none is available.
See Also:
getSystemId()

getSystemId

public String getSystemId()
Return the system identifier for the current document event.

The return value is the system identifier of the document entity or of the external parsed entity in which the markup triggering the event appears.

If the system identifier is a URL, the parser must resolve it fully before passing it to the application. For example, a file name must always be provided as a file:... URL, and other kinds of relative URI are also resolved against their bases.

Specified by:
getSystemId in interface SourceLocator
Specified by:
getSystemId in interface Locator
Returns:
A string containing the system identifier, or null if none is available.
See Also:
getPublicId()

getLineNumber

public int getLineNumber()
Return the line number where the current document event ends. Lines are delimited by line ends, which are defined in the XML specification.

Warning: The return value from the method is intended only as an approximation for the sake of diagnostics; it is not intended to provide sufficient information to edit the character content of the original XML document. In some cases, these "line" numbers match what would be displayed as columns, and in others they may not match the source text due to internal entity expansion.

The return value is an approximation of the line number in the document entity or external parsed entity where the markup triggering the event appears.

If possible, the SAX driver should provide the line position of the first character after the text associated with the document event. The first line is line 1.

Specified by:
getLineNumber in interface SourceLocator
Specified by:
getLineNumber in interface Locator
Returns:
The line number, or -1 if none is available.
See Also:
getColumnNumber()

getColumnNumber

public int getColumnNumber()
Return the column number where the current document event ends. This is one-based number of Java char values since the last line end.

Warning: The return value from the method is intended only as an approximation for the sake of diagnostics; it is not intended to provide sufficient information to edit the character content of the original XML document. For example, when lines contain combining character sequences, wide characters, surrogate pairs, or bi-directional text, the value may not correspond to the column in a text editor's display.

The return value is an approximation of the column number in the document entity or external parsed entity where the markup triggering the event appears.

If possible, the SAX driver should provide the line position of the first character after the text associated with the document event. The first column in each line is column 1.

Specified by:
getColumnNumber in interface SourceLocator
Specified by:
getColumnNumber in interface Locator
Returns:
The column number, or -1 if none is available.
See Also:
getLineNumber()

getSystemId

public String getSystemId(long locationId)
Description copied from interface: LocationProvider
Get the URI of the document, entity, or module containing a particular location

Specified by:
getSystemId in interface LocationProvider
Parameters:
locationId - identifier of the location in question (as passed down the Receiver pipeline)
Returns:
the URI of the document, XML entity or module. For a SourceLocationProvider this will be the URI of the document or entity (the URI that would be the base URI if there were no xml:base attributes). In other cases it may identify the query or stylesheet module currently being executed.

getLineNumber

public int getLineNumber(long locationId)
Description copied from interface: LocationProvider
Get the line number within the document, entity or module containing a particular location

Specified by:
getLineNumber in interface LocationProvider
Parameters:
locationId - identifier of the location in question (as passed down the Receiver pipeline)
Returns:
the line number within the document, entity or module, or -1 if no information is available.

getColumnNumber

public int getColumnNumber(long locationId)
Description copied from interface: LocationProvider
Get the column number within the document, entity, or module containing a particular location

Specified by:
getColumnNumber in interface LocationProvider
Parameters:
locationId - identifier of the location in question (as passed down the Receiver pipeline)
Returns:
the column number within the document, entity, or module, or -1 if this is not available

getUnparsedEntities

public List getUnparsedEntities()
Get a list of unparsed entities.

Specified by:
getUnparsedEntities in interface PullProvider
Returns:
a list of unparsed entities, or null if the information is not available, or an empty list if there are no unparsed entities. Each item in the list will be an instance of UnparsedEntity

main

public static void main(String[] args)
                 throws Exception
Simple test program Usage: java StaxBridge in.xml [out.xml]

Parameters:
args - command line arguments
Throws:
Exception


Copyright (c) 2004-2010 Saxonica Limited. All rights reserved.