com.saxonica.fsa
Class FiniteStateMachine

java.lang.Object
  extended by com.saxonica.fsa.FiniteStateMachine

public class FiniteStateMachine
extends Object

Class representing a finite state machine, as a collection of states with transitions between them


Constructor Summary
FiniteStateMachine(boolean determinized)
          Create a finite state machine
 
Method Summary
 void allocateStateNumber(AutomatonState state)
          Allocate a unique number to a state, and index the state from the FSM
static NonDeterminizedState compileParticle(SchemaCompiler compiler, Particle particle, NonDeterminizedState endState, UserComplexType subjectType, FiniteStateMachine machine)
          Static method to translate a particle to a Finite State Automaton, returning the start state of the FSA.
static FiniteStateMachine determinize(FiniteStateMachine nfsa)
          Determinize the finite state machine: that is, ensure that there are no ambiguous transitions from this state.
 void display()
          Display the finite state machine.
 AutomatonState getInitialState()
          Get the initial state of this finite state machine
 int getNumberOfStates()
          Determine the number of states in this finite state machine
 AutomatonState getState(int number)
          Get the state with a given unique state number
 void serialize(SchemaModelSerializer serializer)
          Output a reppresentation of this finite state machine
 void setInitialState(AutomatonState initialState)
          Set the initial state of this finite state machine
static String subsumesMachine(FiniteStateMachine base, FiniteStateMachine sub, SchemaCompiler compiler)
          Test whether one finite state machine subsumes another FSM: that is, whether for each path through the second FSM, there is a corresponding path through the first FSM.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FiniteStateMachine

public FiniteStateMachine(boolean determinized)
Create a finite state machine

Parameters:
determinized - true if this machine is determinized
Method Detail

allocateStateNumber

public void allocateStateNumber(AutomatonState state)
Allocate a unique number to a state, and index the state from the FSM

Parameters:
state - the state whose number is to be allocated

getNumberOfStates

public int getNumberOfStates()
Determine the number of states in this finite state machine

Returns:
the number of states

getInitialState

public AutomatonState getInitialState()
Get the initial state of this finite state machine

Returns:
the initial state

setInitialState

public void setInitialState(AutomatonState initialState)
Set the initial state of this finite state machine

Parameters:
initialState - the initial state

getState

public AutomatonState getState(int number)
Get the state with a given unique state number

Parameters:
number - the number of the state
Returns:
the state with this unique state number

compileParticle

public static NonDeterminizedState compileParticle(SchemaCompiler compiler,
                                                   Particle particle,
                                                   NonDeterminizedState endState,
                                                   UserComplexType subjectType,
                                                   FiniteStateMachine machine)
                                            throws SchemaException,
                                                   UnresolvedReferenceException
Static method to translate a particle to a Finite State Automaton, returning the start state of the FSA. This precisely follows Henry Thompson's and Richard Tobin's algorithm, published here

Parameters:
compiler - user for error reporting
particle - the particle to be compiled, which is either an ElementDeclaration, a WildCard, a choice, or a sequence
endState - the State representing the final state of the automaton
subjectType - the complex type to which this particle belongs
machine - the finite state machine to which the state belongs
Returns:
the initial state of the new Automaton
Throws:
SchemaException
UnresolvedReferenceException

display

public void display()
Display the finite state machine. This is a diagnostic method for internal use. Output is written to System.err.


subsumesMachine

public static String subsumesMachine(FiniteStateMachine base,
                                     FiniteStateMachine sub,
                                     SchemaCompiler compiler)
Test whether one finite state machine subsumes another FSM: that is, whether for each path through the second FSM, there is a corresponding path through the first FSM. The algorithm used is as published by Thompson and Tobin, XML Europe 2003.

Parameters:
base - the first FSM (representing the base type)
sub - the other FSM (representing the type that is derived by restriction, validly or otherwise)
compiler - used for error reporting
Returns:
null indicating that the first machine does indeed subsume the other; or a string indicating why it doesn't.

determinize

public static FiniteStateMachine determinize(FiniteStateMachine nfsa)
                                      throws SchemaException
Determinize the finite state machine: that is, ensure that there are no ambiguous transitions from this state. When presented with an input token, the FSA can then examine the specific transitions from this state and either choose one of them, or throw an error.

This is a revised implementation, based more closely on Aho and Ullman p93, to try to eliminate the rare problems that occur when a state has two transitions for the same symbol. (Given UPA, this should occur only with nested loops, e.g. (a{10,11}){1,3} which allows a sequence of 10, 20, 30, 11, 22, or 33 a's.)

Parameters:
nfsa - the finite state machine to be determinized
Returns:
the new deterministic finite state machine
Throws:
SchemaException

serialize

public void serialize(SchemaModelSerializer serializer)
               throws XPathException
Output a reppresentation of this finite state machine

Parameters:
serializer - the schema model serializer to which this representation is to be written
Throws:
XPathException - if any error occurs during serialization


Copyright (C) Michael H. Kay. All rights reserved.