Class FiniteStateMachine

java.lang.Object
com.saxonica.ee.schema.fsa.FiniteStateMachine

public class FiniteStateMachine extends Object
Class representing a finite state machine, as a collection of states with transitions between them.

The finite state machine allows states with counters. The static machine created by the schema compiler contains states of type AutomatonState; these may have a minOccurs and maxOccurs value. When such a state is entered at validation time, a CountingState is dynamically created that contains a pointer to the fixed AutomatonState plus an integer counter. Any transition that goes from the state to itself causes a new CountingState to be created with an incremented value of the counter (provided it does not exceed the maxOccurs()); any transition to a different state causes the counter to be checked against the minOccurs value.

  • Constructor Details

    • FiniteStateMachine

      public FiniteStateMachine(UserComplexType type, boolean determinized)
      Create a finite state machine
      Parameters:
      type - the type whose content model is implemented by this finite state machine
      determinized - true if this machine is determinized
  • Method Details

    • getComplexType

      public UserComplexType getComplexType()
    • getNumberOfStates

      public int getNumberOfStates()
      Determine the number of states in this finite state machine
      Returns:
      the number of states
    • getInitialState

      public AutomatonState getInitialState()
      Get the initial state of this finite state machine
      Returns:
      the initial state
    • setInitialState

      public void setInitialState(AutomatonState initialState)
      Set the initial state of this finite state machine
      Parameters:
      initialState - the initial state
    • getState

      public AutomatonState getState(int number)
      Get the state with a given unique state number
      Parameters:
      number - the number of the state
      Returns:
      the state with this unique state number
    • setOpenContentWildcard

      public void setOpenContentWildcard(Wildcard wildcard, boolean interleaved)
      Set the open content wildcard for this machine
      Parameters:
      wildcard - the open content wildcard (may be null)
      interleaved - true if open content may be interleaved, false if it must be suffixed
    • compileParticle

      public static NonDeterminizedState compileParticle(SchemaCompiler compiler, Particle particle, NonDeterminizedState endState, UserComplexType subjectType, FiniteStateMachine machine) throws SchemaException, MissingComponentException
      Static method to translate a particle to a Finite State Automaton, returning the start state of the FSA. This precisely follows Henry Thompson's and Richard Tobin's algorithm, published here
      Parameters:
      compiler - user for error reporting
      particle - the particle to be compiled, which is either an ElementDeclaration, a WildCard, a choice, or a sequence
      endState - the State representing the final state of the automaton
      subjectType - the complex type to which this particle belongs
      machine - the finite state machine to which the state belongs
      Returns:
      the initial state of the new Automaton
      Throws:
      SchemaException - if XSD constraints are violated
      MissingComponentException - if unresolved references in the schema are found
    • display

      public void display(Logger err)
      Display the finite state machine. This is a diagnostic method for internal use.
      Parameters:
      err - the output destination
    • subsumesMachine

      public static String subsumesMachine(FiniteStateMachine base, FiniteStateMachine sub, SchemaCompiler compiler)
      Test whether one finite state machine subsumes another FSM: that is, whether for each path through the second FSM, there is a corresponding path through the first FSM. The algorithm used is as published by Thompson and Tobin, XML Europe 2003.
      Parameters:
      base - the first FSM (representing the base type)
      sub - the other FSM (representing the type that is derived by restriction, validly or otherwise)
      compiler - used for error reporting
      Returns:
      null indicating that the first machine does indeed subsume the other; or a string indicating why it doesn't.
    • determinize

      public static FiniteStateMachine determinize(SchemaCompiler compiler, FiniteStateMachine nfsa) throws SchemaException
      Determinize the finite state machine: that is, ensure that there are no ambiguous transitions from this state. When presented with an input token, the FSA can then examine the specific transitions from this state and either choose one of them, or throw an error.

      This is a revised implementation, based more closely on Aho and Ullman p93, to try to eliminate the rare problems that occur when a state has two transitions for the same symbol. (Given UPA, this should occur only with nested loops, e.g. (a{10,11}){1,3} which allows a sequence of 10, 20, 30, 11, 22, or 33 a's.)

      Parameters:
      compiler - the schema compiler
      nfsa - the finite state machine to be determinized
      Returns:
      the new deterministic finite state machine
      Throws:
      SchemaException - if an XSD constraint is violated
    • getOpenContentWildcard

      public Wildcard getOpenContentWildcard()
      Get the open content wildcard, if any
      Returns:
      the open content wildcard for this finite state machine, if one is defined
    • isOpenContentInterleaved

      public boolean isOpenContentInterleaved()
      Ask whether the open content wildcard for this finite state machine (assuming there is one) permits interleaved content (as opposed to suffixed content only)
      Returns:
      true if the open content wildcard permits interleaved content.
    • serialize

      public void serialize(SchemaModelSerializer serializer) throws XPathException
      Output a representation of this finite state machine
      Parameters:
      serializer - the schema model serializer to which this representation is to be written
      Throws:
      XPathException - if any error occurs during serialization