Class ARegexIterator

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, LastPositionFinder, SequenceIterator, RegexIterator

    public class ARegexIterator
    extends java.lang.Object
    implements RegexIterator, LastPositionFinder
    Class ARegexIterator - provides an iterator over matched and unmatched substrings. This implementation of RegexIterator uses the modified Jakarta regular expression engine.
    • Constructor Detail

      • ARegexIterator

        public ARegexIterator​(UnicodeString string,
                              UnicodeString regex,
                              REMatcher matcher)
        Construct a RegexIterator. Note that the underlying matcher.find() method is called once to obtain each matching substring. But the iterator also returns non-matching substrings if these appear between the matching substrings.
        Parameters:
        string - the string to be analysed
        matcher - a matcher for the regular expression
    • Method Detail

      • getLength

        public int getLength()
                      throws XPathException
        Description copied from interface: LastPositionFinder
        Get the last position (that is, the number of items in the sequence). This method is non-destructive: it does not change the state of the iterator. The result is undefined if the next() method of the iterator has already returned null. This method must not be called unless the result of getProperties() on the iterator includes the bit setting SequenceIterator.Property.LAST_POSITION_FINDER
        Specified by:
        getLength in interface LastPositionFinder
        Returns:
        the number of items in the sequence
        Throws:
        XPathException - if an error occurs evaluating the sequence in order to determine the number of items
      • isMatching

        public boolean isMatching()
        Determine whether the current item is a matching item or a non-matching item
        Specified by:
        isMatching in interface RegexIterator
        Returns:
        true if the current item (the one most recently returned by next()) is an item that matches the regular expression, or false if it is an item that does not match
      • getRegexGroup

        public java.lang.String getRegexGroup​(int number)
        Get a substring that matches a parenthesised group within the regular expression
        Specified by:
        getRegexGroup in interface RegexIterator
        Parameters:
        number - the number of the group to be obtained
        Returns:
        the substring of the current item that matches the n'th parenthesized group within the regular expression
      • processMatchingSubstring

        public void processMatchingSubstring​(RegexIterator.MatchHandler action)
                                      throws XPathException
        Process a matching substring, performing specified actions at the start and end of each captured subgroup. This method will always be called when operating in "push" mode; it writes its result to context.getReceiver(). The matching substring text is all written to the receiver, interspersed with calls to the RegexIterator.MatchHandler methods onGroupStart() and onGroupEnd().
        Specified by:
        processMatchingSubstring in interface RegexIterator
        Parameters:
        action - defines the processing to be performed at the start and end of a group
        Throws:
        XPathException
      • computeNestingTable

        public static IntToIntHashMap computeNestingTable​(UnicodeString regex)
        Compute a table showing for each captured group number (opening paren in the regex), the number of its parent group. This is done by reparsing the source of the regular expression. This is needed when the result of a match includes an empty group, to determine its position relative to other groups finishing at the same character position.