Class JavaRegularExpression

  • All Implemented Interfaces:
    RegularExpression

    public class JavaRegularExpression
    extends java.lang.Object
    implements RegularExpression
    An implementation of RegularExpression that calls the JDK regular expression library directly. This can be invoked by appending ";j" to the flags attribute/argument.

    Note that in SaxonCS, this class is emulated by code that invokes the .NET regex engine, which has different rules. In this case the ";n" flag activates this option.

    • Constructor Summary

      Constructors 
      Constructor Description
      JavaRegularExpression​(UnicodeString javaRegex, java.lang.String flags)
      Create a regular expression, starting with an already-translated Java regex.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      RegexIterator analyze​(UnicodeString input)
      Use this regular expression to analyze an input string, in support of the XSLT analyze-string instruction.
      boolean containsMatch​(UnicodeString input)
      Determine whether the regular expression contains a match for a given string
      int getFlagBits()
      Get the flag bits as used by the Java regular expression engine
      java.lang.String getFlags()
      Get the flags used at the time the regular expression was compiled.
      java.lang.String getJavaRegularExpression()
      Get the Java regular expression (after translation from an XPath regex, but before compilation)
      boolean isPlatformNative()
      Ask whether the regular expression is using platform-native syntax (Java or .NET), or XPath syntax
      boolean matches​(UnicodeString input)
      Determine whether the regular expression matches a given string in its entirety
      UnicodeString replace​(UnicodeString input, UnicodeString replacement)
      Replace all substrings of a supplied input string that match the regular expression with a replacement string.
      UnicodeString replaceWith​(UnicodeString input, java.util.function.BiFunction<UnicodeString,​UnicodeString[],​UnicodeString> replacement)
      Replace all substrings of a supplied input string that match the regular expression with a replacement string.
      static int setFlags​(java.lang.CharSequence inFlags)
      Set the Java flags from the supplied XPath flags.
      AtomicIterator tokenize​(UnicodeString input)
      Use this regular expression to tokenize an input string.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • JavaRegularExpression

        public JavaRegularExpression​(UnicodeString javaRegex,
                                     java.lang.String flags)
                              throws XPathException
        Create a regular expression, starting with an already-translated Java regex. NOTE: this constructor is called from compiled XQuery code
        Parameters:
        javaRegex - the regular expression after translation to Java notation
        flags - the user-specified flags (prior to any semicolon)
        Throws:
        XPathException
    • Method Detail

      • getJavaRegularExpression

        public java.lang.String getJavaRegularExpression()
        Get the Java regular expression (after translation from an XPath regex, but before compilation)
        Returns:
        the regular expression in Java notation
      • getFlagBits

        public int getFlagBits()
        Get the flag bits as used by the Java regular expression engine
        Returns:
        the flag bits
      • analyze

        public RegexIterator analyze​(UnicodeString input)
        Use this regular expression to analyze an input string, in support of the XSLT analyze-string instruction. The resulting RegexIterator provides both the matching and non-matching substrings, and allows them to be distinguished. It also provides access to matched subgroups.
        Specified by:
        analyze in interface RegularExpression
        Parameters:
        input - the string to which the regular expression is to be applied
        Returns:
        an iterator over matched and unmatched substrings
      • containsMatch

        public boolean containsMatch​(UnicodeString input)
        Determine whether the regular expression contains a match for a given string
        Specified by:
        containsMatch in interface RegularExpression
        Parameters:
        input - the string to match
        Returns:
        true if the string matches, false otherwise
      • matches

        public boolean matches​(UnicodeString input)
        Determine whether the regular expression matches a given string in its entirety
        Specified by:
        matches in interface RegularExpression
        Parameters:
        input - the string to match
        Returns:
        true if the string matches, false otherwise
      • replace

        public UnicodeString replace​(UnicodeString input,
                                     UnicodeString replacement)
                              throws XPathException
        Replace all substrings of a supplied input string that match the regular expression with a replacement string.
        Specified by:
        replace in interface RegularExpression
        Parameters:
        input - the input string on which replacements are to be performed
        replacement - the replacement string in the format of the XPath replace() function
        Returns:
        the result of performing the replacement
        Throws:
        XPathException - if the replacement string is invalid
      • replaceWith

        public UnicodeString replaceWith​(UnicodeString input,
                                         java.util.function.BiFunction<UnicodeString,​UnicodeString[],​UnicodeString> replacement)
                                  throws XPathException
        Replace all substrings of a supplied input string that match the regular expression with a replacement string.
        Specified by:
        replaceWith in interface RegularExpression
        Parameters:
        input - the input string on which replacements are to be performed
        replacement - a function that is called once for each matching substring, and that returns a replacement for that substring
        Returns:
        the result of performing the replacement
        Throws:
        XPathException - if the replacement string is invalid
      • tokenize

        public AtomicIterator tokenize​(UnicodeString input)
        Use this regular expression to tokenize an input string.
        Specified by:
        tokenize in interface RegularExpression
        Parameters:
        input - the string to be tokenized
        Returns:
        a SequenceIterator containing the resulting tokens, as objects of type StringValue
      • setFlags

        public static int setFlags​(java.lang.CharSequence inFlags)
                            throws XPathException
        Set the Java flags from the supplied XPath flags. The flags recognized have their Java-defined meanings rather than their XPath-defined meanings. The available flags are:

        d - UNIX_LINES

        m - MULTILINE

        i - CASE_INSENSITIVE

        s - DOTALL

        x - COMMENTS

        u - UNICODE_CASE

        q - LITERAL

        c - CANON_EQ

        Parameters:
        inFlags - the flags as a string, e.g. "im"
        Returns:
        the flags as a bit-significant integer
        Throws:
        XPathException - if the supplied value contains an unrecognized flag character
        See Also:
        Pattern
      • getFlags

        public java.lang.String getFlags()
        Get the flags used at the time the regular expression was compiled.
        Specified by:
        getFlags in interface RegularExpression
        Returns:
        a string containing the flags
      • isPlatformNative

        public boolean isPlatformNative()
        Ask whether the regular expression is using platform-native syntax (Java or .NET), or XPath syntax
        Specified by:
        isPlatformNative in interface RegularExpression
        Returns:
        true if using platform-native syntax