Class JavaRegularExpression

java.lang.Object
net.sf.saxon.regex.JavaRegularExpression
All Implemented Interfaces:
RegularExpression

public class JavaRegularExpression extends Object implements RegularExpression
An implementation of RegularExpression that calls the JDK regular expression library directly. This can be invoked by appending ";j" to the flags attribute/argument.

Note that in SaxonCS, this class is emulated by code that invokes the .NET regex engine, which has different rules. In this case the ";n" flag activates this option.

  • Constructor Details

    • JavaRegularExpression

      public JavaRegularExpression(UnicodeString javaRegex, String flags) throws XPathException
      Create a regular expression, starting with an already-translated Java regex. NOTE: this constructor is called from compiled XQuery code
      Parameters:
      javaRegex - the regular expression after translation to Java notation
      flags - the user-specified flags (prior to any semicolon)
      Throws:
      XPathException
  • Method Details

    • getJavaRegularExpression

      public String getJavaRegularExpression()
      Get the Java regular expression (after translation from an XPath regex, but before compilation)
      Returns:
      the regular expression in Java notation
    • getFlagBits

      public int getFlagBits()
      Get the flag bits as used by the Java regular expression engine
      Returns:
      the flag bits
    • analyze

      public RegexIterator analyze(UnicodeString input)
      Use this regular expression to analyze an input string, in support of the XSLT analyze-string instruction. The resulting RegexIterator provides both the matching and non-matching substrings, and allows them to be distinguished. It also provides access to matched subgroups.
      Specified by:
      analyze in interface RegularExpression
      Parameters:
      input - the string to which the regular expression is to be applied
      Returns:
      an iterator over matched and unmatched substrings
    • containsMatch

      public boolean containsMatch(UnicodeString input)
      Determine whether the regular expression contains a match for a given string
      Specified by:
      containsMatch in interface RegularExpression
      Parameters:
      input - the string to match
      Returns:
      true if the string matches, false otherwise
    • matches

      public boolean matches(UnicodeString input)
      Determine whether the regular expression matches a given string in its entirety
      Specified by:
      matches in interface RegularExpression
      Parameters:
      input - the string to match
      Returns:
      true if the string matches, false otherwise
    • replace

      public UnicodeString replace(UnicodeString input, UnicodeString replacement) throws XPathException
      Replace all substrings of a supplied input string that match the regular expression with a replacement string.
      Specified by:
      replace in interface RegularExpression
      Parameters:
      input - the input string on which replacements are to be performed
      replacement - the replacement string in the format of the XPath replace() function
      Returns:
      the result of performing the replacement
      Throws:
      XPathException - if the replacement string is invalid
    • replaceWith

      Replace all substrings of a supplied input string that match the regular expression with a replacement string.
      Specified by:
      replaceWith in interface RegularExpression
      Parameters:
      input - the input string on which replacements are to be performed
      replacement - a function that is called once for each matching substring, and that returns a replacement for that substring
      Returns:
      the result of performing the replacement
      Throws:
      XPathException - if the replacement string is invalid
    • tokenize

      public AtomicIterator tokenize(UnicodeString input)
      Use this regular expression to tokenize an input string.
      Specified by:
      tokenize in interface RegularExpression
      Parameters:
      input - the string to be tokenized
      Returns:
      a SequenceIterator containing the resulting tokens, as objects of type StringValue
    • setFlags

      public static int setFlags(CharSequence inFlags) throws XPathException
      Set the Java flags from the supplied XPath flags. The flags recognized have their Java-defined meanings rather than their XPath-defined meanings. The available flags are:

      d - UNIX_LINES

      m - MULTILINE

      i - CASE_INSENSITIVE

      s - DOTALL

      x - COMMENTS

      u - UNICODE_CASE

      q - LITERAL

      c - CANON_EQ

      Parameters:
      inFlags - the flags as a string, e.g. "im"
      Returns:
      the flags as a bit-significant integer
      Throws:
      XPathException - if the supplied value contains an unrecognized flag character
      See Also:
    • getFlags

      public String getFlags()
      Get the flags used at the time the regular expression was compiled.
      Specified by:
      getFlags in interface RegularExpression
      Returns:
      a string containing the flags
    • isPlatformNative

      public boolean isPlatformNative()
      Ask whether the regular expression is using platform-native syntax (Java or .NET), or XPath syntax
      Specified by:
      isPlatformNative in interface RegularExpression
      Returns:
      true if using platform-native syntax