Class BMPString

All Implemented Interfaces:
Comparable<UnicodeString>, AtomicMatchKey

public class BMPString extends UnicodeString
An implementation of UnicodeString that wraps a Java string which is known to contain no surrogates. That is, all the characters in the string are in the Basic Multilingual Plane (their codepoints are in the range 0-65535 and not in the surrogate range).
  • Constructor Details

    • BMPString

      protected BMPString(String baseString)
      Protected constructor
      Parameters:
      baseString - the string to be wrapped: the caller is responsible for ensuring this contains no surrogates
  • Method Details

    • of

      public static UnicodeString of(String base)
      Wrap a String, which must contain no surrogates
      Parameters:
      base - the string. The caller warrants that this string contains no surrogates; this condition is checked only if Java assertions are enabled.
      Returns:
      the wrapped string.
    • length

      public long length()
      Description copied from class: UnicodeString
      Get the length of the string
      Specified by:
      length in class UnicodeString
      Returns:
      the number of code points in the string
    • isEmpty

      public boolean isEmpty()
      Description copied from class: UnicodeString
      Ask whether the string is empty
      Overrides:
      isEmpty in class UnicodeString
      Returns:
      true if the length of the string is zero
    • getWidth

      public int getWidth()
      Description copied from class: UnicodeString
      Get the number of bits needed to hold all the characters in this string
      Specified by:
      getWidth in class UnicodeString
      Returns:
      7 for ascii characters (not used??), 8 for latin-1, 16 for BMP, 24 for general Unicode.
    • codePoints

      public IntIterator codePoints()
      Description copied from class: UnicodeString
      Get an iterator over the code points present in the string.
      Specified by:
      codePoints in class UnicodeString
      Returns:
      an iterator that delivers the individual code points
    • indexOf

      public long indexOf(int codePoint)
      Description copied from class: UnicodeString
      Get the position of the first occurrence of the specified codepoint, starting the search at the beginning
      Overrides:
      indexOf in class UnicodeString
      Parameters:
      codePoint - the sought codePoint
      Returns:
      the position (0-based) of the first occurrence found, or -1 if not found, counting codePoints rather than UTF16 chars.
    • indexOf

      public long indexOf(int codePoint, long from)
      Description copied from class: UnicodeString
      Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string
      Specified by:
      indexOf in class UnicodeString
      Parameters:
      codePoint - the sought codePoint
      from - the position from which the search should start (0-based). A negative value is treated as zero. A position beyond the end of the string results in a return value of -1 (meaning not found).
      Returns:
      the position (0-based) of the first occurrence found, or -1 if not found
    • indexWhere

      public long indexWhere(IntPredicate predicate, long from)
      Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string
      Specified by:
      indexWhere in class UnicodeString
      Parameters:
      predicate - condition that the codepoint must satisfy
      from - the position from which the search should start (0-based). A negative value is treated as zero. A position beyond the end of the string results in a return value of -1 (meaning not found).
      Returns:
      the position (0-based) of the first codepoint to match the predicate, or -1 if not found
    • codePointAt

      public int codePointAt(long index)
      Description copied from class: UnicodeString
      Get the code point at a given position in the string
      Specified by:
      codePointAt in class UnicodeString
      Parameters:
      index - the given position (0-based)
      Returns:
      the code point at the given position
    • substring

      public UnicodeString substring(long start, long end)
      Description copied from class: UnicodeString
      Get a substring of this string, with a given start and end position
      Specified by:
      substring in class UnicodeString
      Parameters:
      start - the start position (0-based): that is, the position of the first code point to be included
      end - the end position (0-based): specifically, the position of the first code point not to be included
      Returns:
      the requested substring
    • concat

      public UnicodeString concat(UnicodeString other)
      Description copied from class: UnicodeString
      Concatenate with another string, returning a new string
      Overrides:
      concat in class UnicodeString
      Parameters:
      other - the string to be appended
      Returns:
      the result of concatenating this string followed by the other
    • compareTo

      public int compareTo(UnicodeString other)
      Description copied from class: UnicodeString
      Compare this string to another using codepoint comparison
      Specified by:
      compareTo in interface Comparable<UnicodeString>
      Overrides:
      compareTo in class UnicodeString
      Parameters:
      other - the other string
      Returns:
      -1 if this string comes first, 0 if they are equal, +1 if the other string comes first
    • equals

      public boolean equals(Object obj)
      Overrides:
      equals in class UnicodeString
    • hashCode

      public int hashCode()
      Description copied from class: UnicodeString
      Compute a hashCode. All implementations of UnicodeString use compatible hash codes and the hashing algorithm is therefore identical to that for java.lang.String. This means that for strings containing Astral characters, the hash code needs to be computed by decomposing an Astral character into a surrogate pair.
      Overrides:
      hashCode in class UnicodeString
      Returns:
      the hash code
    • toString

      public String toString()
      Overrides:
      toString in class Object