Class Twine8

All Implemented Interfaces:
Comparable<UnicodeString>, AtomicMatchKey

public class Twine8 extends UnicodeString
Twine8 is Unicode string whose codepoints are all in the range 0-255 (that is, Latin-1). These are held in an array of bytes, one byte per character. The length of the string is limited to 2^31-1 codepoints.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    protected byte[]
     
    protected int
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    Twine8(byte[] bytes)
    Constructor
    Twine8(char[] chars, int start, int len)
    Create a Twine8 from an array of characters that are known to be single byte chars
    Create a Twine8 from a string whose characters are known to be single byte chars
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    codePointAt(long index)
    Get the code point at a given position in the string
    Get an iterator over the Unicode codepoints in the value.
    int
    Compare this string to another using codepoint comparison
     
    boolean
    Test whether this StringValue is equal to another under the rules of the codepoint collation.
    byte[]
    Get an array of bytes holding the characters of the string in their Latin-1 encoding
    int
    Get the number of bits needed to hold all the characters in this string
    int
    Compute a hashCode.
    long
    indexOf(int codePoint, long from)
    Get the first position, at or beyond start, where a given codepoint appears in this string.
    long
    indexOf(UnicodeString other, long from)
    Get the first position, at or beyond start, where another string appears as a substring of this string, comparing codepoints.
    long
    indexWhere(IntPredicate predicate, long from)
    Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string
    boolean
    Determine whether the string is a zero-length string.
    long
    Get the length of this string, in codepoints
    int
    Get the length of the string, provided it is less than 2^31 characters
    substring(long start, long end)
    Get a substring of this string (following the rules of String.substring(int), but measuring Unicode codepoints rather than 16-bit code units)
    Display as a string.

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Field Details

    • bytes

      protected byte[] bytes
    • cachedHash

      protected int cachedHash
  • Constructor Details

    • Twine8

      public Twine8(byte[] bytes)
      Constructor
      Parameters:
      bytes - the byte array containing the characters in the range 0-255. The caller must ensure that this array is immutable.
    • Twine8

      public Twine8(char[] chars, int start, int len)
      Create a Twine8 from an array of characters that are known to be single byte chars
      Parameters:
      chars - character array, all characters in range must be LE 255
      start - offset of first character to be used
      len - number of characters to be used
    • Twine8

      public Twine8(String str)
      Create a Twine8 from a string whose characters are known to be single byte chars
      Parameters:
      str - the value, all characters in range must be LE 255
  • Method Details

    • getByteArray

      public byte[] getByteArray()
      Get an array of bytes holding the characters of the string in their Latin-1 encoding
      Returns:
      the bytes making up the string
    • length

      public long length()
      Get the length of this string, in codepoints
      Specified by:
      length in class UnicodeString
      Returns:
      the length of the string in Unicode code points
    • length32

      public int length32()
      Description copied from class: UnicodeString
      Get the length of the string, provided it is less than 2^31 characters
      Overrides:
      length32 in class UnicodeString
      Returns:
      the length of the string if it fits within a Java int
    • substring

      public UnicodeString substring(long start, long end)
      Get a substring of this string (following the rules of String.substring(int), but measuring Unicode codepoints rather than 16-bit code units)
      Specified by:
      substring in class UnicodeString
      Parameters:
      start - the offset of the first character to be included in the result, counting Unicode codepoints
      end - the offset of the first character to be excluded from the result, counting Unicode codepoints
      Returns:
      the substring
    • codePointAt

      public int codePointAt(long index) throws IndexOutOfBoundsException
      Description copied from class: UnicodeString
      Get the code point at a given position in the string
      Specified by:
      codePointAt in class UnicodeString
      Parameters:
      index - the given position (0-based)
      Returns:
      the code point at the given position
      Throws:
      IndexOutOfBoundsException - if the index is out of range
    • indexOf

      public long indexOf(int codePoint, long from)
      Get the first position, at or beyond start, where a given codepoint appears in this string.
      Specified by:
      indexOf in class UnicodeString
      Parameters:
      codePoint - the sought codepoint
      from - the position (0-based) where searching is to start (counting in codepoints)
      Returns:
      the first position where the substring is found, or -1 if it is not found
    • indexOf

      public long indexOf(UnicodeString other, long from)
      Get the first position, at or beyond start, where another string appears as a substring of this string, comparing codepoints.
      Overrides:
      indexOf in class UnicodeString
      Parameters:
      other - the other (sought) string
      from - the position (0-based) where searching is to start (counting in codepoints)
      Returns:
      the first position where the substring is found, or -1 if it is not found
    • isEmpty

      public boolean isEmpty()
      Determine whether the string is a zero-length string. This may be more efficient than testing whether the length is equal to zero
      Overrides:
      isEmpty in class UnicodeString
      Returns:
      true if the string is zero length
    • getWidth

      public int getWidth()
      Description copied from class: UnicodeString
      Get the number of bits needed to hold all the characters in this string
      Specified by:
      getWidth in class UnicodeString
      Returns:
      7 for ascii characters (not used??), 8 for latin-1, 16 for BMP, 24 for general Unicode.
    • codePoints

      public IntIterator codePoints()
      Get an iterator over the Unicode codepoints in the value. These will always be full codepoints, never surrogates (surrogate pairs are combined where necessary).
      Specified by:
      codePoints in class UnicodeString
      Returns:
      a sequence of Unicode codepoints
    • hashCode

      public int hashCode()
      Compute a hashCode. All implementations of UnicodeString use compatible hash codes and the hashing algorithm is therefore identical to that for java.lang.String. This means that for strings containing Astral characters, the hash code needs to be computed by decomposing an Astral character into a surrogate pair.
      Overrides:
      hashCode in class UnicodeString
      Returns:
      the hash code
    • equals

      public boolean equals(Object o)
      Test whether this StringValue is equal to another under the rules of the codepoint collation. The type annotation is ignored.
      Overrides:
      equals in class UnicodeString
      Parameters:
      o - the value to be compared with this value
      Returns:
      true if the strings are equal on a codepoint-by-codepoint basis
    • compareTo

      public int compareTo(UnicodeString other)
      Description copied from class: UnicodeString
      Compare this string to another using codepoint comparison
      Specified by:
      compareTo in interface Comparable<UnicodeString>
      Overrides:
      compareTo in class UnicodeString
      Parameters:
      other - the other string
      Returns:
      -1 if this string comes first, 0 if they are equal, +1 if the other string comes first
    • toString

      public String toString()
      Display as a string.
      Overrides:
      toString in class Object
    • indexWhere

      public long indexWhere(IntPredicate predicate, long from)
      Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string
      Specified by:
      indexWhere in class UnicodeString
      Parameters:
      predicate - condition that the codepoint must satisfy
      from - the position from which the search should start (0-based)
      Returns:
      the position (0-based) of the first codepoint to match the predicate, or -1 if not found
    • details

      public String details()