Package net.sf.saxon.str
Class BMPString
java.lang.Object
net.sf.saxon.str.UnicodeString
net.sf.saxon.str.BMPString
- All Implemented Interfaces:
Comparable<UnicodeString>
,AtomicMatchKey
An implementation of
UnicodeString
that wraps a Java string which is known to contain
no surrogates. That is, all the characters in the string are in the Basic Multilingual
Plane (their codepoints are in the range 0-65535 and not in the surrogate range).-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionint
codePointAt
(long index) Get the code point at a given position in the stringGet an iterator over the code points present in the string.int
compareTo
(UnicodeString other) Compare this string to another using codepoint comparisonconcat
(UnicodeString other) Concatenate with another string, returning a new stringboolean
int
getWidth()
Get the number of bits needed to hold all the characters in this stringint
hashCode()
Compute a hashCode.long
indexOf
(int codePoint) Get the position of the first occurrence of the specified codepoint, starting the search at the beginninglong
indexOf
(int codePoint, long from) Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the stringlong
indexWhere
(IntPredicate predicate, long from) Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the stringboolean
isEmpty()
Ask whether the string is emptylong
length()
Get the length of the stringstatic UnicodeString
Wrap a String, which must contain no surrogatessubstring
(long start, long end) Get a substring of this string, with a given start and end positiontoString()
Methods inherited from class net.sf.saxon.str.UnicodeString
asAtomic, checkSubstringBounds, economize, estimatedLength, hasSubstring, indexOf, length32, prefix, requireInt, requireNonNegativeInt, substring, tidy, verifyCharacters
-
Constructor Details
-
BMPString
Protected constructor- Parameters:
baseString
- the string to be wrapped: the caller is responsible for ensuring this contains no surrogates
-
-
Method Details
-
of
Wrap a String, which must contain no surrogates- Parameters:
base
- the string. The caller warrants that this string contains no surrogates; this condition is checked only if Java assertions are enabled.- Returns:
- the wrapped string.
-
length
public long length()Description copied from class:UnicodeString
Get the length of the string- Specified by:
length
in classUnicodeString
- Returns:
- the number of code points in the string
-
isEmpty
public boolean isEmpty()Description copied from class:UnicodeString
Ask whether the string is empty- Overrides:
isEmpty
in classUnicodeString
- Returns:
- true if the length of the string is zero
-
getWidth
public int getWidth()Description copied from class:UnicodeString
Get the number of bits needed to hold all the characters in this string- Specified by:
getWidth
in classUnicodeString
- Returns:
- 7 for ascii characters (not used??), 8 for latin-1, 16 for BMP, 24 for general Unicode.
-
codePoints
Description copied from class:UnicodeString
Get an iterator over the code points present in the string.- Specified by:
codePoints
in classUnicodeString
- Returns:
- an iterator that delivers the individual code points
-
indexOf
public long indexOf(int codePoint) Description copied from class:UnicodeString
Get the position of the first occurrence of the specified codepoint, starting the search at the beginning- Overrides:
indexOf
in classUnicodeString
- Parameters:
codePoint
- the sought codePoint- Returns:
- the position (0-based) of the first occurrence found, or -1 if not found, counting codePoints rather than UTF16 chars.
-
indexOf
public long indexOf(int codePoint, long from) Description copied from class:UnicodeString
Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string- Specified by:
indexOf
in classUnicodeString
- Parameters:
codePoint
- the sought codePointfrom
- the position from which the search should start (0-based). A negative value is treated as zero. A position beyond the end of the string results in a return value of -1 (meaning not found).- Returns:
- the position (0-based) of the first occurrence found, or -1 if not found
-
indexWhere
Get the position of the first occurrence of the specified codepoint, starting the search at a given position in the string- Specified by:
indexWhere
in classUnicodeString
- Parameters:
predicate
- condition that the codepoint must satisfyfrom
- the position from which the search should start (0-based). A negative value is treated as zero. A position beyond the end of the string results in a return value of -1 (meaning not found).- Returns:
- the position (0-based) of the first codepoint to match the predicate, or -1 if not found
-
codePointAt
public int codePointAt(long index) Description copied from class:UnicodeString
Get the code point at a given position in the string- Specified by:
codePointAt
in classUnicodeString
- Parameters:
index
- the given position (0-based)- Returns:
- the code point at the given position
-
substring
Description copied from class:UnicodeString
Get a substring of this string, with a given start and end position- Specified by:
substring
in classUnicodeString
- Parameters:
start
- the start position (0-based): that is, the position of the first code point to be includedend
- the end position (0-based): specifically, the position of the first code point not to be included- Returns:
- the requested substring
-
concat
Description copied from class:UnicodeString
Concatenate with another string, returning a new string- Overrides:
concat
in classUnicodeString
- Parameters:
other
- the string to be appended- Returns:
- the result of concatenating this string followed by the other
-
compareTo
Description copied from class:UnicodeString
Compare this string to another using codepoint comparison- Specified by:
compareTo
in interfaceComparable<UnicodeString>
- Overrides:
compareTo
in classUnicodeString
- Parameters:
other
- the other string- Returns:
- -1 if this string comes first, 0 if they are equal, +1 if the other string comes first
-
equals
- Overrides:
equals
in classUnicodeString
-
hashCode
public int hashCode()Description copied from class:UnicodeString
Compute a hashCode. All implementations ofUnicodeString
use compatible hash codes and the hashing algorithm is therefore identical to that forjava.lang.String
. This means that for strings containing Astral characters, the hash code needs to be computed by decomposing an Astral character into a surrogate pair.- Overrides:
hashCode
in classUnicodeString
- Returns:
- the hash code
-
toString
-