UnicodeString
interface, which represents a string as a sequence of directly-addressible
Unicode codepoints (without relying on surrogate pairs).See: Description
Interface | Description |
---|---|
TwineConsumer |
Interface that accepts a a sequence of Unicode codepoints.
|
UnicodeWriter |
Interface that accepts strings in the form of
UnicodeString objects,
which are written to some destination. |
UniStringConsumer |
Interface that accepts a string in the form of a sequence of CharSequences,
which are conceptually concatenated (though in some implementations, the final
string may never be materialized in memory)
|
Class | Description |
---|---|
AbstractUniStringConsumer |
This abstract implementation of UniStringConsumer exists largely for C#, as a place to
capture the default methods defined in the interface, and avoid them proliferating into
multiple subclasses
|
BMPString |
An implementation of
UnicodeString that wraps a Java string which is known to contain
no surrogates. |
CodepointIterator |
Iterator over a string to produce a sequence of single character strings
|
CompressedWhitespace |
This class provides a compressed representation of a sequence of whitespace characters.
|
EmptyUnicodeString |
A zero-length Unicode string
|
IndentWhitespace |
This class provides a compressed representation of a string used to represent indentation: specifically,
an integer number of newlines followed by an integer number of spaces.
|
LargeTextBuffer |
The segments (other than the last) have a fixed size of 65536 codepoints,
which may use one byte per codepoint, two bytes per codepoint, or three bytes per
codepoint, depending on the largest codepoint present in the segment.
|
Slice16 |
A Unicode string consisting entirely of 16-bit BMP characters, implemented as a range
of an underlying byte array
|
Slice24 |
A Unicode string consisting of 24-bit characters, implemented as a range
of an underlying byte array holding three bytes per codepoint
|
Slice8 |
A Unicode string consisting entirely of 8-bit characters, implemented as a range
of an underlying byte array
|
StringConstants |
Contains constants representing some frequently used strings, either as a
UnicodeString
or in some cases as a byte array. |
StringTool | |
StringView |
An implementation of the CodePoints interface that wraps an ordinary Java string.
|
ToLower |
Class to perform lowercase conversion.
|
ToUpper |
Class to perform uppercase conversion.
|
Twine16 |
Twine16 is a Unicode string consisting entirely of codepoints in the range 0-65535
(that is, the basic multilingual plane), excluding surrogates. |
Twine24 |
Twine24 is Unicode string that accommodates any codepoint value up to 24 bits. |
Twine8 |
Twine8 is Unicode string whose codepoints are all in the range 0-255 (that is, Latin-1). |
UnicodeBuilder |
Builder class to construct a UnicodeString by appending text incrementally
|
UnicodeChar |
A UnicodeString containing a single codepoint
|
UnicodeString |
A UnicodeString is a sequence of Unicode codepoints that supports codepoint addressing.
|
UnicodeWriterToWriter |
Implementation of
UnicodeWriter that converts Unicode strings to ordinary
Java strings and sends them to a supplied Writer |
WhitespaceString |
This abstract class represents a couple of different implementations of strings
containing whitespace only.
|
ZenoString |
A ZenoString is an implementation of UnicodeString that comprises a list
of segments representing substrings of the total string.
|
This package contains classes used to handle Unicode strings: notably implementations of the
UnicodeString
interface, which represents a string as a sequence of directly-addressible
Unicode codepoints (without relying on surrogate pairs).
Copyright (c) 2004-2022 Saxonica Limited. All rights reserved.