This package contains classes used to handle Unicode strings: notably implementations of the
UnicodeString interface, which represents a string as a sequence of directly-addressible
Unicode codepoints (without relying on surrogate pairs).
Interface Summary Interface Description TwineConsumerInterface that accepts a a sequence of Unicode codepoints. UnicodeWriterInterface that accepts strings in the form of
UnicodeStringobjects, which are written to some destination.
UniStringConsumerInterface that accepts a string in the form of a sequence of CharSequences, which are conceptually concatenated (though in some implementations, the final string may never be materialized in memory)
Class Summary Class Description AbstractUniStringConsumerThis abstract implementation of UniStringConsumer exists largely for C#, as a place to capture the default methods defined in the interface, and avoid them proliferating into multiple subclasses BMPStringAn implementation of
UnicodeStringthat wraps a Java string which is known to contain no surrogates.
CodepointIteratorIterator over a string to produce a sequence of single character strings CompressedWhitespaceThis class provides a compressed representation of a sequence of whitespace characters. EmptyUnicodeStringA zero-length Unicode string IndentWhitespaceThis class provides a compressed representation of a string used to represent indentation: specifically, an integer number of newlines followed by an integer number of spaces. LargeTextBufferThe segments (other than the last) have a fixed size of 65536 codepoints, which may use one byte per codepoint, two bytes per codepoint, or three bytes per codepoint, depending on the largest codepoint present in the segment. Slice16A Unicode string consisting entirely of 16-bit BMP characters, implemented as a range of an underlying byte array Slice24A Unicode string consisting of 24-bit characters, implemented as a range of an underlying byte array holding three bytes per codepoint Slice8A Unicode string consisting entirely of 8-bit characters, implemented as a range of an underlying byte array StringConstantsContains constants representing some frequently used strings, either as a
UnicodeStringor in some cases as a byte array.
StringTool StringViewAn implementation of the CodePoints interface that wraps an ordinary Java string. ToLowerClass to perform lowercase conversion. ToUpperClass to perform uppercase conversion. Twine16
Twine16is a Unicode string consisting entirely of codepoints in the range 0-65535 (that is, the basic multilingual plane), excluding surrogates.
Twine24is Unicode string that accommodates any codepoint value up to 24 bits.
Twine8is Unicode string whose codepoints are all in the range 0-255 (that is, Latin-1).
UnicodeBuilderBuilder class to construct a UnicodeString by appending text incrementally UnicodeCharA UnicodeString containing a single codepoint UnicodeStringA UnicodeString is a sequence of Unicode codepoints that supports codepoint addressing. UnicodeWriterToWriterImplementation of
UnicodeWriterthat converts Unicode strings to ordinary Java strings and sends them to a supplied Writer
WhitespaceStringThis abstract class represents a couple of different implementations of strings containing whitespace only. ZenoStringA ZenoString is an implementation of UnicodeString that comprises a list of segments representing substrings of the total string.