Package net.sf.saxon.str
Class UnicodeBuilder
java.lang.Object
net.sf.saxon.str.UnicodeBuilder
- All Implemented Interfaces:
UnicodeWriter
,UniStringConsumer
Builder class to construct a UnicodeString by appending text incrementally
-
Constructor Summary
ConstructorsConstructorDescriptionCreate a Unicode builder with an initial allocation of 256 codepointsUnicodeBuilder
(int allocate) Create a Unicode builder with an initial space allocation -
Method Summary
Modifier and TypeMethodDescriptionaccept
(UnicodeString chars) Process a supplied stringappend
(char ch) Append a character, which must not be a surrogate.append
(int codePoint) Append a single unicode character to the contentappend
(CharSequence str) Append a Java CharSequence to the content.append
(UnicodeString str) Append a UnicodeString object to the content.append
(IntIterator codePoints) Append multiple unicode characters to the contentappendAll
(SequenceIterator iter) Append the string values of all the items in a sequence, with no separatorappendLatin
(String str) Append a Java string to the content.void
clear()
Reset the contents of this builder to be emptyvoid
close()
Complete the writing of characters to the result.static byte[]
expand
(byte[] in, int start, int end, int oldWidth, int newWidth, int allocate) Expand the width of the characters in a byte arraystatic byte[]
expand1to2
(byte[] in, int start, int used, int allocate) Expand a byte array from 1-byte-per-character to 2-bytes-per-characterstatic byte[]
expand1to3
(byte[] in, int start, int used, int allocate) Expand a byte array from 1-byte-per-character to 3-bytes-per-characterstatic byte[]
expand2to3
(byte[] in, int start, int used, int allocate) Expand a byte array from 2-bytes-per-character to 3-bytes-per-characterstatic char[]
expandBytesToChars
(byte[] in, int start, int end) boolean
isEmpty()
Ask whether the content of the builder is emptylong
length()
Get the number of codepoints currently in the buildertoString()
Return a string containing the character content of this buildertoStringItem
(AtomicType type) Construct a StringValue whose value is formed from the contents of this builderConstruct a UnicodeString whose value is formed from the contents of this buildervoid
void
Process a supplied stringvoid
write
(UnicodeString chars) Process a supplied stringvoid
writeAscii
(byte[] content) Write a supplied string known to consist entirely of ASCII characters, supplied as a byte arrayMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
Methods inherited from interface net.sf.saxon.str.UnicodeWriter
flush, writeCodePoint, writeRepeatedAscii
Methods inherited from interface net.sf.saxon.str.UniStringConsumer
open
-
Constructor Details
-
UnicodeBuilder
public UnicodeBuilder()Create a Unicode builder with an initial allocation of 256 codepoints -
UnicodeBuilder
public UnicodeBuilder(int allocate) Create a Unicode builder with an initial space allocation- Parameters:
allocate
- the initial space allocation, in codepoints (32-bit integers)
-
-
Method Details
-
append
Append a character, which must not be a surrogate. (Method needed for C#, because implicit conversion of char to int isn't supported)- Parameters:
ch
- the character- Returns:
- this builder, with the new character added
-
append
Append a single unicode character to the content- Parameters:
codePoint
- the unicode codepoint. The caller is responsible for ensuring that this is not a surrogate- Returns:
- this builder, with the new character added
-
append
Append multiple unicode characters to the content- Parameters:
codePoints
- an iterator delivering the codepoints to be added.- Returns:
- this builder, with the new characters added
-
appendLatin
Append a Java string to the content. The caller is responsible for ensuring that this consists entirely of characters in the Latin-1 character set- Parameters:
str
- the string to be appended- Returns:
- this builder, with the new string added
-
appendAll
Append the string values of all the items in a sequence, with no separator- Parameters:
iter
- the sequence of items- Returns:
- this builder, with the new items added
-
append
Append a Java CharSequence to the content. This may contain arbitrary characters including well formed surrogate pairs- Parameters:
str
- the string to be appended- Returns:
- this builder, with the new string added
-
append
Append a UnicodeString object to the content.- Parameters:
str
- the string to be appended. The length is currently restricted to 2^31.- Returns:
- this builder, with the new string added
-
length
public long length()Get the number of codepoints currently in the builder- Returns:
- the size in codepoints
-
isEmpty
public boolean isEmpty()Ask whether the content of the builder is empty- Returns:
- true if the size is zero
-
toUnicodeString
Construct a UnicodeString whose value is formed from the contents of this builder- Returns:
- the constructed
UnicodeString
-
toStringItem
Construct a StringValue whose value is formed from the contents of this builder- Parameters:
type
- the required type, for example BuiltInAtomicType.STRING or BuiltInAtomicType.UNTYPED_ATOMIC. The caller warrants that the value is a valid instance of this type. No validation or whitespace normalization is carried out- Returns:
- the constructed StringValue
-
toString
Return a string containing the character content of this builder -
clear
public void clear()Reset the contents of this builder to be empty -
expand1to2
public static byte[] expand1to2(byte[] in, int start, int used, int allocate) Expand a byte array from 1-byte-per-character to 2-bytes-per-character- Parameters:
in
- the input byte arraystart
- the start offset in bytesused
- the end offset in bytesallocate
- the number of code points to allow for in the output byte array- Returns:
- the new byte array
-
expandBytesToChars
public static char[] expandBytesToChars(byte[] in, int start, int end) -
expand1to3
public static byte[] expand1to3(byte[] in, int start, int used, int allocate) Expand a byte array from 1-byte-per-character to 3-bytes-per-character- Parameters:
in
- the input byte arraystart
- the start offset in bytesused
- the end offset in bytesallocate
- the number of code points to allow for in the output byte array- Returns:
- the new byte array
-
expand2to3
public static byte[] expand2to3(byte[] in, int start, int used, int allocate) Expand a byte array from 2-bytes-per-character to 3-bytes-per-character- Parameters:
in
- the input byte arraystart
- the start offset in bytesused
- the end offset in bytesallocate
- the number of code points to allow for in the output byte array- Returns:
- the new byte array
-
expand
public static byte[] expand(byte[] in, int start, int end, int oldWidth, int newWidth, int allocate) Expand the width of the characters in a byte array- Parameters:
in
- the input byte arraystart
- the start offset in bytesend
- the end offset in bytesoldWidth
- the width of the characters (number of bytes per character) in the input arraynewWidth
- the width of the characters (number of bytes per character) in the output array. If newWidth LE oldWidth then the input array is copied; the width is never reducedallocate
- the number of code points to allow for in the output byte array; if zero (or insufficient) the output array will have no spare space for expansion- Returns:
- the new byte array
-
accept
Process a supplied string- Specified by:
accept
in interfaceUniStringConsumer
- Parameters:
chars
- the characters to be processed- Returns:
- this CharSequenceConsumer (to allow method chaining)
-
write
Description copied from interface:UnicodeWriter
Process a supplied string- Specified by:
write
in interfaceUnicodeWriter
- Parameters:
chars
- the characters to be processed
-
writeAscii
Write a supplied string known to consist entirely of ASCII characters, supplied as a byte array- Specified by:
writeAscii
in interfaceUnicodeWriter
- Parameters:
content
- byte array holding ASCII characters only- Throws:
IOException
- if processing fails for any reason
-
write
Process a supplied string- Specified by:
write
in interfaceUnicodeWriter
- Parameters:
chars
- the characters to be processed- Throws:
IOException
- if processing fails for any reason
-
trimToSize
public void trimToSize() -
close
public void close()Complete the writing of characters to the result. The default implementation does nothing.- Specified by:
close
in interfaceUnicodeWriter
- Specified by:
close
in interfaceUniStringConsumer
-