Package net.sf.saxon.value
Class Whitespace
java.lang.Object
net.sf.saxon.value.Whitespace
This class provides helper methods and constants for handling whitespace
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic class
An iterator that splits a string on whitespace boundaries, corresponding to the XPath 3.1 function tokenize#1 -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
static final int
static final int
static final int
The values NONE, IGNORABLE, and ALL identify which kinds of whitespace text node should be stripped when building a source tree.static final int
The values PRESERVE, REPLACE, and COLLAPSE represent the three options for whitespace normalization.static final int
static final int
static final int
static final int
-
Method Summary
Modifier and TypeMethodDescriptionstatic UnicodeString
applyWhitespaceNormalization
(int action, UnicodeString value) Apply schema-defined whitespace normalization to a stringstatic String
collapse
(CharSequence in) static UnicodeString
static String
Collapse whitespace as defined in XML Schema.static UnicodeString
Collapse whitespace as defined in XML Schema.static boolean
containsWhitespace
(IntIterator codePoints) Determine if a string contains any whitespacestatic boolean
isAllWhite
(UnicodeString content) Determine if a string is all-whitespacestatic boolean
isWhite
(int c) Determine if a character is whitespacestatic String
static UnicodeString
static UnicodeString
normalizeWhitespace
(UnicodeString input) Normalize whitespace as defined in XML Schema.static String
removeAllWhitespace
(String value) Remove all whitespace characters from a stringstatic UnicodeString
Remove leading whitespace characters from a stringstatic String
Trim whitespace: return the input string with leading and trailing whitespace removed.static UnicodeString
trim
(UnicodeString in) Trim whitespace: return the input string with leading and trailing whitespace removedstatic long
Get the codepoint offset of the first whitespace character in trailing whitespace in the stringstatic long
Get the codepoint offset of the first non-whitespace character in the string
-
Field Details
-
PRESERVE
public static final int PRESERVEThe values PRESERVE, REPLACE, and COLLAPSE represent the three options for whitespace normalization. They are deliberately chosen in ascending strength order; given a number of whitespace facets, only the strongest needs to be carried out. The option TRIM is used instead of COLLAPSE when all valid values have no interior whitespace; trimming leading and trailing whitespace is then equivalent to the action of COLLAPSE, but faster.- See Also:
-
REPLACE
public static final int REPLACE- See Also:
-
COLLAPSE
public static final int COLLAPSE- See Also:
-
TRIM
public static final int TRIM- See Also:
-
NONE
public static final int NONEThe values NONE, IGNORABLE, and ALL identify which kinds of whitespace text node should be stripped when building a source tree. UNSPECIFIED indicates that no particular request has been made. XSLT indicates that whitespace should be stripped as defined by the xsl:strip-space and xsl:preserve-space declarations in the stylesheet- See Also:
-
IGNORABLE
public static final int IGNORABLE- See Also:
-
ALL
public static final int ALL- See Also:
-
UNSPECIFIED
public static final int UNSPECIFIED- See Also:
-
XSLT
public static final int XSLT- See Also:
-
-
Method Details
-
applyWhitespaceNormalization
Apply schema-defined whitespace normalization to a string- Parameters:
action
- the action to be applied: one of PRESERVE, REPLACE, or COLLAPSEvalue
- the value to be normalized- Returns:
- the value after normalization
-
removeAllWhitespace
Remove all whitespace characters from a string- Parameters:
value
- the string from which whitespace is to be removed- Returns:
- the string without its whitespace.
-
removeLeadingWhitespace
Remove leading whitespace characters from a string- Parameters:
value
- the string whose leading whitespace is to be removed- Returns:
- the string with leading whitespace removed. This may be the original string if there was no leading whitespace
-
containsWhitespace
Determine if a string contains any whitespace- Parameters:
codePoints
- the string to be tested, as a codepoint iterator- Returns:
- true if the string contains a character that is XML whitespace, that is tab, newline, carriage return, or space
-
isAllWhite
Determine if a string is all-whitespace- Parameters:
content
- the string to be tested- Returns:
- true if the supplied string contains no non-whitespace characters. (So the result is true for a zero-length string.)
-
isWhite
public static boolean isWhite(int c) Determine if a character is whitespace- Parameters:
c
- the character or codepoint to be tested- Returns:
- true if the character is a whitespace character
-
normalizeWhitespace
Normalize whitespace as defined in XML Schema. Note that this is not the same as the XPath normalize-space() function, which is supported by thecollapseWhitespace(net.sf.saxon.str.UnicodeString)
method- Parameters:
input
- the string to be normalized- Returns:
- a copy of the string in which any whitespace character is replaced by a single space character
-
collapseWhitespace
Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function- Parameters:
in
- the string whose whitespace is to be collapsed- Returns:
- the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
-
collapseWhitespace
Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function- Parameters:
in
- the string whose whitespace is to be collapsed- Returns:
- the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
-
trimmedStart
Get the codepoint offset of the first non-whitespace character in the string- Parameters:
in
- the input string- Returns:
- the index of the first non-whitespace character; or -1 if the string consists entirely of whitespace (including the case where the string is zero-length)
-
trimmedEnd
Get the codepoint offset of the first whitespace character in trailing whitespace in the string- Parameters:
in
- the input string- Returns:
- the index of the last non-whitespace character plus one; or zero if the string consists entirely of whitespace
-
trim
Trim whitespace: return the input string with leading and trailing whitespace removed- Parameters:
in
- the input string- Returns:
- he input string with leading and trailing whitespace removed
-
trim
Trim whitespace: return the input string with leading and trailing whitespace removed. Note that this differs fromString.trim()
because the definition of whitespace is different.- Parameters:
in
- the input string- Returns:
- he input string with leading and trailing whitespace removed
-
collapse
-
collapse
-
normalize
-
normalize
-