Class Whitespace


  • public class Whitespace
    extends java.lang.Object
    This class provides helper methods and constants for handling whitespace
    • Field Detail

      • PRESERVE

        public static final int PRESERVE
        The values PRESERVE, REPLACE, and COLLAPSE represent the three options for whitespace normalization. They are deliberately chosen in ascending strength order; given a number of whitespace facets, only the strongest needs to be carried out. The option TRIM is used instead of COLLAPSE when all valid values have no interior whitespace; trimming leading and trailing whitespace is then equivalent to the action of COLLAPSE, but faster.
        See Also:
        Constant Field Values
      • NONE

        public static final int NONE
        The values NONE, IGNORABLE, and ALL identify which kinds of whitespace text node should be stripped when building a source tree. UNSPECIFIED indicates that no particular request has been made. XSLT indicates that whitespace should be stripped as defined by the xsl:strip-space and xsl:preserve-space declarations in the stylesheet
        See Also:
        Constant Field Values
    • Method Detail

      • applyWhitespaceNormalization

        public static UnicodeString applyWhitespaceNormalization​(int action,
                                                                 UnicodeString value)
        Apply schema-defined whitespace normalization to a string
        Parameters:
        action - the action to be applied: one of PRESERVE, REPLACE, or COLLAPSE
        value - the value to be normalized
        Returns:
        the value after normalization
      • removeAllWhitespace

        public static java.lang.String removeAllWhitespace​(java.lang.String value)
        Remove all whitespace characters from a string
        Parameters:
        value - the string from which whitespace is to be removed
        Returns:
        the string without its whitespace.
      • removeLeadingWhitespace

        public static UnicodeString removeLeadingWhitespace​(UnicodeString value)
        Remove leading whitespace characters from a string
        Parameters:
        value - the string whose leading whitespace is to be removed
        Returns:
        the string with leading whitespace removed. This may be the original string if there was no leading whitespace
      • containsWhitespace

        public static boolean containsWhitespace​(IntIterator codePoints)
        Determine if a string contains any whitespace
        Parameters:
        codePoints - the string to be tested, as a codepoint iterator
        Returns:
        true if the string contains a character that is XML whitespace, that is tab, newline, carriage return, or space
      • isAllWhite

        public static boolean isAllWhite​(UnicodeString content)
        Determine if a string is all-whitespace
        Parameters:
        content - the string to be tested
        Returns:
        true if the supplied string contains no non-whitespace characters. (So the result is true for a zero-length string.)
      • isWhite

        public static boolean isWhite​(int c)
        Determine if a character is whitespace
        Parameters:
        c - the character or codepoint to be tested
        Returns:
        true if the character is a whitespace character
      • normalizeWhitespace

        public static UnicodeString normalizeWhitespace​(UnicodeString input)
        Normalize whitespace as defined in XML Schema. Note that this is not the same as the XPath normalize-space() function, which is supported by the collapseWhitespace(net.sf.saxon.str.UnicodeString) method
        Parameters:
        input - the string to be normalized
        Returns:
        a copy of the string in which any whitespace character is replaced by a single space character
      • collapseWhitespace

        public static UnicodeString collapseWhitespace​(UnicodeString in)
        Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function
        Parameters:
        in - the string whose whitespace is to be collapsed
        Returns:
        the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
      • collapseWhitespace

        public static java.lang.String collapseWhitespace​(java.lang.String in)
        Collapse whitespace as defined in XML Schema. This is equivalent to the XPath normalize-space() function
        Parameters:
        in - the string whose whitespace is to be collapsed
        Returns:
        the string with any leading or trailing whitespace removed, and any internal sequence of whitespace characters replaced with a single space character.
      • trimmedStart

        public static long trimmedStart​(UnicodeString in)
        Get the codepoint offset of the first non-whitespace character in the string
        Parameters:
        in - the input string
        Returns:
        the index of the first non-whitespace character; or -1 if the string consists entirely of whitespace (including the case where the string is zero-length)
      • trimmedEnd

        public static long trimmedEnd​(UnicodeString in)
        Get the codepoint offset of the first whitespace character in trailing whitespace in the string
        Parameters:
        in - the input string
        Returns:
        the index of the last non-whitespace character plus one; or zero if the string consists entirely of whitespace
      • trim

        public static UnicodeString trim​(UnicodeString in)
        Trim whitespace: return the input string with leading and trailing whitespace removed
        Parameters:
        in - the input string
        Returns:
        he input string with leading and trailing whitespace removed
      • trim

        public static java.lang.String trim​(java.lang.String in)
        Trim whitespace: return the input string with leading and trailing whitespace removed. Note that this differs from String.trim() because the definition of whitespace is different.
        Parameters:
        in - the input string
        Returns:
        he input string with leading and trailing whitespace removed
      • collapse

        public static java.lang.String collapse​(java.lang.CharSequence in)
      • normalize

        public static java.lang.String normalize​(java.lang.CharSequence in)