net.sf.saxon.serialize.charcode
Class UTF16CharacterSet

java.lang.Object
  extended by net.sf.saxon.serialize.charcode.UTF16CharacterSet
All Implemented Interfaces:
CharacterSet

public class UTF16CharacterSet
extends Object
implements CharacterSet

A class to hold some static constants and methods associated with processing UTF16 and surrogate pairs


Field Summary
static int NONBMP_MAX
           
static int NONBMP_MIN
           
static char SURROGATE1_MAX
           
static char SURROGATE1_MIN
           
static char SURROGATE2_MAX
           
static char SURROGATE2_MIN
           
 
Method Summary
static int combinePair(char high, char low)
          Return the non-BMP character corresponding to a given surrogate pair surrogates.
static boolean containsSurrogates(CharSequence s)
          Test whether a CharSequence contains any surrogates (i.e.
 String getCanonicalName()
          Get the preferred Java name of the character set.
static UTF16CharacterSet getInstance()
          Get the singular instance of this class
static char highSurrogate(int ch)
          Return the high surrogate of a non-BMP character
 boolean inCharset(int c)
          Determine if a character is present in the character set
static boolean isHighSurrogate(int ch)
          Test whether the given character is a high surrogate
static boolean isLowSurrogate(int ch)
          Test whether the given character is a low surrogate
static boolean isSurrogate(int c)
          Test whether a given character is a surrogate (high or low)
static char lowSurrogate(int ch)
          Return the low surrogate of a non-BMP character
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

NONBMP_MIN

public static final int NONBMP_MIN
See Also:
Constant Field Values

NONBMP_MAX

public static final int NONBMP_MAX
See Also:
Constant Field Values

SURROGATE1_MIN

public static final char SURROGATE1_MIN
See Also:
Constant Field Values

SURROGATE1_MAX

public static final char SURROGATE1_MAX
See Also:
Constant Field Values

SURROGATE2_MIN

public static final char SURROGATE2_MIN
See Also:
Constant Field Values

SURROGATE2_MAX

public static final char SURROGATE2_MAX
See Also:
Constant Field Values
Method Detail

getInstance

public static UTF16CharacterSet getInstance()
Get the singular instance of this class

Returns:
the singular instance of this classthe singular instance of this class

inCharset

public boolean inCharset(int c)
Description copied from interface: CharacterSet
Determine if a character is present in the character set

Specified by:
inCharset in interface CharacterSet

getCanonicalName

public String getCanonicalName()
Description copied from interface: CharacterSet
Get the preferred Java name of the character set. Note that Java in many cases also supports a "historic name".

Specified by:
getCanonicalName in interface CharacterSet

combinePair

public static int combinePair(char high,
                              char low)
Return the non-BMP character corresponding to a given surrogate pair surrogates.

Parameters:
high - The high surrogate.
low - The low surrogate.
Returns:
the Unicode codepoint represented by the surrogate pair

highSurrogate

public static char highSurrogate(int ch)
Return the high surrogate of a non-BMP character

Parameters:
ch - The Unicode codepoint of the non-BMP character to be divided.
Returns:
the first character in the surrogate pair

lowSurrogate

public static char lowSurrogate(int ch)
Return the low surrogate of a non-BMP character

Parameters:
ch - The Unicode codepoint of the non-BMP character to be divided.
Returns:
the second character in the surrogate pair

isSurrogate

public static boolean isSurrogate(int c)
Test whether a given character is a surrogate (high or low)

Parameters:
c - the character to test
Returns:
true if the character is the high or low half of a surrogate pair

isHighSurrogate

public static boolean isHighSurrogate(int ch)
Test whether the given character is a high surrogate

Parameters:
ch - The character to test.
Returns:
true if the character is the first character in a surrogate pair

isLowSurrogate

public static boolean isLowSurrogate(int ch)
Test whether the given character is a low surrogate

Parameters:
ch - The character to test.
Returns:
true if the character is the second character in a surrogate pair

containsSurrogates

public static boolean containsSurrogates(CharSequence s)
Test whether a CharSequence contains any surrogates (i.e. any non-BMP characters

Parameters:
s - the string to be tested


Copyright (c) 2004-2010 Saxonica Limited. All rights reserved.