xsl:character-map declaration defines a named character map for use
during serialization. The
name attribute gives the name of the character map, which can be
referenced from the
use-character-maps attribute of
xsl:character-map element contains a set of
xsl:output-character elements each
of which defines the output representation of a given Unicode character. The character is specified using
character attribute, the string which is to replace this character on serialization is
specified using the
string attribute. Both attributes are mandatory.
The replacement string is output as is, even if it contains special (markup) characters. So, for
example, you can define <xsl:output-character character=" " string=" "/> to ensure that
NBSP characters are output using the entity reference
Character maps allow you to produce output that is not well-formed XML, and they thus provide a replacement
disable-output-escaping. A useful technique is to use characters in the Unicode
private use area (xE000 to xF8FF) as characters which, if present in the result tree, will be mapped to
special strings on output. For example, if you want to generate a proprietary XML-like format that uses
tags such as <!IF>, <!THEN>, and <!ELSE>, then you could map these to the three characters
xE000, xE001, xE002 (which you could in turn define as entities so they can be written symbolically in your
stylesheet or source document).
Character maps are preferred to
disable-output-escaping because they do not rely on an
intimate interface between the transformation engine and the serializer, and they do not distort the data model. The
special characters can happily be stored in a DOM, passed across the SAX interface, or manipulated in any
other way, before finally being rendered by the serializer.
Character maps may be assembled from other character maps using the
attribute. This contains a space-separated list of the names of other character maps that are to be
included in this character map.
Using character maps may be expensive at run-time. I have not measured the effect. Saxon currently makes no
special attempts to optimize their use: if character maps are used, then every character that is output
will be looked up in a hash table to see if there is a replacement string.