The saxon:character-representation attribute
This attribute allows greater control
over how non-ASCII characters will be represented on output.
With method="xml", two values are supported: "decimal" and "hex". These control whether
numeric character references are output in decimal or hexadecimal when the character
is not available in the selected encoding.
With HTML, the value
may hold two strings, separated by a semicolon. The first string defines how non-ASCII
characters within the character encoding will be represented, the values being "native",
"entity", "decimal", or "hex". The second string defines how characters outside the
encoding will be represented, the values being "entity", "decimal", or "hex". Here "native"
means output the character as itself; "entity" means use a defined entity reference (such
as "é") if known; "decimal" and "hex" refer to numeric character references.
For example "entity;decimal" (the default) means that with encoding="iso-8859-1",
characters in the range 160-255 will be represented using standard HTML entity
references, while Unicode characters above 255 will be represented as decimal character
This attribute is retained for the time being in the interests of backwards
compatibility. However, the latest XSLT 2.0 specification makes it technically a non-conformance
to provide attributes that change serialization behavior except in cases where the behavior
is implementation-defined; and this is not such a case (the specification, at least in the case
of the XML output method, does not allow a character to be substituted with a character reference
in cases where the character is present in the chosen encoding. The best way of ensuring that non-ASCII
characters are output using character references is to use