Syntax of path patterns
This section describes the structure of path patterns informally. It covers the most commonly used forms of pattern, but is not exhaustive.
Generally the most important part of a path pattern is a node test that matches the relevant node; this can be qualified either by a path that qualifies where in the document the node must appear, and/or by one or more predicates that are usually used to place constraints on the node's content.
A simple but very common example: p is a node test that matches elements named p.
This might be qualified by a path such as /doc/appendix//p indicating that the p
element must be contained in an appendix which is itself part of a doc
element at the outermost level of a document; it might also be qualified by a predicate such as
p[@class='note'] which indicates that the element must have a class attribute
whose value is "note".
Node Tests
Technically, the node test p can be considered as
an abbreviation for child::element(p, *). This has four parts:
pis the name of the node: here, the requirement is that the node has local namepand that the namespace is the default namespace for elements and types. The stylesheet might contain the attributexpath-default-namespace="http://xx/"to change the default. XSLT 4.0 also offers the optionxpath-default-namespace="##any"to indicate that the name should be matched regardless of its namespace.XSLT 3.0 generalized the syntax for matching names, so
*:paramatches aparaelement in any namespace, andns:*matches any element in thensnamespace (strictly speaking, the namespace URI bound in the stylesheet to the prefixns). The namespace can also be spelled out explicitly, for exampleQ{http://my.ns/}paraorQ{http://my.ns/}*.XSLT 4.0 generalizes the syntax further so that
element(*:a|*:b)matches an element in any namespace whose local name is eitheraorb.*in this example is the type annotation of the node: here, the pattern will match regardless of the type annotation. Supplying a type is useful with schema-aware stylesheets, particularly those using schemas like FpML (Financial products Markup Language) where the schema type annotation is more useful than the element name for distinguishing different elements.For example, the element
<FpML version="4-0" xmlns="http://www.fpml.org/2002/FpML-4-0" xsi:type="TradeConfirmationRequest">...</FpML>might be matched using the node testelement(fpml:FpML, fpml:TradeConfirmationRequest).The node kind, here
element, is useful when matching nodes other than elements. When matching attributes, the syntaxattribute(code)is usually abbreviated as@code. To match text nodes, usetext(); similarlycomment()orprocessing-instruction()can be used.As a special case, the pattern
/matches a document node.The axis, here
child, is rarely needed. The default is the child axis except when the node kind is attribute or namespace, so the default is almost always right. It is also possible to use thedescendantordescendant-or-selfaxes, but in practice a descendant selection is implicit when the node test is preceded by//.
Qualifying Path
The node test that matches the target nodes can be preceded by a qualifying path.
For example, li/p matches a p element only if its parent
is an li element, while table//p matches a p element only
if it is contained within a table.
More generally, in the pattern P/Q, the node must match Q
and it must have a parent that matches P, while for a pattern P//Q
the node must match Q and must have an ancestor than matches P.
In these examples P can be any pattern, so it can itself have qualifiers
and predicates. Thus chap[@nr='1']//para matches any para
element within the chap element having the attribute nr="1".
Less commonly, the qualifier for a pattern can be a call on the id()
function, or a call on doc() to indicate that the match must occur within a specific
document.
A leading / or // at the start of the pattern indicates that the
match must occur within a tree rooted at a document node. Think carefully before using this:
usually it adds nothing to the meaning of the pattern, and occasionally it has unintended
consequences.
Predicates
A predicate is simply an additional condition that the pattern must satisfy, expressed
as an XPath expression within square brackets. The predicate is evaluated with the candidate
node as the context node. A simple example is para[@class='note'] which matches
only those para elements having the requested attribute value.
Numeric predicates are allowed, but can cause some confusion. The pattern para[1]
matches a para element that is the first para child of its parent element;
by contrast, *[1][self::para] matches the first child of any element provided its name
is para. The pattern (section//figure)[1] (alternatively, section/descendant::figure[1]),
matches the first figure within a section (at any depth). Writing
section//figure[1] is probably a mistake: it matches any figure element
within a section provided it is the first figure child of its parent element.
Union, intersection, and difference
The pattern P|Q matches a node if the node matches either of the operand patterns
P or Q. The | operator can also be written union.
Some complications arise in XSLT template matching if a node matches both branches of the union,
and this case is best avoided.
In simple cases, the pattern P except Q matches a node if it matches P
and does not match Q. For example, @* except (@code | @status) matches all
attributes except code and status. In more complex cases, particularly
where the descendant axis is involved, the semantics of except can be unexpected.
For example para except ancestor//para does not have the intuitive meaning, and
should be avoided. XSLT 4.0 proposes making an incompatible change in this area.
The form P intersect Q is allowed (a matching node must satisfy both patterns),
but it is very rarely useful in practice.