Syntax of path patterns

This section describes the structure of path patterns informally. It covers the most commonly used forms of pattern, but is not exhaustive.

Generally the most important part of a path pattern is a node test that matches the relevant node; this can be qualified either by a path that qualifies where in the document the node must appear, and/or by one or more predicates that are usually used to place constraints on the node's content.

A simple but very common example: p is a node test that matches elements named p. This might be qualified by a path such as /doc/appendix//p indicating that the p element must be contained in an appendix which is itself part of a doc element at the outermost level of a document; it might also be qualified by a predicate such as p[@class='note'] which indicates that the element must have a class attribute whose value is "note".

Node Tests

Technically, the node test p can be considered as an abbreviation for child::element(p, *). This has four parts:

p is the name of the node: here, the requirement is that the node has local name p and that the namespace is the default namespace for elements and types. The stylesheet might contain the attribute xpath-default-namespace="http://xx/" to change the default. XSLT 4.0 also offers the option xpath-default-namespace="##any" to indicate that the name should be matched regardless of its namespace.

XSLT 3.0 generalized the syntax for matching names, so *:para matches a para element in any namespace, and ns:* matches any element in the ns namespace (strictly speaking, the namespace URI bound in the stylesheet to the prefix ns). The namespace can also be spelled out explicitly, for example Q{http://my.ns/}para or Q{http://my.ns/}*.

XSLT 4.0 generalizes the syntax further so that element(*:a|*:b) matches an element in any namespace whose local name is either a or b.
* in this example is the type annotation of the node: here, the pattern will match regardless of the type annotation. Supplying a type is useful with schema-aware stylesheets, particularly those using schemas like FpML (Financial products Markup Language) where the schema type annotation is more useful than the element name for distinguishing different elements.

For example, the element <FpML version="4-0" xmlns="http://www.fpml.org/2002/FpML-4-0" xsi:type="TradeConfirmationRequest">...</FpML> might be matched using the node test element(fpml:FpML, fpml:TradeConfirmationRequest).
The node kind, here element, is useful when matching nodes other than elements. When matching attributes, the syntax attribute(code) is usually abbreviated as @code. To match text nodes, use text(); similarly comment() or processing-instruction() can be used.

As a special case, the pattern / matches a document node.
The axis, here child, is rarely needed. The default is the child axis except when the node kind is attribute or namespace, so the default is almost always right. It is also possible to use the descendant or descendant-or-self axes, but in practice a descendant selection is implicit when the node test is preceded by //.

Qualifying Path

The node test that matches the target nodes can be preceded by a qualifying path. For example, li/p matches a p element only if its parent is an li element, while table//p matches a p element only if it is contained within a table.

More generally, in the pattern P/Q, the node must match Q and it must have a parent that matches P, while for a pattern P//Q the node must match Q and must have an ancestor than matches P. In these examples P can be any pattern, so it can itself have qualifiers and predicates. Thus chap[@nr='1']//para matches any para element within the chap element having the attribute nr="1".

Less commonly, the qualifier for a pattern can be a call on the id() function, or a call on doc() to indicate that the match must occur within a specific document.

A leading / or // at the start of the pattern indicates that the match must occur within a tree rooted at a document node. Think carefully before using this: usually it adds nothing to the meaning of the pattern, and occasionally it has unintended consequences.

Predicates

A predicate is simply an additional condition that the pattern must satisfy, expressed as an XPath expression within square brackets. The predicate is evaluated with the candidate node as the context node. A simple example is para[@class='note'] which matches only those para elements having the requested attribute value.

Numeric predicates are allowed, but can cause some confusion. The pattern para[1] matches a para element that is the first para child of its parent element; by contrast, *[1][self::para] matches the first child of any element provided its name is para. The pattern (section//figure)[1] (alternatively, section/descendant::figure[1]), matches the first figure within a section (at any depth). Writing section//figure[1] is probably a mistake: it matches any figure element within a section provided it is the first figure child of its parent element.

Union, intersection, and difference

The pattern P|Q matches a node if the node matches either of the operand patterns P or Q. The | operator can also be written union. Some complications arise in XSLT template matching if a node matches both branches of the union, and this case is best avoided.

In simple cases, the pattern P except Q matches a node if it matches P and does not match Q. For example, @* except (@code | @status) matches all attributes except code and status. In more complex cases, particularly where the descendant axis is involved, the semantics of except can be unexpected. For example para except ancestor//para does not have the intuitive meaning, and should be avoided. XSLT 4.0 proposes making an incompatible change in this area.

The form P intersect Q is allowed (a matching node must satisfy both patterns), but it is very rarely useful in practice.