Syntax of path patterns

This section describes the structure of path patterns informally. It covers the most commonly used forms of pattern, but is not exhaustive.

Generally the most important part of a path pattern is a node test that matches the relevant node; this can be qualified either by a path that qualifies where in the document the node must appear, and/or by one or more predicates that are usually used to place constraints on the node's content.

A simple but very common example: p is a node test that matches elements named p. This might be qualified by a path such as /doc/appendix//p indicating that the p element must be contained in an appendix which is itself part of a doc element at the outermost level of a document; it might also be qualified by a predicate such as p[@class='note'] which indicates that the element must have a class attribute whose value is "note".

Node Tests

Technically, the node test p can be considered as an abbreviation for child::element(p, *). This has four parts:

Qualifying Path

The node test that matches the target nodes can be preceded by a qualifying path. For example, li/p matches a p element only if its parent is an li element, while table//p matches a p element only if it is contained within a table.

More generally, in the pattern P/Q, the node must match Q and it must have a parent that matches P, while for a pattern P//Q the node must match Q and must have an ancestor than matches P. In these examples P can be any pattern, so it can itself have qualifiers and predicates. Thus chap[@nr='1']//para matches any para element within the chap element having the attribute nr="1".

Less commonly, the qualifier for a pattern can be a call on the id() function, or a call on doc() to indicate that the match must occur within a specific document.

A leading / or // at the start of the pattern indicates that the match must occur within a tree rooted at a document node. Think carefully before using this: usually it adds nothing to the meaning of the pattern, and occasionally it has unintended consequences.

Predicates

A predicate is simply an additional condition that the pattern must satisfy, expressed as an XPath expression within square brackets. The predicate is evaluated with the candidate node as the context node. A simple example is para[@class='note'] which matches only those para elements having the requested attribute value.

Numeric predicates are allowed, but can cause some confusion. The pattern para[1] matches a para element that is the first para child of its parent element; by contrast, *[1][self::para] matches the first child of any element provided its name is para. The pattern (section//figure)[1] (alternatively, section/descendant::figure[1]), matches the first figure within a section (at any depth). Writing section//figure[1] is probably a mistake: it matches any figure element within a section provided it is the first figure child of its parent element.

Union, intersection, and difference

The pattern P|Q matches a node if the node matches either of the operand patterns P or Q. The | operator can also be written union. Some complications arise in XSLT template matching if a node matches both branches of the union, and this case is best avoided.

In simple cases, the pattern P except Q matches a node if it matches P and does not match Q. For example, @* except (@code | @status) matches all attributes except code and status. In more complex cases, particularly where the descendant axis is involved, the semantics of except can be unexpected. For example para except ancestor//para does not have the intuitive meaning, and should be avoided. XSLT 4.0 proposes making an incompatible change in this area.

The form P intersect Q is allowed (a matching node must satisfy both patterns), but it is very rarely useful in practice.