Writing your own Collection Finder

Since Saxon 9.7, the CollectionFinder interface replaces the CollectionURIResolver interface in previous releases. It has much more flexibility, in particular the ability to deliver non-XML resources. The old CollectionURIResolver interface has been dropped in Saxon 10.

Details of the interface can be found in the Javadoc. The basic steps are:

  1. Write a class that implements CollectionFinder. It takes a single method, which accepts an absolute collection URI, and returns an object that implements ResourceCollection. Register an instance of your CollectionFinder with the Saxon Configuration.

    For example, a CollectionFinder written to handle collection URIs using the scheme name "sql" might be supplied as:

    config.setCollectionFinder((context, uri) -> uri.startsWith('sql:') ? sqlCollection(uri) : config.getStandardCollectionFinder().findCollection(context, uri) )

    where sqlCollection(uri) returns some user-defined implementation of ResourceCollection, perhaps one that retrieves XML documents from a relational database.

  2. You can either reuse the existing implementations of ResourceCollection, namely CatalogCollection, DirectoryCollection, and JarCollection, or you can write your own. You can also of course subclass the existing collection classes. The ResourceCollection object provides two key methods that you need to implement: getResources(), which returns a sequence of Resource objects, and getResourceURIs(), which returns a sequence of URIs. These are invoked by the fn:collection() and fn:uri-collection() functions respectively.

  3. Again, you can either reuse existing implementations of Resource (such as XmlResource, JSONResource, UnparsedTextResource, BinaryResource, and MetadataResource), or you can create your own, perhaps by subclassing. The key method that the Resource object must provide is getItem() which returns the resource in the form of an XDM item. It is good practice to delay any extensive work such as parsing until the getItem() method is called: this reduces the memory footprint, and enables parallel evaluation of multiple threads (Saxon-EE only).