Stable identifiers for specimens – A CETAF ISTC initiative supported by pro-iBiosphere
Iliyana Kuzmova

A recent initiative of the CETAF ISTC (Consortium of European Taxonomic Facilities – Information Science and Technology Committee) aims to implement a consistent identifier system for biological collections. This could be an important contribution to the formation of an international system of stable identifiers for the realm of biodiversity data.

The desire to participate in the Semantic Web and Linked Open Data has caused new interest in modern alternative identifiers for natural history collection specimens. In 2012, the Royal Botanic Garden Edinburgh (RBGE) published a paper (Hyam et al.[1]); see also Stable Citations for Herbarium Specimens on the internet) on using the Linked Data principles to issue HTTP URIs (URLs) for their specimens. The CETAF ISTC initiative mentioned was triggered by the RBGE paper. Other important proponents include P. DeVries (see e.g. his 2012 presentation on Linked open Data).

The pro-iBiosphere project was instrumental in furthering this discussion by addressing the issue in depth during both the Leiden (2013-02) and Berlin (2013-05) workshops. Significant progress was made in reaching improved understanding of the problems of LSIDs, DOIs, and Semantic Web stable HTTP URIs; see the pro-iBiosphere report on "Towards Best Practices Guide on Editorial Policies". Additionally, a pro-iBiosphere best practices document for choosing stable URI patterns was developed.

The proposed system allows the implementation of a consistent identifier system for, for example, collections held by CETAF institutions. It is based on HTTP URIs and allows a clear distinction between physical collection objects and their associated metadata. It can provide both human and machine readable object representations.

In a "stable identifier hackathon" in June 2013 in Edinburgh, five CETAF institutions (Royal Botanical Garden Edinburgh, Museum für Naturkunde in Berlin, Royal Botanic Garden Kew, National Museum of National History Paris and Botanical Museum Berlin-Dahlem) committed to a rapid pilot implementation of the system. Naturalis Biodiversity Center in the Netherlands also plans to join this effort.

The identifier implementations of the five institutions will be reviewed during the planned pro-iBiosphere workshop on "Improving technical cooperation and interoperability at the e-infrastructure level", to be held in the week of the 8th to 11th October 2013. The workshop will also discuss the application of the technology on object types other than specimens. Furthermore a demonstration on applications to show how the system can be used will be organised.

As we demonstrate the value in this way of working, the pro-iBiosphere consortium aims to encourage other institutions to adopt HTTP URIs for their specimens. The annual meeting of TDWG Biodiversity Information Standards that will take place in Florence in November 2013 and future pro-iBiosphere workshops will be used for this purpose.

The URI approach will be expanded to other domains like treatments or images within the pro-iBiosphere pilot studies (Task 4.2). Separately, a dialogue between DataCite and our community on creating a subdomain of DOIs for observation data is being pursued.

Already Zoobank has committed to offer this approach as an alternative to its present LSID-based system. We expect other institutions in the biodiversity domain to follow and hope that the system can serve as a blueprint for additional biodiversity informatics object types beyond collections.


Article by Anton Güntsch and Gregor Hagedorn

[1] Hyam, R., Drinkwater R. E. & Harris, D. J. (2012) Stable citations for herbarium specimens on the internet: an illustration from a taxonomic revision of Duboscia (Malvaceae). Phytotaxa 73: 17–30

