Skip to main content
A metadata schema for documenting material samples from multiple domains

A metadata schema for documenting material samples from multiple domains

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Stephen Richard, Dave Vieglais, Andrea Thomer, Sarah Hyunju Song, Neil Davies, Quan Gan, Eric Kansa, Sarah Kansa, John Kunze, Kerstin Lehnert, Danny Mandel, Chris Meyer, Rebecca Snyder, Ramona Walls

Abstract

This paper documents a metadata schema, implementation, and associated vocabularies developed for the Internet of Samples (iSamples) project to integrate geoscience, archaeology/anthropology, biology and genomics sample descriptions in a single cross-domain catalog. To develop the sample description scheme for sample discovery across these disparate domains, we reviewed the metadata schema and example metadata from each project partner, as well as other existing schemes. Top level classes in the schema include MaterialSampleRecord, Curation, SamplingEvent, SamplingSite and Agent. By factoring sample type classification into material type, material sample object type, and sampled feature type, it has been possible to classify the approximately 6,000,000 samples in the combined corpus. Category vocabularies for these classifications were developed based unique value summaries from related fields in the source sample metadata, tested using a card sorting exercise and by development of code for automated mapping from source metadata. Each vocabulary has on the order of 20 categories with some hierarchy; the category concepts are intended to be covering, but might overlap. These vocabularies are implemented in SKOS, and published with the ARDC Research Vocabularies Australia (RVA) vocabulary service. The metadata schema is defined using a LinkML YAML file, and implemented as a JSON schema used to validate instance documents. To support interoperability mapping from the iSamples metadata schema to several other schemes is provided in the project Github.

DOI

https://doi.org/10.31223/X5KT9C

Subjects

Biodiversity, Databases and Information Systems, Earth Sciences, Environmental Sciences, Library and Information Science

Keywords

Physical Samples, natural history collections, ontology, metadata standard, Material Samples

Dates

Published: 2026-02-04 17:43

Last Updated: 2026-02-04 17:43

License

No Creative Commons license

Additional Metadata

Data Availability (Reason not available):
n/a

Metrics

Views: 35

Downloads: 2