palaeoverse: a community-driven R package to support palaeobiological analysis

This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1111/2041-210X.14099. This is version 2 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Supplementary Files
Authors

Lewis Alan Jones , William Gearty , Bethany Allen , Kilian Eichenseer, Christopher D. Dean, Sofía Galván, Miranta Kouvari, Pedro L. Godoy, Cecily Nicholl, Lucas Buffan, Joseph T. Flannery-Sutherland, Erin M. Dillon, Alfio Alessandro Chiarenza 

Abstract

1. The open-source programming language ‘R’ has become a standard tool in the palaeobiologist’s toolkit. Its popularity within the palaeobiology community continues to grow, with published articles increasingly citing the usage of R and R packages. However, there are currently a lack of agreed standards for data preparation and available frameworks to support implementation of such standards. Consequently, data preparation workflows are often unclear and not reproducible, even when code is provided. Moreover, due to a lack of code accessibility and documentation, palaeobiologists are often forced to ‘reinvent the wheel’ to find solutions to issues already solved by other members of the community.
2. Here, we introduce palaeoverse, a community-driven R package to aid data preparation and exploration for quantitative palaeobiological research. The package is freely available and has three core principles: (1) streamline data preparation and analyses; (2) enhance code readability; and (3) improve reproducibility of results. To develop these aims, we assessed the analytical needs of the broader palaeobiological community using an online survey, in addition to incorporating our own experiences.
3. In this work, we first report the findings of the survey which shaped the development of the package. Subsequently, we describe and demonstrate the functionality available in palaeoverse and provide usage examples. Finally, we discuss the resources we have made available for the community and the future plans for the broader palaeoverse project.
4. palaeoverse is the first community-driven R package in palaeobiology, developed with the intention of bringing palaeobiologists together to establish agreed standards for high-quality quantitative research. The package provides a user-friendly platform for preparing data for analysis with well-documented open-source code to enhance transparency. The functionality available in palaeoverse improves code reproducibility and accessibility, which is beneficial for both the review process and future research.

DOI

https://doi.org/10.31223/X5Z94Q

Subjects

Geology, Paleobiology, Paleontology

Keywords

Analytical Palaeobiology, Computational Palaeobiology, R Programming, Readable, Reusable, Reproducible

Dates

Published: 2022-10-26 08:06

Last Updated: 2023-04-18 09:37

Older Versions
License

CC BY Attribution 4.0 International

Additional Metadata

Conflict of interest statement:
None.

Data Availability (Reason not available):
The palaeoverse R package is hosted on CRAN (TBC) and is available on GitHub (https://github.com/palaeoverse-community/palaeoverse). All example datasets are bundled with the R package. All code is released under a GPL (>= 3) license.