This is a Preprint and has not been peer reviewed. The published version of this Preprint is available: https://doi.org/10.1111/2041-210X.14099. This is version 2 of this Preprint.
Downloads
Supplementary Files
Authors
Abstract
1. The open-source programming language ‘R’ has become a standard tool in the palaeobiologist’s toolkit. Its popularity within the palaeobiology community continues to grow, with published articles increasingly citing the usage of R and R packages. However, there are currently a lack of agreed standards for data preparation and available frameworks to support implementation of such standards. Consequently, data preparation workflows are often unclear and not reproducible, even when code is provided. Moreover, due to a lack of code accessibility and documentation, palaeobiologists are often forced to ‘reinvent the wheel’ to find solutions to issues already solved by other members of the community.
2. Here, we introduce palaeoverse, a community-driven R package to aid data preparation and exploration for quantitative palaeobiological research. The package is freely available and has three core principles: (1) streamline data preparation and analyses; (2) enhance code readability; and (3) improve reproducibility of results. To develop these aims, we assessed the analytical needs of the broader palaeobiological community using an online survey, in addition to incorporating our own experiences.
3. In this work, we first report the findings of the survey which shaped the development of the package. Subsequently, we describe and demonstrate the functionality available in palaeoverse and provide usage examples. Finally, we discuss the resources we have made available for the community and the future plans for the broader palaeoverse project.
4. palaeoverse is the first community-driven R package in palaeobiology, developed with the intention of bringing palaeobiologists together to establish agreed standards for high-quality quantitative research. The package provides a user-friendly platform for preparing data for analysis with well-documented open-source code to enhance transparency. The functionality available in palaeoverse improves code reproducibility and accessibility, which is beneficial for both the review process and future research.
DOI
https://doi.org/10.31223/X5Z94Q
Subjects
Geology, Paleobiology, Paleontology
Keywords
Analytical Palaeobiology, Computational Palaeobiology, R Programming, Readable, Reusable, Reproducible
Dates
Published: 2022-10-26 08:06
Last Updated: 2023-04-18 09:37
Older Versions
License
CC BY Attribution 4.0 International
Additional Metadata
Conflict of interest statement:
None.
Data Availability (Reason not available):
The palaeoverse R package is hosted on CRAN (TBC) and is available on GitHub (https://github.com/palaeoverse-community/palaeoverse). All example datasets are bundled with the R package. All code is released under a GPL (>= 3) license.
There are no comments or no comments have been made public for this article.