This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Reframing natural organic matter research through compositional data analysis
Downloads
Authors
Abstract
Compositional data (CoDa) are prevalent in environmental research. They represent parts of a whole, such as percentages, proportions, and relative or absolute abundance. They are arrays of positive data that convey relevant information in the ratios between their components. Standard statistical techniques developed for real random observations often yield spurious results and are therefore unsuitable for CoDa, which has unique geometric properties. CoDa analysis is now widely acknowledged across various research fields, ranging from geoscience to social science, with a recent surge in popularity in microbial genomics. However, its adoption remains limited in natural organic matter (NOM) research, despite NOM data from key analytical tools such as mass spectrometry, fluorescence spectroscopy, and nuclear magnetic resonance spectroscopy all being compositional. Given the structural similarity between NOM and high-throughput sequencing data, for which CoDa analysis has been successfully adopted, we argue that CoDa analysis should also be consistently integrated into NOM research to prevent analytical pitfalls and misleading inferences. A few pioneering studies have applied CoDa analysis to NOM data, and a wide array of useful open-source tools are already available. This paper discusses step-by-step the application of CoDa analysis to NOM research, using ultrahigh-resolution mass spectrometry data as an illustrative example. The goal of the study is to provide the community with an overview of CoDa analysis and guide them on how to use it in practice.
DOI
https://doi.org/10.31223/X51X7P
Subjects
Life Sciences
Keywords
CoDa, FT-ICR MS, NMR, Orbitrap mass spectrometry, EEM, PARAFAC, sum constraint, FT-ICR MS, NMR, EEM, PARAFAC, sum constraint, Orbitrap mass spectrometry
Dates
Published: 2025-10-13 16:51
Last Updated: 2025-10-13 16:51
License
CC-BY Attribution-NonCommercial 4.0 International
Additional Metadata
Conflict of interest statement:
None
Data Availability (Reason not available):
https://doi.org/10.6084/m9.figshare.30344623.v2
There are no comments or no comments have been made public for this article.