Skip to main content
Reframing natural organic matter research through compositional data analysis

Reframing natural organic matter research through compositional data analysis

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.

Downloads

Download Preprint

Authors

Morimaru Kida, Julian Merder, Thorsten Dittmar, Vera Pawlowsky-Glahn, Juan Jose Egozcue

Abstract

Compositional data (CoDa) are prevalent in environmental research. They represent parts of a whole, such as percentages, proportions, and relative or absolute abundance. They are arrays of positive data that convey relevant information in the ratios between their components. Standard statistical techniques developed for real random observations often yield spurious results and are therefore unsuitable for CoDa, which has unique geometric properties. CoDa analysis is now widely acknowledged across various research fields, ranging from geoscience to social science, with a recent surge in popularity in microbial genomics. However, its adoption remains limited in natural organic matter (NOM) research, despite NOM data from key analytical tools such as mass spectrometry, fluorescence spectroscopy, and nuclear magnetic resonance spectroscopy all being compositional. Given the structural similarity between NOM and high-throughput sequencing data, for which CoDa analysis has been successfully adopted, we argue that CoDa analysis should also be consistently integrated into NOM research to prevent analytical pitfalls and misleading inferences. A few pioneering studies have applied CoDa analysis to NOM data, and a wide array of useful open-source tools are already available. This paper discusses step-by-step the application of CoDa analysis to NOM research, using ultrahigh-resolution mass spectrometry data as an illustrative example. The goal of the study is to provide the community with an overview of CoDa analysis and guide them on how to use it in practice.

DOI

https://doi.org/10.31223/X51X7P

Subjects

Life Sciences

Keywords

CoDa, FT-ICR MS, NMR, Orbitrap mass spectrometry, EEM, PARAFAC, sum constraint, FT-ICR MS, NMR, EEM, PARAFAC, sum constraint, Orbitrap mass spectrometry

Dates

Published: 2025-10-13 16:51

Last Updated: 2025-10-13 16:51

License

CC-BY Attribution-NonCommercial 4.0 International

Additional Metadata

Conflict of interest statement:
None

Data Availability (Reason not available):
https://doi.org/10.6084/m9.figshare.30344623.v2