Positive Matrix Factorization of Large Aerosol Mass Spectrometry Datasets Using Error-Weighted Randomized Hierarchical Alternating Least Squares

This is a Preprint and has not been peer reviewed. This is version 1 of this Preprint.

Downloads

Download Preprint

Authors

Benjamin Sapper, Daven Henze, Manjula Canagaratna, Harald Stark

Abstract

Weighted positive matrix factorization (PMF) has been used by scientists to find small sets of underlying factors in environmental data. However, as the size of the data has grown, increasing computational costs have made it impractical to use traditional methods for this factorization. In this paper, we present a new weighting method to dramatically decrease computational costs for these traditional algorithms. We then apply this weighting method with the Randomized Hierarchical Alternating Least Squares (RHALS) algorithm to a large environmental dataset, where we show that interpretable factors can be reproduced using these methods. We show this algorithm results in a computational speedup of 38, 67, and 634 compared to the Multiplicative Update (MU), deterministic Hierarchical Alternating Least Squares (HALS), and non-negative Alternating Least Squares (ALS) algorithms, respectively. We also investigate rotational ambiguity in the solution, and present a simple ``pulling'' method to rotate a set of factors. This method is shown to find alternative solutions, and in some cases, lower the weighted residual error of the algorithm.

DOI

https://doi.org/10.31223/X5C94F

Subjects

Physical Sciences and Mathematics

Keywords

matrix factorization, Algorithm, Atmospheric Science, positive matrix factorization, randomized algorithm

Dates

Published: 2022-11-17 06:51

Last Updated: 2022-11-17 11:51

License

CC BY Attribution 4.0 International

Add a Comment

You must log in to post a comment.


Comments

There are no comments or no comments have been made public for this article.