An adaptive auto-reduction solver for speeding up integration of chemical 1 kinetics in atmospheric chemistry models: implementation and evaluation in 2 the Kinetic Pre-Processor (KPP) version 3.0.0

Abstract


Introduction
Modeling atmospheric chemistry is a grand computational challenge.Current global 3-D models of oxidantaerosol chemistry use chemical mechanisms that may involve hundreds of coupled chemical species with lifetimes ranging from less than a second to many years.Chemical evolution in such a mechanism is computed by solving a large, stiff system of coupled non-linear ordinary differential equations (ODEs) expressing the chemical kinetics of individual species.Implicit solvers are required to accommodate the coexistence of short and long time constants but are computationally expensive because of the need for repeated construction and inversion of the Jacobian matrix (Brasseur & Jacob, 2017).Chemical integration often dominates the overall computational cost of global 3-D atmospheric chemistry model simulations, even in massively parallel environments or using graphics processing units (GPUs) (Alvanos & Christoudias, 2017;Eastham et al., 2018;Zhuang et al., 2020;Dawson et al., 2022).
Considerable research has gone into devising algorithms to speed up chemistry solvers.A common strategy is to split the mechanism species by time scales in order to decrease the stiffness of the system (Young & Boris, 1977;Gong & Cho, 1993;Djouad & Sportisse, 2002), but this tends to be mechanism-specific and is difficult to apply in global models because of the wide range of conditions that may be experienced.
Wholesale reduction of the mechanism, such as in the Super-Fast mechanism used in some climate models (Brown-Steiner et al., 2018), may lead to large errors (Kelp et al., 2022) and incorrect chemical responses to perturbations.Machine learning methods can in principle speed up chemical integration by orders of magnitude but have met with little success because of the large dimensionality of the problem resulting in error growth (Keller & Evans, 2019;Kelp et al., 2020Kelp et al., , 2022)).
A promising approach for global models is to recognize that the full complexity of the mechanism is not needed everywhere.For example, reactions involving volatile organic compounds (VOCs) and their shortlived products typically account for much of the complexity but may be unimportant outside of continental boundary layers where they are emitted.Jacobson (1995) thus applied separate mechanisms in a global model for the urban boundary layer, the global troposphere, and the stratosphere.However, this approach can result in errors and inefficiencies by not accounting for the interactions at chemical boundaries between domains (Rastigejev et al., 2007) and not allowing for a continuum of chemical regimes from source regions to the remote atmosphere.Santillana et al. (2010) developed an adaptive mechanism reduction method in which the size of the mechanism is adjusted at each grid cell and time step, classifying species as fast (coupled) or slow (uncoupled) on the basis of their total production and loss rates.However, the overhead involved in local definition of the reduced mechanism offset the computational gains.Sander et al. (2019) and Shen et al. (2020Shen et al. ( , 2022) ) improved the method by pre-compiling a limited ensemble of chemical submechanisms.Sub-mechanisms were selected based on local conditions at each timestep, thus avoiding the overhead required to define sub-mechanisms.Shen et al. achieved a 30%-50% reduction in computational cost compared to the full parent mechanism in a global simulation of the troposphere and stratosphere, but the selection of sub-mechanisms had to be customized to the parent mechanism.
Here we develop a mechanism-agnostic, ready-to-use method for adaptive auto-reduction of any chemical mechanism and implement it as an option in a new version 3.0.0 of the Kinetic PreProcessor (KPP).KPP, originally developed by Damian et al. (2002) and Sandu and Sander (2006), is a software tool that automatically generates code to efficiently integrate chemical mechanisms.KPP takes in a set of humanreadable input files describing the mechanism species and reactions and generates Fortran 90, C, or MATLAB code solving the corresponding system of ODEs using one of a suite of integration methods.
KPP is used to generate the chemistry solver source code in several atmospheric chemistry models including MECCA within MESSy (Sander et al., 2019;Jöckel et al., 2010), WRF-Chem (Grell et al., 2005;Fast et al., 2006), the forward and adjoint GEOS-Chem models (Henze et al., 2007), and the adjoint for the CMAQ model (Hakami et al., 2007;Zhao et al., 2020).Our new version KPP 3.0.0incorporates several performance and diagnostic updates over the previous version KPP 2.1 (Sandu & Sander, 2006), in addition to the adaptive solver option presented below.

Adaptive solver for chemical kinetics
Atmospheric chemistry models alternate chemical integration and transport calculations through operator splitting (Brasseur & Jacob, 2017), The chemistry solver is called for a time interval of length Δ, referred to as the external time step, and returns a vector of updated concentrations C at the end of that time step to be operated on by transport.The kinetic integration of the mechanism by the chemistry solver is done over internal time steps δ Δ to reach the desired accuracy.
The chemistry solver integrates a system of N coupled nonlinear first-order ODEs of the form where N is the number of coupled species in the mechanism, C is the vector of species concentrations of dimension N, and P i (C) and L i (C) are the production and loss rates of species i that depend on the concentrations of other species in the mechanism through the reaction rate expressions.
The external time step in a global model is typically ~10 3 s, but many species in the mechanism have lifetimes ~1 s or shorter.An explicit solver would require internal time steps shorter than the lifetime of the shortest-lived species in order to achieve stability, but this is not computationally practical.An implicit solver is required.The simplest such solver is the first-order backward Euler method, which approximates the solution to (1) over the internal time step δ with   δ      δ δ 2 where we define the net source term    as a vector of functions (P i -L i ).Solving (2) for the unknown quantity   δ using the Newton-Raphson method requires the repeated construction and inversion of the N×N Jacobian matrix J: The construction and inversion of this Jacobian matrix is computationally expensive.Higher-order solvers generally used in atmospheric chemistry models, such as Rosenbrock (Rosenbrock, 1963;Hairer & Wanner, 1991;Sandu et al., 1997) or Gear (Jacobson & Turco, 1994), similarly require the repeated construction and inversion of the Jacobian over internal time steps.The Jacobian is typically ~90% sparse allowing for efficient sparse-matrix inversion methods (Sandu et al., 1997), so that the overall cost of construction and inverting the Jacobian scales as ~ N rather than a higher power.
A way to reduce the dimensionality N of the problem is to split the mechanism into "fast" species for which the coupled implicit solution is necessary and "slow" species that may be solved independently over the external time step using a fast explicit method.Young and Boris (1977) and Gong and Cho (2003) separate species into fast and slow based on their lifetimes compared with the integration time step.However, the separation results in non-conservation of mass because the reaction rates are not computed consistently.
This may not be of consequence in a regional model (as used in those applications) where the domain is ventilated by the boundary conditions, so that errors do not accumulate, but it is more problematic in a global model.Santillana et al. (2010) separated instead "fast" and "slow" species on the basis of their production and loss rates, with the slow species having sufficiently low rates that their influences on other species would be negligible and the non-conservation of mass would be inconsequential.This is more relevant for global models where concentrations and rates of VOCs become very small outside of their source regions.Shen et al. (2020Shen et al. ( , 2022) ) used the same approach to partition species between fast and slow.
Here we also follow the partitioning method of Santillana et al. (2010).At the beginning of each external time step we calculate P and L and classify species  as fast if max  ,  δ and as slow otherwise, where δ is a user-selected partitioning threshold.Fast species are assigned to the coupled implicit solver as a subset (sub-mechanism) of the full mechanism, while the evolution of slow species over the external time step is calculated using an explicit first-order approximation with first-order loss rate constant k i (t 0 ) = L i (t 0 )/C i (t 0 ): In the Santillana et al. (2010) implementation, the Jacobian had to be reconstructed locally at every external time step for the identified subset of fast species and that incurred large overhead, canceling the benefit of the method.Here we avoid the overhead by taking advantage of the pre-computed Jacobian matrix terms for the full mechanism in the KPP solver to simply remove rows and columns corresponding to the slow species.This is explained in Section 3.2 and is the key new development to make the method computationally practical.
The partitioning threshold δ is set prior to integration and is tuned to balance performance and accuracy.
Previous work (Santillana et al., 2010;Shen et al, 2020Shen et al, , 2022) ) considered that in a typical tropospheric chemistry mechanism, much of the coupling is associated directly or indirectly with cycling of the hydroxyl radical (OH).OH has a daytime concentration of ~10 6 molecules cm -3 and a lifetime ~1 s, so its production and loss rates are ~10 6 molecules cm -3 s -1 .A species with production and loss rates that are several orders of magnitude smaller would not be expected to contribute significantly to the coupling.They found δ ~10 10 molecules cm -3 s -1 to be adequate after testing for performance and accuracy.
Here we include the option to dynamically define δ instead of specifying a uniform value over the domain, to account for rates varying with local conditions.This is done by identifying a target species which is considered central to the mechanism and scaling its production and loss rates to define a local partitioning threshold: where  and  are the local production and loss rates of the target species, and α ≪ 1 is a user-selected coefficient that depends on the target species but is otherwise fixed for the model domain.
For example, a model may use OH as target species for daytime and NO 2 for nighttime.When using OH as a target species and with max  ,  ~10 molecules cm -3 s -1 , a value α ~10 10 would correspond to the criteria for δ used by Santillana et al. (2010).But max  ,  can in fact vary over orders of magnitude depending on pressure, UV flux, and other factors, and our local specification of δ accounts for this variability.
We also include two new options in the algorithm.First is to force individual species to remain in the coupled implicit solver even if max  ,  ≪ δ.As we will see, this may be helpful for inorganic halogen species that cycle between radical and non-radical forms across sunrise/sunset.Second is to include an 'append' functionality in the algorithm so that a species initially diagnosed as slow at the beginning of the external time step can be transferred into the coupled implicit solver if it becomes fast over the course of the integration.This increases accuracy with minimum overhead.

KPP 3.0.0 overview
The Kinetic Pre-Processor (KPP, Damian et al., 2002;Sandu & Sander, 2006) generates code for solving the chemical kinetics for a given chemical mechanism defined by a list of species, reactions, and rate constants.It is designed for speed by exploiting sparse matrix algebra and pre-computation of terms.We have made several improvements to KPP relative to the previous version 2.1 (Sandu & Sander, 2006), including the adaptive solver option.We present these improvements as KPP version 3.0.0,available for download from https://github.com/KineticPreProcessor/KPP/(DOI: 10.5281/zenodo.6828026)with detailed documentation at https://kpp.readthedocs.io.
KPP takes in as input one or more text files, with an example shown in Figure 1.The input is not necessarily in one single file, and may be split into multiple files for readability.The files describe the chemical mechanism, the choice of the numerical solver (e.g., Rosenbrock), target language (e.g., Fortran 90, C, or MATLAB), floating point type (single-or double-precision), production and loss diagnostics for selected chemical families (optional), and whether to use the adaptive solver option (optional).Some inputs are not part of the KPP code generation and are instead left for users to adjust at runtime, including convergence criteria (absolute and relative tolerance), numerical order (e.g., order of the Rosenbrock solver), and adaptive solver options, in order to enable the user to experiment with different thresholds for performance and accuracy.Reaction rate constants can be specified as expressions in the target language (in this case Fortran 90), here showing calls to functions calculating photolysis and Arrhenius rate expressions.
Based on the specifications in the KPP input file(s), KPP creates files in the target language containing a description of the system of ODEs (number of species, reactions, and a numbered list of species) code to calculate the time derivatives, Jacobian matrix, and solution by back-substitution, along with a copy of the numerical solver (such as Rosenbrock) and supporting routines for sparse linear algebra.This set of files can be either run standalone as a box model or can be included in a 3-D model to update concentrations locally over external time steps.
Auto-reduction of the mechanism with the adaptive solver described in Section 2 is an option in KPP 3.0.0using the Fortran 90 language, enabled in the configuration file by #AUTOREDUCE on and specifying the corresponding integrator (e.g., #INTEGRATOR rosenbrock_autoreduce).This sets up the capability for the user to reduce the mechanism locally through specification of the partitioning threshold δ between fast and slow species, listing any species for which this partitioning should not be applied.These specifications are done at runtime for flexibility.Even when the auto-reduction solver is used, mechanism auto-reduction can be disabled at runtime, defaulting to the behavior of the original integrator.A test case box model for auto-reduction is included as part of KPP 3.0.0documentation.
In addition to the option for adaptive mechanism auto-reduction, several additional features and improvements were added to KPP 3.0.0.These include: a. Redeployment of KPP source code, continuous integration, and documentation for community development.KPP 3.0.0source code has been redeployed on GitHub for community development.
The GitHub repository incorporates continuous integration (CI) tests which automatically compile the KPP source code to build and run combinations of sample chemistry mechanisms and integrators into box models at every code revision.This helps to ensure that new features and updates added to KPP do not break existing functionality.The documentation has also been relocated to https://kpp.readthedocs.iowhere it is automatically built with each code revision on GitHub.c. Addition of new solvers.Several solvers have been added as options for integrating chemical kinetics including VODE (Brown et al., 1989), SDIRK (Hairer & Wanner, 1991)  Thread safety for generated code.The code generated is now thread safe, so that calls for updating rate constants and running the integrator can be placed in an OpenMP parallel loop for parallelization.
 Improved expressions for vector and array functions.Several functions for basic vector and array operations originally used reference BLAS (Basic Linear Algebra Subprograms) implementations.These have been replaced with Fortran 90 expressions to allow for compilers to better optimize the code.
f. Miscellaneous code improvements in KPP.The C source code of KPP has been improved so that no compiler warnings are generated.A more consistent memory allocation helps to avoid buffer overflow problems.The KPP language is now parsed by bison instead of yacc.

Adaptive solver implementation in KPP 3.0.0
We implemented the adaptive solver as an option within KPP's Fortran 90 version of the Rosenbrock solver using sparse matrix algebra.KPP is computationally efficient because the functions to compute the time derivatives for each species and the Jacobian matrix (expressed in terms of reaction rates and species concentrations), along with all sparse matrix algebra routines, are pre-generated and use fixed indices to access species vectors and matrices.This means that the problem size and memory space is fixed at compile time, avoiding expensive memory allocation operations.However, this also yields a fixed problem structure that is difficult to manipulate.One major source of overhead in locally defined sub-mechanisms as in   The separation between fast and slow species is controlled by the initial conditions at the beginning of the external time step, but an optional "append" function is added to account for an initially slow species becoming fast over the course of the internal time steps.This function appends new species to the fast submechanism and adjusts the logical control vectors if these species are initially partitioned as slow but their production or loss rate exceed the partitioning threshold over the course of the internal time stepping.
Diagnosing this has little overhead, because the production and loss rates of all species are already computed at every internal time step.
We used a box model integration of the GEOS-Chem chemical mechanism (described below) to analyze the overhead of the adaptive solver implemented within KPP.By forcing the adaptive solver routines to run with a threshold of 0, we determined that the KPP overhead added by the auto-reduction is 10-16%, depending on the initial condition provided to the box model.The main source of overhead is the copying of data between the full and reduced sub-mechanism (Figure 2), where the worst-case scenario is when all species are partitioned as fast and all species' data need to be copied between the full to the "reduced" data structures.Profiling tests show that other steps such as the partitioning of species between fast and slow, or the first-order approximation for slow species, contribute negligible overhead.The chemical mechanism includes comprehensive oxidant-aerosol chemistry in the troposphere and the stratosphere with 291 chemical species and 913 reactions.Recent updates to the mechanism include Cl-Br-I tropospheric halogen chemistry (Wang et al., 2021), isoprene chemistry (Bates & Jacob, 2019), aromatic chemistry (Bates et al., 2021), hydroxymethanesulfonate cloud chemistry (Moch et al., 2020), and NO y cloud and aerosol chemistry (Holmes et al., 2019).Heterogeneous sulfur chemistry that was previously simulated with a separate module was brought into KPP in version 13.4.0 with the addition of new rate functions.
GEOS-Chem has been structured to interact with the KPP-generated solver code through the FlexChem interface which prepares data for the solver code, executes the code, and retrieves concentrations at the end of the time step.FlexChem allows GEOS-Chem to use modules outside of KPP for computing reaction rates, for example interfacing with Fast-JX for photolysis rates (Bian & Prather, 2002;Mao et al., 2010).
FlexChem also includes a derived type object, State_Het, which passes state variables from GEOS-Chem model for calculating heterogeneous chemistry reaction rates including cloud liquid water content, aerosol size distribution, pH, and/or alkalinity.This derived type object holds common intermediate quantities necessary for heterogeneous reaction rate computations, such as aerosol area, avoiding repeated computation and memory use.FlexChem also allows GEOS-Chem users to modify the chemical mechanism input into KPP without modifying the GEOS-Chem source code.
We evaluate the accuracy and computational performance of the adaptive solver in KPP version 3.0.0using global GEOS-Chem simulations at 2°×2.5° resolution with 72 vertical levels extending up to 0.1 hPa.The We select as the standard configuration of the adaptive solver within the GEOS-Chem model a dynamically defined threshold with OH as target species during daytime, and NO 2 during nighttime, with coefficients  5 10 and  1 10 .We do not force any species to remain in the implicit KPP solver as fast and we do not use the append functionality.We find that we achieve a net 32% reduction in integration time in this standard configuration.Figure 3 shows the percentage of species retained in the fast sub-mechanism using the adaptive solver's standard configuration.Over 60% are retained in surface air over land, reflecting VOC sources, whereas only 10-50% are needed over the ocean.The fraction of retained species decreases with altitude, and fewer than 40% are needed in the stratosphere where VOC chemistry is mainly limited to methane.Fewer species are needed at night than in daytime.These results are consistent with Shen et al. (2020).
We evaluate the accuracy of the adaptive solver (AS) relative to the full mechanism solver over the global GEOS-Chem domain using the relative root mean squared (RRMS) error metric.For each species i, the RRMS error is: where C i,j,full , C i,j,AS are the concentrations of species i in grid box j for simulations without and with the adaptive solver.The RRMS is computed over the ensemble N i of ordered grid boxes that account for 99% of the total mass of species i in the boundary layer (surface to PBL height from MERRA2), free troposphere (boundary layer height to tropopause), and stratosphere, respectively, and where C i,j,full is greater than 10 molecules cm -3 .
Figure 4(a) shows the mean errors at the end of a 1-year simulation for species in different chemical categories (Table S1) and Figure 4(b)-(d) shows the time evolution of errors over the 1-year simulation for all species in the standard configuration of the adaptive solver within the GEOS-Chem model.The mean error for most categories is below 1% within the boundary layer and the free troposphere.There is no error growth in the troposphere but there is some in the stratosphere.Figure 5 shows the geographical distribution of errors for ozone, OH, and NO 2 at the surface and 500 hPa for the end of the 1-year simulation.Errors are generally lower than 1% except for OH at high latitudes and NO 2 over the Southern Ocean, where errors are up to 3% for OH and 6~10% for NO 2 .Errors over land for these species are minimal.
The largest errors in Figure 4 are found for inorganic halogen radicals and their reservoirs.Halogens cycle through species with lifetimes spanning several orders of magnitude, and include several highly reactive photochemical forms.As a result, partitioning between fast and slow species at sunrise/sunset leads to large errors and mass balance issues associated with the use of the first-order approximation (equation ( 4)).
Halogen errors are compounded due to their cycling through gas and aerosol phases.These problems were previously identified by Shen et al. (2020Shen et al. ( , 2022)).Halogen errors compounded with long residence times are the cause of the slow error growth for other species in the stratosphere.The impact is much less in the troposphere.
Our results in Figure 4 indicate that halogen species should be kept in the implicit solver as fast species if the primary interest of the simulation is the stratosphere.Keeping the halogen radicals and their reservoir species as fast prevent large errors for halogen species in the stratosphere, in addition to avoiding error growth and spikes in errors as shown in Figure 4(d).Similarly, atomic oxygen (O) and HO 2 radicals are the cause of high sulfate errors in the stratosphere (exceeding 10% above 85 hPa), and may be kept as fast to reduce this error.The performance improvement of the adaptive mechanism is then 23%, as compared to 32% in the standard configuration.Such ad hoc adjustments to the fast/slow species partitioning would be mechanism-specific and can be experimented with by users but we find in GEOS-Chem that they are not needed for standard applications focusing on oxidant-aerosol tropospheric chemistry.
We find in this application that the append functionality (allowing species to switch from slow to fast over internal time steps) does not provide significant error reduction, in particular for the halogen species, and degrades the performance improvement of the adaptive mechanism to 24% instead of 32%.We retain it as an option in the adaptive solver code as it may be helpful for longer external time steps (here 20 minutes) or other chemical mechanisms.S1).Panels (b)-(d) show the time evolution of RRMS errors for all species in the boundary layer, free troposphere, and stratosphere, respectively, with colored lines for species of particular interest.
The adaptive solver uses a dynamically defined threshold (equation ( 5)) with target species OH in daytime (α 5 10 ) and NO 2 at night (α 1 10 ), and does not force any species below that threshold to remain as fast.

Conclusions
We presented an updated version of the Kinetic Pre-Processor (KPP 3.0.0) to integrate stiff chemical mechanism kinetics.KPP was originally designed for flexibility and speed.KPP 3.0.0features several improvements for performance, diagnostics, choice of solvers, and code openness.It includes an adaptive solver capability for mechanism auto-reduction where and when the full mechanism is not needed.
The adaptive solver performs auto-reduction of the chemical mechanism locally and on-the-fly at runtime, by comparing the local production and loss rates of each species with a partitioning threshold (δ).Species with production and loss rates higher than the threshold are considered fast and are solved as a coupled submechanism within KPP, while other species are considered slow and solved individually by an explicit method.Previous application of this adaptive solver method suffered from large overhead due to the need for local reconstruction of the reduced Jacobian matrix and the associated memory allocation and deallocation.We solved this problem here by using pre-computed Jacobian terms for the full mechanism with a mapping operation to crop rows and columns corresponding to the slow species without changing the memory allocation.
KPP 3.0.0features additional improvements for performance, diagnostics, versatility, and openness.
Improved performance includes more efficient calculation of reaction rates from the KPP rate functions and thread safety for parallelization.New diagnostics include individual reaction rates, production and loss rates for chemical families, and stoichiometric numbers.Improved versatility includes expanded choice of chemical solvers.KPP 3.0 is now hosted on GitHub (https://github.com/KineticPreProcessor/KPP) to enable community access, development, and testing.
We evaluated the adaptive solver implemented in KPP 3.0.0 in a 1-year simulation with the global 3-D GEOS-Chem atmospheric chemistry model including 291 species in the full mechanism for oxidant-aerosol chemistry in the troposphere and stratosphere.Results show that a 32% performance improvement in the solver can be achieved with a target error of 1% for key species in the troposphere.Errors in the stratosphere can be larger, driven by halogen chemistry.Lower errors especially in halogen species can be achieved by keeping these species within the fast sub-mechanism but reduces the performance improvement to 23% and is mainly beneficial in the stratosphere.Our standard configuration of KPP and the adaptive solver within the GEOS-Chem model is mechanism-independent and can be readily applied to other models using KPP.
The release of KPP 3.0.0introduces improvements in development infrastructure, diagnostics, and performance, particularly in Fortran 90 applications.However, one of the strengths of the KPP software is the capability to generate code for different programming languages.Development directions for future versions include (1) adding support for modern languages such as Python and Julia; (2) refactoring of the generated code to avoid global data structures for easier parallelization; (3) streamlining inputs and outputs of all integrators for consistency; (4) supporting the adaptive solver option in other integrators and programming languages; and (5) Improve interaction and compatibility with the Master Chemical Mechanism (http://mcm.york.ac.uk).These improvements will allow KPP to better serve the community as a versatile tool for solving chemical kinetics.

Figure 1 .
Figure 1.Example KPP input file.The KPP input file includes options for code generation, adaptive autoreduction, diagnostics for production and loss rates of chemical families, and a list of species and reactions.
b. Diagnostic improvements.Diagnostic features have been added to KPP 3.0.0: Production and loss rates for chemical families.KPP 3.0.0allows for the definition of families of chemical species for computing production and loss for that family, ignoring interconversion reactions within the family.Families are defined in the #FAMILIES section of the KPP input file. Stoichiometric numbers.The stoichiometric numbers of all reactions in the mechanism are now available in the CalcStoichNum subroutine in the KPP-generated code.This feature is used to calculate the importance of chemical species in a mechanism for the skeletal mechanism reduction in Sander et al., 2019. Individual reaction rates and time derivatives.The reaction rates and time derivatives are now available as optional outputs from the KPP-generated code.
Santillana et al. (2010) is associated with repeatedly re-allocating and de-allocating memory to accommodate changing problem sizes for each sub-mechanism(Shen et al., 2020).Our adaptive solver implementation in KPP 3.0.0uses a mapping operation to project the full mechanism into sub-mechanisms, thus reusing the same memory space to avoid expensive resizing operations.

Figure 2 .
Figure 2. Mechanism auto-reduction in KPP.Panel (a) shows the data structure of the sparse Jacobian data within KPP for a sample 3-species mechanism.If species B is diagnosed as slow, then the corresponding rows and columns of the Jacobian are no longer calculated (panel (b)), and the indices pointing to the sparse Jacobian data (LU_IROW, LU_ICOL) are adjusted to remove the slow species (cLU_IROW, cLU_ICOL) through a mapping array (JVS_MAP) while preserving the sparse matrix in row-compressed form.The result is a reduced mechanism with the same data structure as the original one, but with smaller dimensions.

Figure 2
Figure2(a) shows the sparse Jacobian data as stored within KPP.In this example mechanism, there are 3 species (NVAR) and 8 non-zero entries in the Jacobian (LU_NONZERO) of the full mechanism.The row and column indices of these 8 non-zero entries in Jacobian matrix are correspondingly specified in LU_IROW, LU_ICOL in row-compressed form.At the beginning of every external time step, the production and loss rates of each species are calculated and compared to the partitioning threshold to separate species into fast and slow.Figure2(b) shows an example where species B is slow.The entries in the Jacobian corresponding to B no longer need to be computed, and a mapping operation is performed: JVS_MAP corresponds to the non-zero Jacobian matrix entries still present in the sub-mechanism consisting of fast species.The smaller sub-mechanism can now be described by 2 species (rNVAR) and 4 non-zero entries in the reduced Jacobian (cNONZERO) with indices described by cLU_IROW and cLU_ICOL.The data structure of the smaller submechanism is identical to the full mechanism, and the same routines are used to solve it, without the need to generate extra code, or resizing memory.
model is driven by the Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2) meteorological fields from the NASA Global Modeling and Assimilation Office (GMAO).The external time step for the chemistry solver is 20 minutes.All simulations are conducted on the same single-node hardware with 24 Intel Cascade Lake physical cores (Intel(R) Xeon(TM) Platinum 8268 CPUs Manuscript submitted to Journal of Advances in Modeling Earth Systems 14 with a base clock speed of 2.90GHz, no hyper-threading logical cores), 100 GB of RAM, and a highperformance Lustre parallel file system.The model was compiled using Intel(R) Fortran Compiler (ifort) version 2021.2.0.

Figure 3 .
Figure 3. Percentage of GEOS-Chem species retained in the fast sub-mechanism when using the adaptive solver.The full GEOS-Chem mechanism has 291 species to describe tropospheric and stratospheric oxidant-aerosol chemistry.Only a fraction of species is retained as fast in the KPP solver, while the other slow species are solved individually using equation (4).Results are shown for a snapshot in time on July 1, 2014, 12:00 UTC at different altitudes.The adaptive solver uses a dynamically defined threshold (equation (5)) with target species OH in daytime (α 5 10 ) and NO 2 at night (α 1 10 ).

Figure 4 .
Figure 4. Accuracy of the adaptive solver in a 1-year GEOS-Chem simulation.Panel (a) shows the mean RRMS errors for species in different categories at the end of the 1-year simulation starting on July 1, 2014.The categories are as defined in the standard GEOS-Chem benchmarking output diagnostics (Table

Figure 5 .
Figure 5. Accuracy of the adaptive solver for ozone, OH, and NO 2 global distributions.The Figure shows the errors relative to the full solver for the daily mean concentrations on July 1, 2015, the last day of the 1-year simulation.The relative error can be up to 3% at high latitudes for OH and up to 10% over the Southern Ocean for surface NO 2 .
Rate law functions for three-body reactions using the formulas proposed by JPL (https://jpldataeval.jpl.nasa.gov)and IUPAC (https://iupac.aeris-data.fr/)have been added to the built-in rate laws in KPP 3.0.0.Rate law functions are not limited to those built in KPP and can be added by including extra source code files in the KPP input.Previous inputs to KPP used both single-and double-precision numbers.KPP input files and code now do not mix number precision to avoid conversion, which loses precision and costs computational time.Previously, KPP called subroutines to update reaction rate constants at every internal time step, but this is computationally expensive and may not be needed in most model applications, because the variables affecting rate constants are not updated between internal time steps in these models.An optional switch has been added so rate constants can be updated only at the beginning of the external time step.Rate law functions have been split to avoid computation of unnecessary terms.For example, previously a single function with Arrhenius temperature dependence was used for all reactions:  , ,   * exp / * 300/ .However, certain reactions may have  0 or  0, but in this case the expressions 300/ or exp 0 are still computed, leading to the waste of CPU cycles.Separate rate functions where coefficients lead to constants, such as  ,   * exp / have been added to skip computations that , 3-stage Runge-Kutta, and forward and backward Euler methods.d.Addition of new rate law functions.e. Miscellaneous performance improvements in Fortran 90.KPP 3.0.0optimizes for Fortran 90 performance by applying several guidelines in coding, particularly in reaction rate calculations that are computed repeatedly in loops:  Unifying number precision. Switch to control update of reaction rate constants. Optimized rate law functions.can be pre-evaluated to constants, leading to a 44% performance improvement of reaction rate computations in a full-chemistry GEOS-Chem model run. Avoiding conditionals and optional arguments.Conditional clauses such as IF, ELSE, and SELECT CASE, and testing for optional arguments add significant computational cost if called thousands of times.If a conditional clause or optional argument is present in a frequently called subroutine, the subroutine is split into different functions for each case.