An interpreted language implementation of the Vaganov-Shashkin tree-ring proxy system model

We describe the implementation of the Vaganov-Shashkin tree-ring growth model (VSM) in MATLAB. VSM, originally written in Fortran, mimics subdaily and daily resolution processes of cambial growth as a function of soil moisture, air temperature, and insolation, with environmental forcing modeled as the principle of limiting factors. The re-implementation in a high level interpreted language, while sacriﬁcing speed, provides opportunities to systematically evaluate model parameters, generate large ensembles of simulated tree-ring chronologies, and embed proxy system modeling within data assimilation approaches to climate reconstruction. We provide a versioned code repository and examples of model applications which permit process-level understanding of tree ring width variations in response to environmental variations and boundary conditions.


Introduction
Interpretations of climate influences on tree rings and the application of these relationships to the reconstruction of past climate typically use empirical statistical models calibrated from the overlapping periods of observed climate data and the tree-ring proxy measurements (Fritts et al., 1971;Hughes, 2011).There are many strengths to this approach to dendroclimatology: these methods are simple to use and interpret, they are usually and sufficiently linear (Hughes, 2002), and they require no a priori knowledge of the biological response of any given species to climate in any particular region, and as discoverable models, can be applied regardless of species, location, or climate regime.Many thousands of tree-ring chronologies have a biologically reasonable association between local monthly or seasonal climate variability and tree growth (e.g.Meko et al., 1993;Hughes, 2002;Breitenmoser et al., 2014;D'Arrigo et al., 2014;St. George, 2014;Zhao et al., 2019) that can be used to estimate past climate variability.Nevertheless, the empirical statistical approach has its limitations.Most models of the association between ring width and climate that are used for paleoclimate reconstructions assume a univariate, linear, and stationary relationship (e.g.Cook, 1987;Hughes, 2002Hughes, , 2011)).However, any of these assumptions might be violated if there is a significant non-climatic or secondary environmental control on tree-ring formation, if the relationship displays threshold or nonlinear behavior, or if the association between tree growth and climate changes over time (Vaganov et al., 1999;Anchukaitis et al., 2006;Evans et al., 2006) .
A complementary approach to empirical statistical methods is the use of forward models that simulate the process of tree-ring formation as a function of multivariate and potentially nonlinear responses to climate.Such a model can also permit better temporal resolution, as cellular-scale processes that ultimately give a tree-ring proxy its characteristics can be simulated as they respond to daily weather and other environmental influences and not constrained to monthly statistical associations.Formalized as a proxy systems model (Evans et al., 2013), they can be inverted or used in a data assimilation framework to reconstruct multiple climate variables.
These prior applications of the model have used computer code written in the FORTRAN language (Evans et al., 2006).Another version of the model has been coded in Pascal (Shishov et al., 2016).Both of these are compiled languages, where the program source code is first translated into machine-specific instructions in an executable file.The primary advantage of this is speed -the computational overhead of preparing the executable file from the source code is incurred only once and compiled programs tend to be faster than interpreted languages, where the source code is parsed and executed during while the program runs.MATLAB, R, and Python are all examples of interpreted languages in broad use in the earth and environmental sciences that execute program code directly (or using a 'Just In Time' approach) and do not require prior compilation.While the primary disadvantage of this is that execution is slower than compiled code, among the benefits are that programs are platform-independent and readily modified and integrated within other code bases, workflows, or environments.Unlike R or Python, MATLAB requires an individual or institutional license.Nevertheless, MATLAB is widely used in scientific and engineering fields, including paleoclimatology, and broadly available in universities and industry.
Here, we introduce an interpreted language version of the full Vaganov-Shashkin cambial growth model (VSM) of tree-ring formation in MATLAB.Our core implementation of VSM also works in the free Octave software.We briefly describe the model itself and its execution.
Three practical applications that demonstrate the utility of our interpreted language version of the model program are demonstrated for Pinus longaeva in California (USA), Tsuga canadensis in southern New York (USA), and Larix cajanderi in northern Yakutia (Russia).The model code is freely available to use and can be adopted for additional development, platform-migration, and integration in related applications.

Model Description
Complete details on development of the model and observations that support environmental control on xylogenesis are available in Vaganov et al. (2006) and Vaganov et al. (2011).The original FORTRAN code included three related modules: (1) a module that calculates environmental conditions and determines a daily relative growth rate; (2) a module that simulates cambial activity based on the daily growth rate and ultimately determined the number cells in an annual ring, (3) a cell size module that determines the characteristics of the individual cells in the ring and can be compared directly to tracheidograms and measures of quantitative wood anatomy.Because of the paucity of cellular-level measurements, nearly all applications of the Vaganov-Shashkin model thus far have focused on the first two modules, primarily to investigate the climate controls on the width of the annual rings.
Our implementation of the Vaganov-Shashkin model (VSM) therefore consists of the first two modules, or blocks.The Growth (or Environmental) Block calculates daily climate and environmental conditions including temperature, solar irradiance, snow depth, soil moisture, and evapotranspiration and determines a daily relative growth rate between 0 and 1 based on those conditions.The Cambial Block then uses this growth rate to simulate the rate and timing of growth and division of cells in the cambium.In this way, the kinetics of xylem formation are explicitly modeled as a function of climate variability modified by environmental and cambial processes.Our MATLAB version is identical to the original FORTRAN code, with two exceptions, also modified by Shishov et al. (2016): we impose a maximum depth of soil thaw in permafrost environments no greater than the rooting depth itself.We also include additional error catching to prevent unrealistic environmental parameters.In MATLAB the model is implemented in double precision as opposed to single precision, which in some instances can create non-trivial differences in model output between MATLAB and FORTRAN versions.

Growth (Environmental) Block
Daily growth rates are calculated by comparing daily (t) temperature (T ) and soil moisture (W) to piecewise linear growth functions (Figure 1, inset).Soil moisture (in units of v/v) is updated daily by the model as a function of precipitation, snowmelt, evaporation (as a function of temperature), and runoff (Thornthwaite and Mather, 1955).Solar irradiance is calculated from the latitude corresponding to the tree-ring site or the origin of the temperature and precipitation input data.Four parameters define the shape of the trapezoidal growth functions a minimum (where G(t) = 0), lower and upper optimal bounds (where G(t) = 1), and a maximum (where G(t) = 0).Between the minimum (or maximum) and the lower (or upper) bounds of the optimal values for the climate parameter (temperature, sunlight, or soil moisture), growth rates will be between 0 and 1.Relative growth rates are calculated for soil moisture (gW(t)), temperature (gT (t)), and sunlight (gE(t)).The determination of the overall daily growth rate G(t) is calculated as: Because of the minimization term and the piecewise approximation of the nonlinear growth function, the model behaves stoichiometrically and rates of cambial growth and division are therefore controlled by the most limiting factor (Fritts, 1966(Fritts, , 1976) at a daily resolution.

Cambial Block
The Cambial Block uses the output G(t) from the Growth Block to determine the rate at which cambial cells grow and divide (Figure 1).Each cell in the Cambial Block is characterized by two variables at each daily step: its position ( j) in the cellular file and its diameter.The growth rate G(t) calculated in the prior block is used to derive a specific growth rate, V( j, t), for each cell based on its position.For cambial cells, diameter increases until a maximum size when division occurs, or until the cell loses the ability to divide as its growth rate falls below a minimum threshold for the cell's position in the radial file.Cells that lose the ability to divide pass out of the cambium and complete the cell cycle.Daily cellular growth rates below a critical minimum threshold send the cambium into dormancy.The cells in the cambium at the end of one simulated growing season will therefore be those which first grow and divide in the subsequent year, and can influence the cambial dynamics and tree-ring structure of the following year.Activity in the cambium is initiated each year when the sum of temperatures above a certain threshold over a specified period of time (growing degree days) reaches a critical threshold.VSM therefore integrates the essential features of cambial dynamics as described above: Annual xylem cell production is related to the number of cells in the cambial zone, the size of which varies over the course of the year in response to environmental variability.Specific cellular growth rates are positional and depend on the distance of the simulated cell from the cambial initial.Ring width is determined in VSM by the number of cells in each annual ring (Figure 2; Gregory and Wilson, 1968;Camarero et al., 1998;Vaganov et al., 2006Vaganov et al., , 2011;;Martin-Benito et al., 2017;Zhang et al., 2018) and reported as normalized (index) values by dividing the number of cells in each ring by the average number of annual cells over the simulation.

Input Data, Parameters, and Model Output
The model uses daily precipitation and temperature from meteorological stations, gridded data, or climate model output as required input data.The 28 primary model parameters are based on empirical and experimental data, whose selection is discussed in detail by Vaganov et al. (2006).Different parameter sets may be applied for different environments (Vaganov et al., 2006;Evans et al., 2006;Anchukaitis et al., 2006;Shi et al., 2008;Vaganov et al., 2011), but in practice we have found that model output is sensitive to relatively few of the parameters (Anchukaitis et al., 2006;Vaganov et al., 2011) and that the simulations are dominated by the climate variability and the parameters that specify the thresholds of the piecewise growth functions.Model output includes the normalized tree-ring width chronology, the overall and component simulated growth rates, and environmental variables.The model does not simulate additional biological or ecological influences on patterns of tree-ring formation, including those caused by tree age and geometry, carbon storage, canopy and root activity, or stand-level competition and disturbance.
The simulations can therefore be considered as perfectly detrended tree-ring chronologies.

Pinus longaeva in the White Mountains, California
Bristlecone pines (Pinus longaeva) in eastern California and Nevada are the longest living non-clonal organisms on Earth, likely achieving ages in excess of 5000 years (Schulman, 1958;Currey, 1965;Hallman et al., 2006).The climate signal embedded in the rings of these ancient trees is complicated, however, empirically estimated as a mix of temperature and precipitation sensitivity as a function of elevation and position on the landscape (LaMarche, 1974b,a;Hughes andFunkhouser, 1998, 2003;Salzer et al., 2009;Bunn et al., 2011;Tran et al., 2017;Bunn et al., 2018).Here we use the daily climate data from National Weather Service Cooperative Network weather station 049632 (WHITE MTN 1, 1955to 1977, 37.50 o N, 118.18 o W, 3094 m elevation) to simulate bristlecone growth at the Methuselah Walk site (Hughes and Graumlich, 1996;Salzer et al., 2009).We modify the temperature data for the difference between the elevation of the station and chronology location using a simple lapse rate correction.Bunn et al. (2018) also recently used VSM with a higher elevation composite meteorological record to investigate landscapescale topographic influences on the climate response of high elevation bristlecone pines in the White Mountains.
We use VSM here to demonstrate how the model can be run iteratively to investigate the sensitivity of the model to parameter choices, specifically the rate of soil water drainage from the simulated soil column.Anchukaitis et al. (2006) found this parameter was important for correctly capturing interannual growth variability at tree-ring sites in the southeastern United States, but that soil type alone provided only a weak constraint on the parameter value itself.We run the model iteratively using the White Mountains parameter set developed by Vaganov et al. (2006) and a varying coefficient of drainage (dimensionless) from 0.001 to 0.02 by increments of 0.0001.This generates 191 simulated chronologies which we then compare against the actual Methuselah Walk chronology (Figure 3a).Correlations between the ensemble members and the actual chronology range from r = 0.37 (p = 0.09) to r = 0.79 (p < 0.001), with a broad range of drainage values yielding significant (p < 0.05) correlations between simulated and actual chronologies (Figure 3b,c).
We can also use the model output to examine the environment controls on ring width formation (Figure 4).The best simulation (Figure 4a) shows that simulated mean annual snow depths peak in April and decline through the middle of June and that this snowmelt recharges soil moisture until a maximum in June (Figure 4b).Soil moisture then declines through the remainder of the summer and autumn as rising temperatures lead to a maximum in evapotranspiration and minimum soil moisture values are reached in September (Figure 4b).These environmental patterns are reflected in the components of the daily growth rate as well.Starting in July, growth rates due to soil moisture begin to decline and by early August are below the growth rate due to the direct influence of temperature (Figure 4c).Growth limitation due to soil moisture continues until early September, when overall growth rates decline with the arrival of shorter days, lower temperatures, and accumulating snowpack.The model output therefore confirms that the moisture-sensitivity of bristlecone pines at the lower elevation Methuselah Walk site (LaMarche, 1974b,a;Hughes and Graumlich, 1996;Salzer et al., 2009;Bunn et al., 2018) is therefore due to summer soil moisture deficits driven by progressive drying following the end of snowmelt, limited summer precipitation, and high rates of growing season evapotranspiration at these semi-arid sites.

Tsuga canadensis at Shawangunk Ridge, New York
We previously investigated (Vaganov et al., 2011) the ability of the VSM to reproduce the annual growth patterns and multivariate climate response of eastern hemlock (Tsuga canadensis) trees growing on a talus slope in the Shawangunk Mountains near Mohonk Lake in the Hudson Valley of New York state (Cook and Jacoby, 1977;Cook and Jacoby Jr, 1979;Cook et al., 2010;Cook and Pederson, 2011;Cook, 2014).We demonstrated that our simulation captured the temporal variability in ring width at this steep and well-drained site, as well as mimicking the multivariate climate response -a positive response to spring temperature and summer precipitation, and negative response to summer temperature for a broad range of the soil moisture drainage rate parameter (Vaganov et al., 2011;Cook and Pederson, 2011).Here, we return to this site to evaluate two additional parameters, the minimum temperature for growth (T minimum ) and the lower end of range of optimal temperatures (T lower optimal ), which define the rising portion of the piecewise growth function (Figure 1, inset).We use a Latin Hypercube sampling design (Stein, 1987) to generate a set of 1000 coupled parameter values drawn from uniform distributions U(0, 10) and U(11, 20) for these temperature growth parameters, respectively.We use the long continuous meteorological data from the nearby Mohonk Mountain House (Cooperative Station 305426; Cook et al., 2010) to simulate growth.We then evaluated the correlation between the ensemble of simulations and the actual Mohonk hemlock tree-ring chronology (Cook and Jacoby, 1977) during their period of overlap, 1925 to 1973.Use of an interpreted language here permits us to easily use an iterative approach to sample from the parameter value space, run VSM, generate a large ensemble of simulations, evaluate the resulting simulated tree-ring chronologies, and plot and analyze the results and the detailed model output.
Consistent with our previous findings (Vaganov et al., 2011), we can successfully simulate the Mohonk chronology (r = 0.60, p < 0.01) and capture the mixed temperature and summer precipitation signal in the actual tree-ring chronology (Figure 5a).Interestingly, significant correlations between 1000 simulation ensemble members and Mohonk Lake tree-ring width are observed for a wide range of minimum and optimal temperature parameters (Figure 5b).Minimum temperature even in the wide range between 0 and 10C still result in a positive response to March or April temperatures that mirrors that of the actual chronology (Figure 5c).Parameters where the minimum and lower bound optimal temperature are both high have the lowest correlations with actual tree-ring widths.This appears to be due to a reduction in sensitivity in these runs to temperature in the spring, as growth both starts later in the year and the shallow slope of the rising limb of the growth function reduces the magnitude of the model's response to temperature until soil moisture exerts control on growth in the summer.Both simulated and real chronologies for Mohonk Lake have a negative response to May temperatures, which in model simulations corresponds to a decline in soil moisture that makes it the most limiting growth factor (Figure 5c).The model simulations indicate that peak summer temperatures are sufficient to drive growth rates due to temperature beyond the upper optimal bounds and onto the declining limb of the piecewise growth function; however, decreased soil moisture remains even more limiting for growth during the summer, and drives a positive response to May through July rainfall.

Larch in northern Yakutia, Russia
Large volcanic eruptions can cause anomalously cold summer temperatures for a year or more following an event (Robock, 2000).Climate models and climate reconstructions, however, disagree on the magnitude and duration of this cooling over the last millennium (Ammann et al., 2007;D'Arrigo et al., 2013).Mann et al. (2012) adopted a simplified form of the Vaganov-Shashkin model and used it to suggest that the cause of this discrepancy was the existence of undetected missing rings in temperature-sensitive chronologies.Anchukaitis et al. (2012) showed that the conclusions of Mann et al. (2012) were based on using unrealistic model parameter choices, in particular a choice for temperature below which growth cannot occur of 10 o C, which is much higher than observed in nature (∼5 o C; Anchukaitis et al. (2012) and references therein).Here, we use VSM to simulate a high latitude temperature-sensitive tree-ring Larix cajanderi chronology from the Indigirka region (northern Yakutia, Russia) (Hughes et al., 1999;Kirdyanov et al., 2003;Sidorova et al., 2005;Guillet et al., 2017) as a demonstration of the influence of parameter selection on modeling tree-ring width in response to temperature changes.We use daily meteorological data from the Global Historical Climatology Network weather station at Chokurdakh, Russia (RSM00021946, 70.60 o N, 147.86 o E, 40m elevation; Vaganov et al. (2006); Evans et al. (2006)).We run the model iteratively with a range of minimum growth temperatures from 0 o C to 10 o C in 1 o C increments and compare the simulated chronology against the actual tree-ring width chronology from Indigirka (Guillet et al., 2017).We also record the number of years with no simulated growth under these conditions.
The model overall produces statistically significant (p < 0.05) simulations for minimum growth temperature values up to 8 o C (Figure 6a,b).However, for a minimum growth temperature parameter of 5 o C and higher, the model produces one or more years without growth in the chronology (Figure 6c) -an entirely missing ring.Although there can be individual locallyabsent rings in the individual trees comprising the Indigirka chronology, there are no years for which every ring in every tree is missing, evidence that these parameter values are inappropriate for accurately simulating the actual chronology.Above a minimum temperature parameter of 6 o C, multiple non-growth years degrade the correlation between the simulated and actual chronologies, and for a minimum growth temperature parameter above 8 o C the frequency of missing growth years causes the relationship between simulated chronology to become nonsignificant (p > 0.05, Figure 6b).A similar result was found by Anchukaitis et al. (2012) and here demonstrates the application of our MATLAB version of VSM for evaluating and identifying realistic and appropriate model parameterization.More generally, sampling over reasonable parameter space and observing the range of simulations as we have done here may be a way to capture the range of stochastic growth variations -including locally absent rings -in a forest stand.We note that the mismatch here in simulated vs. observed tree-ring width in 1991 could also arise from heightened uncertainty in weather observations associated with the collapse of the Soviet Union (Jones and Moberg, 2003;Dell et al., 2014).

Summary
We have described here an interpreted language version of the Vaganov-Shashkin model (VSM) in MATLAB.In addition to the applications described above, the code can now be modified for additional functionality.For instance, the model might be modified to accept measured photosynthetically active radiation or soil moisture from observations or models in place of the simple internal calculations of these metrics.VSM could also be deployed as a proxy system model in support of data assimilation approaches to climate reconstruction (Evans et al., 2013;Hakim et al., 2016).Integration of tree-ring modeling with forest growth simulation could potentially improve tree growth, forest dynamics and carbon models (e.g.Mina et al., 2016;Evans et al., 2016;Babst et al., 2018).Addition of a cell growth module could produce cell anatomical simulations that could be compared to emerging observations (c.f.Fonti et al., 2010;Cuny et al., 2015;von Arx et al., 2016;Ziaco et al., 2016).Finally, the model may also be ported to other open source languages, as has been done for the monthly resolution 'lite' version of the Vaganov-Shashkin model (VSL;Tolwinski-Ward et al., 2011, 2013).2011).Daily growth rates due to the environment are determined by comparing daily temperature and soil moisture (calculated from precipitation, transpiration, and soil drainage) to piecewise linear approximations of parabolic growth functions (see inset) in the Growth (Environmental) Block.This growth rate is then used in the Cambial Block to calculate the cellular growth rate V( j, G(t)), which is a function of this environmental growth rate and the position of the cell in the radial file.Each cell is permitted to be dormant, differentiate, grow, or divide on an intraday time interval.When a non-differentiated cell reaches a critical size, it enters and completes the mitotic cycle, continuing its subsequent growth at a constant, environmentally independent growth rate until division occurs, resulting in two cells, each half the size of the original mother cell.Once differentiated, cells can no longer divide.

Figure 1 :
Figure1: Vaganov-Shashkin growth and cambial model block processes fromVaganov et al. (2011).Daily growth rates due to the environment are determined by comparing daily temperature and soil moisture (calculated from precipitation, transpiration, and soil drainage) to piecewise linear approximations of parabolic growth functions (see inset) in the Growth (Environmental) Block.This growth rate is then used in the Cambial Block to calculate the cellular growth rate V( j, G(t)), which is a function of this environmental growth rate and the position of the cell in the radial file.Each cell is permitted to be dormant, differentiate, grow, or divide on an intraday time interval.When a non-differentiated cell reaches a critical size, it enters and completes the mitotic cycle, continuing its subsequent growth at a constant, environmentally independent growth rate until division occurs, resulting in two cells, each half the size of the original mother cell.Once differentiated, cells can no longer divide.Figure used with permission from Springer Nature.

FigureFigure 2 :Figure 3 :Figure 4 :
Figure1: Vaganov-Shashkin growth and cambial model block processes fromVaganov et al. (2011).Daily growth rates due to the environment are determined by comparing daily temperature and soil moisture (calculated from precipitation, transpiration, and soil drainage) to piecewise linear approximations of parabolic growth functions (see inset) in the Growth (Environmental) Block.This growth rate is then used in the Cambial Block to calculate the cellular growth rate V( j, G(t)), which is a function of this environmental growth rate and the position of the cell in the radial file.Each cell is permitted to be dormant, differentiate, grow, or divide on an intraday time interval.When a non-differentiated cell reaches a critical size, it enters and completes the mitotic cycle, continuing its subsequent growth at a constant, environmentally independent growth rate until division occurs, resulting in two cells, each half the size of the original mother cell.Once differentiated, cells can no longer divide.Figure used with permission from Springer Nature.

Figure 5 :Figure 6 :
Figure5: Simulations of the Mohonk Lake Tsuga canadensis tree-ring width chronology at Shawangunk Ridge, New York.(A) Ensemble simulations including the best match (r = 0.60, p < 0.01) with the observed Mohonk chronology(Cook and Jacoby, 1977).(B) Correlations (colored circles) between simulation ensemble members at the real chronology as a function of the minimum temperature and lower optimal temperature used for the piecewise growth function.(C) Mean daily growth rates due to temperature (g T (t)), soil moisture (g W (t)), and overall (G(t)).Light colored lines for temperature and overall growth rates are the individual ensemble members, while the dark heavy lines are for the simulation with the highest correlation to the actual chronology.