Technical Note: Explaining Variance of Pleistocene Climate Sensitivity – Not Peer Reviewed 1 Explaining Variance of Pleistocene Climate Sensitivity: Path, Epochal and CO2 dependence

Improved high-resolution paleo records of atmospheric carbon dioxide (CO2) concentrations and reconstructions of Earth’s surface temperature are available. We analyse one authoritative Pleistocene dataset to explore how the climate sensitivity parameter S varies under different system states, using linear regression of mean annual surface temperature changes against CO2 forcing changes. On the whole data set, S = 2.04K/Wm -2 and CO2 forcing explains 64% of the variance in temperature. Partitioning the data by path (glaciation / deglaciation); during deglaciation episodes, S = 2.34K/Wm, explaining 75% of the temperature variance; during glaciations, S = 1.59K/Wm explaining 48% of the temperature variance. Further partitioning into epochs before and after the Marine Isotope Stage 11 424 kaBP (thousand years before present); for deglaciation paths after 424 kaBP, S = 2.63 K/Wm explaining 83% of the variance. Partitioning into levels of CO2 concentration has much less explanatory power; 20% for concentrations above 210 ppmv and 2% for concentrations below 210 ppmv. These partitions conflate glaciation and deglaciation episodes and pre/post 424 kaBP epochs. Possible process-related explanations for the path-related differences are conjectured. The common assumption that changes in temperature are proportional to changes in CO2 forcing constrains the regression intercept to pass through the origin and inflates the regression error to the extent that the regression model becomes a worse predictor of temperature changes than simply predicting the population mean for all values of CO2 forcing. The consequences of this assumption are substantive in the context of estimating future temperature trajectories and should be carefully weighed.


Introduction
The climate sensitivity parameter S is somewhat loosely defined in the literature (Myhre et al 2013) as the change with respect to present in mean annual global surface temperature (T) per unit change in forcing with respect to present (F). S is sometimes estimated from data by linear regression: with intercept B and error e. F is [Wm 2 ] so that S is [K/(Wm 2 )]. The variance of the error e is estimated as (SS  SSR)/(N  df  1) where N is the number of observations, df is the number of parameters in the model (here always =1), SS is the sum square deviations from the population mean and SSR is the sum square deviations of the regression estimates from the population mean. SSR/SS is the fraction of variance of T explained by the regression model, termed R 2 . The lower is R 2 the more of the variation in T must be attributed to factors other than FCO 2 . If SSR = SS then the regression model explains all the variance in the data. If SSR > SS then the regression predicts T with a variance that is greater than the population variance; in other words, the model predictions are worse than simply predicting the population mean of T for all values of F.
It is widely believed that "… the change in surface temperature is directly proportional to the radiative forcing. Hence, this becomes the simplest way of quantifying the effect in a perturbation in greenhouse gas inventory" (Byrne and Goldblatt 2014). Others argue that the regression line "needs to pass through the origin to avoid any biases" (Köhler et al 2017). Note that T can be proportional to F only if the intercept B is zero. Assuming B = 0 forces the trend line to pass through the origin (F= 0, T= 0). This can inflate and skew the errors and can cause the regression model to be a worse predictor of T than simply predicting the mean of T for each value of F. This actually happens with the dataset under analysis (see Figure 1 and associated discussion). Different statistical packages have different interpretations of R 2 when B = 0 is stipulated, but none has the interpretation as fraction of explained variance. Indeed, if SS < SSR, then variance estimate of e as (SS  SSR)/(N  df  1) would be negative. In what follows, we refrain from stating R 2 values if B is set to zero. Of course, the placement of the origin effects the value of S if B = 0 is stipulated. For example, Snyder (2019) considers T relative to the average temperature over the last five thousand years, whereas Martinez-Boti et al (2015) define temperature change relative to pre-industrial temperature. If on the other hand the intercept is estimated this choice is immaterial. In any event, the consequences of stipulating B = 0 are substantial and should be carefully weighed.
Here, we utilize the dataset (Martinez-Boti et al 2015) for the Pleistocene (1096 records from 0.14 to 798.51 kaBP (thousands of years before present). A second Pleistocene dataset (Snyder, 2019) yields very similar results and is discussed briefly. In recent years, several researchers have studied paleoclimate data for evidence of "state dependence" in the climate sensitivity parameter (e.g. Meraner et al 2013). It has been observed that the linear approximation breaks down in the long tail of high climate sensitivity commonly seen in observational studies (Bloch-Johnson et al 2015). Transient behavior of climate sensitivity has been explored using an energy balance model combined with observational and modeling CMIP5 constraints (Goodwin 2018). Background state dependence and tipping points in Earth system sensitivity with millennial timescales (von der Heydt et al 2016) motivate the introduction of the more general "climate sensitivity parameter". Others find that climate sensitivity strongly depends on the climate background state, with significantly larger values attained during warm phases (Meraner et al 2013. Friedrich et al 2016. However, some authors voice concerns over the simple concepts underlying climate sensitivity and radiative forcing (Knutti et al 2010). Non-linear dependence of land ice albedo forcing is found (Köhler et al 2010), while non-linearity of CO 2 forcing is said to depend on the CO 2 data set. Global mean temperature and CO 2 diverge during intervals of pronounced land ice growth (Köhler et al 2018). The need to distinguish actuo-and paleoclimate sensitivity over different time scales is emphasized (Rohling et al 2018). Averaged glacial and interglacial climate sensitivities are estimated  using Earth system model simulations of the Last Glacial Cycle.
The present study applies standard regression analysis to study variations in the climate sensitivity parameter, using Pleistocene data from Martínez-Botí et al (2015). Rather than estimating non-linear and/or multivariate regression functions, we partition the data into periods of increasing versus periods of decreasing paths of CO 2 concentration corresponding to deglaciation versus glaciation respectively. Table 1 also looks at partitions into epochs before and after 424 kaBP and periods of low, intermediate and high CO 2 concentration. These partitions retain ample statistical power. Path dependence is the most important followed by epochal dependence. Dependence on background CO 2 concentration has low explanatory power but does interact with the other two.
To compute the fraction of explained variance it is necessary to estimate the intercept term. Comparison with regressions in which the intercept B is set to zero shows that the latter have higher values of the climate sensitivity parameter with much less variation over the partition elements. While this data may be subject to errors, random errors would depress the R 2 values. The wide variation of R 2 values across the partition elements argues against a strong random or systematic error across the entire data set.
Given the wide variations in S, projections of regression equations to estimate CO 2 concentrations out of sample cannot be considered predictions unless the conditions going forward are very similar to the conditions on which such regressions are based.

Analysis of Pleistocene data
A deterministic functional relation between changes in CO 2 concentrations and changes in the induced forcing (FCO 2 ) is inferred from radiative transfer codes, where 278 ppmv is taken as the pre-industrial concentration of CO 2 : FCO 2 (CO 2 ) = 5.3515  ln(CO 2 /278) [Wm 2 ]. (2) Note that the forcing change depends only on the ratio of two CO 2 concentrations and not on their actual values. This assumption is pervasive in the literature. Doubling CO 2 with respect to pre-industrial: (3) Regressing mean annual surface temperature (MAT in Martinez-Boti et al 2015) on FCO 2 gives the following equation which explains 64% of the variance in T: The climate sensitivity parameter S = 2.037[K/Wm 2 ]. The values of S and B in eqn. (4) are chosen to minimize the squared distance between the values of T and the linear trend.
To project the effect on temperature of doubling CO 2 above pre-industrial, based on the whole Pleistocene data, substitute FCO 2 = 3.71 in (4)  = +7.56˚K. This, of course, is incorrect: if we wish to constrain the intercept to be zero, we must find the line passing through the origin minimizing squared distance to T. In that case, S = 2.531K/Wm 2 , T(556) = +9.391˚K (Figure 1).
Eqn. (2) is just a positive affine transformation of ln(CO 2 ). If we regress T on ln(CO 2 ), we would also explain 64% of the variance and also find T(556) = +6.85˚K. The only difference would be that the units of the linear coefficient would be The latter dimensioning suggests physical agency; indeed, a GHG induced radiative imbalance at the top of the atmosphere causes warming according to the Stefan Boltzmann law. At the same time, a rise in temperature can raise atmospheric CO 2 concentrations through temperature mediated feedbacks as, for example, when higher ocean temperatures reduce CO 2 uptake in the oceans. We retain the familiar dimension of K/Wm 2 for the climate sensitivity parameter while recognizing that it reflects a choice of units rather than physical agency.
For the Pleistocene data of Martinez-Boti et al (2015), the mean T = 2.816˚K. Letting i index the 1096 data points, summing the squared deviations from the mean of the black regression line (B = 0) in Figure 1 gives SSR =  i (2.531 FCO 2 (i) +2.816) 2 = 2969 > 2942 =  i (T(i) + 2.816) 2 = SS. Using the black line as a predictor of T is a bit worse than predicting the population mean of T for each value of FCO 2 (i). For the red regression line (B estimated), SSR =  i (2.037 FCO 2 (i)  0.709 +2.816) 2 = 1896, which explains 1896/2942 = 0.644 (64%) of the variance of T.
Neglecting to clarify whether the intercept is inferred or stipulated invites confusion. With this dataset, setting B = 0 inflates the S values and suppresses the differences over different partition elements. Table 1 compares results with and without the intercept assumption. Figure 1: Regression of T on FCO 2 for all Pleistocene data; with (red) and without (black) intercept. Out of sample values for doubling pre-industrial CO 2 , T(556) are also shown. Figure 2 shows CO 2 as a function of age in kaBP, with local minima and maxima identified with blue resp red arrows. This enables us to distinguish episodes of increasing (deglaciation) and decreasing (glaciation) CO 2 concentrations. Note that increasing CO 2 episodes generally transpire more quickly than decreasing CO 2 episodes. Dividing the data into two subsets consisting of increasing or decreasing episodes allows us to determine whether the climate sensitivity parameter is CO 2path dependent. Figure 2 also suggests that the Earth's climate system changed around the inception of Marine Isotope Stage 11 in 424 kaBP, midway between a glacial maximum and a glacial minimum. Table 1 compares sensitivity in the periods before and after 424 kaBP.

Partitioning the data
The results show evidence of path dependence. Note that if the intercept is estimated the differences in the climate sensitivity parameter between glaciation and deglaciation (1.5885, 2.3352) are larger than if the intercept is stipulated to be zero (2.5051, 2.552). The same is true for the projected temperature at 522ppmv CO 2 : with the intercept estimated these are (4.72, 8.32) for glaciation and deglaciation but with intercept stipulated these are (9. 16, 9.47). The right panel of Figure 3 showing glaciation also illustrates how stipulating the intercept can lead to inferior predictions. Summing over the 583 data points in the right panel, the square differences between T and the prediction with intercept = 0 (red line) is 674.1, whereas this sum is (blue line) 478.7 when the intercept is estimated. Note also the difference in explanatory power in the two panels. During deglaciation the blue regression line explains three fourths of the variance in T whereas during glaciation less than half of the variance is explained. The interpretation is that during glaciation factors other than CO 2 account for the variation in T.     Table 1: Summary of data subset regressions. R 2 is the fraction of variance of T explained by the regression, S is the climate sensitivity parameter, B is the intercept, SE is the standard deviation of the difference between predictions and true values, and T(556) is the projected T corresponding to a doubling of the pre-industrial CO2 concentration (278 ppmv).
With the exception of "All" and "Pre 424" all S values are outside the 90% confidence bands of the other values within each group. The number of samples is in parentheses.

Summary and discussion
Extending the analysis of the Pleistocene climate data of Martinez-Boti et al (2015) and of Snyder (2019), one key finding is that imposing a zero intercept on the regression of mean annual surface temperature changes against CO 2 forcing changes tends to mask differences between the climate sensitivity parameter in different subsets of the Pleistocene data. In terms of variance explained, the zero intercept assumption produces inferior predictions relative to regressions in which estimates of intercept are exploited. Ultimately, this constraint prevents us from appreciating explanatory power when comparing climate sensitivity parameter estimates under different physical situations.
Explaining why the climate sensitivity parameter is higher during deglaciation than during glaciation is a challenge which deserves attention. One obvious feature is the fact that deglaciation generally transpires faster than glaciation. It may be that certain negative feedbacks with intermediate time scales get played out substantially during glaciation but are less effective during rapid deglaciation.
For example, carbon fertilization would draw down atmospheric CO 2 . During glaciation the retreat of plants would tend to retard the cooling and slow the glaciation process. During deglaciation the advance of plants would tend to retard the warming process. However, if the retreat of land and sea ice happened quickly, other changes, such as growth of the Hadley cells and desertification, might overwhelm the advance of plants and disable this negative feedback. Oceanic CO 2 outgassing during the last deglaciation has been proposed (Shao et al 2019) as a potential disequilibrium process influencing atmospheric CO 2 concentrations. Aerosols, and glacial aerosols specifically, can interact with clouds and influence radiative forcing; paleo-aerosol concentrations are likely to have varied with glaciation pathways. However, the lack of robust reconstructions of glacial aerosol forcing is a key source of uncertainty in paleo-based estimates of climate sensitivity (Friedrich, and Timmermann 2020). As a further possibility, Xie (2020) posits that non-uniform regional uptake of heat by oceans militates against equilibration with radiative forcing, with spatio-temporal variations in ocean state and currents affecting global climate sensitivity.
These are just some conceptual examples of how the interplay of process-related feedbacks with different time scales might alter climate sensitivity in different physical situations. The climate sensitivity parameter is higher after 424KaBP, than before. Such effects have been attributed to changes in orbital forcing, though a detailed explanation of why heightened forcing might raise climate sensitivity has not, to our knowledge, been found.
If the Earth's surface temperature response to CO 2 forcing on millennial time scales does indeed depend on the physical circumstances at the time of the forcing, predictions for the Earth's future response to heightened CO 2 forcing require knowledge of these complex non-linear interactions, an understanding of how they will evolve and on what timescales. Scrutiny and clarification of assumptions regarding the intercept issue and further analytical investigation with stochastic uncertainty techniques should provide additional insights into the properties of the climate sensitivity parameter and inform discussion of contingent implications.