Causal and Predictive Analysis of Climate Change Using Granger Causality

Current climate simulation models provide valuable insights but are highly complicated, with numerous parameters, making them complex for assessing the causal impact of anthropogenic and natural factors on global temperatures. We applied multivariate Granger causality to investigate how combinations of forcings a↵ect Earth’s surface and ocean temperatures. Clear causal impact was found due to combinations of human factors including wellmixed greenhouse gases, tropical aerosols, ozone, and land use, while nonanthropogenic factors showed a non-trivial but lesser role.

20th century was largely caused by man-made factors rather than solely by natural ones.
Determination of whether climate change in the past decade has been caused by natural or anthropogenic factors is usually conducted using global climate models. Hegerl and Zwiers (2011) reviewed the use of global climate models in attribution studies. More recently, Hausfather et al. (2019) analyzed a number of climate models from prior research papers, exploring how their underlying models would have fared as predictors if fed continually with accurate data regarding relevant input factors. They found that many models would have been fairly accurate had they been applied year-on-year for prediction based on then-current data, but that more recent models generally would have been less accurate predictors than simpler models from the earlier climate science literature. This result may be attributed to the greater subtlety and complexity of more recent models, leading to greater overfitting.
Granger Causality is another method used and determines causality in climate time-series data. Pasini, Triacca, and Attanasio (2015) used Granger Causality to examine the link between increased aerial sulfates and global warming, and Smirnov and Mokhov (2009) concluded that the rise of global surface temperature can be explained by CO 2 , implying that anthropogenic factors may cause global warming. Moreover, McGraw and Barnes (2018) reported that Granger Causality analyses outperform lagged linear regression for causality detection in climate data.
Inspired by this prior work, we tested for climate causality using multivariate Granger causality, investigating how combinations of forcings a↵ect Earth's surface and ocean temperatures. Clear causal impact was found due to combinations of factors including well-mixed greenhouse gases, tropical aerosols, ozone, and land use; while non-anthropogenic factors showed a non-trivial, but lesser, role.

Input data sources
We used data from the following sources: -All data for global forcings, including well-mixed greenhouse gasses (WMGHG), ozone, stratospheric aerosols, tropospheric aerosols, land use, orbital forcings and solar forcings was downloaded from http://data.giss.nasa.gov on September 15, 2019. The data is of the version from CMIP5 Climate Simulations. Methodology for those forcings can be found from Hansen et al. (2005) for CMIP3 and Miller et al. (2014) for CMIP5. All data is for the years between (and including) 1850 and 2012. -Temperature data: We have two sets of temperature data; One is for average surface and ocean temperature and the other is for average ocean temperature only. Note that ocean temperature is measured at the ocean surface only and not in the depths of the ocean. The data was retrieved from NOAA National Centers for Environmental information, Climate at a Glance: Global Mapping, published on September 15th 2019 from https://www.ncdc.noaa.gov/cag/. The data was retrieved for years between (and including) 1880 and 2018.

Short description of the forcings
We considered three natural influences: -Solar Irradiance (Solar): How much heat the Earth receives from the sun in the form of electromagnetic radiation. -Stratospheric Aerosols (StratAer): The particles that exist in the stratosphere region of the Earth's atmosphere. They usually consist of a mixture of sulfuric acid and water and are usually created naturally. For example, volcanic eruptions play a big part in the amount of power that they add during certain time periods in the atmosphere. -Orbital Forcings (Orbital): The e↵ect on climate which happens because of changes in the tilt of the Earth's axis and the orbit shape. Both of these factors change the total amount of sunlight reaching Earth. They might have been responsible for some ice ages.
And four anthropogenic (human-caused) influences: -Well-Mixed Greenhouse Gases (WMGHG): The e↵ect that greenhouse gasses, such as CO 2 , CH 4 and N 2 O have on Earth's warming (like all other forcings, this one is estimated in Watts per square meter). Those gasses mainly absorb and emit radiant energy and therefore a↵ect the warming and cooling of the planet. -Changes in Land Use (Land Use): Mainly the conversion of land from natural vegetation, such as forest, to farmland or pastureland. -Tropospheric Aerosols: The influences, both directly and through cloud cover, of sulfates, nitrates, sea salts, black carbon, etc. We distinguish between direct (absorbing solar and terrestrial radiation, denoted by TropAerDir) and indirect e↵ects on climate (through the formation of clouds, denoted by TropAerInd). -Ozone: O 3 Ozone occurs in the stratosphere and troposphere both naturally and through man-made production. Tropospheric ozone is usually caused through a combination of sunlight and anthropogenic emissions. Miller et al. (2014) constructed figures 1 and 2 which display forcings and net forcings, respectively, of these factors. We carefully managed both our data collection and causality testing steps to obtain the most accurate results we could. As cited in Energy education Climate forcing (2019), "It is important to note that it is di cult to measure these forcings, and thus forcings are not reflected perfectly." To control for robustness and since recent forcings and temperature estimates are more accurate than are older ones, we split the dataset into two time periods: the entire period 1880 through 2012, and a shorter time period covering 1958 through 2012.

Granger causality testing
Granger causality is an hypothesis test applied to two time series (or more time series for multivariate Granger causality) to determine whether one series aids in forecasting a di↵erent series. Proposed by Granger (1969), the test measures the ability to predict future values of a time series using prior values of another one, hence testing for "predictive causality".

Mathematical specification of single variate Granger causality
Given two time series variables x and y, we say that x Granger-causes y if predictions of the value of y based on its own past values and on the past values of x are better than predictions of y based only on its own past values. Assuming x and y are stationary processes, we begin with an auto regression on y:ŷ Here, y t represents the actual value of time series y at time t;ŷ t represents the predicted value of y at time t; a 0 , a 1 , . . . a m are constants for m time lags; and ✏ t is an error term which is normally distributed with distribution N (0, ⇠). We next add the x variable to the equation: where b 0 , b 1 , . . . b m are constants for the m time lags and u t follows the normal distribution N (0, ⇠). We then retrieve squares of residual errors from the first and second processes, respectively. The improvement of model 2's prediction compared to that of model 1's can be found by testing for the joint significance where k is the number of explanatory variables. For equation 2, k = 2 ⇤ m + 1. In all cases, ! is distributed with Fisher distribution of (m, T k). We test the following hypothesis to determine whether our result is within a 95% confidence interval in which case x does not Granger cause y. Alternatively, we conclude x does Granger cause y. We define multivariate Granger causality similarly, by adding additional variables and testing them against our y variable. Suppose we test w variables to determine whether or not they Granger-cause y. The model then looks like: y t = a 0 +a 1 y t 1 +· · ·+a m y t m +b 1,1 x 1,t 1 +· · ·+b 1,m x 1,t m +· · ·+b w,1 x w,t 1 +u ⇤ t (3) Analogously to our prior procedure we define where, k ⇤ = w⇤m+m+1. We then check the value with the Fisher distribution of (m,T-k ⇤ ) to determine the significance of sets of variables. The standard Granger causality tests we considered above include only the constant a 0 , but additional tests can be performed. We can include both the constant a 0 and a trend via an added term b ⇤ t; we can include a trend only or a constant only; or we can include neither trend nor constant. The model chosen depends on the time series and how they change over time. The goal is to satisfy the criterion that ✏ t is independently and identically distributed (IID).

Granger causality results
After ensuring quality data, we tested all possible combinations of forcings for Granger causality against both combined ocean and surface temperatures as well as on ocean temperatures alone. In all models, three lags optimized the Akaike Information Criterion (AIC) values when tested against changes in ocean temperature, so we set m = 3 in all cases.
Special attention was given to natural factors compared with anthropogenic ones. While Granger causality does not determine the exact strength of the signal, it does determine the significance of di↵erent sets of factors. Our Granger causality testing spanned multiple parameter dimensions (temperatures (joint ocean and surface versus ocean alone); time periods (1880-2012 versus 1958-2012); and models (constant only, trend only, both constant and trend, and neither constant nor trend).) We display the results of our experiments in tables 1-8 below with the results for sets of natural forcings shown in tables 1-4, and the results for sets of anthropogenic forcings, shown in tables 5-8. In each table, we display p-values for multivariate Granger Causality test results upon specific sets of factors for each of the four models. P-values larger than 5% are displayed in white, between 2.5% and 5% in red, between 1% and 2.5% in orange, between .1% and 1%in yellow, and less than .1% in green.
The results di↵ered somewhat depending on the model used. In a model with no constant and no trend, none of the natural factors showed significance either alone or in conjunction with other factors. With regard to man-made factors, the picture is a bit di↵erent. In the longer time period, from 1880 until 2012, several sets of factors display statistical significance related to surface and ocean temperatures, though fewer significant factors appeared with respect to ocean temperatures only. Only two sets, namely {WMGHG and Ozone} and {WMGHG, Ozone and TropAerInd} were significant at the 2.5% significance level, though the number of sets of variables increased when considering a significance level of 5%. For the shorter time period, there were more   For the model with a trend but no constant, we find that, during the longer time period, of the natural factors only orbital forcings were significant at 2.5% level, and the set {Orbital, StratAer} at a 5% level. For the shorter time period, the significance can only be found in joint surface and ocean temperature, where there is a 2.5% significance on {StratAer} and a 5% significance on the sets {Orbital, StratAer} and {Solar, StratAer}. When it comes to man-made factors and the longer time period, many sets of factors show statistical sig- For the model containing a constant and no trend, during the full time period several significant natural causalities exist, all of which include orbital forcing as a factor. For the shorter time period, orbital forcings appear to play a somewhat diminished role, while the roles of other natural factors become more prominent. For man-made factors, almost all of the sets of factors display significant causalities on both surface and ocean temperatures as well as on ocean only temperatures. This holds true for both time spans.
We finally analyze the model with both a trend and a constant. For natural factors, the situation is similar to the case in which we include neither trend nor constant. There are no significant causalities at the 2.5% or 5% level in either time span. We obtain many sets of man-made factors that significantly cause surface and ocean temperature change over the longer time period, but for the ocean temperature data only the two sets {Ozone, TropAerInd} and {WMGHG, Ozone, TropAerInd} at the 2.5% significance level with one more, {Land Use, Ozone, TropAerInd}, at the 5% level. In the shorter time interval, we have a similar situation with numerous significant causalities for joint surface and ocean temperature. For ocean temperature alone, there is only one causality at the 2.5% level of significance: {WMGHG, Ozone, TropAerInd}. Overall, we have no significant causalities for any of the natural sets of factors and significant causalities for a few man-made sets of factors, though those causalities are not necessarily confirmed when looking at di↵erent temperature datasets (ocean and joint ocean and surface temperatures).
When considering all the Granger causality models and tests, we obtain a better overview. We sometimes detect temperature causalities from some sets of natural factors, usually including orbital forcings. These causalities only appear in specific cases and do not seem to persist in all models. While orbital forcing has some small trend, it is very weak and with inclusion of other factors the signal becomes harder to detect. When it comes to man-made sets of factors, we have a di↵erent story. Some sets of factors are significant more often than others. {TropAerInd} is almost always significant in the longer time interval. It is only not significant in causing ocean temperature changes in the model including both trend and constant. In both time spans, a set of {WMGHG, Ozone, TropAerInd} was Table 6: Anthropogenic forcings results for surface and ocean temperature data for years  found to always significantly cause temperature changes, be it ocean temperatures or joint ocean and surface temperatures, across all tested models. Other joint sets of factors found to cause significant temperature changes in almost all models and time frames include {WMGHG, Land Use, Ozone, TropAerInd} Table 7: Anthropogenic forcings results for ocean temperature data for years  1880-2012 and {WMGHG, TropAerDir, Ozone, TropAerInd}, and to a lesser degree {WMGHG, Ozone}.
All the above mentioned sets of variables display significant causalities in both time frames and in both temperature datasets, while causalities of natural factors appear only sporadically and even then with a considerably lower level of confidence. We conclude that while there appear to be obvious causalities from anthropogenic factors, the same cannot be said for the natural ones.

Conclusions and Future Work
In an e↵ort to better understand the role of man-made causes on global climate change, we applied Granger Causality tests to determine how di↵erent sets of variables a↵ect either average surface and ocean temperatures or average ocean temperatures alone. While we were unable to obtain consistent significant causalities for any set of natural factors, we did find sets of anthropogenic factors that caused temperature change at statistically significant levels. The most likely cause was joint interference of well-mixed Greenhouse gases, ozone changes and tropospheric aerosols. Changes in land use may have also added to global warming. While there might be indications that some natural factors, primarily orbital forcings, also contributed to global warming, such indications could not be determined with confidence using multivariate Granger causality. At a minimum, multivariate Granger causality confirmed that anthropogenic factors are very likely to causally impact temperature change within 3 years from their occurrence.
We are ultimately interested in a more thorough understanding of how forcings interact with each other. To achieve such understanding we plan to apply nonlinear tools, including evolutionary and symbolic regression methods. From some very preliminary investigations we undertook in this direction, there appears to be at least some overlap between the results found from such approaches and our Granger Causality analyses.