Globally resolved surface temperatures since the Last Glacial Maximum

Climate changes across the past 24,000 years provide key insights into Earth system responses to external forcing. Climate model simulations1,2 and proxy data3–8 have independently allowed for study of this crucial interval; however, they have at times yielded disparate conclusions. Here, we leverage both types of information using paleoclimate data assimilation9,10 to produce the first proxy-constrained, full-field reanalysis of surface temperature change spanning the Last Glacial Maximum to present at 200-year resolution. We demonstrate that temperature variability across the past 24 thousand years was linked to two primary climatic mechanisms: radiative forcing from ice sheets and greenhouse gases; and a superposition of changes in the ocean overturning circulation and seasonal insolation. In contrast with previous proxy-based reconstructions6,7 our results show that global mean temperature has slightly but steadily warmed, by ~0.5 °C, since the early Holocene (around 9 thousand years ago). When compared with recent temperature changes11, our reanalysis indicates that both the rate and magnitude of modern warming are unusual relative to the changes of the past 24 thousand years. Paleoclimate datasets are integrated with a climate model to reconstruct global surface temperature since the Last Glacial Maximum, showing sustained warming until the mid-Holocene.

Climate changes across the past 24,000 years provide key insights into Earth system responses to external forcing. Climate model simulations 1,2 and proxy data [3][4][5][6][7][8] have independently allowed for study of this crucial interval; however, they have at times yielded disparate conclusions. Here, we leverage both types of information using paleoclimate data assimilation 9,10 to produce the first proxy-constrained, full-field reanalysis of surface temperature change spanning the Last Glacial Maximum to present at 200-year resolution. We demonstrate that temperature variability across the past 24 thousand years was linked to two primary climatic mechanisms: radiative forcing from ice sheets and greenhouse gases; and a superposition of changes in the ocean overturning circulation and seasonal insolation. In contrast with previous proxy-based reconstructions 6,7 our results show that global mean temperature has slightly but steadily warmed, by ~0.5 °C, since the early Holocene (around 9 thousand years ago). When compared with recent temperature changes 11 , our reanalysis indicates that both the rate and magnitude of modern warming are unusual relative to the changes of the past 24 thousand years.
The interval of time spanning the Last Glacial Maximum (LGM; 21-18 thousand years ago, ka) to the preindustrial era represents the most recent large-scale reorganization of the climate system, over which the Earth rapidly transitioned out of a cold, glaciated state with vast Northern Hemisphere ice sheets into a warm interglacial. Constraining the evolution of global surface temperatures during this critical time period provides an excellent opportunity to better understand the mechanisms of large-scale climate change, including Earth system interactions and responses to various forcings (for example, greenhouse gases, albedo/ice-sheet and orbital changes).
A number of prior studies have sought to characterize the global surface temperature evolution from the LGM to present [3][4][5][6][7] . Of particular note, Shakun et al. 3 and Marcott et al. 6 established a global mean surface temperature (GMST) estimate spanning the deglacial and Holocene periods using ~80 marine and terrestrial temperature proxies (hereafter, the Shakun-Marcott curve; SMC). However, subsequent comparisons of SMC to other global temperature reconstructions and transient LGM-to-present model simulations revealed discrepancies surrounding the timing, magnitude, and rapidity of deglacial warming and of millennial-scale cooling events 2,4 . One of the most prominent differences between SMC and climate model simulations is the direction of global temperature change across the Holocene. Whereas SMC shows a cooling trend, modelling results indicate there should be a warming, a phenomenon termed the Holocene temperature conundrum 2 . More recent work has sought to reconcile these differences by using either independent 12,13 or additional 7,12 proxies, and by correcting for possible proxy seasonal biases 2,8,12 . Nonetheless, all of these approaches have a fundamental limitation in that none provide a dynamically constrained full-field view of climate evolution since the LGM. Conversely, although climate models provide a self-consistent and spatially complete representation of the climate system, they are known to have biases due to inaccurate representation of climate processes 10,14 . Moreover, the fidelity of paleoclimate simulations of the LGM and Holocene depends on the accurate knowledge of paleoclimate boundary conditions, which are known with varying levels of certainty and may not be independent from proxies 2,15,16 .

The Last Glacial Maximum reanalysis
Here, we revisit the evolution of global temperatures from the LGM to present using an offline paleoclimate data assimilation approach that formally combines proxy and independent model information 9,10,17 . The resulting 'Last Glacial Maximum reanalysis' (LGMR) product offers the first proxy-constrained, dynamically consistent and spatiotemporally complete view of climate change for the past 24 kyr. The LGMR enables us to diagnose the major modes of climate variability, refine our understanding of global temperature changes across the Holocene, and compare current anthropogenic global warming with the rate and magnitude of change seen in the recent geological record.
Following ref. 10 , we focus on assimilating geochemical proxies for sea surface temperature (SST) with established Bayesian proxy forward models [18][19][20][21] . To ensure that the proxy data have sufficient temporal resolution and length to inform our reconstruction, we required that records be at least 4,000 years long, have a median time resolution of 1,000 years or less, and contain a radiocarbon-based Methods). The difference between the actual and the forward modelled proxy value (the 'innovation') is weighted by the Kalman gain, which considers the covariance between the proxy location and the climate fields as well as uncertainties in the proxies and the prior, producing an 'update' that is then added to the model prior state. For our final reconstruction, we generated a posterior ensemble of 500 realizations, based on random sampling of 60 prior states for each time interval, with 20% of proxy records withheld for error quantification and validation testing. We also sampled age uncertainty to ensure that this source of error was propagated into our assimilated fields. As in proxy-only analyses 3,4,6,7,12 , this results in some temporal smoothing of our LGMR ensemble mean but does not impact the fidelity of millennial-scale trends or features (Methods).
The LGMR highlights the exceptional and spatially heterogeneous nature of deglacial climate change (Fig. 2). Reconstructed GMST reveals a distinct three-part sequence across the past 24 kyr. From 24-17 ka, the Earth is in a ubiquitously cold glacial state. The thermal imprints of the North American and Eurasian ice sheets are near their maximum extent, with terrestrial cooling relative to the preindustrial below −20 °C across the glaciated high northern (>45°N) and southern (>45°S) latitudes ( Fig. 2). At 16.9 ka (median; 95% confidence interval (CI) = 18.5-16.0 ka; Supplementary Information), global-scale deglaciation (the second stage) abruptly begins. Deglacial global warming shows a familiar 3 two-step rise that is punctuated by the millennial-scale Bølling-Allerød (14.8-12.8 ka) to Younger Dryas (12.8-11.7 ka) events. Following the Younger Dryas cooling event, the Earth enters its final transition towards the present interglacial. In the third part of the GMST sequence, early Holocene (11 ka onward) warming stabilizes by 9.5 ka (11.2-8.7 ka) and is followed by a small (~0.5°C) but significant (>99% probability from 9.5-0 ka) global warming until preindustrial times. A vestigial cold imprint over northeastern North America is all that remains of the once-great Northern Hemisphere ice sheets at 9 ka as mild, albeit widespread, high-latitude warming ensues; Antarctica shows a notable east-west thermal dipole next to a relatively warm Southern Ocean; whereas mild cooling persists across much of the tropics (Fig. 2). All told, we estimate a global warming of 7.0 ± 1.0 °C (2σ) from the deglaciation onset to preindustrial, which is larger than the value reported in ref. 10 . (6.1 °C), with ~90% of the warming occurring between 16.9 ka and 9.5 ka. The greater warming found here reflects the LGM period referenced (ref. 10 uses 23-19 kyr, which corresponds to 6.8 ± 1.0 °C in the LGMR) as well as differences in iCESM model priors, proxy data distribution, and the degree of covariance localization used (Methods).

Validating the LGMR
Offline data assimilation products are strongly dependent on the covariance structure of the model prior 23 . A limitation of the LGMR is that it is based on priors from a single model (iCESM), which are inevitably biased by model deficiencies, resolution and uncertainties in boundary conditions. However, we can objectively test the veracity of the LGMR, including its spatial representation, using two independent methods of statistical validation. First, we use our posterior LGMR fields to reconstruct withheld proxy time series (for example, ref. 17 ). Across the ensemble, the majority of records are skilfully reconstructed (Methods) with no obvious signs of regional biasing (Extended Data Fig. 2). Second, following ref. 10 , we compare posterior δ 18 O of precipitation (δ 18 O p ) to independent ice core-and speleothem-derived δ 18 O p time series (Extended Data Table 2). On a global scale, we find notable improvement in the posterior comparison of δ 18 O p over the modelled state, with a ~30% reduction of error and a large increase in variance explained (Extended Data Fig. 3).
LGMR recovers 65-90% of the ice core δ 18 O p variance (n = 13 records; Extended Data Table 2), including divergent Holocene trends in east versus west Antarctica δ 18 O p (Extended Data Fig. 4). Both tests suggest that our posterior assimilation is robust, but the close correspondence between LGMR δ 18 O p and ice core proxy records in particular emphasizes that the LGMR is producing a realistic climate state.

Drivers of global temperature change
To gain further insight into the drivers of global surface temperature change during the past 24 ka, we decompose our LGMR temperature fields into spatiotemporal modes of variability using empirical orthogonal function (EOF) analysis (Supplementary Information). As expected, the first spatial mode, EOF1, exhibits positive loading across the globe and explains the majority (>90%) of the surface temperature covariance during the past 24 ka (Fig. 3a). This mode is clearly associated with deglaciation, with the strongest amplitude concentrated atop the North American and Fennoscandian ice sheets. The uniform nature of EOF1 implies an association with changes in greenhouse gas (GHG) radiative forcing and ice sheet albedo. Given the monotonic nature of the associated principal component time series, PC1, GHG forcing 24 can explain 92% of the EOF1 variance (Fig. 3b). However, there are notable differences between the two time series: during the early to mid-Holocene, GHG radiative forcing increases at around 12 ka and then gradually decreases, while PC1 steadily increases. This implies GHG forcing alone is not sufficient for explaining the leading mode of global temperature variability.
Modelling experiments indicate that the magnitude of ice sheet albedo forcing is comparable to (if not greater than) GHG forcing across the deglacial transition 10,13,25 . By considering GHG and ice sheet forcing together, we account for 98% of the variance in PC1 as well as the observed warming during the Holocene (Fig. 3c). The inclusion of ice sheet albedo forcing also explains the strong EOF1 loading atop North America and Fennoscandia (Fig. 3a). Although other radiative forcings, such as vegetation and dust, probably also impacted LGM-to-present temperature change 10 our EOF results imply that these were of lesser importance in terms of their global footprint, particularly during deglaciation.
The second mode of global temperature variability, EOF2, explains only 3.5% of the variance. However, it is distinct from its neighbouring tailing modes and physically interpretable (Supplementary Information). This mode is a hemispheric dipole, with strong positive loading across the Southern Ocean and negative loading spanning much of the Northern Pacific, North America and the North Atlantic (Fig. 3b). Its associated time series, PC2, consists of both long-term trends as well as millennial-scale peaks during the deglaciation. We interpret this mode to represent a superposition of two sources of climate variability: changes in Atlantic Meridional Overturning Circulation (AMOC; the millennial-scale features) and orbitally induced shifts in high latitude seasonality (the long-term trends). To illustrate this, we decompose PC2 into its long-term 'trend' (Fig. 3c, purple) and millennial-scale 'residual' components ( Fig. 3c, yellow).
The trend component of PC2 represents a precession cycle, with a peak at around 11 ka. Both summer insolation intensity at 65°N 26 and Southern Hemisphere summer duration at 65°S 27 offer good approximations of this long-term change (Fig. 3c). However, we interpret the latter as the likely driver. Enhanced summer insolation in the Northern Hemisphere would not cause mean annual cooling; this conflicts with conventional Milankovitch orbital theory 28 . In addition, spatial correlation analyses (of either orbital series) with surface temperatures indicate that the strongest coupling occurs in the Southern Hemisphere (Extended Data Fig. 5b). The strong loading of EOF2 in the

Article
Southern Ocean in particular could point towards a feedback with regional sea ice; a longer summer (and shorter winter) would increase the extent of summertime sea ice retreat while decreasing its growth during wintertime, resulting in an increase in mean annual surface temperatures 27 . The residual component of PC2 closely follows (R 2 = 0.80) 231 Pa/ 230 Th proxy records of AMOC from the Bermuda Rise 29-31 (Fig. 3c). Prior studies have also identified this 'bipolar seesaw' mode 32,33 , which represents the millennial-scale events that occurred during the last deglaciation (Heinrich event 1, the Bølling-Allerød, and the Younger Dryas). Correlation analysis shows that Northern Hemisphere surface temperatures in LGMR are strongly related to AMOC changes (Extended Data Fig. 5c). A decrease in Atlantic heat transport would also lead to compensating warmth in the Southern Hemisphere, similar to the loading pattern of EOF2. However, the particularly strong loading found across the Indian and Pacific Ocean sectors of the Southern Ocean does not match the classic fingerprint of the oceanic bipolar seesaw 34 . Similarly, the strong loading in the eastern North Pacific is not typical of a modelled response to an AMOC slowdown 1,35,36 . It does, however, reflect the underlying proxy records from this region, which show a strong response of SST to North Atlantic climate variability 37 . Columbia River megaflood meltwater forcing may have contributed to the severe cooling observed in deglacial SST records from the Gulf of Alaska 37 ; however, step-wise deglacial cooling might also be explained by dynamic changes in the subpolar gyre boundary 38 .

Comparison to proxy-only insights
LGMR GMST shows several notable differences when compared to the proxy-only SMC reconstruction. Focusing first on pre-Holocene differences, the LGMR has a more abrupt onset of deglaciation at ~17 ka, and a more muted Bølling-Allerød-Younger Dryas transition (Fig. 4a). The LGMR also indicates nearly twice as much glacial cooling, but this can be explained by the fact that the SMC is based mostly on SST proxies and was not scaled to infer GMST; we scale it here for comparison (Fig. 4a). To diagnose the origin of the other differences, we generated a proxy-only GMST reconstruction from our SST compilation (Methods). Even though our compilation has many more proxy SST records (and no terrestrial records), it is strongly correlated with SMC (R 2 = 0.98).
The similarity of the proxy-only reconstruction and the SMC illuminates at least two shortcomings that are effectively mitigated by our data assimilation approach. First, proxy-specific GMST reconstructions suggest that the gradual deglacial onset is most likely to be linked to the Mg/Ca data, which show early deglacial SST increases relative to U 37 K' and δ 18 O c (Extended Data Fig. 6a and Supplementary Information).
Such differences may reflect proxy-specific spatial bias ( Fig. 1); data assimilation will mitigate these differences by balancing signals from other nearby proxies. Second, data assimilation allows us to overcome problems associated with spatial aliasing in the proxy distribution. Unlike the enhanced Younger Dryas cooling shown by the proxy-only curves (Fig. 4a), LGMR reveals that Younger Dryas cooling was in fact confined to the Northern Hemisphere (and, specifically, the North Atlantic and North Pacific sectors; Extended Data Fig. 7). Thus, the stronger expression of the Younger Dryas in the proxy-only GMST curves could reflect Northern Hemisphere bias in the proxy distribution during the deglaciation (Fig. 1a).

Holocene global temperature trends
The LGMR provides an updated view of the Holocene temperature conundrum 2 . All of the proxy-only reconstructions-including SMC, Temp12K 7 and ours-show a cooling trend that begins at ~7 ka BP and continues through the Holocene (Fig. 4b). In contrast, LGMR shows a small (0.25 °C) but significant warming (P > 0.9, based on ensemble analysis) since 7 ka (Fig. 4b). The Holocene trend in LGMR does not come from the model prior; in fact, the model suggests a warmer mid-Holocene due to a prescribed 'Green Sahara' (Methods; Extended Data Table 1). Rather, it is a feature of the assimilation: the early-mid Holocene warming in Mg/Ca (and, to a lesser extent, U 37 K' and δ 18 O c ) that underlies the conundrum is nearly eliminated after assimilating each proxy type into iCESM (Extended Data Fig. 6). Such consistency implies that a warming trend through the Holocene is a robust solution. This solution is generally similar to the temperature evolution simulated by TraCE-21k ( Fig. 4b) indicating that the Holocene conundrum is effectively resolved by data assimilation. Most likely, this is related to how data assimilation weights the proxies to compute a global average. Data assimilation weights proxies based on their uncertainties and the model-proxy covariance structure and uses this information to update the surface air temperature field. In contrast, proxy-only reconstructions rely on simple latitudinal binning and weighting. This renders the latter approach particularly sensitive to latitudinal bands with sparse proxy coverage or outliers. Sensitivity tests (Supplementary Information) suggest that limited number of proxies in the Southern Ocean latitude band (45-60°S) can account for about half of the early Holocene warmth in our proxy-only GMST curve (Extended Data Fig. 6b). This implies the Holocene conundrum may be, in part, an artifact of poor spatial averaging. More broadly, given that proxies are unevenly distributed, proxy-only reconstructions do not represent a true global average. In contrast, LGMR-based GMST is based on a spatially complete field, and thus is truly a global mean air temperature. This is a clear strength of the LGMR over existing reconstructions.
Proxy seasonal bias may also play a role 2,8 . The LGMR uses proxy forward models that account for seasonal plankton growth 19-21 and allow δ 18 O c and Mg/Ca seasonality to change through time. Our proxy-only curves use inversions of the same models but require that seasonality be temporally fixed. Thus, the proxy-only reconstructions could be more affected by seasonal bias. However, analyses exploring the impact of seasonally biased records, as well as the 'dynamic' seasonality in LGMR, indicate that seasonality has a minimal influence on the Holocene GMST trajectory in both proxy-only reconstructions and data assimilation (Extended Data Fig. 6a-c and Supplementary Information). Within the confines of our forward modelling assumptions, seasonal bias is a less prominent contributor to the conundrum than spatial weighting.
Finally, the LGMR allows us to directly assess 20th and 21st century warming from the broader vantage point of the past 24 kyr. When The proxy-only GMST reconstruction (this study; red) overlain with the GMST-scaled SMC curve (dotted dark red), and LGMR GMST (this study; blue). Uncertainty ranges denote ±1σ (dark) and 95% confidence intervals (light).
All ∆GMST curves are shown as deviations relative to the past two millenia. b, Holocene temperature trends from a alongside the reconstruction of ref. 7 ('Temp-12k'; red dotted-dashed) and the TraCE-21k simulation 1 . LGMR deglacial: decadal resample (n = 10,000) LGMR deglacial: ensemble TraCE-21k deglacial (LGMR-scaling) juxtaposed alongside the last millennium reanalysis (also a paleoclimate data assimilation product 17 ) and observational HadCRUT5 11 (Fig. 2), we find that 2010-2019 mean GMST exceeds the upper bound (>99.9th percentile) of decadal-estimated values from the LGMR by a considerable margin: >0.5 °C, or +1.5 °C above mean Holocene GMST. These findings differ from those of Marcott et al. 6 , who suggested that early 21st century temperatures (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) had not yet exceeded early Holocene values and reflect increased confidence over ref. 7 , who find that 2010-2019 warming is at the ~80% CI of mid-Holocene centennial-scale values. Similarly, we find the HadCRUT5-observed rate of 20th to 21st century warming (0.96 °C per century) registers near the upper bound of LGMR deglacial warming rates (that is, >99th percentile; Fig. 5). A similar conclusion is reached when comparing HadCRUT5 warming rates to the monthly resolved TraCE-21k simulation scaled to match the larger magnitude of deglacial warming shown by the LGMR (Fig. 5 and Supplementary Information) 1,2 . The LGMR underscores the dramatic nature of anthropogenic warming, whose magnitude and rate appear unusual in the context of the past 24 kyr.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-021-03984-4.

Proxy compilation and screening
We collated a globally dispersed set of 573 SST proxy records spanning the past 24 kyr. Following ref. 10 , we focus on geochemical proxies for SST including alkenone U 37 K' (146 records), the TetraEther indeX of 86 carbons (TEX 86 ; 28 records), the elemental ratio of Mg to Ca in planktic foraminifera (Mg/Ca; 129 records), and the oxygen isotopic composition of planktic foraminifera (δ 18 O c ; 270 records). As in ref. 10 , we limit our analyses to these proxies because we have already developed Bayesian forward models for each of them 18-21 that we can use in our paleoclimate data assimilation scheme (see the 'Paleoclimate data assimilation' section below). These data tend to cluster along coasts where sedimentation rates are high, and in regions where sampling efforts have historical focused (for example, the Atlantic sector and Northern Hemisphere). By comparison, data coverage across ocean interiors-in particular, the Pacific and (to a lesser extent) Southern Oceans-is sparse. For consistency, we recalibrated all age models using the Marine13 radiocarbon calibration curve 39 with the BACON age model program 40 and local estimates of deviations from the global marine radiocarbon reservoir age (∆R). This procedure also allowed us to generate ensembles (n = 1,000) of possible age models for each record that were used to propagate dating uncertainties into our data assimilation product (see sections 'Paleoclimate data assimilation' and 'Proxy-only global mean temperature' below). Some screening of our proxy compilation was necessary to remove low-resolution, short and adversely situated proxy records. Generally speaking, we removed records whose median age resolution was less than 1,000 years or were less than 4,000 years long (Extended Data Fig. 1). However, the former constraint was relaxed for records situated in or near the Southern Ocean, where data coverage is sparse, so as to retain as many time series as possible from this undersampled region. To remove anomalous influences of sea ice on our proxy estimates (in particular, the influence of sea ice on the δ 18 O of seawater 20 ) we removed records situated at locations where preindustrial mean annual SSTs were less than 0 °C (a value assumed to roughly approximate the perennial sea ice edge), as estimated from the World Ocean Atlas 2013 product 41 . This resulted in 4 δ 18 O c records being removed from locations each north of 80°N. Following ref. 19 , we also omitted all U 37 K' records situated north of 70°N or within the modern Arctic sea ice zone, due to known biases in the alkenone temperature proxy that are likely to arise from lipid contributions from Isochrysidales species living in sea ice 42 . We also removed two western Atlantic sites, OCE326-GGC26 (43°29′N, 54°52′W) and OCE326-GGC30 (43°53′N, 62°48′W; ref. 43 ). While these U 37 K' records have been featured in prior mean global Holocene temperature reconstructions 6 , they show an extremely large (up to 10 °C) cooling over the Holocene that most likely reflects a shift in the Gulf Stream/Labrador Current boundary 43 . This poses a problem for our data assimilation technique, because CESM1.2 does not put this sharp boundary in the same place as observations. Assimilation of these sites thus has a tendency to cause a large regional bias in SSTs. Although similar issues arising in part from coarse model resolution probably afflict other frontal regions, no sufficient cause was found to warrant the removal of any additional records. All told, our selection criteria resulted in the removal of 34 records.

Proxy-only global mean temperature reconstruction
To provide a point of comparison for our data assimilation results, we generated a reconstruction of global mean temperature change using only the proxy data, broadly following the methodology of ref. 44 . This was done by first estimating a 'reference' preindustrial proxy value for each site and then appending each value at the top of its respective N × 1 proxy record. This produced an (N i + 1) × 1 vector of proxy values for each site i, where the +1 denotes the appended preindustrial reference value. For sites with value(s) overlapping the preindustrial (that is, 0-4 kyr BP; see ref. 10 ), the preindustrial reference was computed as the 0-4 kyr mean proxy value. For sites without preindustrial overlap, reference proxy values were estimated by using the nearest core-top value 18-21 . As in ref. 44 , if no core-top locations existed within a threshold 300 km radius, an observational preindustrial SST estimate was taken from the HadISST product 45 and forward modelled to a proxy estimate. All (N i + 1) × 1 vectors were then calibrated to SSTs using the Bayesian inverse models 18 For this, we first scaled the benthic stack of ref. 55 to an estimated change in global δ 18 O sw (arising from changes in global ice volume) of +1‰ at the LGM (18 ka) relative to the preindustrial following ref. 56 . This scaled curve was then added to the modern mean annual δ 18 O sw value 57 and interpolated in time for each site. The posterior SST estimates produced by the Bayesian inverse models are a matrix of dimension (N i + 1) × M, where M contains 1,000 possible SST histories and core-top reference values for each time entry N i + 1 of each i site. These matrices were sorted from least to greatest along dimension M, which preserves the 'shape' of the time series, after which a normally distributed analytical uncertainty of N (0,0.5) was added back to the sorted ensembles to account for laboratory precision (see also refs. 10,44 ). Finally, we converted each of our records to SST anomaly units relative to preindustrial values by subtracting the first row of the (N i + 1) × M matrix (the preindustrial core-top estimate) from the remaining rows to generate an N i × M matrix of SST anomalies.
To produce a GMST anomaly curve, SST anomaly values and associated ages were randomly drawn from our ensemble of M posterior values and our ensemble of 1,000 age models, respectively, and then sorted into contiguous 200-year bins spanning back to 24 ka. If more than one data point per record occurred in a given 200-year bin, those SST data points were averaged, to ensure that higher-resolution records did not bias the bin. Following refs. 4,10,58 , the data within each time bin were binned by latitude, with the bin size randomly selected between 2.5 and 20, and then global average SST (GSST) was computed as the latitudinally weighted zonal average between 60°S and 60°N. Following refs. 4,10 , GSST was then scaled by a value randomly chosen between 1.5 and 2.3 to transform the values to GMST. This Monte Carlo process was repeated 10,000 times, to propagate errors arising from the SST estimation, age modelling, latitudinal weighting, and GSST to GMST scaling.  Table 1). In addition, we developed new time-slice simulations using iCESM1.2 of 16, 14, 12, 9 and 6 ka (Extended Data Table 1). For each time slice, the greenhouse gases (CO 2 , CH 4 and N 2 O) were set to 200-year averages centred around the corresponding time from ice core reconstructions [63][64][65] . Orbital parameters followed ref. 26 . Ice sheet forcing was prescribed according to the ICE-6G reconstruction 66 , including effects from changes in land elevation and surface properties and the land-sea mask due to sea-level variations. For each time-slice simulation, ocean temperature and salinity were initialized from published CESM1.2 simulations when available 67 . δ 18 O sw was initialized from the slice before, for example, δ 18 O sw of 18 ka branched from 21 ka. A spatially uniform correction was applied to salinity and δ 18 O sw to account for the ice-volume effect. The correction terms were derived by scaling changes in the global volume-mean salinity and δ 18 O sw between 21 and 0 ka by the corresponding change in the global mean sea level 47 . Global volume-mean salinity and δ 18 O sw were 34.7 and 35.7 g kg −1 and 0.05 and 1.05‰ in the 0 and 21 ka simulations, respectively 68 . The iCESM1.2 time-slice simulations used preindustrial aerosol emissions because of the lack of reliable global reconstructions 69 . For a similar reason, the simulations used the preindustrial vegetation cover except for the 9 and 6 ka slices (see description below). All these time-slice simulations were run for 900 years.

Climate model simulations
A Green Sahara was implemented in both the 9 and 6 ka simulations by prescribing a 100% spatial coverage of shrub and C4 grass at 10-25°N and 25-35°N, respectively. In addition, C3 grass over the Northern Hemisphere high latitude regions (northward of 50°N) was replaced with boreal trees in the 6 ka simulation. These vegetation changes were developed following recommendations from the Paleoclimate Modeling Intercomparison Project and represent maximum possible vegetation expansion over the Sahara and the Northern Hemisphere according to the pollen and macro-fossil evidence 70 . To sample the uncertainty from vegetation, an additional 6 ka simulation was performed for 400 years with the preindustrial vegetation cover, as another end-member of the mid-Holocene vegetation forcing. All the iCESM1.2 time-slice simulations were run with a prescribed satellite phenology in the land model due to the overall poor simulation of vegetation processes with a prognostic phenology 71 . The satellite observation derived vegetation phenology included leaf area and stem area indices, and vegetation heights.
In addition, two water hosing experiments were performed within the 16 and 12 ka slices, respectively, to provide prior climate states for the millennial-scale events of the last deglaciation (that is, Heinrich 1 and the Younger Dryas). In the hosing experiments, 0.25 Sv (1 Sv = 10 6 m 3 s −1 ) of freshwater with a δ 18 O composition of −30‰ (VSMOW) were applied over the northern North Atlantic (50-70°N). These experiments were run for 200 years. AMOC is largely shut down after 100 years in the water hosing simulations with a maximum transport at 34°S reduced to ~3 Sv from a background value of ~18 Sv.
Prior to using the simulations in our data assimilation, a paleoclimate calendar adjustment was applied to the monthly model output for all time slices to account for the effect of changing months on seasonal climatic expressions 72 .

Paleoclimate data assimilation
The data assimilation method incorporates an offline ensemble square root Kalman filter approach, following the methodology of ref. 10 using the data assimilation MATLAB code package DASH version 3.6.1 (source code available at https://github.com/JonKing93/DASH). We refer the reader to this previous work for a full mathematical description. Briefly, the method combines a set of prior climate states from our model simulations (X prior ) with new information from the proxy observations (the 'innovation', y obs -Y est ) to compute a 'posterior' matrix of assimilated past climate states, X post . The posterior mean and deviations from the mean are each computed separately (see refs. 10,73 ); the Kalman filter mean 'update' equation is: post prior obs est X prior is an N × M matrix of prior climate states from iCESM, where dimension N contains the model grid point data for SST and SSS (both at monthly and mean-annual resolution), and mean-annual surface air temperature (SAT), δ 18 O sw , precipitation amount-weighted δ 18 O (δ 18 O p ), and mean-annual precipitation rate collapsed into a concatenated vertical 'state vector', and dimension M represents the number of state vector ensemble members; the overbar in all cases denotes averaging across the ensemble dimension (producing, in this case, a vectorized ensemble 'mean' update 73 ).
The P × 1 vector y obs consists of P globally dispersed δ 18 O c , Mg/Ca, U 37 K' and TEX 86 proxy observations. The P × M matrix Y est contains the corresponding set of P proxy estimates, generated from the model output from each M state using our Bayesian forward models. For details concerning the Bayesian models, the readers are referred to the original publications 18-21 . In brief, the forward model for δ 18 O c requires monthly SST and mean annual δ 18 O sw . These δ 18 O c values are computed on a species-and growing season-specific basis 20 that allows us to explicitly account for foraminiferal seasonal preferences in our forward model proxy estimates. Both the U 37 K' and TEX 86 models require only SST as inputs, with the former requiring monthly SST due to the seasonal response of U 37 K' production in the North Pacific, the North Atlantic and the Mediterranean 19 , and the latter only mean annual SST 18 . Finally, the forward model for Mg/Ca requires both monthly SST and SSS to compute species-specific growing season Mg/Ca values, in addition to sea surface pH, bottom water calcite saturation state (Ω), and the laboratory cleaning method. The latter is provided in the original publications, and SST and SSS are drawn from iCESM output. For pH and Ω, we follow the same procedure as the proxy-only reconstruction (described above). The innovation ( y obs -Y est ) represents the new information from the observations not already provided by the prior estimates. As shown in equation (1), these values are weighted by the N × P matrix K, the Kalman gain, which takes the general form: prior est est est −1 where 'cov' denotes the covariance expectation (approximated by an ensemble mean, with the ensemble mean removed). The P × P matrix R prescribes the error covariance associated with each proxy observation. Thus, the Kalman gain weights the innovation by the covariance of the forward-modelled proxy estimates with the prior climate states and the uncertainties of the prior-estimated proxy ensemble and the proxy observations. In our case, R is diagonal, that is, the errors are presumed to be independent. R is user-defined, but ideally based on an estimate of 'true' proxy uncertainties. Following ref. 10 , in which the impact of different values of R on the posterior were systematically tested, we use the error values output from our Bayesian forward models scaled by 1/5, but further refine this by specifying a slightly different scaling factor for each proxy type. To determine these proxy-specific factors, for each record we performed jack-knife (leave one record out) and 'only-one record' assimilation experiments (no R scalings applied) to assess the ability of any particular record to predict all others when that record was either removed, or solely retained, respectively. From these experiments, we then ranked each record by validating the only-one and all-but-one reconstructions against the non-assimilated proxies. This allowed assessment for the percent of tests for which this proxy resulted in 'improvement' (as denoted by the ratio of the posterior to prior squared error of all predicted, independent proxies, where a ratio less than unity indicates improvement). Using these rankings for each proxy type, we then weighted each proxy-specific scaling factor by the improvement factor, and subsequently weighted these rankings by total record count to maintain an average R scaling of 1/5 across all available proxy records. The specific scaling factors that we calculated were r uk = 3.13 −1 , r tex = 1.36 −1 , r mgca = 2.86 −1 and r 18o = 7.27 −1 , indicating δ 18 O c to be the most reliable (and numerous) proxy type. Following refs. 10,17 , we applied covariance localization to the assimilation to limit spurious relationships between proxies and far-field regions. Validation testing suggested that a 24,000 km localization radius provided optimal posterior results for our dataset (see the 'Internal and external validation testing' section below). This differs from ref. 10 , which used a narrower 12,000 km localization. The improvement we find using broader localization is likely to relate to the fact that fewer proxies are assimilated here per time step than in ref. 10 .
For computing our full 24 kyr LGMR product, we calculated X post at 200-year increments using the following approach. First, we selected 80% of our proxy records at random for inclusion in our assimilation, with the remaining 20% of records withheld for statistical validation (see the 'Internal and external validation testing' section below). For each record, we randomly prescribed an age scale by drawing from the 1,000 viable posterior BACON-derived age models. Second, for each 200-year interval, y obs was compiled as all of the available proxy data points whose ages are within the bounds of the current reconstruction age interval. When multiple data points from a single record occurred within a given 200-year age-interval, these values were averaged. We then randomly selected M = 60 state vector ensembles from the iCESM output using a transient 'evolving prior' approach (see below) and used the Bayesian forward models to produce the matrix Y est . X post was then computed from y obs and Y est (equation (1)) with R in the Kalman gain (equation (2)) scaled to the appropriate proxy type. Finally, this process was repeated for a total 500 times for each time interval, to create a 500-member LGM-to-present ensemble of posterior states. This Monte Carlo procedure ensures that proxy, age-model and model prior uncertainties are included in the assimilated product. Since the proxy age model uncertainties in particular can be on the order of centuries (interquartile range of ~320-770 years across all data points), this sampling procedure has the effect of smoothing the posterior ensemble mean time series on sub-millennial timescales, as in prior proxy-only analyses 3,4,6 .
Assimilation of the LGM-to-present climate evolution at 200-year intervals directly reflects our underlying proxy data compilation. ~96% of the proxy records have a median resolution that is higher than 200 years (Extended Data Fig. 1). However, if all >60,000 compiled data points are considered together, >90% of the paleoclimate data have sample resolutions of ≤200 years. While ideally, the amount of time represented by the model prior would also equal 200 years, this would have considerably limited the number of model priors available (a maximum of 58 prior states across our all iCESM time-slice simulations, and as few as four priors for a given interval; Extended Data Table 1). To increase the number of iCESM priors available for assimilating our marine proxies while still roughly adhering to our reconstruction interval, we instead used 50-year average priors, following ref. 10 . Prior experimentation by ref. 10 showed only marginal differences in LGM and preindustrial posteriors once time-averaging of our iCESM prior fields exceed interannual time periods, justifying this choice.
Assimilating Earth's transient climate evolution between two fundamentally different glacial versus interglacial states presents a unique obstacle for offline paleoclimate data assimilation (which has largely focused on reconstructing the climate evolution of the Common Era 17 , a relatively stable background climate state 9 ). In terms of Bayesian inference, the challenge is adequately assigning a collection of iCESM priors at each LGM-to-present reconstruction interval that reflects a reasonable prior belief in their viability. For example, a time interval in the late Holocene should not include glacial prior states that contain a Laurentide ice sheet, as the latter induces fundamental changes in spatial covariance that are not realistic for a deglaciated climate state. Conversely, deglacial prior states might include a range of possible Laurentide configurations.
To address this issue, we developed an 'evolving prior' approach. For each 200-year interval, we defined a normal probability density function (PDF) with a 1σ range of 4,000 years and a maximum cut-off range of 3σ (±12,000 years). The PDF is truncated to the range of our target time interval (24-0 ka), such that for the tail ends of the reconstruction interval, the PDF ends up being half-normal. We then sampled 60 prior ages from this PDF and rounded them to 0, 3,6,9,12,14,16,18 or 21 ka, the discrete time-slice intervals at which iCESM simulations are available (Extended Data Table 1). For each randomly drawn and rounded age, a model prior was selected (with replacement) from its corresponding iCESM time-slice simulation.
The 1σ range of 4,000 years was chosen to balance the need to include adequate variability in the prior while still excluding model priors that are not physically justified (that is, the inclusion of LGM priors when assimilating mid-late Holocene climatic states, and vice-versa). Similar to ref. 10 , rank histogram analysis of our withheld validation proxies 74 suggested minimal mean bias of our model priors using this 1σ length scale, but an apparent lack of structural variance (as suggested by a U-shaped rank histogram; see Extended Data Fig. 2 of ref. 10 ). While increasing the length scale to include a broader range of priors would increase prior variance, validation testing indicated that the 4,000-year length scale was near-optimal, and also resulted in substantial improvement over an 'agnostic' prior sampling scheme (for example, one that assigns equal probability of including a prior from any given iCESM timeslice; see the 'Internal and external validation testing' section below). We note that the variance in our model prior is fundamentally restricted by the use of a sole model (iCESM). Further work is needed to test the sensitivity of the LGMR reconstruction to the use of different isotope-enabled model priors (once available).

Internal and external validation testing
Statistical validation and tuning of our LGMR product was conducted in two ways, referred hereafter as 'internal' and 'external' validation. The first approach ('internal' validation) involves withholding 20% of the marine proxies per iteration (see the 'Paleoclimate data assimilation' section above), and then using the posterior SST, SSS and δ 18 O sw fields to forward model these withheld proxy records. These predicted proxy records were then compared with the actual proxy records using standard skill diagnostics: the coefficient of efficacy (CE; a value between −∞ and 1, where a value >0 is conventionally taken to represent skill over climatology), the squared Pearson product moment coefficient (R 2 ), and the root mean square error of prediction (RMSEP). The computation of multiple posterior ensembles (that is, N = 500), each with 20% withholding, implies each proxy record was randomly withheld and internally validated on average 100 times. These tests yield, on average, CE values that are greater than 0 with no obvious signs of systematic spatial biasing, indicative of skill in our posterior assimilation above our evolving iCESM prior fields. On a global basis all posterior-predicted proxies exhibit a strong correspondence to observed values with R 2 > 0.95 and slopes within 5% of their respective 1:1 lines (Extended Data Fig. 2), indicating a lack of systematic bias in the LGMR oceanic climatologies.
Following ref. 10 , we also use independent ice core and speleothem records of δ 18 O p to externally validate the LGMR. In this more stringent analysis, we compare posterior δ 18 O p to published ice core δ 18 O (which is taken as a direct indicator of precipitation-weighted mean-annual δ 18 O p , given that post-depositional processes such as isotopic diffusion 75 and sublimation 76 do not typically impact ice core record integrity across centennial and longer time scales) and speleothem δ 18 O, which is first converted to δ 18 O p via the methodology of ref. 77 (see also ref. 10 ). For the speleothem data, we used the SISAL version 1b database 78 . Records were included in our compilation solely on the basis that they span at least 18 kyr: that is, at least three-quarters of the LGMR reconstruction interval, ensuring overlap with the deglacial period (around 17-9 ka; Fig. 2). Record-specific details are provided in Extended Data Table 2. Following ref. 10 , we focus on δ 18 O p deviations (∆δ 18 O p ), which we generate by differencing all δ 18 O p values at each time slice interval relative to the 0 ka baseline. This approach is premised on the expectation that δ 18 O p deviations should be adequately captured by LGMR 10 despite known mean δ 18 O p biases in iCESM 22 . We then compare both prior and posterior ∆δ 18 O p with observed ∆δ 18 O p at the iCESM timeslice intervals (3,6,9,12,14,16, 18 and 21 ka) using our statistical diagnostics of covariance and prediction error (R 2 and RMSEP). Positive ∆R 2 (that is, a stronger relationship with observed values in LGMR versus the prior) and negative ∆RMSEP (that is, reduced prediction error in LGMR versus the prior) imply improvement in our LGMR posterior relative to the iCESM priors.
Overall, this external validation test indicates that LGMR substantially improves over the prior, with a nearly 30% error reduction (RMSEP prior = 2.60‰; RMSEP posterior = 1.92‰) and approaching two times greater variance explained in with our posterior-predicted values relative to the prior (R 2 prior = 0.37; R 2 posterior = 0.62). Although much of the improvement is driven by ice core ∆δ 18 O p estimates (Extended Data Figs. 3 and 4), offsets with speleothem ∆δ 18 O p observations are also strongly reduced in LGMR relative to iCESM. The comparably poor temporal covariance shown by global speleothem ∆δ 18 O p values relative to ice cores (Extended Data Fig. 4; Extended Data Table 2) may reflect local-scale influences on speleothem ∆δ 18 O p records, such as groundwater storage, mixing, recharge and residence time variations; subgrid-scale topographic and (or) precipitation influences; and uncertainties arising from indirectly inferring δ 18 O p from δ 18 O calcite or δ 18 O aragonite measurements 77 . In addition, the iCESM prior range of δ 18 O p across the LGM to present in the tropics is considerably smaller than in the high latitudes (for example, Extended Data Fig. 4), which might restrict the posterior solutions for the speleothems (see ref. 10 ).
We used external validation testing to choose both the covariance localization radius and evolving prior 1σ range (see the 'Paleoclimate data assimilation' section for a description of each). Between the two, our tests show that LGMR is most sensitive to the choice of localization radius. We tested values between 6,000 km and infinite (that is, no localization) and found a relatively broad localization cut-off (24,000 km) is near-optimal (Extended Data Table 3). In contrast, LGMR shows comparably less sensitivity to choice of the 1σ range for sampling iCESM priors, with acceptable external validation scoring for values between 1σ = 2,000-6,000 years (Extended Data Table 3). For our final LGMR product we chose a value of 1σ = 4,000 years as this was shown to provide near-optimal validation scoring (Extended Data Table 3), while also constituting a reasonable 'middle ground' between enabling adequate variance amongst iCESM model priors throughout the past 24 kyr while excluding physically unjustifiable states (see discussion above).

Data availability
All LGMR and associated proxy data are publicly available via the National Oceanic and Atmospheric Administration (NOAA) Paleoclimatology Data Archive (https://www.ncdc.noaa.gov/paleo/ study/33112). Source data are provided with this paper.