Alkenone isotopes show evidence of active carbon concentrating mechanisms in coccolithophores as aqueous carbon dioxide concentrations fall below 7 µmolL-1

5 Coccolithophores and other haptophyte algae acquire the carbon required for metabolic processes from the water in which they live. Whether carbon is actively moved across the cell membrane via a carbon concentrating mechanism, or passively through diffusion, is important for haptophyte biochemistry. The possible utilisation of carbon concentrating mechanisms also has the potential to over-print one proxy method by which ancient atmospheric CO2 is reconstructed using alkenone isotopes. Here I show that carbon concentrating mechanisms are likely used when aqueous carbon dioxide concentrations are 10 below 7 μmolL−1. I use published alkenone based CO2 reconstructions from multiple sites over the Pleistocene, which allows comparison to be made with ice core CO2 records. Interrogating these records reveal that the relationship between proxyand ice coreCO2 breaks down when local aqueous CO2 concentration falls below 7 μmolL−1. The recognition of this threshold explains why many alkenone based CO2 records fail to accurately replicate ice core CO2 records, and suggests the alkenone proxy is likely robust for much of the Cenozoic when this threshold was unlikely to be reached in much of the global ocean. 15 Copyright statement. This work is distributed under the Creative Commons Attribution 4.0 License.


Introduction
Alkenones are long-chain (C 37−39 ) ethyl-and methy-ketones ( Figure 1; Brassell et al. (1986); Rechka and Maxwell (1987)) produced by a restricted group of photosynthetic haptophyte algae (Conte et al., 1994). Produced by a narrow group of or-15 ganisms which live exclusively in the photic zone, alkenones allow probing of algal biogeochemistry, and as alkenones are often preserved in the sedimentary record, alkenones can also provide information about past environmental conditions. Two main proxy systems based on alkenone geochemistry exist, (1) for sea surface temperature (SST) relies on the changing degree of unsaturation of the C 37 alkenone (U K 37 )  and (2) for atmospheric CO 2 , based on reconstructing the isotopic fractionation which takes place during photosynthesis (ε p ) using the carbon istopic composition of the preserved Figure 1. Alkenones are C37 unsaturated methyl ketones Rechka and Maxwell, 1987).
In the modern ocean alkenones are produced primarily by two dominant coccolithophore species; Emiliania huxleyi and Gephyrocapsa oceanica. E. huxleyi first appeared 290 kyr ago, and began to dominate over G. oceanica around 82 kyrs ago (Gradstein et al., 2012;Raffi et al., 2006). However alkenones are commonly found in sediments throughout the Cenozoic, with the oldest reported detections from mid-Albian aged black shales (Farrimond et al., 1986). Prior to the evolution of G. 25 oceanica alkenones were most likely produced by other closely related species from the Noelarhabdacaeae family (Marlowe et al., 1990;Volkman, 2000).
One plausible reason for these discrepancies is the action of active carbon concentrating mechanisms (CCMs) in the haptophytes. These are potentially important as CO 2(εp−alk) assumes purely passive uptake of carbon into the haptophyte cell (Laws et al., 1995;Bidigare et al., 1997). The potential for CCMs to be active and to effect CO 2(εp−alk) has long been known 40 (Laws et al., , 2002Cassar et al., 2006) and recent work has refocussed efforts on understanding CCMs in CO 2(εp−alk) (Bolton et al., 2012;Bolton and Stoll, 2013;Stoll et al., 2019;Zhang et al., 2019Zhang et al., , 2020. In this study I use the now large number of published CO 2(εp−alk) records which overlap with ice core records of atmospheric CO 2 (Tables 1 & 2) to explore the relationship between CO 2(εp−alk) and CCMs in the Pleistocene, where our understanding of atmospheric CO 2 is best.   (2013) 155-393 Vostok Petit et al. (1999) 2 Materials and Methods 45 2.1 Calculating CO 2 from alkenone δ 13 C: The CO 2(εp−alk) proxy Multiple records of CO 2(εp−alk) have been published for the Pleistocene (Figure 2, Table 1) allowing direct comparision with ice core based CO 2 records (Table 2). These records are globally distributed in longitude, but are concentrated at low latitude sites, largely as there is a general preference for sites which have (in the modern ocean) surface waters close to equilibrium with the atmosphere (Figure 2, Table 1). In longer term palaeoclimate studies there has also been a preference for low latitude, 50 gyre sites in the belief that these sites are more likely to be oceanographically stable over long time intervals (Pagani et al., 1999). Most of the records included here (Table 1, Figure 2) were generated with the aim to reconstruct atmospheric CO 2 , however one, the MANOP C Site of Jasper et al. (1994), was used to explicitly reconstruct changing disequilibrium due to oceanographic frontal changes over time, and so is excluded from the following analysis. Whilst these sites do only span a relatively small latitudinal extent, the diversity of settings does allow for investigation of 55 any secondary controls on alkenone δ 13 C (δ 13 C alkenone ). In particular, differences in oceanographic setting and SST to test the hypothesis that low [CO 2 ] (aq) breaks the relationship between δ 13 C alkenone and atmospheric CO 2 , as might be expected if haptophytes are able to actively uptake carbon from seawater to meet metabolic demand -i.e. activate CCMs.
To facilitate fair comparision between sites and consistent comparision with the ice core records, all CO 2(εp−alk) records I recalculated using a consistent approach. The approach is based on Bidigare et al. (1997) which updated the initial approach 60 of Jasper and Hayes (1990) to CO 2(εp−alk) .
This approach removes some additional corrections used in the original publication of the records (such as growth rate adjustment for NIOP 464 (Palmer et al., 2010)) but does allow for direct comparison to be made.
An overview of how CO 2(εp−alk) data are typically generated is given in Badger et al. (2013b).
Briefly, to calculate ε p requires the carbon isotopic composition of the dissolved CO 2 (δ 13 C CO 2(aq) ) and haptophyte biomass (δ 13 C org ). The isotopic fractionation between δ 13 C alkenone and δ 13 C org is first corrected assuming a contstant fractionation (ε alkenone ) of 4.2 ‰ (Popp et al., 1998;Bidigare et al., 1997): The isotopic composition of dissolved inorganic carbon (DIC) is estimated using (ideally) the δ 13 C of planktic foraminifera and the temperature-dependant fractionation between calcite and [CO 2 ] (g) experimentally determined by Romanek et al. 70 (1992), where T is sea surface temperature in degrees Celsius (SST): The value the cabon isotopic composition of CO 2(g) (δ 13 C CO 2(g) ) can then be calculated: From this δ 13 C CO 2(aq) can be calculated using the relationship experimentally determined by Mook et al. (1974): and Finally ε p can be calculated: and from that [CO 2 ] (aq) is calculated using the isotopic fractionation during carbon fixation (ε f ) and 'b', which represents the summation of physiological factors: Here ε f is assumed to be a constant 25 ‰   (Zhang et al., , 2020 but for the purposes of this analysis is assumed to hold. This is discussed further below. Values for SST, δ 13 C alkenone , δ 13 C carbonate , salinity and [PO 3− 4 ] are either taken from the original publications or estimated from modern ocean estimates (Takahashi et al., 2009;Antonov et al., 2010;Garcia et al., 2013;Locarnini et al., 2013).
from [CO 2 ] (aq) , (and vice versa if atmospheric CO 2 is known) using Henry's law: The solubility coefficient (K H ) is dependant on salinity and SST, and here is calculated following the paramterization of Weiss (1970Weiss ( , 1974.

95
3.1 Multi-site comparisons between CO 2(εp−alk) and the ice core records Across the six sites included in this analysis, there are 217 CO 2(εp−alk) -based estimates of atmospheric CO 2 over the past 260 Ka for comparison with the ice core records (Table 2; Bereiter et al. (2015)). When all CO 2(εp−alk) estimates are considered together over 260 Ka, this compilation of proxy-based records fails to replicate the ice core record (Figure 3). This has already been noted at specific sites (e.g. Site 999 in the Caribbean Badger et al. (2019)) but this is the first time that all available records 100 coincident with the Pleistocene ice core records have been compiled using a common methodology. Notably the CO 2(εp−alk) based estimates are rarely lower than time-equivalent ice core estimate, but frequently higher. Given that haptophytes require carbon to satisfy metabolic demand, this is perhaps unsurprising; if at times of low carbon availibily haptophytes can switch from passive to active uptake to satisfy metabolic demand, it would be times of low atmospheric CO 2 (and so lower [CO 2(aq) ) when the active uptake is most likely to be needed. As CO 2(εp−alk) -based estimates of atmospheric CO 2 rely on the assumption 105 of a purely difusive uptake of carbon, it is therefore likely that the proxy would perform least well at times of low atmospheric The haptophyes do not directly interact with the atmosphere, obtaining their carbon from dissolved carbon. As it is not only atmospheric CO 2 which controls the concentration of dissolved carbon ([CO 2 ] (aq) ), but also temperature, alkalinity and other oceanographic factors which control the equilibrium state between surface waters at the atmosphere, ( Figure 2) the multiple 110 sites in different settings now give the opportunity to test whether other factors are important in controlling the accuracy of CO 2(εp−alk) .
To produce time-equilvalent estimates of atmospheric CO 2 for comparison with the ice core records, a simple linear interpolation of the Bereiter et al. (2015) compilation was initially used (Figure 4). This assumes that both the age model of the ice core and the published age models of the sites are correct and equivalent. This is almost certainly not the case, and so for the 115 calculations below, a ±3000 year uncertainty is included for ages of both the ice core and CO 2(εp−alk) values. Figure 4 shows that CO 2(εp−alk) -based atmospheric CO 2 agree with ice core CO 2 at some sites and at some times, but not througout. Sites such as 05-PC21 (Bae et al., 2015) and DSDP Site 619 (Jasper and Hayes, 1990) perform quite well, througout, whilst others only appear to agree at higher values of CO 2 , such as ODP Site 999 (Badger et al., 2019) and NIOP 464 (Palmer et al., 2010), This suggests that the fidelity of the CO 2(εp−alk) depends on the concentration of [CO 2 ] (aq) , improving at higher levels of

135
To further investigate this potential relationship, I progressively exclude samples based on [CO 2 ] (aq)−predicted with a step size of 0.05 µmolL −1 , again calculating Pearson correlation coefficients between ice core and CO 2(εp−alk) for each subsample of the population. The result is shown in Figure 6. Here the analysis shows, similar to Figure 5, that as the samples with lowest [CO 2 ] (aq)−predicted are progressively removed, the correlation between ice core and CO 2(εp−alk) increases. Furthermore, this  continues only up until [CO 2 ] (aq)−predicted reaches 7 µmolL −1 . Above this, the correlation coefficient plateaus, until the 140 subsample reaches such a small size that spurious correlations become important (Figure 6b).

Sensitivity and Uncertainty Tests
As it is not impossible that that a similar pattern could emerge if the dataset were particularly shaped so that there was increased density surrounding the 1:1 correlation line, I ran a series of sensitivity experiments. In these, rather than reducing the sample by filtering by [CO 2 ] (aq)−predicted , the whole dataset (Table 1) was randomly ordered, and then stepwise subsampled so 145 that the number of samples equalled the number of values for each value of [CO 2 ] (aq)−predicted (ie for each point in Figure   6, an equivalently sized but randomly selected sample was made such that for any equivalent value of [CO 2 ] (aq)−predicted the randomly ordered sample had an equivalent n as shown in Figure 6b). Pearson correlation coefficients were calcluated for each subsample as above. To allow for possible age model uncertainties, a 3000 year (1σ) uncertainty was also applied to each sample. This uncertainty was applied to the age of each sample prior to sampling of the ice core record, and is 150 Figure 5. Crossplots of CO 2(εp−alk) -based atmospheric CO2 (Table 1; y-axes) vs the time-equivalent estimate from ice core records (x-axes; Bereiter et al. (2015); Table 2 applied as a normally distributed uncertainty. Uncertainty in CO 2(εp−alk) measurements is typically calculated using Monte Carlo modelling of all the parameters (i.e Pagani et al. (1999); Badger et al. (2013a, b)), however this was not done in all the published work (Table 1), and some differences in approach was found accross the published work. Therefore to create CO 2(εp−alk) uncertainty estimates for each value in this study, I emulate the uncertainties based on the CO 2(εp−alk) value.
I built a simple emulator (Figure 7) by running Monte Carlo uncertainty estimates for all of the included datasets (Table 1) 155 using the same estimates of uncertainty for each variable in the CO 2(εp−alk) calculation as applied in Badger et al. (2013a, b).
This then allows the uncertainty to be included in the [CO 2 ] (aq)−predicted calcuation as well as CO 2(εp−alk) , and allowed for uncertainty estimates to be site-ambivalent.
The result is shown in Figure 8, and suggests that the 7 µmolL −1 break point remains valid. The absolute value of r 2 is reduced, even at higher [CO 2 ] (aq)−predicted , but this would be expected given the addition of uncertainty in age model, as the 160 published age is most likely to align with the ice core. Given the rapid rate of change at deglaciations, this effect is likely to be particularly pronounced in this dataset as many records have high temporal resolution around deglaciations in order to attempt to resolve them. Any small age model offset introduced by the error modelling in these intervals also clearly has the potential 9 https://doi.org/10.5194/bg-2020-356 Preprint. Discussion started: 2 October 2020 c Author(s) 2020. CC BY 4.0 License.  (Table 1) vs the time-equivalent estimate from ice core records (Bereiter et al. (2015); Table 2). The sample reduces stepwise by 0.05 µmolL −1 , and the number of records in each subsample is shown in panel b.
to induce large differences between the CO 2(εp−alk) and ice core values. Figure 8 clearly demonstrates that it is the filtering by [CO 2 ] (aq)−predicted rather than any spurious correlations which determine the shape of the data in Figures 6 and 8.

Discussion
The plateau in r 2 in Figures 6a and 8a suggest that below a [CO 2 ] (aq)−predicted of ∼ 7 µmolL −1 CO 2(εp−alk) is no longer as good a predictor of ice core CO 2 as when [CO 2 ] (aq)−predicted > 7 µmolL −1 . This is clear from comparing the relationship between samples where [CO 2 ] (aq)−predicted < 7 µmolL −1 with those where [CO 2 ] (aq)−predicted > 7 µmolL −1 in Figure 9. Here the r 2 for the former of 0.15 is substantially less than the latter of 0.55. I suggest that this is because below this threshold, the 170 fundamental assumption of CO 2(εp−alk) ; that carbon is passively taken up by haptophytes, no longer holds true. One obvious explanation for why this would be the case is that at low levels of [CO 2 ] (aq) haptophytes have to actively uptake carbon in order to satisfy metabolic demand.  Table 1 applying the same approach to uncertainty as Badger et al. (2013a, b). Estimates used in this study are highlighted in blue.
Similar behaviour has been recognised in some culture studies (Laws et al., , 2002Cassar et al., 2006), with some evidence that the diatom Phaeodactylum tricornutum even has a similar CCM threshold  but this study is 175 the clearest evidence of the behaviour in alkenone based studies of the environment.
By applying a threshold value for [CO 2 ] (aq)−predicted of 7 µmolL −1 to the published records (Table 1) (Table 1) vs the time-equivalent estimate from ice core records (Bereiter et al. (2015); Table 2). As in Figure 6 the sample reduces stepwise by 0.05 µmolL −1 . Panel a shows a 1000 member Monte Carlo analysis, whereby uncertainty in CO 2(εp−alk) and age is considered, as detailed in the text. Panel b shows a similar 1000 member Monte Carlo analysis, but with random sampling of the whole CO 2(εp−alk) population so that the number of samples is equivalent to the dataset shown in panel a, ie the size of the sample follows that shown in Figure 6b. Means and one σ uncertainties are shown as the bold lines.
of atmospheric CO 2 and above (Martínez-Botí et al., 2015) at all but the warmest surface ocean temperatures, CO 2(εp−alk) is 190 likely to be a reliable system for most of the Cenozoic. It is only in the Pleistocene that atmospheric CO 2 is low enough for CCMs to be widely active accross the surface ocean, with the low CO 2 glacials providing the most difficulty (Badger et al., 2019). This finding aligns well with evidence that CCMs developed in coccolithophores as a reponse to declining atmospheric Figure 9. Correlations between CO 2(εp−alk) and ice core CO2 where [CO2] (aq)−predicted > 7 µmolL −1 (black symbols) and [CO2] (aq)−predicted < 7 µmolL −1 (red symbols).
CO 2 through the Cenozoic, and were developing in [CO 2 ] (aq) -limited parts of the ocean in the late Miocene at the earliest, and likely not widespread until the Plio-Pleistocene (Bolton et al., 2012;Bolton and Stoll, 2013).

195
Recent has attempted to correct for the existance of CCMs in palaeo-records of atmospheric CO 2 Stoll et al., 2019;Zhang et al., 2020). However, these assume that CCMs are always active, and that Pleistocene records can be used to correct for them throughout the Cenozoic. If, as suggested by the analyses presented here, CCMs only act at low [CO 2 ] (aq) , and largely only in conditions prevalent throught the late Pliocene and Pleistocene, it is plausible that corrections based on Pleistocene records could over-compensate for CCMs in the rest of the Cenozoic, when the assumption of passive carbon 200 uptake inherent in CO 2(εp−alk) as traditionally applied may still be valid.

Conclusions
Reconstructions of past atmospheric CO 2 with proxy tools like CO 2(εp−alk) are critical to understanding how the Earth's climate system operates, so long as the tools used can be relied upon to be accurate and precise. This re-analysis of existing Pleistocene CO 2(εp−alk) records reveals that below a critical threshold of [CO 2 ] (aq) of 7 µmoL −1 the relationship between 205 δ 13 C alkenone and atmospheric CO 2 breaks down, plausibly because below this threshold haptophytes are able to actively update carbon using CCMs in order to satisfy metabolic demand.  Table 2).
Although reconstructing the low levels of atmospheric CO 2 in the Pleistocene glacials and areas of the global ocean where [CO 2 ] (aq) is less than 7 µmoL −1 will be impossible, for much of the Cenozoic the CO 2(εp−alk) proxy retains utility. If care is taken to avoid regions and oceanographic settings where [CO 2 ] (aq) may be expected abnormally low, CO 2(εp−alk) remains an 210 important and useful proxy to understand the Earth system.
Author contributions. MPSB conceived the study, designed the methodology, analysed the data, prepared the figures and wrote the manuscript (conceptualization, formal analysis, investigation, methodology, vizualization, writing -original draft, review and editing) Competing interests. MPSB declares that he has no conflict of interest Acknowledgements. I am grateful to Gavin Foster and Tom Chalk for frequent and stimulating discussions on alkenone paleobarometry. I 215 thank all authors who made full datasets available online. I thank Kirsty Edgar for comments on various drafts that greatly improved this manuscript.