Crop residues are a key feedstock to bioeconomy but available methods for their estimation are highly uncertain

Crop residues are acknowledged as a key biomass resource to feed tomorrow’s sustainable bioeconomy. Yet, the quantification of these residues at large geographical scales is primarily reliant upon generic statistical estimations based on empirical functions linking the residues production to the primary crop yield. These useful yet unquestioned functions are developed either using direct evidence from experimental results or literature. In the present study, analytical evidence is presented to demonstrate that these methods generate imprecise and likely inaccurate estimates of the actual biophysical crop residue potential. In this endeavor, we applied five of the most used functions to a national case study. France was selected, being the country with the largest agricultural output in Europe. Our spatially-explicit assessment of crop residues production was performed with a spatial resolution corresponding to the level of an administrative department (96 departments in total), also the finest division of the European Union’s hierarchical system of nomenclature for territorial units (NUTS), and included 17 different crop residues. The theoretical potential of crop residues for the whole of France was found to vary from 987 PJ Y-1 to 1369 PJ Y-1, using different estimation functions. The difference observed is more than the entire annual electricity consumption of Belgium, Latvia, and Estonia combined. Perturbation analyses revealed that some of the functions are overly sensitive to a fluctuation in primary crop yield, while analytical techniques such as the null hypothesis statistical test indicated that the crop residues estimates stemming from all functions were all significantly different from one another.


Introduction
Terrestrial lignocellulosic biomass from crop residues (CR) (e.g. cereal straw) is a significant carbon feedstock source to feed a well-below 2°C economy with the non-fossil carbon it requires (Bentsen et al., 2014;Hamelin et al., 2019;Williams et al., 2016). In fact, fossil fuel carbon dioxide (CO2) emissions are the leading cause of human-induced climate change, counting with ca. 69% of global greenhouse gas (GHG) emissions (WRI, 2020). Substituting, to the extent possible, the use of fossil carbon by biogenic carbon from residual sources like CR furthermore allows supplying a carbon source decoupled from the demand for additional arable land.
CR has been subjected to scientific scrutiny for many years, in particular the last two decades, and is typically defined as an agrarian by-product. CR mainly consists of the dry stalks of cereal and oilseed crops after the product of interest (i.e., grain or seed) is harvested, and of the top stem and leaves from tuber tops (e.g., potato or beetroot). Figure 1 shows the typical representation of a cereal and oilseed crop in terms of the above-ground and below-ground repartition of the biomass. The above-ground biomass is partitioned as primary crop yield (to be harvested, also referred to as economic yield), harvestable residues (may or may not be collected), and non-harvestable residues (the machinery or specific farm management does not permit the harvest of these in most cases) (Hamelin et al., 2012). In the case of tubers, the primary crop yield is below ground, and harvestable residues above-ground. Figure 1: Generic repartition of the above-and below-ground biomass for cereal and oilseed crops. Although the whole above-ground biomass could be harvested, a portion of the above-ground often remains unharvested and considered as unharvestable due to the specific farm management or harvester used.
Although CR and, in particular, cereal straw represent an important non-fossil carbon source in terms of quantity generated all over the World, it is only the amount generated in surplus of current uses that can be directly available for the bioeconomy, at least to avoid inducing market reactions caused by a change in supply. Apart from use in bioenergy production (e.g. straw-firing heat plants), CR already serve several competitive demands ranging from fodder and bedding in animal husbandry, as a substrate for mushroom cultivation or as a mulch in farms, among other applications (Haase et al., 2016;Scarlat et al., 2019;Tonini et al., 2016b). Furthermore, one inherent essential function of CR is its role as a vital source of organic matter for soils, including a supply of carbon, nitrogen, and other nutrients to soils. CR are also known for their ecosystemic functions, such as acting as a preventive layer against erosion (Haase et al., 2016) or enhancing soil water retention (Blanco-Canqui, 2013). Hence, the plethoric removal of these residues from agricultural fields can decrease the long-term productivity of soils (Blanco-Canqui, 2013;FAO, 2017). Therefore, the economic and environmental sustainability of removing CR from fields requires attentive and site-specific evaluation before any massive investment in CR-based bioeconomy solutions takes place. This challenge was first acknowledged by Scarlat et al. (2010), who presented a comprehensive assessment of the availability of CR in the European Union. Based on a literature review, the authors proposed sustainable removal rates varying between 40% and 50% according to the CR type, these rates allowing to maintain soil organic matter. The sustainable removal rates published by Scarlat et al., (2010) have been widely used in bioenergy and bioeconomy studies (Daioglou et al., 2016;Monforti et al., 2013;Searle and Malins, 2015). Apart from the study of Scarlat et al. (2010), several studies at scales varying from regional to global have proposed a variety of indicators to quantify the sustainable CR removal rates (D J Muth Jr et al., 2013;Hansen et al., 2020;Ronzon and Piotrowski, 2017;Scarlat et al., 2019).
Yet, when it comes to bioeconomy planning, the starting point is to ascertain the total annual biophysical quantity of these residues, i.e., prior to applying any restrictions, whether of sustainability or feasibility nature. This quantity is typically referred to as the theoretical potential (THP) (Bentsen and Felby 2012). Providing THP estimates, although these do involve their load of uncertainties, has the merit to supply a transparent quantitative basis for decision-making. Scaler multipliers may subsequently be applied to the THP estimates, at the convenience of stakeholders in charge of the planning to reflect techno-economic or environmental constraints (Ericsson and Nilsson, 2006;Haberl et al., 2010;Kadam and McMillan, 2003). Thus, in this study, we focus on the methods for estimating the THP of CR.
Actual field measurements would probably supply the most accurate method for quantifying CR THP in a given plot. Yet, because CR are a seldom traded market commodity, and because of the related time and cost constraints associated with measurements of unharvested CR, these measurements are rarely available nor performed. To derive THP estimates at global, national, or even at regional levels, statistical and empirical estimation methods have typically been used (Bentsen et al., 2014;García-Condado et al., 2019;Scarlat et al., 2010). Usually, the estimation of CR production has been realized based on assumptions on the mathematical relationship between the crop and the residue yield. This relationship is generally derived as a factor based on the ratio between the primary crop yield and the residue yield, commonly referred to as the residues-to-product ratio (RPR). Some studies also use Harvest Indexes (HI) for estimating CR e.g., (Sommer et al., 2016). HI is defined as the primary crop yield expressed as a fraction of the total aboveground biomass produced.
Several studies suggest that RPR is better represented as a function of primary crop yield rather than as a fixed value (Bentsen et al., 2014;Scarlat et al., 2010). As reported in (Ronzon and Piotrowski, 2017), the functions so-far proposed for estimating the residue yield are somewhat diverse, including linear (Fischer et al., 2007), logarithmic (Scarlat et al., 2010), hyperbolic (Bodirsky et al., 2012), inverse tangential (Edwards et al., 2005) or exponential (Bentsen et al., 2014). In reality, the quantity of CR generated at large geographic regions can encapsulate significant variations due to a plethora of factors such as soil type, prevailing meteorological conditions, harvesting practices, and primary crop yield, among other things. Some studies also reported that drought has an impact on the residue-to-product ratio that may either decrease or increase if drought occurs at earlier or later growth stages, respectively (McCartney et al., 2006). Because of this diversity in the factors affecting the residue yield, there is no clear standard or set of rules for the quantification of crop residues THP at large geographical scales. Yet, it appears that despite the heavy focus on quantifying sustainable removal rates, studies never challenged nor addressed the potential significance of the choice of selecting the initial THP estimation method in the first place, whether based upon HI or RPR functions.
Hence, the overall goal of this study is to evaluate the magnitude of eventual differences in CR THP estimates resulting from the use of the most commonly reported functions for CR estimation. This is illustrated with a national case study for Metropolitan France, the European Union country with the largest agricultural output, in economic terms (European Commission, 2020). We further address three specific sub-questions: o How variations in primary crop yield affect the estimation of CR yield for the assessed functions; o How uncertainties in primary crop yield overshadow the differences observed in the estimated CR stemming from the functions assessed herein and; o Is there any significant differences in the RPRs estimated from the different estimation functions.

Scoping
The assessment considers all major crops grown in France and reported in the national statistics , here grouped into four categories (Cereal crops, Roots and Tubers, Protein Crops, and Oil crops), which comprises sixteen crops in total (Table 1). These represent ca. 20% of the overall land cover. The annual data on their production and surface area was obtained from the national agricultural statistics ) at the French departmental administrative level (corresponding to NUTS-3 division in Eurostat's Nomenclature of Territorial Units for Statistics; Eurostat, 2015). For each department, average yields were calculated from 19 years of production and surface area data (2000 -2018), as shown in Eq. 1: Where Primary crop Yieldi,j is the economic (cereal) yield for crop i in department j, Productioni,j is the production of crop i in department j, and Surface areai,j is the corresponding agricultural surface for crop i in department j.
As detailed in the Supplementary Material 1 (SM1), the minimum and maximum records of crop production and surface area were identified for each crop and department in order to incorporate the range of annual variability in crop yield.

Estimation of crop residues using empirical functions
RPR is mathematically defined as the ratio of the above-ground harvestable biomass residue, here defined as residue yield, R, to the primary crop yield, Y (García-Condado et al., 2019), as shown in Eq.
(2), which also presents the correspondence between RPR and HI:

Eq. (2)
It should be noted that Eq.
(2) was also presented in García-Condado et al. (2019), and is only valid to the extent R refers to the overall generated residue (harvestable and non-harvestable; Figure 1).
The rationale for selecting different empirical functions for RPR varies for different studies. Still, the essential notion behind most functions is that the residue yield is directly proportional to the primary crop yield (Scarlat et al., 2010). Based on this, Bentsen et al. (2014) as well as Ronzon and Piotrowski (2017), proposed an exponential relation between the crop and the residue yields. Scarlat et al. (2010), on the other hand, derived best-fit logarithmic function curves for RPR by plotting the values for RPR and primary crop yield based on data available in the literature. Edwards et al. (2005) derived RPR functions for wheat and barley, based on grain yields and empirical ranges of harvest indexes taken from de Vries (1999). The study of Fischer et al. (2005) proposed negative linear RPR functions, which do not limit the production of crop residue to a threshold. This, however, also implies that residue yields may decrease at very high levels of primary crop yields, as highlighted by Ronzon and Piotrowski (2017). On the other hand, Bentsen et al. (2014) argue that plant breeding has led to an increase in the HI without changing the overall plant biomass (Hay, 1995), indicating an asymptotic development of residue yield to a theoretical threshold only limited by physiological constraints. Thus they considered piecewise continuous functions to derive RPR estimates. García-Condado et al., (2019) used empirical models to predict crop residues from annual yield statistics. Their models were developed based on experimental data from scientific literature. The functions mentioned above are summarized in table 1. It can also be noted from Table 1 that although RPR functions typically differ from one crop to the other, there are also cases where exactly the same functions are proposed (e.g. wheat and barley RPR functions of (Edwards et al., 2005).  To maintain consistency with the terms used in the present study, the terminology in the functions have been changed. c As per , other oil crops include flax, castor and oeillette.
The RPR functions presented in Table 1 were used to estimate spatially-explicit residue yields considering, for each administrative department, the primary crop yield and surface area data for each of the 16 crops included in this case study (Eq. 3).

Eq. (3)
Where, Residue Production (RPi,j) is the amount of residue produced for crop i in department j, RPRi,j is the residue-to-product ratio of crop i in department j.

Uncertainty assessment
Uncertainty assessment was used to address our three specific sub research questions. Three tests wereperformed by considering wheat cereal as a case-example, as it represents a significant share of the generated CR (39% by production volume in France). In the first test, the extent to which the variation (or sensitivity) in primary crop yield affected the estimated residue yield was evaluated by performing a oneat-a-time (OAT) perturbation analysis (Bisinella et al., 2016). In the OAT analysis, primary crop yield values were changed by ±10% and ±50% of the original values, and residue yields were recalculated accordingly, using all the functions presented in Table 1.
In the second test, we evaluated how the actual uncertainty in primary crop yield overshadows the differences we observe in the estimated residues using the functions listed in Table 1. For performing this test, each of the 96 French departments was considered as an individual sample, and the mean and standard deviation (SD) of primary crop yield was calculated on a year per year basis for the period considered here (2000 -2018). To incorporate this uncertainty in the estimated annual results, residue yields were recalculated with the original primary crop yield ±SD values for all the 19 years of data, and a chart was plotted to observe the overshadowed differences as confidence interval using the student's t distribution (Supplementary Material 2: SM2).
Finally, in the third test, we evaluated, through a two-tailed t-test, if there are any significant differences in the RPRs obtained using the different estimation functions presented in H0: There is no significant difference in the estimated RPRs of wheat cereal using different functions.
H1: There is a significant difference in the estimated RPRs of wheat cereal using different functions.

Results and Discussion
In this study, we examined with a national case for France, the use of different estimation methods for quantifying CR. For each crop, the CR THP of a given spatially-explicit unit was separated into two ranges, i.e., (i) higher range, which includes the maximum CR estimate for the given crop and (ii) lower range, which includes lowest CR estimate for the given crop. The aggregated spatially-explicit results (i.e. for all crops) are shown in Figure 2 in terms of energy units, both at the French departmental (NUTS-3) and regional (NUTS-2) level. The THP of CR considering the selected sixteen crops varied from 987 PJ Y -1 to 1369 PJ Y -1 . These estimates are considerable, being equivalent to about 60% -80% of the annual French electricity consumption (Eurostat, 2020). The THP, by definition, does not consider any competitive use (animal feed, bedding, etc.). The competitive uses of CR can be substantial; for example, Monforti et al. (2013) estimated that about 16% of the collectible CR is needed as animal bedding. Furthermore, in reality, not all of the estimated residues are collectible and their removal from fields is not always suitable. Several studies have reported that about 40%-70% of these residues should not collected, considering a variety of sustainability goals and premises (Einarsson and Persson, 2017;Scarlat et al., 2019Scarlat et al., , 2010Hansen et al., 2020). Consequently, it should be kept in mind that the ranges presented in Figure 2 are higher than what can actually be used as a replacement for fossil carbon. However, mobilizing even just 20% of the potentials presented in Figure 2 could substitute about 3% -5% of the annual French electric energy consumption, considering an electrical conversion efficiency of 27% (Tonini et al., 2016a).
From Figure 2, it can be observed that the CR production is mainly concentrated in the Centre-Val de Loire, Hauts-de-France, Grand Est, and the Nouvelle-Aquitaine regions of France, which are also the primary cereal producing regions. The overall THP of CR at the Regional (NUTS-2) level is shown in Table 2, while THPs at the department (NUTS-3) level and crop-specific maps of the estimated THP using different functions are available in SM1. The results presented in Table 2 reveal high variability. At the national scale, this corresponds to about 39% difference (987 -1,369 PJ Y -1 ). This 382 PJ Y -1 difference is almost equal to about 22% of the overall annual electricity consumption in France, also equivalent to more than the overall electricity consumption of Belgium, Latvia, and Estonia combined (Eurostat, 2020). At the regional level, the maximum difference was observed in the region of Grand Est with nearly 61 PJ Y -1 , which itself is nearly twice the entire electricity consumption of a small country like Estonia. These considerable differences are isolating the "RPR function" effect only, as the primary crop yield considered for a given crop-department combination remains constant.
The estimated THP of CR of our study falls within the range of a recent study by Scarlat et al. (2019), where an average THP of 1067.5 PJ Y -1 was estimated for France, considering a LHV of 17.5 MJ kg -1 DM. However, in their study, they only considered eight crops, namely wheat, rye, barley, oats, maize, rice, rapeseed, and sunflower. When compared to the estimates of Monforti et al. (2013), our estimates (62,182 kt -86,178 kt) are 4 -44% higher than the 59,569 kt Y -1 presented in Monforti et al. (2013).
The average residue production (Mt) and the residue yield (t/ha) range of the crops selected in this study are shown in Table 3, based on the RPR function used. In terms of absolute volume, the maximum difference in the residue production was observed for wheat straw between the functions proposed by Fischer et al. (2007) and García-Condado et al. (2019), with a difference of 9.3 Mt Y -1 of wheat straw. a Empty cells mean that a given study did not supply RPR functions for the crop under consideration Figure 3 (a-e) shows the spatial distribution of wheat straw estimated using different empirical functions.
Wheat straw is used here as a representative example since it contributes with ca. 40% of the THP-energy (385.1 PJ Y -1 -525.8 PJ Y -1 ), but the details for all other CR can be found in SM1-CR (DM and Energy). Figure  3 (f) highlights the departments which are associated with two or more ranges of wheat straw potential, according to the RPR function used for the estimation. In total, 29 out of the 96 French departments have different ranges of wheat straw potential associated with them.

Figure 3: Department (NUTS-3) spatial distribution of wheat straw THP using different empirical functions (ae), and (f) Departments associated with two or more ranges of wheat straw potential.
In order to evaluate the sensitivity of the empirical functions to the fluctuations in primary crop yield, OAT perturbation analysis was performed by changing the primary crop yield value by ±10% and ±50% of the original. For three out of the five functions (Edwards et al., 2005;García-Condado et al., 2019;Scarlat et al., 2010), a proportional increasing or decreasing trend was observed, i.e., with the increase in primary crop yield, the estimated residues also increased and vice versa ( Figure 4). For the function by Bentsen et al. (2014), when the primary crop yield values were changed by ±10%, the estimated results were observed to be tightly bound to the results estimated using the original primary crop yield values. However, when the primary crop yield values were changed by ±50%, disproportionate changes were observed in the estimated straw, reflecting the very nature of the piecewise functions proposed by the authors, which limit the CR production (and indirectly possible yield increases) to a certain threshold. Similarly, yield variations generated rather erratic results when using the RPR function of Fischer et al. (2007), especially with a ±50% yield variation. Mathematically, the linear function has a general structure of RPR = -0.14*yield+1.96 (Table  1); hence if the primary crop yield values are increased, the estimated residues are bound to decrease. The chart shown in Figure 5 highlights the inter-annual variability of residues estimated using different functions along with the 95% confidence interval shown as error bars. From the figure, it can be observed that the results obtained using the functions from Edwards et al. (2005) and García-Condado et al. (2019) are mostly overlapping in the confidence intervals. This might be because both functions use HI directly or indirectly to estimate the residues. In terms of inter-annual variation of estimated residues, sharp decreases were observed for the years 2001, 2003, and 2016. These decreases followed the sharp decreasing trend observed in the primary crop yield values (highlighted in the black dotted series). However, this trend is not general; for example, the primary crop yield value increased in the year 2009, but the estimated residues for that year shows a decreasing trend using all the functions (SM2: Effect). The results of the null-hypothesis test are shown as pairwise comparisons in Table 4 (SM2: T-test RPR). The results of the t-test revealed that for each pair compared, the CR estimates were significantly different, with P(T<t) < 1.96. Thus the null hypothesis (H0 = There is no significant difference in the estimated RPR using different functions) was rejected, and the alternate hypothesis (H1) was accepted. In other words, the results obtained with each RPR function presented in Table 1 cannot be considered equivalent, meaning that the function selected for estimating CR is not a simple choice without consequences. This is further clarified in Figure 6, where it can be noticed, among others, that no two boxes overlap with each other. Figure 6 also illustrates that results from the functions of Bentsen et al. (2014) and Fischer et al. (2007) have broader ranges indicating a wider distribution and more scattered output results. Conversely, the short boxes in the functions of Edwards et al. (2005), García-Condado et al. (2019) and Scarlat et al. (2010) indicate that the RPR results range consistently hover around the center values. The RPR functions developed by Bentsen et al. (2014) and Scarlat et al. (2010) are also accompanied by their coefficients of determination (R 2 ) values (Table 1), which at best reaches 0.52. This implies that approximately half of the observed variation in the estimated residues can be explained by the function's variable, here the yield. This makes the estimation functions highly uncertain. Furthermore, it is not clear with these functions, whether they capture the entire generated residual biomass, or just the portion that is harvestable (Figure 1). None of the studies considered here were evident on this point, yet, based on how certain studies relate to Eq. 3, it is here hypothesized that most studies consider that RPR provides an estimation of the overall amount of generated residues (harvestable + non-harvestable). According to (Kristensen Fløjgård, 2012), this non-harvestable portion (or loss) can represent 10-15% of the overall CR in the case of cereals.
While carrying out such resource assessment studies at large geographic scales (country, continental, global), empirical or statistical functions as those used here remain the most convenient tool for CR estimation. However, as shown in this study, the functions available at present appear little reliable, and additional experimental research to improve these would be rather beneficial in the perspective of bioeconomy action plans.

Conclusions
A comprehensive assessment of crop residues theoretical potential was performed for metropolitan France, considering 16 major crops. The spatially-explicit estimation of crop residues was performed at the French departmental (NUTS-3) + regional level. Empirical functions commonly used in the literature were used to estimate the CR by considering a ratio (RPR), which partitions the total above-ground biomass into primary crop yield (the main cereal component of the crop) and CR. The results and uncertainties obtained with the different empirical functions were thoroughly analyzed.
The key conclusion of this study is that existing RPR functions, albeit rather unquestioned, are highly unreliable and would greatly benefit from additional experimental research. In fact, we showed, with a case study on wheat produced in France in the period 2000 -2018, that none of the assessed functions produced a CR estimate that can be considered as statistically comparable with one another.