On using flood-excess volume to assess natural flood management, exemplified for extreme 2007 and 2015 floods in Yorkshire

This paper offers a protocol for conducting a quantified assessment of the relative merits of both existing and proposed methods of Natural Flood Management (NFM). Assessment is based on the rarely used concept of flood-excess volume (FEV), which approximately quantifies the volume of water one wishes to eliminate via flood-mitigation schemes, and is exemplified using publicly available river-gauge data for recent wellknown extreme-flood events in Yorkshire, UK. The following question motivates the study: what fraction of the FEV is reduced, and at what cost, by a particular (suite of) flood-mitigation measure(s)? The approach presented admits juxtaposed cost assessments, of disparate and popular NFM measures, that are neither available nor considered in existing flood-mitigation policy. With promulgation of a societally useful protocol in mind, quantification and interpretation of alternative cost scenarios are facilitated using the authors’ novel visualisation of flood-alleviation basins as partially filled, realistically constructible, two-metre-deep square lakes of side-length approximately one-to-two kilometres. In the first case study, a hypothetical flood-alleviation scheme for the River Calder at Mytholmroyd, comprising flow-attenuation features, tree planting and peat restoration, and reservoir storage, is critically assessed alongside reasonable cost estimates. The clear quantification and visual representation of the analysis indicate that the fractions of FEV reduced by the NFM measures, while not insignificant, are dwarfed by the careful draw-down of reservoirs. The second case study, of the River Don at Sheffield, extends the analysis via a range of scenarios that attempt to sample realistic seasonal rainfall distributions across a catchment, and in doing so elucidates both the potential and uncertainty of numerous mitigation schemes under different conditions. We corroborate the growing consensus that, while NFM measures can reduce low-level flooding locally, the spatial scale at which they are effective is limited and may not upscale to the catchment level. Our FEV analysis and protocol thus not only offers a concise quantification of the effectiveness of disparate and in-tandem flood mitigation measures but also further highlights the issue of NFM scalability.

. Example of a woody-debris or leaky dam in the River Calder catchment from the Stroud project. Photo courtesy: Robin Gray.

Introduction
There is an increasing interest in using Natural Flood Management (NFM) to alleviate current and future risks of river and coastal flooding. A broader terminology used is Nature Based Solutions (NBS), defined as: "Working with natural processes means taking action to manage fluvial and coastal flood and coastal erosion risk by protecting, restoring and emulating the natural regulating function of catchments, rivers, floodplains and coasts" †. Examples of NFM are run-off attenuation features such as (small-scale, woody) leaky-debris dams to enhance ponding in a catchment, see Fig. 1, the planting of peat and trees to absorb more rainwater, the re-meandering of brooks and rivers to slow down the flow and to increase flood-plain storage, and the planting of trees and bushes along river banks to slow down rivers in flood by increasing the roughness of the river banks; for a comprehensive overview see, e.g., the report on NFM by the Environment Agency (2017a).
The idea in NFM and NBS is to slow down the flow by increasing flood depths, in places where this is deemed acceptable, in order to reduce and spread the flood peak further downstream at critical locations. Critical locations usually concern the river levels near or through villages and cities further downstream. The space-for-water (the Dutch and Flemish "Ruimte-voor-water") program of enhancing the flood-plain volume (cf. IJssel (2017)) also fits within the wider remit of NBS. It uses naturally suitable river locations to create extra floodwater storage, sometimes by the construction of fixed or moveable weirs and sometimes by widening and deepening a flood plain, for example by reopening old river meanders. The evidence for the effectiveness of NFM or NBS depends on the case considered.
Collective slow-down of flood peaks in tributaries may not necessarily lower the main flood peak since synchronisation of flood peaks can lead to adverse effects with increased flood peaks and flooding further downstream (Cabaneros et al. 2018). Tree and peat planting can lead to adverse effects when the soil becomes more compacted due to, for example, grazing by animals or gully formation. Moreover, in wet periods often seen before large flood events, the ground can become oversaturated such that the run-off of extra rainfall is nearly instantaneous, irrespective of the vegetation type. The nearinstantaneity of run-off was generally believed to be the case for the Boxing Day floods of 2015 in Yorkshire, UK.
A neglected aspect of NFM is that the proposed methods are difficult to scale up in a robust manner. For example, the attenuation features such as leaky dams tend to be small individually, implying that a large number of such features are required before effects are significant for large and extreme floods on a catchment scale. The large number of such features required to effectuate flood mitigation also increases and complicates maintenance. Similarly, tree and peat planting, even when it leads to more water absorption, will require enormous areas before mitigation effects become significant for larger flood events. Besides showing beneficial effects in specific, small-scale, pilot studies of NFM (e.g., Environment Agency (2017a)), it is therefore important to obtain first estimates of the effectiveness and reliability of NFM on grander catchment scales for extreme-flood events. In contrast, flood-plain and reservoir-storage projects, such as the "space-for-water" program mentioned above, generally concern larger volumes. It is one of our aims here to provide a rational approach to obtain such first estimates, which in particular have been prominently absent or largely neglected in public and policy debates about flood protection by NFM (as evidenced in the reports from the Environment Agency (2017a), Leeds City Council (2018), Leeds Executive Board (2107) and the article of Puttock et al. (2017)).
For both NFM and NBS, a first assessment of the effectiveness of flood-mitigation strategies including NFM can be quantified in a quite straightforward manner using floodexcess volume (FEV, see Bokhove et al. (2018a)) estimates for extreme-flood events. FEV concerns the fraction of the total volume, of river discharge over a certain time duration, which caused flooding at a certain location along a river during an extreme event. It implies that, at that location, the river levels have exceeded a certain threshold river level h T during the flood event yielding associated excess discharge rates. FEV is the flood volume one wishes to reduce to zero in flood-mitigation approaches in order to avoid flood damage. This can be done either by raising flood-defence walls along a river, thus effectively raising the threshold level h T , or by holding back water or slowing down the flow by various upstream flood-mitigation measures, thus lowering the FEV either by a significant fraction per mitigation approach or entirely by the cumulative effect of several mitigation measures.
The main aim of our work is to use FEV to analyse flood-mitigation approaches based on NFM or NBS, using FEVs for the 2007 flood of the River Don in Sheffield and the 2015 Boxing Day flood of the River Calder in Mytholmroyd, both of which are rivers in Yorkshire, UK, that flow into the North Sea via the Humber estuary. The analysis presented herein will constitute an insightful and accessible approach to assessing flood mitigation, aimed at increasing public understanding and assisting policy makers in their decision making. In Bokhove et al. (2018a), we advocated a similar analysis but for the River Aire based on the FEV for the Boxing Day 2015 flood and regarding a partially hypothetical Leeds' flood-alleviation scheme (FASII + ), which was based on the to-date available information of the actual flood-alleviation scheme proposed (Leeds City Council 2018; Leeds Executive Board 2107) and augmented by additional common-sense estimates. In contrast to the NFM measures analysed later, the (hypothetical) FASII + consists of building higher defence walls in Leeds (relative to the current situation) and several enhanced flood-storage plains.
The outline of the paper is as follows. In §2, the concept of FEV and available floodstorage volume is briefly reviewed to allow its use in subsequent sections. In §3, two specific FEVs are calculated for the River Calder flood of 2015 and River Don flood of 2007. Based on these FEVs, several flood-mitigation measures proposed within the River Calder and River Don catchments are extended, augmented and analysed in §4. Conclusions are drawn in §5.

Flood-excess volume
The flood-excess volume (FEV) of a river in flood is defined as the water volume causing flood damage at a certain spatial location due to river levels exceeding a relevant threshold h T . This threshold is chosen in such a way that, for any river levelh above h T , some or major flooding occurs. The concept of FEV offers a relatively cheap, simple and comprehensible means of quantifying arguments underpinning flood-mitigation strategies, in which (see, e.g., Leeds City Council (2018); Leeds Flood Alleviation Scheme (2017)) the concept of quantification is predominantly minimal and/or heuristic.

FEV approximations
Three approximations of FEV (all in units of m 3 ) are introduced in Bokhove et al. (2018a) as follows. Given an in-situ rating curve Q = Q(t) explicitly as function of time t over a flood duration T f , or implicitly as a function Q = Q(h) of the in-situ river level h =h(t), the most accurate approximation to the FEV is in which Q T = Q(h T ), and which is the hatched area in Fig. 2 in the limit of an infinite number of river-level measurement data h k , i.e., when ∆t = T f /N m → 0 as N m → ∞. In that limit, the approximation of FEV becomes exact insofar as the rating curve considered is exact, which is not the case in reality due to errors in the relation Q(h) in addition to measurement errors inh k , cf. Environment Agency (2016b). A cruder (mean) approximation to the FEV (2.1a) is which uses a mean dischargeQ(h T ) over the flood duration T f and is the rectangular area indicated in Fig. 2. Finally, the crudest approximation to the FEV is which is obtained using estimates of the discharge Q max (see, e.g., Bokhove et al. (2018a)), the mean discharge estimateQ(h T ) ≈ h m Q max /h max and the threshold discharge estimate Q T ≈ h T Q max /h max , which are approximations based upon, in the absence of a rating curve, only the known mean h m , threshold h T and maximum h max river levels.
In general, river levelsh are measured and, via either a theoretical or phenomenological rating curve, an in-situ river levelh is converted into an in-situ discharge rate Q(t) = Q h (t) . FEV estimate V e2 will be used later along with V e . For further discussion on rating curves and the FEV, see Bokhove et al. (2018a) and the report by the Environment Agency (2016b).
, displayed on the vertical as function of time t on the horizontal, and a chosen threshold discharge QT = Q(hT ). It involves the in-situ river levelh =h(t) as a function of time t with corresponding discharge threshold QT = Q(hT ). The FEV estimate Ve in (2.1a) approximates this as the discrete sum of Nm rectangular slices, cf. rules for integration. The rectangle indicated represents a mean approximation of the FEV: the product of mean minus threshold discharges times flood duration T f = 8.25hr, defined as Ve 1 = (Q(hT ) − QT )T f in (2.1b).

Available flood-storage volume
Available flood-storage volume is the extra flood-storage volume gained above the flood capacity that a flood-storage site has for a flood of a particular return period †. This available flood-storage volume changes as a function of the return period of the flood event; for floods with higher river levels and longer duration, it will be less than for floods with lower river levels and shorter duration. Leaky dams slow down the flow and increase the upstream water level. Their flood-storage capacity is time-dependent because the upstream water levels will increase rapidly during a flood event and then slowly decrease afterwards. We will ignore this time dependence of the flood volume stored in the preliminary estimates provided hereafter, thereby implicity assuming that this time dependence is sufficiently small within the flood duration T f . The concept of available flood volume is related to the concept of available potential energy (APE), regularly used in atmospheric science and oceanography (e.g., see Lorenz (1955); Shepherd (1993)); in short, APE is the potential and internal energy available for conversion into kinetic energy. We explicitly introduced available flood-storage volume in Bokhove et al. (2018a) as an essential notion in analysing flood-mitigation approaches.

FEV for Yorkshire rivers in extreme flood
To determine FEVs, river-level data from the River Calder at Mytholmroyd for the Boxing Day Flood in 2015 and from the River Don at Sheffield Hadfields for the June 2007 floods are analysed. On 26 th and 27 th December 2015, an extreme-flood event was recorded for both the River Aire and River Calder in Yorkshire, UK -the so-called "Boxing Day Flood" -which was so severe as to merit high-profile coverage in diverse national media (see, e.g, The Guardian (2015)). The extreme rainfall that caused the 2015 Boxing Day Flood fell north of Sheffield, which therefore did not experience the aforementioned extreme flooding of the Aire and Calder valleys. However, the River Don did experience record-high levels on 25 th and 26 th June 2007, as evidenced by measurements at the Sheffield Hadfields gauge station: the resultant flooding in Sheffield caused widespread damage that led to circa 1000 people being evacuated and, tragically, the deaths of three people (The Guardian 2007). The flooding event in Sheffield had a 1:200-year return period and the Boxing Day 2015 events had approximately a 1:200 +year return period for the River Aire and a 1:100 + return period for the River Calder (Environment Agency 2016a). Both events were extreme and fell outside the range of the data records. Their return period could nonetheless be estimated using extreme-value theory (Coles 2001). Extreme events falling in the tail of a probability distribution of flood peaks versus return period can be estimated using, for example, Generalised Pareto distributions or by combining Gamma and Generalised Pareto distributions for the entire distribution (Wong et al. 2014). For both the Mytholmroyd and Sheffield Hadfields sites, descriptions of the station are available describing the rating curve used in reports by the Environment Agency (2016b).

River Calder Boxing Day 2015 flood
The aforementioned relative simplicity of the concept of FEV is first exemplified for the River Calder Boxing Day 2015 flood. We first consider the River Calder gauge-level data at Mytholmroyd. The report of the Environment Agency (2016a) gives a peak discharge of Q max = 276m 3 /s with peak height h max = 5.65m. From Gaugemap at Mytholmroyd, see www.gaugemap.co.uk/#!Detail/1970, the threshold level above which flooding is possible lies at about 2.85m, while the threshold for property flooding to be possible lies at around 4.0m. We therefore take h T = 4.0m and a simple arithmetic mean to estimate h m = (h max + h T )/2 = 4.825m. Using the slider at Gaugemap, for example, the flood duration is found to be T f ≈ 8.25hrs, from 07:15 till 16:30 on 26-12-2015. Hence, from (2.1c) one finds the FEV estimate  Table 1. The coefficients Cj, aj, bj as well as the limb thresholds h0 = 0 and hj for the rating curve Q = Cj(h − aj) b j with N = 3 and j = 1, . . . , N , with stages or limbs hj−1 <h < hj for 1 j < N , at the river-level gauge station for the River Calder in Mytholmroyd, see Bokhove et al. (2018a) and Environment Agency (2016b).
and measurements of the (surface) flood velocity and the river's cross-section. Moreover, even when a rating curve is available, such estimates form a quick way to obtain FEV estimates without having to analyse the data in detail. Alternatively, and more precisely, the 15min river-level data and rating-curve data from the Environment Agency, for which coefficients are given in Table 1, are analysed. The river level and discharge data are given in Fig. 3, which shows both an overview and exploded views around the Boxing Day peak, with an indicated threshold level chosen to be h T = 4.5m.
The dependence of the FEV calculation on the threshold level is given in Fig. 4. For a range of thresholds h T ∈ [4.0, 5.65]m the FEVs lie in the range V e ∈ [0, 2.7]Mm 3 . Equivalently, we can express these FEVs as the capacities of a 2m-deep square-lake with side lengths ranging from [0, 1200]m or circa [0, 0.75]mi. For a threshold of h T = 4.5m, the calculated excess volume is or the capacity of a square lake of depth 2m and side length 908m or 0.55mi. Hence, estimate V e2 = 0.784Mm 3 using (3.2) is approximately 48% of the more precisely calculated value FEV (3.3), the less-accurate estimate yielding an equivalent capacity of a 2m-deep square lake of side length 626m or 0.37 mile. For the first stage, error bars of 84.9% are reported, for the second stage 13.6% and for the last stage and beyond error bars are not available (Environment Agency 2016b). The error estimate in (3.3) uses the 13.6% of the second stage as placeholder. In the next section, we consider another exemplification of the concept of FEV -for the River Don 2007 flood -and show how FEVs can be used to contrast the two different flood scenarios in a straightforward manner.

River Don 2007 flood
Excessive flooding of the River Don occurred in June 2007 in Sheffield. To estimate and calculate the corresponding FEV, data from the gauge station Sheffield Hadfields are considered next. Gaugemap indicates that flooding is possible above 2.63m. In contrast to the River Armley and River Calder cases, less in-situ information is known to the authors. Given that the Environment Agency threshold level is quite low, as in the River Aire case for the Boxing Day 2015 floods (see Bokhove et al. (2018a)), the threshold level is increased to h T = 2.9m heuristically to obtain an initial estimate of the FEV; the corresponding time duration is T f = 13.75hrs with a concomitant Q max = 259m 3 /s. This follows from the data provided by the Environment Agency because the (online) Gaugemap data do not go back far enough in time. The resulting excess-volume estimate a) b) Figure 3. (a) Flow rate and river-level data of the River Calder at Mytholmroyd, from May 2015 till the end of March 2016, as well as b) the rating curve (and its linear approximation) and exploded views of river level and discharge data around Boxing Day. The horizontal dashed lines indicate a chosen threshold level and the corresponding discharge at that level, obtained via the rating curve, as well as the maximum river level.  Table 2. The coefficients Cj, aj, bj as well as the limb thresholds h0 = 0 and hj for j = 1, 2, 3, 4 for the rating curve at the river-level gauge station for the River Don at Sheffield Hadfields (Environment Agency 2016b).

becomes
A more detailed analysis of the River Don 2007 flood results in Figs. 5 and 6 with the coefficients given in Table 2. The calculated excess volume for the threshold of h T = 2.9m yields or the capacity of a 2m-deep square lake of 1225m, or 0.74mi. By comparison with the sloping dashed line in the central subfigure in Figure 5, the rating curve in Fig. 5 is seen to be pseudo-linear for moderate to high depths so the quick estimate (3.4) of the FEV (3.5) is roughly 2.434/3.00 ≈ 81% accurate. In contrast, the rating curves for the gauge stations at Armley (for the River Aire see Bokhove et al. (2018a)) and Mytholmroyd (see Fig. 3) are more nonlinear, which explains the larger discrepancies, approximately 65% and 50%, between the estimated and calculated flood-excess volumes for the chosen indicative river-level thresholds. Despite exhibiting quasi-linearity above h 3 = 1.436m, large error bars (circa 8%) at low depths (and even up to 20% below h 2 = 0.931m), are reported to be a common issue at the Sheffield Hadfields gauge (Environment Agency 2016b), while no standard error has been calculated for the highest stage (h 4 ). In the

Summary
The FEV for the River Aire at Armley in Leeds, UK, was analysed in Bokhove et al. (2018a). By analysing the flood data for three different Yorkshire rivers for two different flooding events, the following FEVs V e ≈ (9.34 ± 0.51, 1.65 ± 0.22, 3.00 ± 0.24)Mm 3 were found for chosen threshold levels of h T = (3.9, 4.5, 2.9)m, respectively for the River Aire and River Calder in the Boxing Day flood of 2015 and the River Don for the 2007 flood in Sheffield. These volumes can also be expressed as the capacities of 2m-deep square lakes with side lengths of (2161, 908, 1225)m respectively. Between even these rivers there is nearly an order of magnitude difference in the FEVs, though we stress that the FEV is a function of the chosen threshold h T (cf. panels of Figs. 4 and 6).
Flood data of the Houston flood in 2017 were not available (to us) but if we assume that 20% of the total flood volume of 2mi 3 mentioned in Business Insider (2017) was the FEV, then this excess volume was about V e = 7319Mm 3 , which is equivalent to the capacity of a 2m-deep square lake with side lengths of 36.4 miles, i.e. two to three orders of magnitude larger than the lake-side lengths and FEVs calculated for the Yorkshire rivers. The area of 36.4 2 mi 2 = 1325mi 2 of this flood-excess lake appears reasonable when we make a visual estimate of the flooded areas based on (online) satellite images of the Houston floods.
Such a perspective and visualisation of flood-excess volumes as square lakes with realistic lake depths is useful when considering and analysing flood-mitigation strategies. Given the size of the square lake and the width and length of the river valley, as well as the size of the catchment, one can start to consider whether multiple NFM techniques and NBS features such as flood-storage reservoirs or flood-plain enhancements are feasible. The River Calder valley is not only distinctly more narrow than the River Aire valley but also quite urbanised, which readily follows by direct inspection, (online) photographs and from browsing maps with contour lines. It is therefore more difficult to find substantial flood-storage sites in the Calder valley than in the Aire valley by enhancing the floodstorage capacity of existing flood plains, for example via controllable weirs. While flood-alleviation schemes with (substantial) flood-storage sites within the valley have been explored for the Aire valley (Bokhove et al. 2018a; Leeds Executive Board 2107), given the extensive green zones in the Aire valley with valley widths ranging from circa 200 to 600m, this approach has limited scope in the Calder valley. Accordingly, it has not been explored successfully, thereby motivating the flood-mitigation assessment addressed in the next section.

Flood-mitigation assessment using FEV
Several NFM and NBS approaches will now be reviewed with regards to their capacity for large-scale flood mitigation, and these will be exemplified for the River Calder and River Don. There have been a plethora of NFM measures reported -in newspapers, online, by flood-action-group websites and elsewhere -with often either very vague or excessive claims of their effectiveness towards flood mitigation. NFM is momentarily popular in the UK (Environment Agency 2017a; Puttock et al. 2017;BBC 2017) yet this popularity has not been substantiated with a proof or bona fide estimates that NFM can actually be beneficial on larger catchment scales for more extreme flood eventsa crucial requirement for integrated catchment-flood management. Our analysis using FEV aims to put these claims into context and will reveal some of the pitfalls and merits of different NFM flood-mitigation strategies.

River Calder Boxing Day 2015 flood
For the River Calder, we will consider three different types of NFM in separation, using its FEV of V e (h T = 4.5m) = 1.65Mm 3 given in (3.3), for the 1:100-year return period Boxing Day flood, before putting their contributions into a joint perspective.

NFM flow-attenuation features
NFM using flow-attenuation features such as leaky dams has received a lot of attention since the successful collaborative project between the Environment Agency and citizens of Pickering (Pickering 2012). What is less-well known is that 10% of that "NFM" scheme consists of woody, leaky-debris dams, while 90% of the enhanced storage is created behind a large leaky-cement dam (Pickering 2012). We therefore introduce the smallscale pilot project of the citizen action group Slow-the-Flow-Calderdale, cf. Fig. 1, which upscaling to the River Calder catchment scale we consider later. The pilot consists of creating and maintaining run-off-attenuation features and restoring old mill ponds to slow down the flow so as to increase water-storage capacity (Slow the Flow Calderdale 2017). An estimate was made of the attenuation volumes obtained by these interventions, which included circa 120 plate weirs, leaky (small-and large-wood-debris) dams and strategically placed logs as well as restoring plantation on ancient woodland sites. Using two approaches to be able to make estimates, an available flood-storage volume of circa 7000m 3 was foreseen at a project cost of circa £50000 to £72000 (see Slow the Flow Calderdale (2017)). Here we have assumed that the aforementioned cumulative volume of 7000m 3 concerns the available flood-storage volume since this distinction was not made in the available information (cf. Slow the Flow Calderdale (2017) and personal communication with the flood-action group). Long-term maintenance is not included within the cited costs. In addition, the building and maintenance in the field will provide and disseminate valuable insights into flood management and biodiversity within the local area. However, the contribution of these NFM measures in the context of preventing or mitigating an extreme flood is minute when compared with the total FEV of the Boxing Day 2015 flood of V e (h T = 4.5m) = 1.65Mm 3 . Under the assumption that the extreme rainfall is uniform and that the attenuation features function optimally, the cumulative storage would lead to at best a 7000/1.65 × 10 6 = 0.0042 or 0.42% reduction of the required volume.
Furthermore, it remains unclear whether these attenuation features can attain their full capacity in very wet periods, in which most of the required extra capacity may already have been used up by the increased level of sustained rainfall. The justification of the above assumptions requires more field tests and monitoring combined with mathematical and fluid-dynamical modelling, including the optimisation of the placement of these leaky dams; for example, of the kind undertaken in Cabaneros et al. (2018). Nonetheless, this estimate of the effectiveness of these run-off attenuation features shows that upscaling (to cover a larger fraction of the FEV) to a ten-or hundredfold increase in flood storage, even for floods a tenth of the size of the Boxing Day 2015 floods, and then involving circa 1200 or 12000 attenuation features, is required to bring the flood-storage contribution up to more reasonable flood-mitigation levels of 4.2% or 42%. The costs increase correspondingly to £[0.5, 0.7]M or £[5, 7]M: this calibration/comparison with real data seems hitherto to have been neither recognised nor conducted.

Floodwater storage in reservoirs
Both Yorkshire Water and the Environment Agency have started to explore a floodstorage project in which levels in drinkwater reservoirs in the upper catchment of the River Calder will be lowered before extreme-rainfall events in order to provide floodwater storage †. These reservoirs lie mostly upstream of Mytholmroyd, which makes our analysis of the Mytholmroyd flood hydrograph relevant. Both static draw-down and dynamic control of the draw-down in anticipation of extreme rainfall are under investigation at the Environment Agency. The static reservoir storage is achieved by drawing down six reservoirs by 10% in volume. The total volume of floodwater storage in these reservoirs is estimated to be V r = 0.88Mm 3 which, when compared with the target volume of V e = 1.65Mm 3 , would be a significant contribution of circa 53.3% of the Boxing Day 2015 excess volume for the River Calder. This estimate of its contribution is again based on the assumption that the extreme rainfall is spatially uniform, such that the capacity of all reservoirs can be reached; this was roughly the case for the Boxing Day 2015 flood. Conversely, spatial non-uniformity in the rainfall will change the storage capacity, generally lowering it; but it can also increase the capacity when the rainfall happens to be very localised near the reservoirs. Despite these reservations, it is clear that flood storage in reservoirs does constitute a voluminous flood-mitigation approach.

NFM tree planting and peat restoration
Yorkshire Water recently advertised a number of flood-mitigation measures, including tree planting and peat restoration, also including the above-mentioned flow-attenuation features and the use of reservoirs for floodwater storage (Yorkshire Water 2017). Yorkshire Water mentions 43ha of blanket-bog restoration and 60ha of "environmental improvements such as leaky dams, fascines and wetlands to slow the flow of the water", thereby totalling an area of A b = 103ha. To make a reasonable first estimate of what is attainable by these proposed NFM measures of tree planting and peat restoration on an area of A b = 103ha = 1.03Mm 2 , consider first the total catchment area of the River Calder, which is A t = 369mi 2 = 1017Mm 2 , i.e., roughly 20mi 2 (see the wikipedia page on the "River Calder"). Assuming in a favourable way that all water is held back on this area of 103ha and that the rainfall is again uniform over the catchment, the contribution of these NFM measures can at best be about A b /A t = 1.03/1017 ≈ 1% of the total water volume. These assumptions are of course excessive but they assume that perfect storage yields an upper bound on the estimated flood mitigation. It is not straightforward to obtain an estimate of the total volume because it is somewhat ambiguous to define flood duration for a total volume associated to a flood. Perhaps a reasonable estimate follows by calculating the excess volume for a low (non-flooding) threshold of h T = 1.5m, thus enforcing that a larger time interval is taken into account. Via (2.1a), that choice yields an "excess" volume of V et (h T = 1.5m) = 12.879Mm 3 . Hence, by taking the above fraction A b /A t ≈ 0.1% thereof, we obtain an estimate of the available flood-storage volume of circa V t = 0.001V et = 12879m 3 , which is about V t /V e = 12879/(1.65 × 10 6 ) ≈ 0.8% of the FEV. Given that Mytholmroyd lies further upstream in the River Calder catchment, the relevant area A t considered should be smaller and concern only the fraction of the total catchment area draining water into Mytholmroyd. On the one hand this would lead to a larger percentage than the 1% estimate above, but on the other hand that increase is likely to offset by the absorption being less than the assumed 100%. Again, the contribution of this type of NFM is both small -unless serious upscaling is undertaken -and difficult to quantify with any degree of certainty. In summary, the reservoir flood-water storage strategy remains by far the most favourable flood-mitigation strategy, with its estimated contribution of 53%.

Cost-benefit analysis for hypothetical flood-alleviation scheme
A cost-benefit analysis for a hypothetical Calder river flood-alleviation scheme will be presented, eventually resulting in a novel and graphical illumination, of potential use for policy makers. The hypothetical River Calder flood-alleviation scheme will consist of partially upscaled versions of the NFM measures discussed above in separation.
We will consider flood mitigation by flood-attenuation features resulting in an available flood-storage volume of V a = 140000m 3 at a base cost of £1.44M pounds for circa 2400 attenuation features, spread out relatively evenly over the upper River Calder catchment; it is a deliberately chosen twenty-fold increase of the Slow-the-Flow-Calderdale case discussed above, which offers a V a /V e = 140000/(1.65 × 10 6 ) = 0.0848 = 8.48% reduction of the FEV. Leaky-woody-debris dams degrade over time and can therefore break down: we assume that the features have an average life span of 25 years so, over 50 years, these need to be either constructed or repaired twice, leading to a doubling of the base costs to £2.88M excluding maintenance. These 2400 features should not be replaced at once but by using a smart staggered-replacement scheme, in order to reduce failure of dams in series at certain times, which can lead to devastating flood-wave damage, cf. Cabaneros et al. (2018). Designing such a maintenance scheme is an interesting mathematical optimisation problem in itself, which we will not consider here. Over 50 years the 2400 features are replaced on average every 25 years, so 4800/50 = 96 features need to be replaced per year. We employ one person at an estimated £50k/yr (including benefits) to carry out continued maintenance, resulting in £2.5M employment costs over 50 years. This yields a total cost of £(2.88 + 2.5)M = £5.38M over 50 years, which is £0.634M per 1% of flood protection. Since it is clear neither whether the full available flood-storage volume is reached nor the extent to which this capacity is reached under rainfall with a varying spatial distribution, we assume and introduce an ad hoc sliding scale of coverage between 50% and 100% of the capacity estimated above. In the Leeds' flood-alleviation scheme phase II (Leeds City Council 2018), it is proposed to increase the tree coverage in the River Aire valley by 8% to a total of 15%, which is above the national average of 12%. We therefore assume another type of NFM by increasing the area of tree and peat coverage to 5% instead of the 1% estimated in the example given above. Again using a sliding-scale approach, this yields an increase of flood-storage volume by [0.03125, 0.0825], so [2.5, 5]% at an estimated cost of £5M including maintenance costs over 50 years; that gives a window of £[1, 2]M per 1% of flood mitigation offered.
Let us assume that the above NFM and NBS measures are distributed relatively evenly across the part of the catchment influencing the river flow in Mytholmroyd. Without further information on the spatial and temporal distribution of rainfall during extreme rainfall and flood events, the only option we have is to assume that the flood mitigation offered lies linearly between the most adverse case with a minimum of 33.41% floodmitigation coverage offered by the combined measures and a maximum of 66.81% with a mean of 50.11% at a cost of £40.38M for £40.38/50.11 = £0.8038M per 1% of flood mitigation offered. The above is visualised in Fig. 7. The remaining (i.e. unmitigated) part of the excess volume needs to be covered by other mitigation measures, such as building higher flood-defence walls than are currently in place or by further increasing the flood-storage volume in the reservoirs by using dynamic control, cf. Breckpot (2013) and Breckpot et al. (2013). Alternatively, more attenuation features can be built, or more trees and peat planted in combination with controlled ponding. Furthermore, additional benefits of tree planting and peat coverage can be taken into account in a cost assessment, such as carbon sequestration and recreational value. These can all be measures that are good for society as well as important for flood mitigation, yet our main point is that all these measures and decisions should be quantified more clearly than is presently convention in the public domain. We conclude that our novel graphical presentation aids in making such decisions in a more quantifiable, visual and, most importantly, rational manner.

River Don 2007 flood
Sheffield City Council (Sheffield CC) aims to increase the flood protection against events such as the 2007 flood, which had a 1 : 200-year return period. This protection consists of a hybrid set of flood-mitigation measures, most of which have been completed. To offer further protection against enhanced flooding due to climate change, Sheffield CC is exploring NFM and NBS, in the form of 1521 attenuation features such as leaky dams as well as floodwater storage by drawing down the water levels of drinkwater reservoirs in the upper parts of the River Don catchment. It is currently not clear what extra flood protection is required to mitigate against climate-change effects. Several studies, including Hodgkins et al. (2017), indicate that, for flood events with a return period larger than 1:100 years, there is no statistical evidence for increased flood intensity and volume due to climate change, while there is evidence that it increases for floods with return periods of less than 1:100 years. These findings may seem to contrast with the climate predictions in Sanderson (2010); however, the latter predictions contain considerable uncertainties for the largest return period considered, i.e., the 1:100-year one. That is, † We used a cost estimate here that yields a similar relative cost as the relative costs for the flow-attenuation features. Figure 7. A graphical overview of the fraction of the FEV captured by the three NFM and NBS flood-storage measures, as well as the associated costs, for the River Calder. The overall flood mitigation by "reservoirs", "NFM" and "trees" ranges from 33.41% to 66.81% at a cost of £40.38M, and the mean of each flood-mitigation measure is represented by the corresponding area of the quadrilateral strips, indicated with the words "reservoirs", "NFM" and "trees", as part of the overall area of the square lake with the same capacity as the FEV that requires mitigation. The sloping lines reflect the sliding scale between the quoted ranges. since flood protection against the 1:200-year return period flood is already in place, no further flood mitigation against climate uptake may be required as a consequence of this information. Nonetheless, based on the FEV of the 2007 flood event, we analyse the extra flood-mitigation capacity offered by these proposed NFM and NBS but now regarding an idealised yet representative set of rainfall scenarios. This leads to the following more complex, yet more realistic, analysis.

NFM via circa 1500 leaky dams
Sheffield CC has performed a study of flood mitigation via NFM by using over 1500 flow-attenuation features (Sheffield City Council 2017). It reveals that the available floodstorage volume V d in 1521 flow-attenuation features such as leaky dams is (4.1) Available flood-storage volume is denoted as "volume of 'new' NFM storage" in the report from Sheffield City Council (2017); here "'new'" is presumably relative to a basic storage volume for a 1:200-year return-period flood, even though that is not explicitly stated. Hence, assuming that this volume can be attained during such high-risk and high-volume flood events, including floods such as the 2007 one, it captures V d /V e (h T = 2.9m) = 0.567/3.00 ≈ 18.9% of the FEV (3.5). For higher-threshold values of h T , this fraction is seen to be much higher, cf. Fig. 6. These estimates are of course upper bounds because the full volume of V d = 0.567Mm 3 may not be attainable, either because part of this volume is already filled prior to the flood or because the rainfall is not uniform across the area covered by the flow-attenuation features. In addition, maintenance of the leaky dams and cascade failure need to be addressed, as in Cabaneros et al. (2018). Despite these caveats, this upscaling of NFM features to 1521 dams shows that the cumulative effects to flood mitigation could be substantial. It warrants a pilot field study with hundreds of leaky dams, in order to monitor not only both their efficiency and durability but also the validity of accompanying modelling approaches.

Flood storage in reservoirs
To date, the Environment Agency and Sheffield CC are exploring the potential of V r = 2.8Mm 3 floodwater storage by drawing down various reservoirs. Summarised simply, contributions to flows in the Lower Don Valley (from downstream of the River Sheaf) are split approximately one-third each: via the catchment area with the reservoirs; from the Sheaf area, an area without any reservoirs, and; from the rest of the Upper Don catchment without the reservoir catchment †. The Sheffield Hadfields gauge concerns the Lower Don Valley: hence, under both uniform rainfall and equal run-off times, this means that the part of the catchment with reservoirs concerns only 1 3 of the FEV of V e (h T = 2.9m) = 3.00Mm 3 given in (3.5), i.e., V e /3 = 1.00Mm 3 such that only part of the potential storage V e can then be reduced. Reservoir storage under uniform rainfall is therefore in principle larger than the V d = 0.567Mm 3 offered by the 1521 leaky dams. Whether it is possible to harness this flood-storage volume V r in advance of an extreme-rainfall event depends on a series of factors, including the spatial rainfall distribution and the ability to draw down the reservoirs safely far enough in advance of rainfall predictions, while also balancing the need to keep these reservoirs sufficiently filled to maintain the drinking water supply. This requires further detailed modelling and pilot studies, cf. Breckpot (2013) and Vermuyten et al. (2018).

Flood-mitigation analysis
To illustrate a more complex flood-mitigation analysis, we hypothesise several idealised precipitation scenarios, divided seasonally over the autumn and winter (hereafter abbreviated as "winter") as well as spring and summer (hereafter abbreviated as "summer") seasons, and the three catchment areas. Each precipitation scenario is assumed to lead to a resulting FEV of V e = 3.00Mm 3 further downstream of the River Don at Sheffield Hadfields and we will state how the fractions of the FEV are distributed across these three roughly equal-sized areas. The three catchment areas introduced above will be denoted by a "reservoir" area with a cumulative flood-storage volume of V r = 2.8Mm 3 , a "Sheaf" area with no additional flood-storage volume and an "upper Don" area with available flood-storage volume stored behind leaky dams with a cumulative available flood-storage volume of V d = 0.567Mm 3 . See Fig. 8 for a map of the catchment. In order to describe and quantify the scenarios, a "rainfall fraction" vector (RFV) and corresponding notation (α, β, 1 − α − β) is introduced, with 0 α, β 1, to indicate weights of the relative rainfall in the above-defined (reservoir, Sheaf, upper Don) areas. For example, the rainfall vector ( 1 3 , 1 3 , 1 3 ) signifies that rain has fallen uniformly across all three areas, each catching a third of the FEV, while ( 1 2 , 1 2 , 0) means that rainfall was evenly distributed across only the reservoir and Sheaf areas but not the upper Don area; similarly, (0, 0, 1) means that all rain fell in only the upper Don area.
The various rainfall scenarios considered below (and summarised in Table 3) are somewhat arbitrary but are chosen such that, in winter, more coherent larger-scale rainfall patterns are favoured relative to the summer, which is itself prone to having more isolated rainfall patterns. Seven different spatially distributed scenarios for an extreme flood with FEV V e = 3.00Mm 3 are now readily described by their respective RFVs and listed below; for later use, they are also assigned corresponding seasonal probabilities of occurrence.
• S3: extreme rainfall localised to just one location with RFV options (a) (1, 0, 0), (b) (0, 1, 0) or (c) (0, 0, 1), given a combined 25% probability in the winter and a combined 50% probability in the summer, such that p S3(a) ,w = p S3(b) ,w = p S3(c) ,w = 1 12 and p S3(a) ,s = p S3(b) ,s = p S3(c) ,s = 1 6 . To facilitate the presentation and quantification of the analysis, the above RFVs and volumes per scenario and area may be readily summarised in the following matrix representations (truncated to a maximum of three decimal places), with row-wise scenarios and column-wise locations: 1 1 1.5 1.5 0 0 1.5 1.5 1.5 0 1.5 The corresponding storage matrix reads The amount of possible storage A m in each area is the minimum of the storage matrix A s and the volume matrix per scenario A r . For the rainfall scenarios considered herein, it is given by 1 0 0.567 1.5 0 0 0 0 0.567 1.5 0 0.567 2.8 0 0 0 0 0 0 0 0.567 (4.4) The sum of each row of A m yields the elements of total storage vector r v for each rainfall scenario, also given as a fraction of the FEV by the vector r vf , as follows r v = (1.567, 1.5, 0.567, 2.067, 2.8, 0.0, 0.567) T Mm 3 and (4.5a) (0.5223, 0.5, 0.189, 0.689, 0.933, 0, 0.189) T (4.5b) with transpose (·) T . Finally, the probability distributions assigned to the seven scenarios in winter and summer are summarised in the vectors v w = ( 1 2 , 1 8 , 1 8 , 0, 1 12 , 1 12 , 1 12 ) T and (4.6a) v s = ( 1 4 , 1 12 , 1 12 , 1 12 , 1 6 , 1 6 , 1 6 ) T , (4.6b) with components summing to unity. The average flood storage attained by the two flood-mitigation measures in winter and summer are given by the inner products m 1 = r rf · v w = 0.4408 = 44.08% and m 2 = Figure 9. A graphical overview of the fraction of the FEV captured by the two flood-storage measures, reservoirs and leaky dams in the reservoir and Upper Don areas of the Don catchment, respectively, for (a) the winter-rainfall scenarios and (b) the summer-rainfall scenarios. Stacked vertically are the respective probability distributions implied by the components of vw and vs in (4.6), relative to the associated FEV, which is fixed for all scenarios. The blue shaded areas to the left of the thick, stepped, solid line denote the fractions of the FEV mitigated per scenario, to be read horizontally (e.g., 93.3% for (S3a)). The mean FEV (winter 44.08%, summer 43.25%) over all 7 scenarios and standard deviation (winter 17.51%, summer 16.38%) are indicated by thick and thin vertical dashed lines respectively.  Table 3. Summary of the seven precipitation scenarios, with rainfall fraction for the three locations, and seasonal probabilities.
r rf · v s = 0.4325 = 43.25% respectively. The spread between the flood mitigation offered for each rainfall scenario (N.B. by only the leaky dams and reservoir-mitigation measures considered here -Sheffield is protected by other measures) is quite extreme since, in both winter and summer, 0% flood protection occurs when rain falls in only the Sheaf area (cf. scenario S3(b)), with a chance of 1 12 and 1 6 in winter and summer, respectively, while 93.3% flood protection occurs when rain falls in only the reservoir area (cf. scenario S3(a)) with probabilities of 1 12 and 1 6 in winter and summer respectively. The standard deviations in the winter and summer are respectively s w = s 1 = 0.1751 = 17.51% and s s = s 2 = 0.1638 = 16.38%, and are calculated as follows (4.7) In Fig. 9, the above information is presented graphically for both winter and summer cases in terms of partitioned square 'flood-excess lakes' of the same volume as the FEV. These lakes are overlaid by not only the mean flood mitigation offered (and its standard deviation) but also the protection offered per rainfall scenario. This graphical interpretation of the analysis illustrates the fraction of FEV accounted for by each scenario in a concise and quantifiable manner and enables the reader (or stakeholder) to make informed choices when assessing potential flood-mitigation schemes.
For our hypothetical scenarios, we conclude that the proposed and current floodmitigation measures, comprising NFM measures and the use of storage reservoirs, offer significant extra reduction (circa 45%) of flood levels based on the 2007 flood data, but their variance over the idealised rainfall distributions is, at circa 17%, relatively large. A significant portion of the volume is captured by the use of storage reservoir (cf. scenarios 1, 2(a,c), 3(a)). Local knowledge of the catchment (from the Environment Agency)that the lower River Don catchment is fed by water flow from three sub-catchments of roughly equal area, feeding into the Lower Don Valley where the Sheffield Hadfields' river gauge is located -was used to simplify and carry out the analysis. Even within those simplifications one could further refine the analysis by using realistic rainfall distributions, discretised using a piecewise-constant representation for the seven seasonally adjusted partitions. The seven scenarios could also be further refined. It should be noted that the starting point of the analysis is that each rainfall event leading to a 1:200-year return period flood event at Sheffield Hadfields had the same FEV as the 2007 flood. However, this does not necessarily imply that the total rainfall was the same in each scenario.
The framework offered by the FEV matrix representations and subsequent interpreta-tion (cf. Fig. 9) is simple yet elegant: information covering a range of rainfall scenarios, mitigation measures, and geographical areas of a river catchment is encapsulated in a single graphic. Moreover, it is highly flexible and can incorporate any number of scenarios, rainfall distributions, and locations; indeed, the process used to produce Fig. 9 has been fully automated in anticipation of being utilised to examine other scenarios.

Summary and discussion
By using flood-excess volume, we have demonstrated how to assess the efficacy of several proposed flood-mitigation measures based on Natural Flood Management (NFM) and on the controlled draw-down of water levels in drinkwater reservoirs prior to extremerainfall events. NFM examples included flood-attenuation features such as leaky dams, tree planting and peat restoration with the latter two aiming to increase water retention after extreme rainfall. The concept and use of flood-excess volume (FEV) in this form was introduced in Bokhove et al. (2018a) and promoted in a novel cost-benefit analysis of a (partially) hypothetical Leeds' flood-alleviation scheme for the Boxing Day 2015 flood of the River Aire. FEVs have been calculated here for two other river floods in Yorkshire: the Boxing Day 2015 flood of the River Calder, based on river-gauge data at Mytholmroyd, and the River Don flood in the summer of 2007, based on river-gauge data at Sheffield Hadfields. FEV is the flood volume one wishes to diminish to zero to prevent in-situ flooding. It is calculated by choosing an in-situ river-level threshold h T above which flooding occurs. From a full rating curve or approximations thereof, one can calculate a river discharge Q T associated with this threshold h T . FEV is then the volume of water corresponding to discharge rates above Q T integrated over the duration of the flood.
For the design of flood protection it is particularly valuable to address the following question: what fraction of the FEV is reduced by a particular flood-mitigation measure proposed? It is insightful to express FEV first as the equivalent capacity of a 2-m deep square flood-excess lake. Flood-mitigation measures can thus be expressed visually and clearly as rectangular subsections of this square flood-excess lake. Costs of each floodmitigation measure can readily be presented in that square above and below doubleheaded arrows, alongside the respective fraction of each flood-mitigation measure, thus leading to a clear and visual instrument for decision-making. Our analysis therefore leads to a digestible dissemination of flood-alleviation plans for policy makers, the public and flood practitioners. A novel aspect here is that we have used FEV to analyse various actual NFM measures explored in the catchments of the River Don and River Calder, as proposed by stakeholders from city councils, Yorkshire Water and the Environment Agency.
Concerning the River Calder, the conclusion of our analysis is that the NFM measures proposed contribute only a small fraction (about 1% or less) of the FEV required to mitigate against an extreme flood with a 1:100-year return period. Despite its popularity in the UK and its media, flood mitigation by NFM is prone to a scalability problem: while NFM can often reduce flooding locally and/or for low return-period events, NFM neither can nor is (rarely) scaled up as flood-mitigation measure for large-scale and extreme floods (cf. a cautionary remark in Lane (2017)). For the unrealistic case of proposing beaver dams to mitigate extreme floods, such as done in The Guardian (2017) and Puttock et al. (2017) regarding a beaver colony in the River Tamar catchment, we refer to Bokhove et al. (2018b); to wit: the enhanced volume of water stored behind these beavers dams (Puttock et al. 2017) divided by the FEV is less than 0.1%, whether one uses the FEVs for the Aire or Calder Boxing Day 2015 floods (using data at Armley (Bokhove et al. 2018a) or Mytholmroyd) for a comparison, or high River Tamar floods in 2012 and 2013 (using data at Gunnislake) (Bokhove et al. 2018b). The benefit of using a FEV-analysis is that it quantifies this lack of scalability and the potential (or lack thereof) for upscaling in an easy-to-understand way.
To mitigate floods for the River Calder in Mytholmroyd in a realistic manner, we therefore significantly scaled up the contributions of NFM, i.e., flow-attenuation features and tree planting, while using the factual volume estimates provided for the drinkwater reservoirs. In addition, we have provided reasonable cost estimates; moreover, what is novel and in high demand by flood professionals is that we have also included estimates of the long-term maintenance costs of NFM over 50 years †. Given the high uncertainty of the effectiveness of the flood-mitigation measures due to rainfall distributions and their robustness, our FEV analysis for the River Calder demonstrates that other classical flood-mitigation measures such as higher flood-defence walls or enhanced flood-plain storage, relative to the current case, are likely still to be required. For the River Don, the situation is different because Sheffield CC has taken a mixture of grey-green floodmitigation measures to protect against a flood with a 1:200-year return period such as the one of the summer of 2007. We extended our analysis of the two viable NFM measures proposed by Sheffield CC by investigating the variability caused by a range of rainfall distributions across three distinct subcatchments of the River Don. The visualisation of our new FEV analysis illustrates that there is considerable variation of the flood protection offered by NFM and reservoirs, ranging from 0% to 93%, but also that there is an additional average flood protection of around 45% of the original FEV. In conclusion, both flood-mitigation analyses for the River Calder and River Don show that NFM can be upscaled to play a significant role in flood mitigation.
Finally, our analyses show that the proposed static draw-down of drinkwater reservoirs reduces the FEVs most significantly. Both static and dynamic draw-down of reservoirs requires more investigation on how to optimize the conflicting demands on drink-water capacity, flood buffering and mitigation, reservoir safety, scour and ecological impact due to relative rapid draw-downs needed prior to probabilistic forecasts of extreme rainfall. Modern optimization strategies for river-flood mitigation such as in Breckpot (2013), Breckpot et al. (2013) and Vermuyten et al. (2018) can form a starting point for such investigations.