Quantifying Streamflow Depletion from Groundwater Pumping: A Practical Review of Past and Emerging Approaches for Water Management

Groundwater pumping can cause reductions in streamflow (“streamflow depletion”) that must be quantified for conjunctive management of groundwater and surface water resources. However, streamflow depletion cannot be measured directly and is challenging to estimate because pumping impacts are masked by streamflow variability due to other factors. Here, we conduct a management‐focused review of analytical, numerical, and statistical models for estimating streamflow depletion and highlight promising emerging approaches. Analytical models are easy to implement, but include many assumptions about the stream and aquifer. Numerical models are widely used for streamflow depletion assessment and can represent many processes affecting streamflow, but have high data, expertise, and computational needs. Statistical approaches are a historically underutilized tool due to difficulty in attributing causality, but emerging causal inference techniques merit future research and development. We propose that streamflow depletion‐related management questions can be divided into three broad categories (attribution, impacts, and mitigation) that influence which methodology is most appropriate. We then develop decision criteria for method selection based on suitability for local conditions and the management goal, actionability with current or obtainable data and resources, transparency with respect to process and uncertainties, and reproducibility.


INTRODUCTION
Conjunctive water management, which acknowledges the interconnected nature of groundwater and surface water and manages them as a single resource, is critical to sustain both human society and aquatic and terrestrial ecosystems. Groundwater inflow to streams provides a stable supply of water, which sustains human water needs for domestic use, industry, and agriculture (Taylor et al. 2013;Gleeson, Cuthbert, et al. 2020) and supports ecological communities (Larsen and Woelfle-Erskine 2018). Streamflow depletion, defined as "a reduction in total streamflow caused by groundwater pumping" , can occur in both gaining or losing streams ( Figure 1). Streamflow depletion occurs when pumping captures groundwater that otherwise would flow from the aquifer to the stream (increased gains in a gaining stream), reverses the direction of flow at the stream-aquifer interface (transition from gaining to losing stream), or increases the rate of infiltration losses through the streambed (increased losses in a losing stream). For further background and details on streamflow depletion please see Barlow and Leake (2012).
Streamflow depletion is particularly problematic when it causes streamflow to drop below environmental flows, defined as "the quantity, timing, and quality of freshwater flows and levels necessary to sustain aquatic ecosystems which, in turn, support human cultures, economies, sustainable livelihoods, and well-being" (Arthington et al. 2018). Streamflow depletion has already impaired environmental flows around the world (Konikow and Leake 2014;de Graaf et al. 2019), with diverse local impacts including transitions from perennial to non-perennial streams (Zimmer et al. 2020;Zipper, Hammond, et al. 2021), impairment of surface water right holders (Idaho Water Resource Board 2019) and collapse of aquatic ecosystems (Perkin et al. 2017). Impairment of environmental flows due to streamflow depletion is anticipated to become more widespread in the future and will be exacerbated by climate change (de Graaf et al. 2019).
Unfortunately, streamflow depletion is challenging to measure directly and, as a result, the extent to which groundwater pumping affects streamflow remains unknown or uncertain, even in settings where the hydrology has been previously studied. Quantifying streamflow depletion is hard because significant time lags between pumping and changes in streamflow may exist, and these lags vary as a function of well-stream geometry and aquifer characteristics (Bredehoeft 2011). Furthermore, the signal of streamflow depletion will be convoluted with all other factors impacting both short-term and long-term streamflow variability (Barlow and Leake 2012), many of which are difficult to characterize such as surface water diversions, weather variability, reservoir operations, land use change, and climate change. While streamflow depletion can be measured at the scale of an individual stream reach using intensive field measurements (Hunt et al. 2001;Kollet and Zlotnik 2003;Lee et al. 2017), it is not possible to measure streamflow depletion at the regional scale, nor resolve depletion in individual segments, using observational data alone.
Since regional-scale streamflow depletion cannot be measured, managers must base decisions on streamflow depletion estimates. Three primary approaches for estimating regional-scale streamflow depletion are analytical, numerical, and statistical models. Each approach has strengths and weaknesses Wate r Table   Groundwater Pumping

Groundwater Depletion
Streamflow Depletion Pumping reduces groundwater storage.
Pumping captures groundwater that would have flown into the stream and/or induces infiltration from the stream into the aquifer.

Groundwater Flow Paths
Water that is pumped from a well comes from two sources: This can be quantified by measuring changes in groundwater levels.
This cannot be directly measured and is challenging to estimate. for decision support purposes, making the selection of an appropriate method challenging. Analytical models were the first approaches developed for estimating streamflow depletion (Theis 1941;Glover and Balmer 1954) and have relatively low data and computational requirements, but contain many simplifying assumptions that reduce their flexibility (Hunt 2014;Huang et al. 2018). In contrast, numerical models allow for a more realistic representation of groundwater and surface water interactions and are often considered the "gold standard" for streamflow depletion assessment in that they are expected to be the most accurate, but are complex and require significant time, data, and expertise for their development, and are available only in limited locations (Mehl and Hill 2010; Barlow and Leake 2012;Fienen, Bradbury, et al. 2018;Fienen et al. 2016). Finally, statistical models attempt to relate changes in streamflow to potential drivers such as groundwater pumping and climate variability, but are limited in their ability to identify causal relationships (Barlow and Leake 2012;Karpatne et al. 2019) and to our knowledge have only rarely been used to quantify streamflow depletion. However, use of statistical models in other fields such as climate change attribution suggest that their use may evolve going forward, particularly given recent advances in physicsinformed statistical methods (Read et al. 2019). Quantifying streamflow depletion is important for numerous water management decisions, and water managers must choose among the variety of available approaches by considering their strengths and weaknesses relative to available resources. To serve this process, our objective is to review and synthesize the advantages, disadvantages, and uncertainties in streamflow depletion estimation methods and provide water managers with a better foundation to select the most appropriate method(s) based on the management question, hydrogeological setting, data, and resources available. We provide examples to illustrate the relative utility and practicality of these approaches, and while we focus primarily on North American examples, the applicability of this work is global, much like the problem of streamflow depletion (Rohde et al. 2017;Gleeson and Richter 2018;de Graaf et al. 2019).
In this review, we use the title "water manager" to encompass multiple types of publicly and privately employed decision-makers, including staff of organizations like state or provincial water planning or regulation offices, irrigation districts, fish and wildlife organizations, watershed associations, and/or other parties working with these agencies such as environmental consultants or nongovernmental organizations. We collected literature and policy for review through several approaches including (1) searching databases (i.e., Web of Science, Google Scholar) with relevant terms such as "streamflow depletion"; (2) studies with which our group of authors were familiar; and (3) forward and backward citation tracing from studies identified in steps (1) or (2). We also had semi-structured conversations with five water managers, with specific roles spanning water planning and regulation, environmental consulting and decision support, and environmental nongovernmental organizations; more details about these conversations are in Appendix. The focus on water management applications and inclusion of recent and emerging methods of streamflow depletion estimation distinguishes this work from the foundational contributions of Barlow and Leake (2012).

Management and Policy of Interconnected Groundwater and Surface Water
Water management primarily interfaces with streamflow depletion through questions related to changes in surface water flows to ensure water availability for downstream users and/or maintain environmental flows for aquatic ecosystems. Historically, groundwater resources and surface water resources have often been treated separately (Bredehoeft and Young 1983;Gleeson et al. 2012), but in recent decades conjunctive water management frameworks that acknowledge the interconnected nature of surface water and groundwater are being applied in many jurisdictions.
Conjunctive water management frameworks from around the world include significant variation in how (or if) streamflow depletion is addressed. In the United States (U.S.), California's Sustainable Groundwater Management Act mandates that groundwater pumping have no unreasonable impact on interconnected surface water (Rohde et al. 2018). In Canada, British Columbia's Water Sustainability Act requires that wells do not cause reductions in streamflow beyond environmental limits (Water Sustainability Act 2014). In the European Union, the European Water Framework Directive requires that pumping not impair environmental flows in surface water such as streams, though specifics on streamflow depletion estimation are not provided (Kallis and Butler 2001;Gleeson and Richter 2018). Australia's 2004 National Water Initiative acknowledged the interconnectivity of groundwater and surface water resources and requires conjunctive management, including explicit consideration of the impacts of impaired flows on groundwater-dependent ecosystems such as communities in groundwater-fed streams (Rohde et al. 2017;Ross 2018).
Despite these examples, effective conjunctive management of surface water and groundwater is lagging behind scientific understanding in many settings. A review of 54 groundwater management plans in the U.S. found that only six (11%) had quantitative targets related to streamflow depletion (Gage and Milman 2020), and there are many regions around the world where streamflow depletion is not addressed by water management. In India, for example, groundwater and surface water are typically managed separately (Srinivasan and Kulkarni 2014;Harsha 2016), and therefore "groundwater use is not considered to be linked to streamflow and is decoupled from the surface water allocation" by water management groups (Biggs et al. 2007). Even where new regulations and policies are made to address the interconnected nature of groundwater and surface water, there can be legacy effects of a different or unregulated past that adversely impact water resources (Owen et al. 2019).
The wide range of approaches to identifying, quantifying, and managing streamflow depletion around the world, as well as variable regulatory frameworks, demonstrates the need for decision resources water managers can use to select and implement appropriate streamflow depletion estimation approaches.

Streamflow Depletion Management Decisions
We identified a number of common water management questions related to streamflow depletion (Table 1; Figure 2). Broadly, these questions can be categorized into three thematic groups: 1. Attribution: Does pumping contribute to decreases in streamflow and, if so, how do pumping impacts compare to other drivers of change? 2. Impacts: What are the implications of streamflow depletion for water users, ecosystems, and society? 3. Mitigation: How can negative impacts of streamflow depletion be minimized?
Different types of information are needed to answer these questions. For attribution questions, it is necessary to quantify the relative importance of different potential drivers (e.g. climate, pumping, land use) on historical streamflow variation. For impacts questions, useful information includes the magnitude of change in streamflow (relative to management targets and/or environmental flows) that would occur as a result of pumping from a well or group of wells. Answering mitigation questions requires understanding the impacts of pumping at different times of year and the magnitude and time scale of a stream's recovery following the cessation of pumping. For all of these questions, estimates are often required at different times of year and for different locations within the stream network. Furthermore, taking management action in response to these questions includes balancing the costs, benefits, and risks of a given management strategy, and therefore depletion estimates that underlie these decisions should include some information about the magnitude and sources of uncertainty (Doherty and Simmons 2013;White, Foster, et al. 2021).

Characteristics of a Successful Streamflow Depletion Estimation Approach
Many factors contribute to water management decisions ( Figure 2). Based on literature review and our experience, we suggest four general characteristics that are essential to providing decision support for streamflow depletion management. The first two characteristics can help guide the selection of an appropriate method: Well-Suited to Local Conditions. In order to isolate the signal of pumping, the streamflow depletion estimation method should be able to account for other potential influences on streamflow, and associated uncertainty, within the domain of interest (e.g., Knowling et al. 2020). Depending on the region, these may include weather and climate variability, land use change, surface water withdrawals, reservoir operations, or other ways that humans modify the water cycle (Abbott et al. 2019;Gleeson, Wang-Erlandsson, et al. 2020). Local expert knowledge, in the form of a place-based understanding of processes that are currently and have historically affected local hydrology, is essential to identify the potential influences on streamflow that need to be considered by a streamflow depletion estimation approach, and because depletion management policies are increasingly implemented at local scales (Opdam et al. 2013).
Actionable. For management purposes, the method must be able to provide an estimate within an acceptable margin of error with input data that either already exist and/or can be obtained, and provide sufficient information about prediction uncertainty so that a water manager can weigh costs, benefits, and risks of their decision options (Doherty and Simmons 2013;Fienen et al. 2021). Implicit within actionability are numerous practical considerations, including whether there is sufficient in-house expertise to implement the method or whether analysis must be contracted, and the related issue of whether the cost of obtaining streamflow depletion estimates is affordable. The third and fourth characteristics are good scientific practices to enhance trust and engagement regardless of the specific streamflow depletion estimation method used.
Transparent. The logic behind the choice of the method, including the strengths, weaknesses, assumptions, and uncertainties of the chosen approach and any alternatives, should be communicated to parties who will be affected by the streamflow depletion estimates (Eker et al. 2018). Ideally, the study design would incorporate these parties because co-development of methods and scenarios enhances understanding of, and trust in, the resulting streamflow depletion estimates (Kniffin et al. 2020), increases the perceived legitimacy of research (Dickert and Sugarman 2005), and can improve the quality of decisions (Reed 2008). Furthermore, uncertainty and sensitivity analyses are necessary to evaluate the overall uncertainty in estimates and relative importance of different input parameters, respectively (Pianosi et al. 2016;Saltelli et al. 2019).
Reproducible. Ensuring that the analysis and results can be reproduced is essential to enhancing trust in streamflow depletion estimates and addressing potential legal challenges to official decisions (Munaf o et al. 2017). Necessary steps to ensure reproducibility would likely include archiving raw and processed data files, model input files, calibration datasets, and code necessary to run any analyses or models and version used (Wilkinson et al. 2016;Lowndes et al. 2017). In some settings, in particular at smaller spatial scales where there are fewer pumping wells, care should be taken to ensure that individual privacy is not compromised during data sharing by anonymizing or aggregating data to coarser scales Zipper, Stack Whitney, et al. 2019). While there have been substantial recent improvements in open-source tools to enable reproducible hydrological modeling workflows (Bakker et al. 2016;Fienen et al. 2021;White, Hemmings, et al. 2021), in practice true reproducibility remains rare in hydrological science (Stagge et al. 2019), indicating that the hydrologic community must continue to improve with regard to reproducibility.

METHODS USED FOR QUANTIFYING STREAMFLOW DEPLETION
In this section, we describe the strengths and weaknesses of analytical, numerical, and statistical approaches to estimate streamflow depletion (Table 2), and provide examples of where each method has been used for making water management decisions related to streamflow depletion.

Analytical Models
Overview. Analytical models were the first tool developed for streamflow depletion estimation, and have been used for almost 80 years in many regulatory and other resource management circumstances (Theis 1941;Glover and Balmer 1954;Hantush 1965;Jenkins 1968). Analytical models adopt a number of assumptions to simplify stream-aquifer interactions and estimate streamflow depletion based on governing equations for groundwater flow and the conservation of mass (Barlow and Leake 2012). They typically provide streamflow depletion estimates caused by a single well in a single stream, though estimates of depletion are often combined additively to account for impacts of multiple wells. Huang et al. (2018) review the large number of existing analytical models and present a guide for analytical model selection based on aquifer and stream characteristics.
Strengths. The primary strengths of analytical models are their relatively low data requirements and their ease of use (Table 2). For example, the only inputs required by the widely used Glover and Balmer (1954) model are aquifer transmissivity, storativity, and the distance from the well to the stream. The more complex Hunt (1999) model requires only a single additional term, the streambed conductance, to account for a potential low-permeability streambed layer, though distributed regional-scale estimates of streambed conductance are challenging to measure and rarely available (Christensen 2000;Korus et al. 2018Korus et al. , 2020Abimbola et al. 2020). Spreadsheet tools are available online to calculate streamflow depletion with a variety of analytical models (e.g., Environment Canterbury 2020). Since analytical model calculations can be conducted rapidly, they are well-suited for integration into web-based decision support tools and can provide screening estimates to prioritize more detailed study (Huggins et al. 2018). Furthermore, these low computational costs enable rapid and straightforward sensitivity and uncertainty analysis of depletion results, though these assessments are inherently limited by the assumptions required to develop analytical models (see "Weaknesses" subsection).
Weaknesses. The primary weakness of analytical models is the required number of simplifying assumptions to derive analytical solutions. Common assumptions include a homogeneous and isotropic subsurface, linear streams, and constant water levels in the stream and aquifer through time. These assumptions limit the ability of analytical models to represent some important processes, such as changes phreatophytic evapotranspiration caused by pumping, and the possible scope of uncertainty analysis, since impact of many uncertain processes and parameters cannot be evaluated due to the limited input requirements and simple model structure of analytical models (Table 2). Analytical models have been derived for many different, though still idealized, hydrogeological settings , including wedge-shaped aquifers at the confluence of two streams (Yeh et al. 2008), streams that intersect impermeable boundaries (Singh 2009), partially penetrating streams (Hunt et al. 2001;Hunt 2003), leaky aquifers (Butler et al. 2007;Zlotnik and Tartakovsky 2008), variable streambed conductivity (Neupauer et al. 2021), and impacts of land use change (Zlotnik 2015;Traylor and Zlotnik 2016).

Emerging
Approaches. Recently, analytical depletion functions were proposed as an empirical tool to overcome the assumptions of a linear stream by accounting for multiple affected stream reaches and stream sinuosity (Zipper, Dallemagne, et al. 2018;Li et al. 2020Li et al. , 2021. Analytical depletion functions combine (1) an analytical model with stream proximity criteria, which are used to identify stream segments that are potentially affected by a well, and (2) a depletion apportionment equation, which then distributes the estimated streamflow depletion among the stream segments . In intermodel comparisons, the analytical depletion functions had a better agreement with process-based numerical models than standalone analytical models , potentially indicating improved accuracy of spatially distributed estimates of streamflow depletion. Despite these improvements, analytical depletion functions are subject to most of the same assumptions as analytical models, and therefore require additional testing before widespread use.
Example Use in Management. Due to their relatively long history and ease of implementation, analytical models have been used for water management in a number of settings. In Colorado and other jurisdictions in the western U.S., the streamflow depletion factor (SDF) has been used to characterize streamflow depletion and establish regulatory guidelines for streamflow depletion by wells for streams that have senior rights holders (Miller et al. 2007). The SDF was defined by Jenkins (1968) from an analytical solution (Glover and Balmer 1954) as the time required for the streamflow depletion to equal 28 percent of the volume pumped from the well. The SDF is estimated using the distance from the well to the stream and the effective storativity and transmissivity of the aquifer. In some applications, the analytical solution itself is reduced to consideration of the SDF to account for the potential time lag between the initiation of pumping and impact on a stream, or, conversely, for the required time lag for the streamflow to recover once pumping is stopped. The SDF is convenient because this factor can be mapped (e.g., Jenkins and Taylor 1972) to support communication and management, and therefore provide a rapid tool for water managers to evaluate the relative magnitude and timing to impact of wells placed in different locations. Furthermore, in settings where response functions such as the SDF have been well-characterized and reliable groundwater withdrawal data are available, water use accounting can provide reasonable estimates of the attribution and impacts of streamflow depletion, as well as evaluate mitigation strategies.
Another example is the State of Michigan's Water Withdrawal Assessment Tool (https://www.egle.state. mi.us/wwat/), which integrates an analytical model with a depletion apportionment equation to estimate potential impacts of groundwater pumping on surface water resources (Reeves et al. 2009). This tool is used to screen high-capacity well registration for the state using risk-based streamflow depletion criteria (Ruswick et al. 2010;Steinman et al. 2011). In the eleven years since use of the tool became part of the registration process, nearly 3,400 registrations were completed by passing the screening criteria. An additional 1,500 registrations did not initially pass the screening and were referred to the state for sitespecific review where all but 60 were allowed to register after additional analysis (Michigan Water Use Advisory Council 2020).

Numerical Models
Overview. In contrast to analytical models, numerical models typically include a threedimensional representation of the surface and subsurface and solve for storage and flow throughout the domain. Typically, models are developed for a region of interest (such as an aquifer or a watershed), a process that includes considerable data collection, database management, model construction, history matching, and visualization. Streamflow depletion is estimated by comparing streamflow in simulations with and without pumping in all or a subset of the domain (Hill et al. 1992;Neupauer and Griebling 2012;Ahlfeld et al. 2016;Zipper, Gleeson, et al. 2021). Most streamflow depletion studies based on numerical models have used groundwater flow models such as MODFLOW, but recent examples have included integrated hydrologic models that couple land surface, vadose zone, and groundwater processes to simulate feedbacks between pumping, groundwater recharge, subsurface storage, and streamflow (Condon and Maxwell 2014, 2019; Woolfenden and Nishikawa 2014; Kollet et al. 2017). Numerical models for streamflow depletion estimation can be created at a variety of scales, ranging from an individual watershed or aquifer (Leaf et al. 2015;Tolley et al. 2019;Kniffin et al. 2020), to regions (Rossman and Zlotnik 2013), to continental or global (Condon and Maxwell 2019;de Graaf et al. 2019;Liu et al. 2019).
Strengths. Numerical models are often considered the "gold standard" of streamflow depletion assessment because they can evaluate the impacts of multiple scenarios caused by simultaneous changes in pumping, climate and land cover, be more readily tested via comparison to field data, and provide a rigorous framework for causation and uncertainty analysis (Hill and Tiedeman 2006;Barlow and Leake 2012;Knowling et al. 2019). As a result, numerical models are widely used management tools. As numerical models are based on the physical representation of hydrological processes and simulate both the storage and flux of water throughout the groundwater and interconnected surface water system, they are more flexible than analytical models. Processes such as vadose zone dynamics, phreatophytic evapotranspiration, and surface water management can be directly included within a numerical modeling framework to estimate their separate or combined impact on streamflow (Markstrom et al. 2008;Condon and Maxwell 2013;Brookfield and Gnau 2016;Zipper et al. 2017;Tolley et al. 2019), and data associated with each of these processes can be assimilated into the model during the history matching process (Camporese et al. 2010;Naz et al. 2019;Fienen et al. 2021).
Numerical models are typically discretized into grid cells or elements that cover the domain or interest so that each of these hydrological processes can be simulated in three spatial dimensions and through time. This process-based representation allows for explicit testing and evaluation of causal mechanisms because (for example) the effects of a pumping well on groundwater storage, streamflow depletion, evapotranspiration, and recharge can be estimated in a single simulation. In addition, the process-based representation allows users to estimate model uncertainty and identify key parameters and processes that contribute to uncertainty (Ferr e, 2017; Knowling et al. 2019Knowling et al. , 2020. Since management decisions require evaluating costs, benefits, and risks, numerical models subjected to thorough uncertainty analysis can allow water managers to discriminate among competing conceptual models, reduce uncertainty through the collection of additional data, and assess the risk of undesirable outcomes (Ferr e, 2017; Leaf 2017; Enemark et al. 2019).
Weaknesses. Numerical models' complexity relative to the other approaches also introduces several limitations related to the data, computational, and human resources needed to develop numerical models appropriate for streamflow depletion assessment. Numerical models require hydrostratigraphic data at all grid cells or nodes (which can number from hundreds to millions), as well as appropriate parametrization for any other processes included in the simulations such as streambed properties or evapotranspiration. This requires substantial user input and expertise, including the need to make numerous subjective decisions about the processes included and how they are represented, which has been referred to as "the art of environmental simulation" and is developed through training and experience (Doherty and Simmons 2013). Often, limited field observations mean that these values are estimated from a small number of locations and extrapolated widely across the domain and/or derived from look-up tables, though ever-increasing availability of local, regional, and global-scale hydrometeorological and hydrogeological data is helping to address this challenge. Nonetheless, the high data need relative to data availability in many settings can mean that parties whose water use is affected by the outputs of the model may be concerned that the numerical model does not accurately reflect their particular context (e.g., Wardropper et al. 2017).
For a numerical model to be confidently used in streamflow depletion assessment, history matching should be performed to ensure that simulated baseflow and hydraulic head agree with observations at numerous points within the domain and for a range of different pumping conditions (Hill 2006;Hill and Tiedeman 2006). Given the highly parameterized nature of numerical models and the fact that models can never exactly characterize the hydrologic system, they are typically nonunique, meaning that many different parameter combinations can provide equally good agreement with observations and can lead to uncertainty when testing scenarios outside the model calibration conditions (sometimes referred to as the "equifinality hypothesis"; Konikow and Bredehoeft 1992;Beven 2006;Hunt et al. 2020). This has precipitated a recent shift in the discipline toward ensemble-based model development that seeks to connect uncertainty between model inputs and outputs (e.g., Foster et al. 2021;White, Hemmings, et al. 2021), rather than calibration-focused strategies that seek to identify a single set of "correct" parameter values. However, calibration-focused strategies continue to be widespread and models developed in the past using these strategies continue to be used, and can lead to a false sense of accuracy in contexts with equifinality because the model can match historical JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION data well and appear highly realistic even if processes and parameters are incorrect (Doherty and Moore 2020). Adopting a "forecast first" workflow, where scenario forecasting efforts are iteratively integrated with model development and calibration (White 2017), can be valuable as they allow model creators to determine whether additional model complexity and calibration provide improved forecasts, thus ensuring that forecasts provide acceptable uncertainty for decision-makers to assess risk of undesirable outcomes relative to costs and benefits of a management action (Doherty and Simmons 2013).
Furthermore, increasing data availability is enabling calibration methods based on numerous targets such as groundwater head, evapotranspiration, and land surface temperature to provide a more robust approach for streamflow and groundwater head prediction compared to calibration based on head and discharge alone (Stisen et al. 2018). For example, Hunt et al. (2020) found that including both hydraulic head and fluxes in model development substantially improved history matching and forecasting capabilities compared to using hydraulic head alone, and that multivariate or multiobjective model calibration approaches can reduce overfitting even in highly parameterized models when the practitioner has sufficient deep knowledge and expertise to implement appropriate parameter regularization techniques (see also Moore and Doherty 2006). The use of multiple evaluation datasets is becoming more prevalent with the widespread use of integrated hydrologic models and the increasing amount of hydrological data (Schreiner-McGraw and Ajami 2020).
The ability to capture depletion dynamics depends heavily on the temporal and spatial resolution of the model. While a more refined grid provides greater detail on depletion dynamics, it can increase computational demand, potentially making simulations infeasible. Numerical models rely on the convergence of the flow solution to within some user-defined head threshold, which means that regional-scale numerical models are often poorly suited for estimating the impacts of an individual well, particularly in large domains, because they cannot estimate depletion that is less than the model's mass balance error (Leake et al. 2010). This further reinforces the point that decision support models should be specifically designed for the management action under consideration, rather than developing a single model for a region that is then used to answer a variety of different management questions (Doherty and Moore 2020).
Finally, some numerical modeling platforms (i.e., HydroGeoSphere, FEFLOW, COMSOL) are proprietary, which limits transparency and reproducibility of any analysis done using these platforms by other users. The most widely used numerical modeling platform (MODFLOW) as well as many emerging approaches (i.e., GSFLOW, ParFlow) are open source and are well-suited for streamflow depletion in decision-making. There are also many emerging opensource tools for the reproducible creation and analysis of numerical models (Bakker et al. 2016;White et al. 2016White et al. , 2018Gardner et al. 2018;Ng et al. 2018;Fienen et al. 2021;White, Hemmings, et al. 2021).
Emerging Approaches. Numerical models continue to evolve as computational resources, data, and understanding of hydrologic systems advance. Relevant to managing streamflow depletion, integrated hydrologic models that capture flow and transport dynamics across the hydrologic cycle are increasingly incorporating anthropogenic activities, such as groundwater pumping, surface water diversions, reservoir management, and economic factors (Morway et al. 2016;Brookfield et al. 2017;Niswonger et al. 2017;Boyce et al. 2020;Rouhi Rad et al. 2020). Some of these models incorporate water operational rules and constraints, thereby integrating water management decision-making into numerical models (Brookfield and Gnau 2016; Morway et al. 2016;Brookfield et al. 2017). This integration allows the co-evolution of hydrological, ecological, management, and societal conditions, rather than dependence on static boundary conditions and sources/sinks (Srinivasan et al. 2017;O'Keeffe et al. 2018;Konar et al. 2019). Examples include the Agricultural Water Use package for MODFLOW and GSFLOW, which can be used to estimate agricultural water use and resulting streamflow depletion impacts (Niswonger 2020 Hydrologic models are also integrating and improving upon vegetation dynamics, allowing the models to better predict water demand and crop yields, which drive irrigation, in future climate and policy scenarios. For example, integration of crop growth and irrigation modules in the Variable Infiltration Capacity model (VIC-CropSyst) improved hydrologic simulations in agricultural watersheds (Malek et al. 2017). HydroGeoSphere recently incorporated on-demand irrigation into their modeling framework, which triggers groundwater extraction during the user-defined growing season when the pressure head at a specified location and depth declines below a prescribed level. Coupling of the widely used Soil Water Assessment tool (SWAT) with MODFLOW and groundwater solute reactive transport model RT3D (SWAT-MODFLOW-RT3D) has increased broader applicability of the model in regions with conjunctive water use or groundwater contamination (Wei et al. 2019).
Since complexity is one of the primary challenges for numerical model development and use, several promising emerging approaches seek to balance the advantages of improved process representation in numerical models while minimizing model complexity and runtime. For example, surrogate models are simplified models focused on the dominant features of a groundwater problem of interest to allow for more robust sensitivity analysis and scenario exploration than numerical models (Razavi et al. 2012;Asher et al. 2015). Hierarchical approaches to surrogate modeling exclude some processes and therefore have a faster model runtime while maintaining a high level of accuracy. For instance, in streamflow depletion studies it may be acceptable to simplify the representation of unsaturated zone processes, which can have substantial computational costs, if pumping is not expected to substantially change groundwater recharge. Data-driven approaches to surrogate modeling, also referred to as "metamodeling," train statistical models on the input and output data from numerical models and then use the simpler statistical models for scenario assessment. Metamodels have recently emerged in the groundwater community and can be incorporated into decision support systems for streamflow depletion scenario analysis (Fienen, Nolan, et al. 2015, 2016Starn and Belitz 2018). However, both of these surrogate modeling approaches are still only feasible in locations where numerical models already exist. Spreadsheet-based approaches provide a simplified interface for creating and developing finite-difference numerical models with a lower data and expertise requirements while still retaining strong process representation that allows for examination of multiple processes simultaneously (Robinson 2020), and therefore provide a promising intermediate-complexity approach between numerical and analytical models.
Example Use in Management. Numerical models have been used to estimate streamflow depletion in many settings around the world. One example is the Republican River Compact Administration groundwater model (RRCA 2003), which is a MOD-FLOW model used to make water allocation decisions among the states of Colorado, Nebraska, and Kansas. The original 1943 Republican River Compact allocated the distribution of water among subbasins in each of the three states, but did not explicitly address how to account for streamflow depletion caused by groundwater pumping. Following a U.S. Supreme Court settlement between Kansas, Nebraska and Colorado, the interstate compact was modified to account for streamflow depletion due to groundwater extraction, which is quantified using the groundwater flow model jointly developed by the three states and federal government (RRCA 2003;Zipper, Gleeson, et al. 2021). Each year, the states submit estimates of water supply and use, jointly evaluate the results of water accounting, update the MODFLOW model to estimate groundwater consumptive use and streamflow depletion across the basin, and assess compliance with the terms of the Republican River compact and legal settlements.

Statistical Assessments and Models
Overview. In contrast to analytical and numerical models, both of which model physical processes using governing equations of water flow, statistical approaches rely on interpolations, extrapolations, and relationships among observed data to characterize hydrologic states and fluxes. These statistical approaches are based on physical hydrological processes through the selection of relevant variables or model structures that have the potential to reflect key processes influencing streamflow. Therefore, adopting a statistical approach does not lead to the exclusion of physical process understanding, but merely means that relationships among variables are not necessarily controlled by governing equations such as Darcy's Law. There are numerous statistical approaches that have been used or are relevant to streamflow depletion assessment, and we adopt a broad definition to include emerging data-driven approaches such as machine learning within our discussion. Here, we distinguish between statistical assessments, which analyze hydrologic variables (e.g., trend analysis), and statistical models, which estimate hydrological variables (e.g., regression analysis).
Statistical assessments of streamflow depletion typically quantify changes or trends in streamflow or baseflow as well as changes or trends in potential drivers such as groundwater pumping and precipitation, and relate the two. For example, Kustu et al. (2010) observed a spatial match between negative trends in groundwater levels and streamflow across the U.S. High Plains Aquifer and inferred a connection between the two based on the absence of potential explanatory precipitation trends, and Juracek (2015) compared numerous gages in southern Kansas and found significant decreasing streamflow trends in basins with the greatest groundwater level decline JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION and a lack of precipitation trends, which together suggested that streamflow depletion was the cause of observed streamflow trends. In Brazil, Lucas et al. (2021) suggested streamflow depletion was leading to a decline in baseflow due to a spatial agreement between declining baseflow trends, increasing evapotranspiration trends, and irrigated agricultural land. In contrast to statistical assessments, statistical models applied to streamflow depletion estimation typically attempt to quantify some relationship between groundwater pumping and long-term changes in streamflow and/or baseflow, often as one of several predictors. For instance, Holtschlag (2019) included irrigation in linear mixed models of summer water yield for many watersheds in Michigan, allowing them to determine whether it was an important predictor of streamflow; similar approaches have been used elsewhere (Burt et al. 2002;Prudic et al. 2006). Broadly, statistical assessments can identify potential drivers of streamflow depletion, and the links identified through assessment can then be represented and tested using more detailed approaches such as analytical, numerical, or statistical models.
Given the widespread availability of streamflow and meteorological data relative to groundwater data, there are numerous large-scale statistical assessments documenting trends in hydrological signatures that may be relevant to streamflow depletion. For example, Ayers et al. (2019) calculated monthly baseflow trends across the mid-western U.S. and found significant negative trends in areas with widespread groundwater pumping such as western Kansas and Nebraska. In practice, statistical models are rarely used for streamflow depletion management, largely due to an inability to assess causal relationships and responses to management actions, though the emerging data-driven statistical approaches discussed below are promising potential tools that may improve our ability to quantify, predict, and evaluate streamflow depletion.
Strengths. Statistical assessments and models are diverse and have their own, individual strengths, and weaknesses. However, we can generalize several common strengths relative to analytical and numerical models. In many other areas of hydrology, statistical approaches are popular for their ease of application and low data requirements (Farmer et al. 2014). While these approaches have not been widely used for the assessment of impacts and mitigation strategies in the field of streamflow depletion, they have some characteristics that may make them well-suited to these tasks. Statistical approaches tend to be adaptable to a wide range of potential data types and availabilities, making them flexible across different domains. Statistical approaches may be particularly useful in settings where subsurface hydrostratigraphic data, which are critical to accurate analytical and numerical model development but are not essential to statistical models, are unavailable. Similarly, statistical approaches are flexible to a wide range of target metrics; for example, statistical assessment and models can be used on any hydrological signature derived from a hydrograph (McMillan 2020), and therefore could effectively represent various aspects of the local hydrological response to pumping. This information is particularly valuable where there may be specific flow conditions or metrics with high relevance to either management or ecological outcomes (Yarnell et al. 2020), as the statistical models can be developed to specifically predict hydrological signatures that are most relevant to needed management decisions.
Additionally, statistical approaches generally have lower computational requirements than numerical models, though for some data-intensive applications, statistical model training can be computationally demanding. This means that they are well-suited for conducting large numbers of simulations necessary for accurate calibration, sensitivity and uncertainty analysis, and to develop probabilistic estimates. Statistical models are capable of quantifying uncertainty in hydrological predictions and the underlying parameters and processes that contribute to uncertainty (Pathiraja et al. 2018;Fang et al. 2020;Piazzi et al. 2021), though this type of analysis has not been done (to our knowledge) in a streamflow depletion context to date.
Weaknesses. Statistical approaches have been widely used to quantify hydrologic states and fluxes, but have rarely been used to quantify streamflow depletion (Barlow and Leake 2012). This is largely because streamflow depletion is damped and lagged relative to groundwater pumping due to the diffusivity of the groundwater system and distance of a stream from the point of withdrawal, and further obscured by natural hydrometeorological variability and other human activities that affect streamflow (i.e., land use change, reservoir operations), making statistical quantification of the direct causal link between pumping and streamflow change hard to detect. Statistical approaches are particularly challenging in settings where hydrologic data are not available prior to the onset of groundwater pumping, and where long-term groundwater pumping data are not available. To fill these gaps, developing relationships with proxies for groundwater usesuch as crop evapotranspiration derived from remote sensing (Foster et al. 2019)may be necessary for the wide application of statistical models to approximate streamflow depletion, though care should be taken to account for potential errors and uncertainty in proxy datasets (Foster et al. 2020). In settings where causal JAWRA JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION attribution is impossible, statistical assessments can detect locations of potential streamflow depletion and infer potential drivers based on system understanding and available evidence (Wahl and Tortorelli 1997;Prudic et al. 2006;Penny et al. 2020), but additional methods (such as numerical models) would be needed to explicitly develop causal links between groundwater pumping and changes in baseflow or streamflow that are needed for evaluating attribution, impacts, and mitigation decisions.
While statistical approaches are highly flexible, they are constrained by the available data and the conditions represented by that data. The ability of a statistical model to represent the needed level of detail or at the required resolution of space and time is dependent on the availability of appropriate data to characterize the objectives at the required detail and resolution. Statistical models, also called data-driven models, are often limited in scope because they rely on available data for a specific objective. The objective may, of course, be far reaching, and the statistical model will require appropriate data to learn from.
Just as numerical and analytical models are calibrated to specific objectives, statistical models are designed around specific objectives. Unlike numerical and analytical models, statistical models often lack the explicit representation of processes that support extrapolations beyond the model's original design. For example, a numerical model may be designed to estimate streamflow depletion at a particular stream gage and calibrated to reproduce this value accurately; in doing so, as a product of its process representation, this model may also produce by-products like estimated groundwater storage. A statistical model with the same calibration target may achieve similar accuracy, but may not produce other targets not specified in the objective function. However, like numerical models, uncertainty analysis of statistical models can be used to quantify uncertainty associated with forecasts outside of training conditions and identify the major contributors to that uncertainty. In many cases, uncertainty-centered workflows developed for numerical models, such as the "forecast first" workflow to modeling discussed in the "Numerical Models" section above (White 2017), could be directly adapted to integrate into statistical modeling workflows.
Emerging Approaches. Determining causality between groundwater pumping and streamflow depletion is challenging with traditional statistical regression models and is a primary reason that they have not been used extensively in streamflow depletion assessments. Randomized controlled experiments used to identify causal relationships are often impractical, if not impossible, in hydrology (Runge et al. 2019;Ombadi et al. 2020). However, the ever-growing amount of observational data from sources such as stream gages, climate datasets, and remote sensing provides an opportunity to adapt existing and emerging econometric methods useful for identifying causal relationships from observational data (e.g., Athey and Imbens 2017). Although there have been recent applications of causal inference to hydrological questions such as estimating streamflow reductions from deforestation (Levy et al. 2018), linking changes in impervious cover to changes in flood events (Blum et al. 2020), or assessing the impact of groundwater policy on pumping and water levels ), these techniques have not yet been used for streamflow depletion assessments to our knowledge. Causal inference methods that would be well-suited to streamflow depletion include (1) difference-in-differences comparisons with appropriate analogs that can serve as a control, similar to paired-catchment studies Reichert et al. 2017); (2) Granger causality (Granger 1969), which tests whether including a variable (e.g., pumping) improves predictions of the outcome (e.g., streamflow or baseflow); and (3) statistical constructions of "counterfactual" scenarios. For streamflow depletion estimation, these counterfactual methods (e.g., synthetic controls, Abadie et al. 2010 or causal impact, Brodersen et al. 2015) could use pre and postpumping relationships among streamflow in the area of interest and streamflow in nearby streams unaffected by pumping, along with covariates such as precipitation, to estimate what streamflow would have been in the absence of pumping as a counterfactual. Differences between observed streamflow and this counterfactual can then be attributed to streamflow depletion. Counterfactual methods have been used elsewhere to isolate impacts of climate and land use change on streamflow (Gao et al. 2016;Zhang et al. 2016;Zipper, Motew, et al. 2018). More information about causal inference methods is available in several recent reviews (Athey and Imbens 2017;Runge et al. 2019;Ombadi et al. 2020). Ultimately, an effective use of causal inference requires thoughtful design and interpretation to match appropriate methods for the study system, account for confounding variables, and couch conclusions within the limitations of the method.
Machine learning, including deep learning, is another emerging statistical approach with potential applications for streamflow depletion estimation and causal inference because machine learning methods can control for many potential covariates (Athey and Imbens 2017). Machine learning models more easily ingest and process large amounts of data compared to other statistical approaches and have the ability to detect unexpected patterns between data points (Nearing et al. 2020). Recent applications have shown the ability of machine learning models to provide better predictions than physically based hydrological models of daily streamflow in both gaged and ungaged locations (Kratzert, Klotz, Herrnegger, et al. 2019;Kratzert, Klotz, Shalev, et al. 2019). While machine learning methods have been applied separately to estimate groundwater levels (Sahoo et al. 2017), groundwater use (Majumdar et al. 2020), streamflow change (Zipper, Hammond, et al. 2021), and surface water metrics (Worland et al. 2018), to the best our knowledge, they have not been applied to streamflow depletion (though machine learning techniques have been used for metamodeling of streamflow depletion trained on numerical model output, as described in the "Numerical Models" section). Simple machine learning techniques such as random forests have the advantages of (1) allowing for many predictors with nonlinear relationships to the response variable, (2) not being constrained by our current best understanding of process across scales, (3) reasonable transparency and interoperability through variable importance analysis, and (4) strong performance in prediction mode with reproducible uncertainty estimates (Addor et al. 2018).
Despite these strengths, random forests and other machine learning techniques are limited by their inability to extrapolate beyond the range of values in the input data (Beven 2020), which is problematic when the potential system stresses being analyzed, such as pumping scenarios, exceed what has been experienced in existing monitored conditions. Additionally, a lack of transparency in machine learning models can make them difficult to interpret, they require large input training datasets, and predictions can be highly sensitive to small perturbations in input under certain circumstances (Shen 2018). For a problem as complex as estimating streamflow depletion, process-guided deep learning in which the model is penalized for violating physical laws (e.g., Read et al. 2019) could prove useful. Machine learning may be especially useful for estimating streamflow depletion due to their ability to identify connections between seemingly unconnected variables, which is valuable given that the groundwater pumping data are rarely monitored or available (Foster et al. 2019).
Example Use in Management. Australia's National Water Initiative in 2004 required conjunctive management of interconnected surface water and groundwater (Ross 2018). To meet this need in Australia's Murray-Darling basin, which covers >1 million square kilometers, a joint approach combining numerical and statistical models was developed through the Murray-Darling Sustainable Yields Program and is described in Rassam et al. (2008). Because of the size and complexity of the Murray-Darling Basin, as well as the presence of existing surface water and groundwater models for parts of the basin, a single basin-wide integrated numerical model was not available or feasible to develop. Instead, to assess impacts of pumping on streamflow the program used existing or developed new numerical groundwater models for high priority subbasins (those with the greatest groundwater extraction and largest likely impacts on streamflow), and for lower priority basins used a statistical model. This mixed numerical-statistical approach was enabled by a substantial amount of long-term data available for the Murray-Darling Basin that was used to parameterize and evaluate both the numerical and statistical models. The statistical model estimates streamflow depletion as a function of the pumping rate, time since pumping began, and an empirical connectivity factor (Rassam et al. 2008). Effectively, the connectivity factor is equal to the proportion of pumping that is expected to be sourced from streamflow depletion over long time scales, where a lower value indicates less streamflow depletion caused by a given pumping volume (Walker et al. 2020a). This statistical model is then used to evaluate whether changes in pumping, for example caused by climate change, may impair rivers beyond sustainable diversion limits that are set at the basin and catchment levels (Walker et al. 2020b).

CHOOSING A STREAMFLOW DEPLETION ESTIMATION APPROACH
Earlier, we identified four general characteristics of a successful streamflow depletion estimation approach: it should be well-suited to local conditions, actionable, transparent, and reproducible. Here, we evaluate analytical, numerical, and statistical models as they relate to these characteristics and with respect to common streamflow depletion management questions (Table 1). Since any well-documented approach can be made both transparent and reproducible (with the exception of proprietary software or tools, as noted above), the primary factors to consider should be the degree to which an approach is well-suited to local conditions and is actionable. In practice, this requires that the approach adequately accounts for the diverse potential drivers of streamflow change (well-suited), and the approach can provide estimates of streamflow depletion and associated uncertainty with the data, expertise, and resources available (actionable).
Suitability and actionability can be balanced by following the parsimony axiom that the approach chosen should be as simple as possible, but no simpler (Figure 3). For streamflow depletion, a well-suited approach should be sufficiently detailed to account for all relevant processes affecting streamflow depletion to avoid errors caused by model inadequacy, while avoiding the inclusion of irrelevant processes to minimize poorly constrained parameters and feedbacks to avoid propagation error (Hill and Tiedeman 2006;Saltelli 2019). To be actionable, the producer of the depletion estimates should be familiar with the strengths and weaknesses of the approach, and have sufficient skill and resources to provide estimates of uncertainty caused by parameters narrow enough to guide decision-making and assimilate available data to minimize this uncertainty (Doherty and Simmons 2013). Figure 3 illustrates the principal by showing how increased model complexity decreases inadequacy error (generally associated with improved model fit to data) and eventually increases propagation error (generally associated with inaccurate predictions and tested using data not included in model development).
Balancing model simplicity and complexity is challenging and the subject of substantial discussion in the decision support modeling community. Past work has found that oversimplified models can underestimate uncertainty and bias model predictions, which hinders effective decision-making (Knowling et al. 2019), though stochastic statistical approaches can improve the simulated distribution of this bias (Farmer and Vogel 2016). In practice, finding this balance is tricky and facilitated by experience with the technique being used, regional hydrologic expertise, and rigorous uncertainty analysis that identifies the processes and parameters contributing most to uncertainty Leaf 2017;Doherty and Moore 2020).
Suitability primarily relates to the match between the management question being asked, the resources available, and the capabilities of each method (Table 3). For questions related to attribution ("Does pumping contribute to observed decreases in streamflow and, if so, how do pumping impacts compare to wother drivers of change?"), numerical and statistical models are generally better-suited than analytical models. Both approaches can be designed to account for other potential drivers of streamflow change (such as land use or climate change). In contrast, analytical models are typically focused on groundwater pumping and do not include any other processes. Comparing between numerical and statistical models, numerical models can estimate causation more directly due to the representation of process-based links between different aspects of the interconnected stream-aquifer system, while statistical models typically provide correlative results (though emerging statistical causal inference methods may be able to overcome this limitation with further research; see, for example, Levy et al. 2018;Blum et al. 2020).
The three approaches have similar suitability strengths and weaknesses for questions related to impacts ("What are the implications of streamflow depletion for water users, ecosystems, and society?") and mitigation ("How can streamflow depletion be mitigated?"). Analytical models are best-suited for assessing the impacts of a single well, while numerical and statistical models are better-suited for answering questions about regional-scale impacts of numerous pumping wells. Regardless of the approach used, it is critical that the estimation model is designed to match the management question and decision criteria. For example, regional numerical models are not well-designed for assessing streamflow depletion from a single well because their grid size typically does not allow sufficient spatial refinement to accurately capture fine-scale dynamics, and they can only detect impacts that exceed the mass balance error of the model (Konikow and Bredehoeft 1992;Mehl and Hill 2010). For a single well, localized numerical models with fine grids and tight solver criteria can be developed . Numerical models tend to be best-suited to explore spatially and temporally distributed impacts of pumping on multiple aspects of the hydrological and broader socioenvironmental system because they can include explicit process-based coupling among different processes (i.e., streamflow depletion, phreatophytic evapotranspiration, groundwater depletion) and are increasingly coupled to other models such as agentbased or economic models (Castilla-Rho et al. 2015Hu et al. 2017;Rouhi Rad et al. 2020).
Where there is a specific management target, statistical models may be advantageous since they can be developed for that metric and therefore bypass  complexity associated with other aspects of the system. For example, if management decisions require understanding how pumping will change 10th percentile annual streamflow, there is no need to simulate impacts on daily or monthly streamflow, significantly reducing statistical model complexity and allowing rigorous uncertainty and sensitivity analysis associated with this hydrologic signature. This is in contrast to numerical models which need to proceed through a more complete representation of the entire hydrological cycle, which means that statistical models can be significantly less complex but may also be more narrowly focused. Additionally, if estimates are needed for different climate conditions (past or future), it is critical that the approach selected acknowledges and, ideally, accounts for hydrologic nonstationarity associated with climate change (Milly et al. 2008;Rissman and Wardropper 2020). Actionability, on the other hand, is driven by the availability of data, resources, and expertise. In general, as model complexity increases, so to do the data and resources required for their applications. In general, analytical models have the lowest complexity, statistical models have intermediate complexity, and numerical models can be the most complex, though there is substantial variability within each of these three broad categories (Figure 4). Interestingly, Addor and Melsen (2019) showed that the choice of hydrological models is strongly influenced by the training and institution of the modeler (Addor and Melsen 2019), and it is therefore likely that expertise and preferred methods will vary across water management areas based on their region, staff, and history. However, analytical models tend to require less expertise to develop and implement than numerical models, which may make them feasible in resourcelimited locations (Zipper, Dallemagne, et al. 2018). Analytical, numerical, and statistical models would all benefit from improved data collection for key streamflow depletion processes, in particular the location, volume, and timing of groundwater withdrawals which are often only available in very well-monitored or studied regions (Foster et al. 2019).
Overall, the choice of approach depends on the question at hand and processes represented. When the focus of study is the impacts of a single well on a single stream, then analytical models are likely to be the best tool for the job. For questions regional in scale, statistical or numerical models are likely to be more suitable. Statistical models, which provide an intermediate level of complexity between numerical and analytical approaches, have not been widely used for streamflow depletion estimation due to the lack of causal attribution but may be a promising area for future development. Given the contrasting strengths and weaknesses of the three approaches discussed above, there is likely to be significant value in using multiple approaches to help constrain estimates (Saltelli et al. 2020).

CONCLUSIONS
Reliable estimates of streamflow depletion are essential for effective water management in settings with interconnected groundwater and surface water resources. We categorize common water management questions into three groups based on water management goals: (1) attribution, to understand the potential drivers of changes in observed streamflow; (2) impacts, to understand the hydrological, ecological, or socioeconomic ramifications of streamflow depletion; and (3) mitigation, to identify ways that the impacts of streamflow depletion can be reduced or minimized. Making management decisions related to each of these goals requires accurate estimates of streamflow depletion, but quantifying streamflow depletion is challenging because it cannot be directly observed in typical hydrological data (i.e., streamflow hydrographs) and therefore is infeasible to estimate using field techniques at scales larger than a single stream reach. Due to these difficulties, there has historically been a lack of consistent streamflow depletion regulatory frameworks, which has caused local water managers to make decisions on a case-by-case basis.
In this study, we provide an updated review of analytical, numerical, and statistical approaches for regional-scale streamflow depletion estimates. From this effort, we developed criteria that water managers can use to select an appropriate and feasible approach for their needs based on suitability, actionability, transparency, and reproducibility. The approach selected should be well-suited to local conditions, produce actionable information relevant to the water management question under consideration, be transparent to affected parties such as water users, and be reproducible so it can be evaluated and used by others not involved in the quantification process.
We then used these criteria to evaluate analytical, numerical, and statistical models, finding that the strengths and weaknesses of each approach vary based on the management question being addressed. Analytical models are well-suited for rapid, screeninglevel assessments of potential impacts and implications of streamflow depletion, but they struggle with questions related to attribution and mitigation since they rarely include other processes that could affect streamflow. Numerical models are particularly wellsuited for understanding impacts of pumping and mitigation for streamflow depletion because they can include quantitative links among many different processes and are increasingly coupled to models representing other aspects of the local social and hydrological system. Numerical models are currently the gold standard for streamflow depletion estimation, but can be infeasible in many settings with limited resources. Statistical approaches have not seen wide use for streamflow depletion estimation compared to analytical or numerical approaches because they typically provide correlative, rather than causative, output and therefore struggle with questions related to Groundwater models (MODFLOW, ...) FIGURE 4. Comparison of analytical, statistical, and numerical approaches with respect to complexity and use for streamflow depletion estimation. Large colored boxes show the general type of approach, and smaller colored text shows specific methods/tools. Locations of approaches in the graph are based on author discussions and informal feedback from colleagues. attribution and impacts. However, emerging statistical methods for causal attribution may become a new tool in the water management toolbox, and with further development could provide a valuable intermediatecomplexity approach for streamflow depletion estimation to fill the gap between simple analytical models and complex numerical models. Additionally, blended approaches (i.e., developing statistical metamodels to interpret and extend numerical model output) can leverage the strengths of multiple types of approaches and hold promise for future use. Regardless of the approach selected, it is critical to calculate and communicate the uncertainty associated with streamflow depletion estimates, particularly when extrapolating any approach beyond the conditions in which it was developed (i.e., scenario assessment). By being transparent about strengths, weaknesses, and uncertainties, affected parties will better understand the logic behind decisions and can serve as a bridge to participatory approaches to streamflow depletion estimation that can enhance both scientific quality and societal impact.

APPENDIX WATER MANAGER FEEDBACK
To help guide this manuscript toward relevant, actionable information, we had conversations with five different water managers asking for their feedback on an earlier draft of the manuscript. In these conversations, we shared a draft version of the manuscript and an executive summary of the key points, with the following conversation prompts in advance: 1. What types of decisions or recommendations do you make related to streamflow depletion? 2. What do you usedata, software, equations, or other toolsto make those decisions? 3. What barriers have you encountered to using streamflow depletion information for decisionmaking? 4. Please look at the figure on page 1 (note: this is the current Figure 2). What about this figure aligns with your own decision process? What is different? What are we missing? 5. What information would make this paper most useful to people like you? 6. Any other thoughts or comments?
These questions provided a basis for the conversation, but we allowed the water managers to focus on aspects that were most interesting and relevant to them, so not all questions were directly addressed in all conversations.

DATA AVAILABILITY STATEMENT
No data are used in this manuscript.

ACKNOWLEDGMENTS
This work was conducted as a part of the Streamflow Depletion Across the U.S. Working Group supported by the John Wesley Powell Center for Analysis and Synthesis, funded by the U.S. Geological Survey. Thanks to Chris Beightel, Melissa Rohde, Bob Smail, and the rest of the Powell Center working group for feedback. We also appreciate constructive feedback from Paul Barlow, Ryan Bailey, and three anonymous reviewers. Any use of trade, firm, or product names is for descriptive purposes only and does not imply endorsement by the U.S. Government.