Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada Streamflow depletion from groundwater pumping in contrasting hydrogeological landscapes: Evaluation and sensitivity of a new management tool

Groundwater pumping can reduce streamflow by reducing groundwater discharge and/or inducing streamflow infiltration, which together are referred to as streamflow depletion. Recently, analytical depletion functions (ADFs) have been suggested as rapid and accurate tools for streamflow depletion assessment, but their performance has only been tested in a few hydrogeological settings. To evaluate whether they will be useful tools for other regions with contrasting stream network and hydrogeological characteristics, we compared ADFs to calibrated MODFLOW models in BX Creek and Peace region with distinct hydrogeological settings (interior plateaus & highlands, and boreal plains, respectively) and spatial scales (165 km 2 and 1952 km 2 , respectively) in British Columbia, Canada. Results showed that ADFs can accurately identify most affected streams by pumping for 100% and 83% of wells in the BX Creek and Peace region, respectively, and had small prediction errors compared with MODFLOW. Specifically, the mean absolute error of predicted depletion ranged from 2% to 14% of the highest simulated pumping rate over the study period of 30 years, with improved accuracy during the pumping season. We also found different responses of ADF performance to hydrostratigraphic properties such as hydraulic conductivity, aquifer thickness, streambed conductance, and well depth across two domains, indicating that different factors control ADF accuracy in different hydrogeological settings. Therefore, we conclude that ADFs are useful tools for conjunctive water management, but a good understanding of local hydrogeological conditions is needed to address the potential uncertainty of ADFs for decision-making.


Introduction
Groundwater is a critical resource for society and aquatic ecosystems (Gleeson & Richter, 2018;de Graaf et al., 2019;Kurylyk et al., 2015;Power et al., 1999). Groundwater pumping can negatively affect rivers, lakes, wetlands, and other surface water bodies by reducing groundwater discharge into them or, in severe cases, inducing infiltration through the streambed. Combined, these impacts are known as streamflow depletion (Barlow & Leake, 2012;Zipper et al., 2019A).
forest logging and fires (Kiffney et al., 2002;Wei et al., 2018), understanding the impacts of streamflow depletion is critically important for regional and global water management.
Unfortunately, streamflow depletion is impractical to measure directly because of limited streamflow monitoring data and because pumping-induced reductions in streamflow are superimposed on top of weather-driven flow variability (Barlow & Leake, 2012;Gleeson & Richter, 2018). As a result, streamflow depletion is typically estimated using numerical and/or analytical models (Chen, 2000;Huang et al., 2018;Schneider et al., 2017;Zipper et al., 2019B).
Numerical models are often used for site-specific investigations and require substantial effort, data, and knowledge to calibrate and validate. Sometimes, water managers cautiously transfer the knowledge gained from one location to another as the differences in hydrogeological and climate conditions create significant uncertainty, which hinders decision-making in locations that have not been extensively studied. In contrast, analytical models can provide a quick assessment of streamflow depletion with relatively low data requirements (Huang et al., 2018;Huggins et al., 2018). Several studies have compared the performance between the numerical and analytical models and concluded that analytical models are conservative tools for water managers to assess Manuscript submitted to Journal of Hydrology 4 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada the pumping impacts on streamflow depletion (Huggins et al., 2018;Rathfelder, 2016;Zipper et al., 2018. However, analytical models have many simplifying assumptions, for instance, the commonly-used Glover analytical model assumed an infinite horizontal aquifer bounded by a single linear stream. Additionally, streams and their underlying aquifer are hydraulically connected and do not become dry over the pumping period (Glover & Balmer, 1954). The details of other analytical depletion models can be found in a review paper by Huang et al. (2018). These assumptions could lead to a biased assessment of streamflow depletion, which further hinder their application in real-world hydrogeological settings with complex stream networks of multiple, meandering, and different types (i.e., perennial and ephemeral streams) of stream segments.
To advance the application of analytical models in real-world settings with multiple stream Republican River Compact Administration groundwater model, a regional-scale calibrated 5 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada groundwater model used for streamflow depletion decision-making in a heavily stressed aquifer (Zipper et al., 2020).
These previous studies (e.g., Zipper et al., 2018;2019A; have found that the streamflow depletion predicted by the best performed ADFs provide comparable estimate to numerical groundwater models. However, to date ADFs have only been tested in these three limited domains, and there have been no systematic cross-regional comparisons to determine how ADF performance varies across different hydrogeological conditions and what factors most strongly influence ADF performance. Therefore, additional testing against calibrated numerical models is needed to evaluate whether ADFs will perform similarly across regions with different stream network geometries, hydrogeological characteristics, and bioclimatic conditions, which is necessary to evaluate whether ADFs are suitable for decision-making in real-world settings. To address these knowledge gaps, the goals of this study are to: 1) evaluate the accuracy of ADFs for estimating impacts of groundwater pumping on streamflow in two contrasting hydrogeological settings in British Columbia, Canada; and 2) understand the accuracy and sensitivity of ADFs in evaluating streamflow depletion to advance the application of analytical depletion function in real world settings.

Methods
To assess the performance of ADFs across hydrogeological settings, we selected two study domains within British Columbia that have contrasting hydrogeology, climate, topography, and ecology ( Figure 1). For these two domains, we modified existing calibrated numerical models, built in MODFLOW, to simulate the pumping impacts on streamflow depletion for comparison 6 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada with ADFs. We treated streamflow depletion simulated by the numerical models as reference or "observed" values for comparison with ADF output since numerical models include more detailed process-based representation of subsurface flow. While both models were calibrated to field observations, we acknowledge that even numerical models are an imprecise mathematical representation of reality.

Study domains
Seven hydrogeological landscapes have been classified based on the physiographic, groundwater regions, and biogeoclimatic zones in British Columbia, Canada ( Figure 1). In this study, we selected two domains in contrasting settings based on numerical model availability and consistency: BX Creek model represents interior plateaus and highlands and the Peace region model represents boreal plains. In addition, these domains have contrasting aquifer and stream network characteristics, as BX Creek represents a small aquifer (165 km 2 ) with a relatively simple stream network while the Peace region model represents a large regional aquifer (1952 km 2 ) with a complex stream network. 7 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada

BX Creek numerical model
The numerical model in the BX Creek was initially developed to validate a long-term recharge rate estimation method for mountainous regions (Smerdon et al., 2009A). BX Creek is characterized by snowmelt-dominated uplands and a dry valley bottom. In the uplands, surface runoff and high flows occur during the snow-melt seasons and groundwater recharge is minimal.
Springs and groundwater seepage occur at mid-elevations. The aquifer in valley bottom is recharged by surface runoff as well as receiving local and regional groundwater flows (Figure 1; (Smerdon et al., 2009B). Model development, conceptualization, and parameterization are described in detail in Smerdon et al. (2009A). The model configuration including spatial 8 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada distribution of hydraulic conductivity and model layers can be found in Supplementary Information Figures S1-S3. Here we briefly review the model settings. The model has a uniform grid resolution of 50 x 50 m with 327 rows, 400 columns, and 8 layers with lateral boundaries between layers following the land surface ( Figure S3). The total model domain has an area of 165 km 2 . Based on the borehole logs from 611 water wells, the subsurface is parameterized as four hydrostratigraphic units (Table 1, Figure S2).
The original BX Creek model includes five boundary conditions: constant head (BAS package in MODFLOW), general head (GHB), drain (DRN), river (RIV), and stream (STR) ( Figure S1). As shown in Figure S1, there are four blocks of drains in the domain to represent seepage faces in the mid-elevation regions, which tend to be found in areas with steep slopes near the valley floor. However, analytical approaches are not designed to represent seepage faces (Huang et al., 2018). We found that removing these drain blocks had minor impacts on the model mass balance, so they were removed from the numerical model ( Figure S1) while keeping the other boundary conditions.
The steady-state model was originally calibrated against hydraulic heads data from 196 groundwater wells, of which 22 were located closely and in a single model cell. The root-meansquare-error of the calibration was 44.6 m, which is 3.6% of the observed hydraulic head range in the watershed. The simulated groundwater discharge from the upland creek (drain boundary) was 90% of average observed low flow. Additionally, this model's solute transport processes were also calibrated. Simulated hydrochemical data also showed strong agreement with the observed data within the domain, providing the further confidence in model performance (Smerdon et al., 2009B).

Peace region numerical model
In the boreal plains, surface runoff is minor due to gentle undulating topography which retains water on the surface in wetlands and ponds. Such landscapes also form nested groundwater flow systems with complex surface and groundwater interactions ( Figure 1; (Smerdon et al., 2009B). The objective of the Peace region numerical model was to understand the connection between local and regional groundwater flow in the buried valley aquifers, and simulation results suggested that buried valleys are not regionally connected throughout the whole network (Morgan, 2018;Morgan et al., 2019). To simulate surface water features, three boundary conditions were assigned including drain Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada (DRN), river (RIV), and general head (GHB) in MODFLOW ( Figure 4A). The river boundary represents the Halfway River and the drain boundary represents its tributaries. The general head boundary condition, which is in the south boundary, is to account for the influence of the Peace River, the mainstream of Halfway River. The details of the numerical model settings can be found in Morgan et al. (2019).
In the Peace region, there were no long-term groundwater levels for calibration.

Pumping schedule
To test the impacts of pumping on streamflow, we developed pumping schedules typical of agricultural irrigation in the BX Creek region using the British Columbia Agriculture Water Calculator (http://www.bcagriculturewatercalculator.ca/), which estimates the monthly water demand for various crops based on soil type and climate conditions. We selected four dominant agricultural crops in BX Creek (apple, cherry, forage and grape) and assumed the irrigated area from each groundwater well was equal to the size of the average land parcels within the domain.
Then, water demand was calculated for each type of crop for the irrigated area over the growing season. Finally, the water demand was averaged across the dominant crops. Specifically, monthly pumping rates for May, June, July, August, and September are 19, 108, 214, 171, and 84 m 3 day -1 , respectively ( Figure 2). We used the same pumping schedule for BX Creek and the Peace region to simulate a consistent stress on the groundwater system for direct comparison of streamflow depletion in the two domains, and the average irrigation demand in BX Creek was higher than that in the Peace region. Previous literature showed that streamflow depletion factor (i.e., streamflow depletion / pumping rate) derived by analytical models and ADFs are not sensitive to pumping Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada rates (Barlow & Leake, 2012;Zipper et al., 2018). Therefore, pumping schedule designed in this study allows us to examine a more realistic groundwater pumping impacts on streamflow.

Streamflow depletion assessment using numerical models
To systematically compare the ADFs and numerical models, we introduced synthetic groundwater wells at regular spacing throughout two domains to stress the aquifer ( We extracted stream segments from the MODFLOW simulation results and then we calculated the annual streamflow depletion factor, which is the ratio of the annual sum of 13 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada streamflow depletion that occurred in a specific year over the annual pumping rate to represent temporal changes in streamflow depletion (Barlow & Leake, 2012). In this way, streamflow depletion occurring during the pumping season and time-lagged depletion occurring during nonpumping periods were both included for assessment. In addition, two MODFLOW models reached dynamic equilibrium stressed by constant weekly recharge rates and hence led to a stable surface water and groundwater interactions. As such, the impacts of pumping for different streamflow regimes (e.g., high and low flow periods) cannot be assessed in this study. Nevertheless, Flores et al. (2020) showed that analytical depletion models have a better performance over constant flow period based on field experiments. Therefore, our research design augments the possibility of comparison between two domains with different hydrological landscape and stream networks.

Streamflow depletion assessment using analytical depletion functions
In this study, we used the highest performing stream proximity criteria and depletion apportionment equation identified by Zipper et al. (2019A, 2020) and compared two different analytical models ( Figure 2). Stream proximity criteria identify the stream segments which could potentially be impacted by a well. In this study, we used the Adjacent+Expanding stream proximity criteria, which include any stream segment that is in a catchment adjacent to the well or is within the maximum radial distance where depletion would be at least 1% of the pumping rate at a given time step. Depletion apportionment equations estimate the fraction of total depletion allocated to each stream segment. The Web Squared depletion apportionment equation splits each stream segment into a finite number of points (e.g. space between each point is 5 meters) and apportions based on the square of the inverse distance of each stream segment to the well as shown in Eqn.
(1). Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada where fi is the depletion fraction of total streamflow depletion from a well apportioned to a stream segment, P is the total number of points which a stream segment is divided into the web squared equation, d is the distance from a well to a stream segment, and n is the total number of stream segments meeting the stream proximity criteria.
The final step to estimate segment-resolution streamflow depletion is to use traditional analytical models to calculate the depletion for each stream segment, which is then adjusted based on the fraction of total depletion calculated using Eqn. 1. In this study, we compared the Glover (Glover & Balmer, 1954) and Hunt analytical models (Hunt, 1999), both of which are commonly used to calculate the streamflow depletion due to their simplicity of implementation. The Glover method assumes that streams fully penetrate the aquifer, and there is no resistance to flow through the streambed. The volumetric streamflow depletion rate, Qa of a stream segment can be calculated by Eqn. (2).
Where, erfc() is complementary error function, S is the aquifer storage coefficient (e.g., specific yield in an unconfined aquifer, unitless), T is the aquifer transmissivity (L 2 /T, L is for length and T is for time), t is the time since the start of pumping (T), d is the well-stream distance (L), and Qw is the pumping rate (L 3 /T).
where  is the streambed conductance. The streambed conductance is defined as where wr (L) is the width of the stream segments. Unlike the conductance in MODFLOW has a unit of L/T, λ in Hunt model has a unit of L 2 /T and interpreted as the conductance per unit length of stream. The ADFs were implemented in the R package "streamDepletr" (Zipper, 2019). The input parameters of the analytical depletion functions, including, T, S, d, and λ, were extracted from MODFLOW so that differences in parameters between the MODFLOW and ADFs are minimized. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada For the analytical models' assumptions, hydrostratigraphic input parameters (T, S) should ideally be averaged between the well locations and affected stream segments. However, since detailed subsurface information is typically unavailable in watershed management settings, we used T and S parameters based on the values at the well locations where aquifer testing (e.g. pump tests, borehole logs) would typically be available. Specifically, each ADF input parameter was calculated as follows: 1) Transmissivity (T). MODFLOW uses hydraulic conductivity as input while the analytical models use transmissivity, which is equal to hydraulic conductivity multiplied by the aquifer thickness. To obtain the transmissivity for the analytical models, the average hydraulic conductivity between the surface and well depth in the MODFLOW model was used and multiplied by the thickness from the water table to the well depth. 2) Specific yield (S). Average specific yield between the surface and well depth in the MODFLOW at the well locations was used as input for analytical depletion functions. 3) Well-stream distance (d). The horizontal distance between a well and affected stream was calculated based on the model domain. 4) For the Hunt model, streambed conductance () is needed. For the Peace region,  was available for all flow boundary conditions used in the MODFLOW model. However, in the BX Creek model, the constant head boundary, which does not require conductance, was used to simulate a lake. We used the highest conductance values of other boundary conditions for the constant head, while  of other flow boundary conditions was extracted from the MODFLOW model.
In BX Creek, the MODFLOW simulations revealed that no depletion occurred for the drain and general head boundary conditions, which are consistently located in the upper streams of the domain and represent intermittent headwater streams (Section 3.1). The inclusion of these boundaries in ADFs could lead to a biased streamflow depletion assessment since depletion cannot occur from dry streams. Therefore, these flow features were excluded for the ADFs in the BX Creek domain. In contrast, both gaining and ephemeral stream types are detected within the drain boundary in the Peace region ( Figure 4C) and thus all features were included in the ADFs. The different response of the drain boundary conditions in the two domains is described below in Section 3.1. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada

Evaluation metrics for comparing between analytical depletion functions and numerical models
To evaluate the performance of ADFs, we used three metrics proposed by Zipper et al., : Metric 1: Spatial distribution of primary impact evaluates whether the ADFs can correctly identify the most affected stream segments by a pumping well. It is quantified as the percentage of wells in which ADFs and MODFLOW identify the same stream segment with the greatest predicted depletion.
Metric 2: Magnitude of primary impact quantifies how big the difference is between ADFs and MODFLOW in predicting the streamflow depletion. We quantify this as the normalized MAE (mean absolute error), which is the streamflow depletion MAE normalized by the highest pumping rate simulated (214 m 3 day -1 ). In addition, we used Kling-Gupta Efficiency (Gupta et al., 2009), or KGE to examine the fit between ADFs and MODFLOW. The KGE is a hydrological fit metric, which integrates correlation, bias, and variability between two datasets. The KGE of 1 is a perfect fit and lower values implying worse performance, with KGE < -0.41 indicating poor performance (Knoben et al, 2019).

Metric 3:
Magnitude of overall impact quantifies the difference in all affected stream segments by ADFs and MODFLOW. For this metric, we used a) MAE to quantify the difference in streamflow depletion predictions from the ADFs and MODFLOW; b) KGE to quantify the fit between the two predictions; and c) accuracy of predicted streamflow capture fraction, which is the cumulative depletion summed across all stream segments from a given well at a given time step (Barlow et al., 2018;Zipper et al., 2019A). This is quantified as the MAE between the capture fraction estimated by ADFs and those by MODFLOW, normalized by the range in capture fraction among all wells from MODFLOW. The three metrics can comprehensively assess the performance of analytical depletion functions.

Sensitivity analysis of analytical depletion functions and their performance in different hydrogeological settings
To understand the drivers of ADFs performance and guide the application of ADFs in realworld settings, we conducted a one-at-a-time (OAT) local sensitivity analysis of ADF input parameters (i.e., T, S, and λ). To do so, T and λ inputs to the ADFs were increased/decreased by Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada the variations of tested parameters are within variability ranges when hydrostratigraphic unit is identified in the study region and thus the outcomes can enhance our understanding of the uncertainties in ADFs for the real-world application. The distribution and mean streamflow depletion difference among sensitivity analyses and the baseline were compared. We also took advantage of existing hydrostratigraphic variability in our two domains to examine the impacts of pumping on the performance of the ADFs. We accomplished this by calculating MAE between ADFs and MODFLOW for each well, and investigating variability in MAE in response to hydrological conductivity, streambed conductance, well-stream distance, and well depth in both domains.

Streamflow depletion assessment using numerical models
In the BX Creek domain, which is relatively small with a simple stream network compared to the Peace region, there are a mixture of gaining, losing, and ephemeral streams ( Figure 3C). Our pumping experiments found that 88% of synthetic groundwater wells caused detectable streamflow depletion over the simulation period. The other 12% of pumping wells tested affected only groundwater storage and thus caused no depletion of surface water features. As shown in water. This is because in MODFLOW, aquifer-drain fluxes drop to zero when hydraulic head in drain cells falls below a threshold level representing the bottom of the stream channel. In our simulations, head was below the drain channel elevations for the entire simulation, which is consistent with regional understanding that these ephemeral streams are disconnected from the groundwater system and primarily transport surface water (e.g., snowmelt and overland flow).
Despite the fact that the drain boundary condition represents headwater drainages with no flow during the simulations ( Figure 3C), some wells located in upland areas still caused depletion, which was sourced from nearby, down-gradient features such as the constant head (lake) and rivers.
Similarly, there was no depletion detected from the general head boundary condition In the BX Creek model, the general head boundary condition was designed to ensure that the downstream segments do not dry. Figure 3C depicting model results over the pumping period and the detailed examination of model output showed that the fluxes exchange between the aquifer and Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada general head boundary condition are small or even nearly zero because the difference of hydraulic head between the general head boundary condition and the surrounding areas are minor, leading to small flux exchanges. Overall, simulations in the BX Creek domain revealed that aquifer-stream interactions differed among boundary conditions and setting in the landscape, indicating that boundary condition choice may affect streamflow response to pumping and therefore should be carefully designed when investigating the impacts of flow responses to pumping.

Figure 3 A) Boundary conditions and topography in the interior plateau and highlands (BX Creek); B) spatial distribution of synthetic pumping wells and their depths in the numerical model; C) streamflow types, including gaining, losing, and ephemeral stream segments. D), E), and F) are cumulative streamflow depletion factors (a ratio of the sum of streamflow depletion that occurred in the year by annual pumping rate) at the 1st, 5th, and 20th years of pumping, respectively. Red colour bar corresponds to the annual streamflow depletion factor in panel D), E), and F). Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada
The Peace region is a large-scale regional domain with a more complex stream network than BX Creek and also contains gaining, losing, and ephemeral streams. In this model, all of synthetic groundwater wells caused detectable streamflow depletion within the 30-year pumping experiments (Figures 4 and S4). Like the BX Creek model, gaining and losing streams near pumping well responded quickly over to pumping. Simulated streamflow depletion factors ranged from 0 to 100% across wells and the number of wells causing streamflow depletion increased with longer pumping time. For instance, in the first eight years of pumping, less than 32% of groundwater wells led to detectable streamflow depletion, while more than 85% of wells caused streamflow depletion from ninth year after onset of pumping onwards. By the final year of pumping, all pumping wells have affected at least one streams in the domain. In contrast to BX Creek, gaining and ephemeral stream segments were found for drain flow boundary condition and we further revealed that pumping had no impacts on ephemeral streams (Figure 4 A and C), indicating that stream types should be considered in streamflow depletion assessment.
Overall, groundwater pumping had larger impacts on streams in the Peace region with the same pumping schedule (larger streamflow depletion factor in Figure 4F compared to Figure 3F). This is likely because 1) different hydrogeological settings and stream networks can have differing impacts on streams, which further highlights that streamflow depletion assessment should consider  In summary, the two domains consistently showed that groundwater pumping can have significant impacts on streamflow under realistic pumping scenarios for these regions. Therefore, the potential negative impacts of pumping on streamflow in these regions should be considered when making water management decisions. However, the two domains have different responses to pumping wells due to differences in hydrogeological settings, stream networks, and numerical model settings. Also, we found that boundary conditions in numerical models should be carefully selected when investigating the impacts of flow responses to pumping as the response to pumping varies across boundary condition types. We found that general head boundary condition in the MODFLOW did not experience streamflow depletion because there was negligible stream-aquifer flux under baseline conditions. This does not imply that general head boundary condition is unsuitable for streamflow depletion assessment, because in other domains with large difference in hydraulic heads between the general head boundary conditions and aquifer, there would be larger fluxes and hence streamflow depletion may occur. Moreover, we found that river (RIV) and stream (STR) flow boundary conditions in MODFLOW in both domains can generate reasonable streamflow depletion estimates (i.e., streamflow depletion < pumping rate) and are recommended for the future studies, while dry stream channels (DRN) disconnected from the water table do not experience streamflow depletion, they should be identified prior to streamflow depletion estimation. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada

Streamflow depletion assessment by analytical depletion functions
We calculated depletion using two different ADFs in the two domains for comparison against the MODFLOW results ( Figure S4-S5 for BX Creek and S8-S9 for the Peace region). Over the entire 30-year simulation period, the average normalized MAE between ADFs and numerical models were about 5.0% for the BX Creek and 2.3% for the Peace region, respectively. We found that the performance of the ADFs including the Glover model and the ADFs including the Hunt model was similar. The similarity between ADFs incorporating different analytical models indicates that streambed conductance is not an important driver of streamflow depletion dynamics in the BX Creek and Peace region numerical models.
In 4.8%, respectively in the BX Creek. In summary, this metric showed that ADFs performance is better over long (decadal) pumping periods compared to short (sub-annual) pumping periods.
In the Peace region, which has a larger domain and more extensive stream network than BX Creek, the ADFs correctly identified the most affected stream segment for >40% of wells in the first year of pumping, and accuracy increased up to 83% by the twelfth year of pumping as the number of wells causing detectable streamflow depletion increased ( Figure 5). The normalized MAE for the most affected segment ranged from 0.04% to 14.8% with an average of 7.6% of the highest pumping rate throughout the pumping period ( Figure 5). The KGE also showed large variation due to seasonal pumping with the highest values of 0.20 and 0.18 for ADFs including the Glover and Hunt, respectively. For all affected stream segments, the average MAE was around 2.3% of the highest pumping rate with the maximum of 3.8% ( Figure 6). The KGE of all affected stream segments also showed variation due to seasonal pumping with the highest values of 0.34 and 0.33 for ADFs including the Glover and Hunt, respectively ( Figure 6). The average KGE of all affected stream segments were 0.09 and 0.18 for ADFs including the Glover and Hunt, respectively. The accuracy of predicted streamflow capture fraction for the ADFs in the Peace region increased with time to a maximum value of 0.75. The average MAE of the most affected stream segments for the pumping season and non-pumping seasons were 7.1% and 11.1% of the highest pumping rate, respectively in the Peace region ( Figure 5). Similarly, the average MAE of all affected stream segments for the pumping season and non-pumping seasons were 2.2% and 2.5% of the highest pumping rate, respectively in the Peace region ( Figure 6). Overall, the difference between ADFs and MODFLOW results was relatively small in the Peace region compared to BX Creek. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada Across both domains, we found that the normalized MAE for all affected stream segments was smaller than normalized MAE for just the most affected stream. This is because the most affected stream segments typically had the shortest well-to-stream distances and accordingly largest predicted depletions. Therefore, small errors relative to the depletion rate could still lead to large differences between the ADFs and MODFLOW. In contrast, for all affected stream segments, many stream segments with relatively little depletion were included, and these also tended to have smaller differences between ADFs and MODFLOW. In addition, the two domains consistently showed that the MAE of the most-affected and all affected stream segments was significantly higher in the non-pumping season than the pumping season, highlighting the ADFs are more accurate in pumping season than non-pumping season. Numerical models simulate both hydraulic head and flux exchanges between aquifer and streams, allowing transmissivity to change in response to pumping, compared to analytical models which assume a constant transmissivity. The lack of dynamic transmissivity estimates in ADFs may result in large errors during the nonpumping or recovery period. This implies that ADFs are a useful tool to estimate streamflow depletion for continuously pumped wells since ADFs have a better performance during the pumping season when impacts tend to be larger.
Our results showed that ADFs can correctly identify the most affected stream segment for 100% of wells in a simpler stream network (BX Creek) and as much as 83% of the time in a more complex stream network (Peace region). Similarly, Zipper et al. (2019A) showed the ADFs can correctly identify the most affected segment ~85% for a mountainous domain in California, but closer to 50% for a larger, more complex domain (Zipper et al., 2020). Therefore, comparing across the two domains in this study and previous work, we conclude that ADFs can accurately identify the most affected stream segments in many real-world settings, and that performance is better in Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada smaller domains that are characterized by steeper catchments and/or simpler stream networks. In addition, the MAE between the ADFs and MODFLOW was small. The largest MAE in the two domains was <15% of the largest pumping rate. Similarly, in California Zipper et al. (2019A) reported that MAE was <20% of the range in observed streamflow depletion for the most affected stream segments. In summary, these results suggest the ADFs are most effective at estimating streamflow depletion in relatively small domains where well-stream distances are shorter.
28 Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada

Sensitivity analysis of analytical depletion functions and their performance in different hydrogeological settings
Our results showed that ADFs including the Glover model and the Hunt model had similar responses to changes in T, S, and  in ADFs in the two domains (Figures 7, S10, and Table S1), Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada which further demonstrate that the choice of analytical models does not significantly affect streamflow depletion estimates in these study domains. We, therefore, focus our discussion of the sensitivity analysis results on the Hunt model, which includes streambed conductance. We found that changes in T and  lead to large difference in streamflow depletion as the density distributions are more widespread for T and  than that of S (Figure 7). Additionally, the average difference relative to the original values of streamflow depletion (Table S1) was larger for T than of  or S, indicating that T has the largest effect on results while S is the least sensitive parameters to ADFs in two domains under sensitivity tests. However, it should be noted that we only changed the S by 20% of its baseline value while T and  varied by two orders of magnitude as we were intended to keep parameter values realistic with real-world hydrostratigraphic materials in these regions. In general, streamflow depletion will be underestimated when T and λ are underestimated, or overestimated when S is overestimated (and vice versa). As a result, we suggest that obtaining accurate transmissivity estimates should be given the first priority when applying the ADFs. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada

Figure 7 Sensitivity analysis of analytical depletion function in BX Creek (A, B, and C) and Peace region (D, E and F). The horizontal axis is the streamflow depletion difference between the tested parameters and original parameter.
To evaluate the impact of hydrostratigraphic conditions on ADF performance, we explored the response of ADFs to four model parameters: hydraulic conductivity, streambed conductance, well-stream distance, and pumping depth in both domains. In BX Creek, we found that ADFs performed better in materials with higher hydraulic conductivity (compared to bedrock materials), lower streambed conductance, and wells within the top five model layers ( Figure 8). Interestingly, we found opposite responses of ADFs to several characteristics in the Peace region ( Figure 9).
Specifically, ADFs performed better in lower conductivity sediments and for wells in deeper layers to affect lower streambed conductance. One possible reason is that the parameters (T and S) used in the ADFs were obtained from well locations. In theory, they should be the average values between the wells and affected streams. In the BX Creek model, due to the simple Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada hydrostratigraphic units and smaller domain, the selected parameters were more representative of the average values between well and streams. In contrast, the Peace region has a large spatial variation in hydraulic conductivity and longer well-stream distances ( Figure S6-S7). The parameters at the well location, therefore, are less likely to represent the average condition between well and streams, and this mismatch between the hydrostratigraphy at the well and the hydrostratigraphy between the well and stream may have caused a different response between the two domains. Moreover, the ADFs had higher MAE in higher hydraulic conductivity zones with higher streambed conductance than smaller ones, potentially leading to larger estimates of depletion and accordingly larger MAE. The hydraulic conductivity range in the BX Creek domain is three orders of magnitude compared to seven orders of magnitude in the Peace region, and apparently there is a much wider range of variability in Peace region which may contribute to different responses in two domains.
In addition, one of the key input parameters in ADFs is the well-stream distance. In both domains, there was consistently greater differences between the ADFs and MODFLOW for wells within ~2 kilometers of a stream, which correspond to wells with the highest predicted depletion (Figures 8 and 9). As well-stream distance increased, the predicted depletion decreases and thus leads to a smaller MAE. The analytical depletion functions' sensitivity in the BX Creek is similar to that observed by Zipper et al. (2019A), who found that ADFs performed better in places which are relatively flat (where alluvial aquifers are most likely to exist), with a near-surface water table (shallower well depth) and within a few kilometers of the down-gradient perennial streams. While we found similar results in BX Creek, our comparison in the Peace region showed that the drivers of ADF performance variability differed across the two domains and that streamflow depletion response to hydrogeological characteristics is most likely to be region-specific. As such, our results Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada indicate that conclusions related to specific parameters observed in one domain cannot be assumed to apply in other regions.
Across the two domains, we found that ADFs including the Glover and Hunt models produced similar streamflow depletion estimates ( Figure S4-S5 and S8-S9) and responded similarly to hydrogeological characteristics (Figures 8 and 9). The MAE for the most affected stream for ADFs including the Glover and Hunt models were 13.6% and 14.4%, respectively, in the BX Creek; and 5.0% and 5.1%, respectively, in the Peace region. Comparison across the two domains revealed that the Hunt model had a consistently better match with MODFLOW, suggesting that considering the streambed conductance can lead to smaller errors, though the differences were slight. Previous work also found that streambed conductance can influence the performance of analytical models (Lackey et al., 2015;Sophocleous et al., 1995;Spalding & Khaleel, 1991). For instance, Hunt et al., (2001) conducted a field experiment in a small domain in New Zealand and found that streambed conductance determined the accuracy of streamflow depletion using the Hunt model. In contrast, streambed conductance may play a minor role in streamflow depletion. Theoretically, as the streambed conductance decreases, greater differences between ADFs including the Glover and Hunt models would be expected. In the two domains, we found that the difference between ADFs including the Glover and Hunt models was minor over a range of streambed conductance conditions, suggesting that in these two model domains, the streambed conductance may not be a significant factor leading to the differences between Glover

Applicability of analytical depletion function: uncertainty limitation and future needs for analytical functions and numerical models
The results of this study confirm that ADFs are an accurate tool for estimating streamflow depletion (Huggins et al., 2018;Zipper et al., 2018Zipper et al., , 2019A, 2020. However, the results also show that streamflow depletion responds differently to hydrogeological and physiographic Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada characteristics of the Peace region and BX Creek (Figures 8-9). Due to the different performance of ADFs in these two different hydrogeological landscapes, our results are not sufficient to make generalized conclusions regarding the applicability of ADFs across British Columbia or within specific hydrogeological settings. Therefore, we recommend that ADFs should be used with caution for broader application, that good understanding of local hydrogeological conditions outside of the tested domains are needed, and that additional testing of ADF performance prior to use in decision-making should be a priority.
While our analysis primarily focused on the evaluation of ADFs, we also found substantial uncertainty associated with using previously calibrated numerical models for streamflow depletion assessment if those models were developed for a different purpose. In the BX Creek and Peace region models some of the flow boundary conditions were challenging to compare to the ADFs.
Notably, the drain boundary represented different stream types across our two domains, i.e., ephemeral streams in BX Creek and losing and ephemeral streams in the Peace region.
Groundwater pumping has limited impacts on ephemeral streams in these numerical models since they are typically disconnected from the water table. In addition, it should be noted that flow processes in unsaturated zone are not simulated by MODFLOW, which result in the river leakage being directly added to aquifers and hence leading to a biased calculation of streamflow depletion, particularly for disconnected systems between streams and the aquifer. The potential uncertainty from such MODFLOW assumptions should be addressed. Therefore, the selection of boundary conditions could potentially affect the stream depletion results in the numerical models and thus influence our comparison between the ADFs and numerical model. In summary, in order to provide reliable streamflow depletion assessments, numerical models should be specifically selected, developed, and calibrated for surface and groundwater interactions. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada Based on these results, more case studies are needed to advance understanding of streamflow depletion and guide application of ADFs for real-world settings with complex stream networks. Here, we highlight some possible directions for future research studies.
1) Cumulative impacts of multiple pumping wells: We compared the performance of ADFs to numerical models by turning one pumping well on at a time. However, multiple groundwater wells typically exist in an aquifer. Future studies are needed to examine whether ADFs are an appropriate tool for estimating stream flow depletion when considering the cumulative effects of multiple wells. Existing literature indicates that the total impacts from multiple groundwater wells may not be equal to the sum of the effects of individual wells (Ahlfeld et al., 2016;Schneider et al., 2017). 2) Clarifying appropriate numerical models for comparison: We assessed the performance of ADF by comparing to the previously calibrated numerical models, so the degree to which ADF results can be considered representative of real-world conditions depends on the representativeness of the numerical models. The selection of flow boundary conditions affects numerical model results and thus increases uncertainty in the degree to which our ADF predictions would match real-world conditions. The MODFLOW used in this study did not include flow process in unsaturated zone. Fully integrated models such as HydroGeoSphere (Brunner and Simmons, 2012) can improve the simulation of the partially saturated zone and groundwater and surface water interactions, but in practice are rarely available in calibrated models. Indeed, they are not available in the regions where we conducted our study. Future studies are recommended to adopt unsaturated zone flow (UZF) package in MODFLOW (Niswonger et al., 2006) or to use other models [e.g., HydroGeoSphere and ParFlow (Maxwell & Condon, 2016)] that have capability to simulate both unsaturated and saturated flow processes to minimize uncertainties.

Synthesis and Conclusions
In this study, we evaluated the performance of analytical depletion functions, which include depletion apportionment equations, stream proximity criteria, and analytical models, to understand the utility of ADFs in two different hydrogeological settings. Specifically, BX Creek has a simpler hydrogeological setting and stream network, while the Peace region is larger and more complex. Using MODFLOW simulations, we found that groundwater pumping can have significant impacts on streamflow in both regions. However, the two domains have different responses to pumping wells due to differences in hydrogeological settings and stream networks. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada Further, we found that streamflow depletion varies as a result of numerical model structure, with limited impacts on dry or ephemeral streams represented using the drain package in MODFLOW.
Streamflow depletion derived from the ADFs was compared to the numerical models.
Across the two domains, we found that ADFs correctly identified all wells for most affected stream segments in the BX Creek and as much as 81% of the time in the Peace region over the entire 30year simulation. The mean absolute error (MAE) of the most affected stream segment were relatively small compared to the pumping rate. Specifically, the average MAE of the most affected stream segments was 14.4% and 7.6% of the highest pumping rate (214 m 3 /day) in the BX Creek and Peace region, respectively. For all affected stream segments, the average MAE was 5.0% of the highest pumping rate in the BX Creek and 2.3% in the Peace region. In addition, we also found that ADFs predictions were more accurate during the pumping season compared to the nonpumping season. Overall, ADFs provide reasonable predictions to estimate streamflow depletion in the pumping season for perennial streams.
We found variable factors of ADFs performance across these two the hydrogeologic settings. In BX Creek, ADFs have smaller errors for wells in higher hydraulic conductivity materials, shallower aquifer, and lower streambed conductance. Conversely, in the Peace region, ADFs have smaller errors in lower hydraulic conductivity materials, deeper aquifers, and lower streambed conductance. In both regions, the performance of ADFs is most variable closest to streams, with increasing MAE in the first couple of kilometers, corresponding to areas where predicted depletion is the highest. The contrasting responses of ADFs performance to hydrogeological setting between the BX Creek and Peace region stresses the importance of additional testing of ADFs in different regions to better identify the drivers of performance. Streamflow depletion assessment with analytical depletion functions in British Columbia, Canada In summary, we conclude that ADFs are useful and accurate tools for streamflow depletion prediction for conjunctive water management. We, therefore, recommend this tool can be applied for assessing the impacts of groundwater pumping on streamflow and its associated aquatic functioning. However, a good understanding of local hydrogeological conditions is required to address the potential uncertainty of ADFs and ensure the prediction accuracy.