Closing the gap: Explaining persistent underestimation by US oil and natural gas production-segment methane inventories

Methane (CH4) emissions from oil and natural gas (O&NG) systems are an important contributor to greenhouse gas emissions. In the United States (US), recent synthesis studies of field measurements of CH4 emissions at different spatial scales are ~1.5x-2x greater compared to official Environmental Protection Agency (EPA) greenhouse gas inventory (GHGI) estimates. Site-level field studies have isolated the production-segment as the dominant contributor to this divergence. Based on an updated synthesis of measurements from component-level field studies, we develop a new inventory-based model for CH4 emissions using bootstrap resampling that agrees within error with recent syntheses of site-level field studies and allows for isolation of differences between our inventory and the GHGI at the equipment-level. We find that venting and malfunction-related emissions from tanks and other equipment leaks are the largest contributors to divergence with the GHGI. To further understand this divergence, we decompose GHGI equipment-level emission factors into their underlying component-level data. This decomposition shows that GHGI inventory methods are based on measurements of emission rates that are systematically lower compared with our updated synthesis of more recent measurements. If our proposed method were adopted in the US and other jurisdictions, inventory estimates could become more accurate, helping to guide methane mitigation policy priorities. Non-peer reviewed preprint submitted to EarthArXiv 2 Methane (CH4) is the principle constituent of natural gas and is also a potent greenhouse gas (GHG) [1]. During production of oil and natural gas (O&NG), some processes are designed to vent CH4 to the air, and CH4 is also emitted unintentionally via leaks in the system. According to the official United States (US) GHG inventory, CH4 from O&NG operations are estimated to contribute ~3% of national GHG emissions (with 100 year GWP = 25, [2]). At the international level the contribution is approximately 5% (based on estimates from [3] and [4]). However, the uncertainty in this estimate, data gaps, and inconsistency with alternative approaches suggested a need for further evidence [5]–[8]. To this end, significant research in the past decade has investigated CH4 emissions from the O&NG system. The US Environmental Protection Agency (EPA) estimates O&NG CH4 emissions in an annual Greenhouse Gas Inventory (GHGI) [9]. The GHGI uses a data-rich, “bottom-up” approach to estimate national CH4 emissions by scaling up CH4 emissions measurements from activities like well completions and gas handling components like valves or seals. However, a recurrent theme consistently found in the literature is that the GHGI underestimates total US O&NG CH4 emissions compared to observed values [10]. Brandt et al. [11] summarize the literature, and observe that national-scale estimates from large-scale field studies exceed the GHGI by ~1.5 times. This difference is sometimes referred to as the “top-down/bottom-up” gap [11]–[17], based on the differences in approach between the GHGI and the conflicting studies. “Top-down” studies determine total emissions from multiple sites via measurements from aircraft, satellites, or weather stations (e.g. [14]–[16], [18]–[20]). Some recent studies have used a meso-scale “site-level” approach which measures CH4 downwind of facilities (e.g., well-pads) to estimate total emissions of an entire site or facility (e.g. [21]–[24]). A recent synthesis of site-level data by Alvarez et al. [13] finds agreement between site-level results and top-down results, with a best estimate of supply chain emissions (including all equipment from production to distribution) ~1.8 times that of the component-level GHGI [25] (up to ~2.1x in the production-segment). Most emissions sources in the GHGI are derived using bottom-up methods. The bottom-up approach estimates overall CH4 emissions by combining counts of individual components (or activities) with emissions per component/activity (the “emission factor”). The bottom-up approach allows for representation of sources at a high resolution, with 67 and 45 separate sources for the O&NG production segments, respectively [25]. Because of this high resolution, the GHGI is useful for development of CH4 mitigation policies. For example, the Obama administration’s Climate Action Plan developed recommendations using the relative contribution of emissions sources in the GHGI [26]. Also, the bottom-up framework of the GHGI is recommended for reporting national emissions under the United Nations Framework Convention on Climate Change (UNFCCC, [27]), under which participating countries report their inventory of GHG emissions. This study aims to answer two questions. First, why does the bottom-up EPA GHGI underestimate CH4 emissions compared to both site-level and large-scale top-down studies? Second, is this underestimation due to an inherent problem with the bottom-up methods used in the GHGI? Previous studies have noted that the underlying data sources of the GHGI were Non-peer reviewed preprint submitted to EarthArXiv 3 published in the 1990s and may be outdated [11], [28], [29]. The site-level synthesis study of Alvarez et al. [13] suggested that the divergence is likely due to a systematic bias in the bottomup methodology that misses “super-emitters”, a finding supported by others (e.g., [11] [30]). Recent work suggests that top-down measurement campaigns are capturing systematically higher emissions during daytime hours from episodic events. However, this may not be true at a national level, as it has been noted that the upward bias of top-down measurements was likely explained by unusually high liquids-unloadings in the Fayetteville shale [13], [31]. Some have attempted to construct alternative inventories (e.g., [13], [32], [33]), however these attempts have not taken full advantage of the robust set of component-level data now available. In this study, for the first time, we explain with source-level specificity the underestimation of O&NG CH4 emissions in the GHGI as compared to top-down studies. Our analysis boundary is the O&NG production segment which includes all active, onshore well pads and tank batteries (excluding inactive and offshore wells) and ends prior to centralized gathering and processing facilities (Figure S1). We focus on the production segment given its significant emissions (~58% of total supply chain CH4 emissions in Alvarez et al. [13]) and the large difference between sitelevel estimates and the GHGI [13] (~70% of difference between Alvarez et al. [13] and the GHGI, Figure S2). This study develops and validates approaches that can be applied to other segments in the O&NG supply chain. Our novel contributions are threefold. First, we construct a bottom-up, O&NG productionsegment CH4 emissions estimation tool based on the most comprehensive public database of component-level activity and emissions measurements yet assembled. Our approach differs from the GHGI in that it applies modern statistical approaches (bootstrap resampling) to allow for inclusion of infrequent, large emitters, thus robustly addressing the issue of super-emitters. Second, we use this tool to produce an inventory of US O&NG production segment CH4 emissions and compare this with the GHGI and previous site-level results, showing that much of the divergence between different methods at different scales vanishes when we apply our improved dataset and statistical approaches. As mentioned earlier, site-level synthesis studies have been validated against even larger-scale top-down studies, so improved alignment between our method and site-level results suggests much better agreement with top-down results [13], [34]. Third, to isolate specific sources of disagreement between the GHGI and other studies, we reconstruct the GHGI emission factors beginning with the underlying datasets and uncover some possible sources of disagreement between inventory methods and top-down studies. Based on these results, we suggest a strategy for improving the accuracy of the GHGI, and likewise any country using a similar approach in reporting O&NG CH4 emissions to the UNFCCC. A new bottom-up approach Bottom-up approaches extrapolate component or equipment emissions rates to large (e.g., national) scales by multiplying emission factors (emissions per component or equipment per unit time) by activity factors (counts of components per equipment, and equipment per well) (Figure 1). Our estimation tool requires two sequential extrapolations, first from the component to the equipment-level, and second from the equipment to the national or regional-level. Non-peer reviewed preprint submitted to EarthArXiv 4 The approach utilized in our bottom-up estimation tool begins with a database of componentlevel direct emissions measurements (e.g., component-level emission factors). We generate component-level emission factor distributions for this study from a literature review building on prior work [11], [30] and adding new publicly available quantified measurements (Table 1 in Methods). Our resulting tool’s database includes ~3200 measurements from 6 studies across a 12-fold component classification scheme (see SI-3.2 for further description of this classification scheme). We applied emission factors as reported in the individual studies, with no modifications beyond unit conversion (noting that there are some differences between studies in High Flow Sampler bias correction for gas concentration and flow rate, which may introduce uncertainty to our results). Data for component counts and fraction of components emitting (the ratio of emitting components to all components counted) was scarce, with only 3 studies containing useful information for both ([35]–[37] for component counts and [35], [36], [38] for fraction of components emitting). We derive equipment-level emission factors for our tool by random re-sampling (i.e., bootstrapping, with replacement) from our component-level database according to component counts per equipment and fraction of components emitting. Source-specific approaches were required for infrequent events (i.e., completions, workovers, liquids unloadings), methane slip from reciprocating engines, and liquid storage tanks (see SI-3.3). We then perform a second extrapolation, using our equipment-level emission and activity factors to calculate a 2015 US O&NG production-segment CH4 emissions estimate. For this step, our tool is integrated into the Oil Production and Greenhouse Gas Emissions Estimator (further description of OPGEE can be found in SI-3.1) and parameterized using 2015 domestic well count and O&NG production data (same dataset as Alvarez et al. [13]). A total of ~1 million wells and associated equipment are partitioned and analyzed across 74 analysis bins (SI-4.1). We performed a Monte Carlo uncertainty analysis repeating the bootstrapping algorithm 100 times across all ~1 million wells (SI-4.4). It is worth mentioning that emission factors are often themselves only measured in a few locations, and thus in our extrapolation we assume applicability to other regions. Non-peer reviewed preprint submitted to EarthArXiv 5 Figure 1: Schematic of this study’s bottom-up CH4 emissions estimation tool which involves multiplication of emission factors (e.g., emissions per valve) by activity factors (e.g., number of valves per wellhead). Two sequential extrapolations are performed using an iterative bootstrapping approach. First, our database of component-level (e.g., valve, connector) emissions measurements (a) is extrapolated using component-level activity factors to generate equipment-level (e.g., wellhead, separator) emission factors (b). Second, these equipment-level emission factor distributions are extrapolated using equipment-level activity factors to generate a 2015 US O&NG production-segment CH4 emissions estimate. This extrapolation is performed 100 times to generate a distribution of nationallevel CH4 emissions (c) and estimate a 95% confidence interval (CI). Comparison of US production-segment CH4 emissions with site-level studies and the GHGI We first compare our resulting US 2015 O&NG production-segment CH4 emissions estimate with the GHGI’s estimate for 2015 produced in their most recent 2020 inventory [25]. We also validate our bottom-up tool by comparing total emissions and emissions distributions with those generated in site-level synthesis studies (total emissions are compared with Alvarez et al. [13], site-level distributions are compared with Omara et al.[34]). We estimate mean O&NG production-segment CH4 emissions of 6.3 Tg/yr (5.8-6.9 Tg/yr, at 95% confidence-interval, CI) (Fig. 2a, Note that the CI is relatively narrow given that this only captures uncertainty due to resampling). Our mean, production-normalized emissions rate from the production segment is 1.3% (1.2-1.4% at 95% CI, based on gross NG production of 32 trillion cubic feet and an average CH4 content of 82% [39], [40]), slightly lower than Alvarez et al.[13], [34], who estimate 1.5% (applying the same denominator as above). Both our bottom-up component-level inventory results and the Alvarez site-level results are approximately 2x those of the GHGI estimate of 3.6 Tg/yr (year 2015 data [25], excludes offshore systems) for the Non-peer reviewed preprint submitted to EarthArXiv 6 O&NG production segment. Interestingly, the difference in US production-segment emissions between this study and the GHGI is approximately the same volume as our estimate of contribution from super-emitters (top 5% of emissions events). Given that our results match the Alvarez et al. site-level results, we conclude that the divergence between the GHGI and topdown/site-level studies is not likely to be due to any inherent issue with the bottom-up approach. Figure 2(b-c) show that site-level distributions developed using our model match empirical distributions from the site-level synthesis study of Omara et al. [34]. To report our results on a basis consistent with site-level studies (recalling that sites can contain more than one well), we cluster equipment-level emissions outputs into production sites (SI-4.3). Several other observations from our simulations are of interest. First, our modeled emissions per site are higher at liquids-rich sites versus gas-rich sites (Figure S29), in alignment with recent field measurement campaigns in both Canada and the United States [41], [42]. Second, our model recreates the trend demonstrated by Omara et al. wherein low-producing sites exhibit higher productionnormalized emissions rates [34] (Figure S30). Finally, the tail of our modeled distribution closely matches the tail of the empirical Omara et al. distribution (Figure 2b and Figure S28). This is of particular interest, given that recent papers assert the divergence between the GHGI and sitelevel studies is mostly due to an inability of the bottom-up methods to capture super-emitters [32], [42]. Our results clearly show that a modern dataset with proper bootstrap resampling techniques can recreate observed super-emitters. Because our approach uses a component-level, bottom-up approach, we can investigate the source of differences with the GHGI. This cannot be done with site-level data. Relative to the GHGI, contributions from equipment leaks in our estimate are larger by ~1.3 Tg CH4 and tank leaks and venting by ~2.1 Tg CH4 (Figure 3). Together, these two sources contribute over half of total O&NG production-segment CH4 emissions. The increase in estimated emissions from equipment leaks compared to the GHGI are due to our updated emission factor; we know that the difference is not due to equipment-level activity factors because ours are nearly identical to the GHGI (see SI-2.3). In the next section we will perform a deeper investigation into both component-level emissions data for equipment leaks and tank modelling as underlying contributors to differences between our results and the GHGI. Non-peer reviewed preprint submitted to EarthArXiv 7 Figure 2: (a) Comparison of this study’s aggregate US 2015 CH4 emissions from O&NG productionsegment with site-level results of Alvarez et al. (see Table S3 in [13] minus contributions from offshore platforms and abandoned wells) and the GHGI [25] including fraction estimated from super-emitters (top 5% of sources) and 95% confidence interval. We also compare probability distributions of our component-level simulations (red lines), aggregated into site-level emissions, with site-level results of Omara (blue line): (b) Cumulative distribution plot (CDF) describing the fraction of well-sites with emissions below a given amount, and (c) probability distribution of emissions rate per well-site with the mean (filled square), median (x), and 95% confidence intervals shown above the plots. Results of this study are presented using 100 Monte Carlo simulations. Because of the large number of sampled sites, the Monte Carlo simulations all converge toward the same size distribution in panels (b) and (c). Non-peer reviewed preprint submitted to EarthArXiv 8 Figure 3: Contributions of emissions sources to our US 2015 O&NG production-segment inventory (and 95% confidence interval) compared with 2020 GHGI [25]. Inset pie charts illustrate individual sourcespecific contributions of our inventory to equipment leaks (left pie chart) and tanks (right pie chart). Discrepancies with the GHGI are dominated by liquid hydrocarbon tank leaks and venting (“tanks”, ~2.1 Tg/yr CH4) and equipment leaks (~1.3 Tg/yr CH4). Details regarding the modelling of tank emissions sources is given in SI-3.3. Results in tabular form are given in Table S2 and Table S3. Main sources of GHGI underestimation Given that our new component-level method is validated by the empirical results from site-level field studies, can we explain why the GHGI produces lower O&NG production-segment CH4 emissions estimates? Results from our modelling (Figure 3), in addition to recent revisions by the GHGI and other analyses (SI5.1), suggest that the downward bias of the GHGI is not due to pneumatic controllers, liquids unloadings, or completions and workovers because either the divergence is small or absolute emissions are small, or both. Methane slip in reciprocating engines is higher in the GHGI, although the overall magnitude in difference is small. The combustion emission factor used in the GHGI for methane slip from reciprocating gas engines is based on a 1991 TRANSDAT dataset published by the Gas Research Institute [43]. The difference compared to our study is probably explained by substantial improvement in engine emissions since publication of that report (based on manufacturer reported specifications for Non-peer reviewed preprint submitted to EarthArXiv 9 reciprocating gas engines [44]). For these reasons, this paper focuses its analysis of the two largest sources of GHGI underestimation compared to our validated method: equipment leakage and liquid hydrocarbon storage tanks, whose emissions are 1.3 and 2.1 Tg CH4 lower than our estimates, respectively. See SI-1.1 for definitions of each emissions source. The GHGI constructs emission factors for equipment-level leaks using an approach very similar to ours, where emission factors of individual components are aggregated according to estimated counts of components per piece of equipment. To explore differences in equipment leak estimates, we decompose equipment-level emission factors into the constituent parts: Component-level emissions data, component counts, and fraction of components emitting (the relationship between these parameters is defined in Figure 4). Reconstructing equipment-level, equipment leakage emission factors from the GHGI is complicated by the fact that the underlying studies from the 1990s [35], [45] are at a more detailed level than the GHGI itself. For example, the underlying data for natural gas system emission factors are subdivided by region (e.g., Western gas versus Eastern gas), and for petroleum systems data are subdivided by product stream (e.g., light oil versus heavy oil). Equipment-level emission factors for gas systems, for example, are a weighted average of both Western emission factors and Eastern emission factors. The GHGI approach to aggregating these factors to overall values for natural gas and petroleum systems is described in SI-5.2. We demonstrate differences in equipment-level emission factors for equipment leaks via a decomposition into constituent factors for a single example (equipment type and region) – leakage from gas wells in the West (Figure 4) – with equipment leaks from all other sources similarly described in the SI (Figure S18 – Figure S26). The difference between our study’s equipment-level equipment leakage emission factor for Western natural gas wells and the GHGI – the difference to be explained by decomposition – is ~5x (3.4 kg/day versus 0.7 kg/day). The underlying factors are plotted in Figure 4. First, we compare component-level emission factors, defined as the average emissions rate of leaking components (Figure 4a). (Note that the “average emission rate of leaking components” is not the same as an average emission rate for all components.) For Western gas and petroleum systems in the GHGI, component-level leakage emission factors are constructed using a method referred to by the EPA [46] as the "EPA correlation approach” (defined in detail in SI-5.2.2). In this approach, emission factors are constructed from a dataset of various facilities including oil and gas production sites, refineries, and marketing terminals (n = 445, data compiled in the EPA Protocol document [46]). The difference between our study’s component-level emission factors and the GHGI for connectors, valves, and open-ended lines (the components comprising the wells) is ~7x, 6x, and 5x respectively (Figure 4a). Note that the decomposition in Figure 4a is limited to connectors, valves, and openended lines (the three components inventoried by the GHGI) although our inventory also accounts for pressure relief valves, regulators, and other (miscellaneous) components on wells. The fact that GHGI equipment-level emission factors are based upon only three component types (when more component classes exist, according to our database) will contribute to some underestimation. Non-peer reviewed preprint submitted to EarthArXiv 10 Figure 4b compares the fraction of components emitting (the ratio of emitting components to all components counted), while Figure 4c shows component counts (number of components counted per piece of equipment). These have offsetting effects, where component-level emission factors and component counts contribute to higher emissions in our study versus the GHGI, and fraction of components emitting contributing to lower emissions in our study. The resulting total emissions per well (Figure 4d) are the product of these factors, summed across all components. Similar results are found across all equipment categories compared to the GHGI. In general, in our dataset, component-level emission factors are higher [5x to 46x comparing our emission factors for connectors, valves, and open-ended lines across all GHGI categories, see Figure S18 – Figure S26], the fraction of components emitting is lower [1x to 0.05x], and the number of components per piece of equipment is generally, but not always, higher [0.2x to 20x comparing our emission factors for wells, separators, and meters across all GHGI categories, see Figure S18 – Figure S26]. Considering the decomposition presented here, along with the rest in the SI (Figure S18 – Figure S26, plus some discussion of smaller factors not described here), we can explain much of the overall underestimation of the GHGI compared to our results for the equipment leaks source category. Non-peer reviewed preprint submitted to EarthArXiv 11 Figure 4: Example decomposition of the equipment-level emission factor for Western US gas wells (Note that units differ for each panel, and also the logarithmic scale meaning that visible differences between points often span orders of magnitude). This study’s equipment-level emission factor (d) is decomposed into constituent parts and compared with the GHGI. Constituent parts include: component-level emission factors (a), fraction of components emitting (b), and component counts (c). When multiplied together, these factors have counteracting biases, with component-level emission factors and component counts contributing to higher emissions in our study versus the GHGI, and fraction of components emitting contributing to lower emissions in our study. Note that in actual usage in the GHGI, equipment-level emission factors for gas systems are a weighted average of both Western systems (API 4598, [47]) and Eastern gas systems (Star Environmental, [45]). Here, for illustration purposes, we only show constituent data for Western gas systems; results for Eastern gas system are reported in SI Section 5.2. Further, we also limit this figure to connectors, valve, and open-ended lines (the three components inventoried by the GHGI) although our inventory also accounts for pressure relief valves, regulators, and other (miscellaneous) components on wells. The second source of significant divergence between this study and the GHGI for US CH4 emissions in the O&NG production-segment is with emissions from liquid hydrocarbon storage tanks. The EPA GHGI constructs storage tank emissions estimates using Greenhouse Gas Reporting Program (GHGRP) data. The GHGRP is a program which collects emissions data from industrial facilities, where requirements for natural gas and petroleum systems are specified by the Code of Federal Regulations Section 40 Subpart W [48]. Based on GHGRP data for storage tanks (see methods in SI-5.3), we decompose total emissions for the GHGI into tank counts and emission factors allowing us to draw comparisons to results from this study. Before presenting our decompositions, it is worth noting two key differences in modelling of emissions from liquid hydrocarbon storage tanks between our study and the GHGI (see further description of how our model estimates tank emissions in SI-3.3.2). First, whereas our model is based on direct measurements, the GHGI is based on operator reported simulations from software programs such as API E&P Tank or AspenTech HYSYS [49], [50]. Second, as a consequence of these differing approaches, whereas our emissions are classified based on measurement source (e.g., vent stack, thief hatch, etc.) GHGI emissions are classified according Non-peer reviewed preprint submitted to EarthArXiv 12 to the simulated process (e.g., flash emissions). As a consequence of these differences in emissions classification, comparisons between decompositions of our study versus the GHGI will be imperfect. With this in mind, we define emission factors in our decomposition as the summation of intentional emission factors and unintentional emission factors (Figure 5). Here, intentional (flash related) emission factors are based on direct emission measurements at the vent stack for our study, and simulations of uncontrolled and controlled tanks in the GHGI (see details in SI-5.3). Our comparison of unintentional emission factors is less precise. In the GHGI, unintentional emissions are limited to what is reported under “malfunctioning separator dump valves” (although it is unclear if additional unintentional emissions are reported alongside flash emissions in the other tank categories, see SI-5.3). Conversely, unintentional emission factors in our study are based on direct measurements of emissions from open thief hatches, rust-related holes, and malfunctioning pressure-relief valves. We demonstrate the decomposition in Figure 5 for petroleum systems (see Figure S28 in the SI for natural gas systems). Note that flash emissions will only occur at controlled tanks, while unintentional emissions from thief hatches, holes, or pressure-relief valves could occur at either controlled or uncontrolled tanks. Figure 5 and Figure S28 demonstrate that, while several factors contribute to differences, difference in emission factors for various unintentional emissions sources are the greatest source of difference between this study and the GHGI. Unintentional emission factors are the product of (i) average emissions rate per event, and (ii) frequency of unintentional emissions events per tank. Both of these values are approximately an order of magnitude higher for our study as compared to the GHGI, contributing to the nearly two orders of magnitude difference in total emissions. Our findings suggest that both the magnitude and frequency of unintentional emissions sources could contribute to significant underestimation in the GHGI. Due to the limited quantified, component-level data available on tank emissions (based upon safety and accessibility issues) our tank emissions measurements come from a single study in a single geographic area (Eastern Research Group in the Barnett shale,[51]). Therefore, more studies are required to provide a comprehensive view of tank emissions. However, while quantified emissions data for tank sources are scarce, the existence of unintentional emissions from tanks (due to open thief hatches, rust-related holes, pressure-relief valves, etc.) has been corroborated by numerous ground and aerial surveys [42], [52]–[54]. Several of these studies are summarized in Table S26. Taken together, these studies provide further evidence that: (i) high emissions events are frequently observed at storage tanks, not just from vents but also at open thief hatches and pressure relief valves, (ii) these high emissions events are common at both controlled tanks and uncontrolled tanks, (iii) the frequency (events/tank) of unintentional emissions events is much higher than the rate suggested by the EPA (2%, see Figure 5c) for malfunctioning separator dump valves. Non-peer reviewed preprint submitted to EarthArXiv 13 Figure 5: Decomposition of total emissions for oil tanks (far right panel) into constituent parts, with comparison of this study’s dataset to those of the GHGI. From left to right: Total activity, intentional (flashrelated) emission factor, unintentional emission factor, and total emissions. Flash and unintentional emission factors are decomposed into emission factors (kg CH4/ emitting tanks) and control rates (emitting tanks/ total tanks). Note the log scale for the right three panels.

Methane (CH4) is the principle constituent of natural gas and is also a potent greenhouse gas (GHG) [1]. During production of oil and natural gas (O&NG), some processes are designed to vent CH4 to the air, and CH4 is also emitted unintentionally via leaks in the system. According to the official United States (US) GHG inventory, CH4 from O&NG operations are estimated to contribute ~3% of national GHG emissions (with 100 year GWP = 25, [2]). At the international level the contribution is approximately 5% (based on estimates from [3] and [4]). However, the uncertainty in this estimate, data gaps, and inconsistency with alternative approaches suggested a need for further evidence [5]- [8]. To this end, significant research in the past decade has investigated CH4 emissions from the O&NG system.
The US Environmental Protection Agency (EPA) estimates O&NG CH4 emissions in an annual Greenhouse Gas Inventory (GHGI) [9]. The GHGI uses a data-rich, "bottom-up" approach to estimate national CH4 emissions by scaling up CH4 emissions measurements from activities like well completions and gas handling components like valves or seals. However, a recurrent theme consistently found in the literature is that the GHGI underestimates total US O&NG CH4 emissions compared to observed values [10]. Brandt et al. [11] summarize the literature, and observe that national-scale estimates from large-scale field studies exceed the GHGI by ~1.5 times. This difference is sometimes referred to as the "top-down/bottom-up" gap [11]- [17], based on the differences in approach between the GHGI and the conflicting studies. "Top-down" studies determine total emissions from multiple sites via measurements from aircraft, satellites, or weather stations (e.g. [14]- [16], [18]- [20]). Some recent studies have used a meso-scale "site-level" approach which measures CH4 downwind of facilities (e.g., well-pads) to estimate total emissions of an entire site or facility (e.g. [21]- [24]). A recent synthesis of site-level data by Alvarez et al. [13] finds agreement between site-level results and top-down results, with a best estimate of supply chain emissions (including all equipment from production to distribution) ~1.8 times that of the component-level GHGI [25] (up to ~2.1x in the production-segment).
Most emissions sources in the GHGI are derived using bottom-up methods. The bottom-up approach estimates overall CH4 emissions by combining counts of individual components (or activities) with emissions per component/activity (the "emission factor"). The bottom-up approach allows for representation of sources at a high resolution, with 67 and 45 separate sources for the O&NG production segments, respectively [25]. Because of this high resolution, the GHGI is useful for development of CH4 mitigation policies. For example, the Obama administration's Climate Action Plan developed recommendations using the relative contribution of emissions sources in the GHGI [26]. Also, the bottom-up framework of the GHGI is recommended for reporting national emissions under the United Nations Framework Convention on Climate Change (UNFCCC, [27]), under which participating countries report their inventory of GHG emissions.
This study aims to answer two questions. First, why does the bottom-up EPA GHGI underestimate CH4 emissions compared to both site-level and large-scale top-down studies? Second, is this underestimation due to an inherent problem with the bottom-up methods used in the GHGI? Previous studies have noted that the underlying data sources of the GHGI were published in the 1990s and may be outdated [11], [28], [29]. The site-level synthesis study of Alvarez et al. [13] suggested that the divergence is likely due to a systematic bias in the bottomup methodology that misses "super-emitters", a finding supported by others (e.g., [11] [30]). Recent work suggests that top-down measurement campaigns are capturing systematically higher emissions during daytime hours from episodic events. However, this may not be true at a national level, as it has been noted that the upward bias of top-down measurements was likely explained by unusually high liquids-unloadings in the Fayetteville shale [13], [31]. Some have attempted to construct alternative inventories (e.g., [13], [32], [33]), however these attempts have not taken full advantage of the robust set of component-level data now available.
In this study, for the first time, we explain with source-level specificity the underestimation of O&NG CH4 emissions in the GHGI as compared to top-down studies. Our analysis boundary is the O&NG production segment which includes all active, onshore well pads and tank batteries (excluding inactive and offshore wells) and ends prior to centralized gathering and processing facilities ( Figure S1). We focus on the production segment given its significant emissions (~58% of total supply chain CH4 emissions in Alvarez et al. [13]) and the large difference between sitelevel estimates and the GHGI [13] (~70% of difference between Alvarez et al. [13] and the GHGI, Figure S2). This study develops and validates approaches that can be applied to other segments in the O&NG supply chain.
Our novel contributions are threefold. First, we construct a bottom-up, O&NG productionsegment CH4 emissions estimation tool based on the most comprehensive public database of component-level activity and emissions measurements yet assembled. Our approach differs from the GHGI in that it applies modern statistical approaches (bootstrap resampling) to allow for inclusion of infrequent, large emitters, thus robustly addressing the issue of super-emitters. Second, we use this tool to produce an inventory of US O&NG production segment CH4 emissions and compare this with the GHGI and previous site-level results, showing that much of the divergence between different methods at different scales vanishes when we apply our improved dataset and statistical approaches. As mentioned earlier, site-level synthesis studies have been validated against even larger-scale top-down studies, so improved alignment between our method and site-level results suggests much better agreement with top-down results [13], [34]. Third, to isolate specific sources of disagreement between the GHGI and other studies, we reconstruct the GHGI emission factors beginning with the underlying datasets and uncover some possible sources of disagreement between inventory methods and top-down studies. Based on these results, we suggest a strategy for improving the accuracy of the GHGI, and likewise any country using a similar approach in reporting O&NG CH4 emissions to the UNFCCC.

A new bottom-up approach
Bottom-up approaches extrapolate component or equipment emissions rates to large (e.g., national) scales by multiplying emission factors (emissions per component or equipment per unit time) by activity factors (counts of components per equipment, and equipment per well) ( Figure  1). Our estimation tool requires two sequential extrapolations, first from the component to the equipment-level, and second from the equipment to the national or regional-level.
The approach utilized in our bottom-up estimation tool begins with a database of componentlevel direct emissions measurements (e.g., component-level emission factors). We generate component-level emission factor distributions for this study from a literature review building on prior work [11], [30] and adding new publicly available quantified measurements ( Table 1 in Methods). Our resulting tool's database includes ~3200 measurements from 6 studies across a 12-fold component classification scheme (see SI-3.2 for further description of this classification scheme). We applied emission factors as reported in the individual studies, with no modifications beyond unit conversion (noting that there are some differences between studies in High Flow Sampler bias correction for gas concentration and flow rate, which may introduce uncertainty to our results). Data for component counts and fraction of components emitting (the ratio of emitting components to all components counted) was scarce, with only 3 studies containing useful information for both ([35]- [37] for component counts and [35], [36], [38] for fraction of components emitting).
We derive equipment-level emission factors for our tool by random re-sampling (i.e., bootstrapping, with replacement) from our component-level database according to component counts per equipment and fraction of components emitting. Source-specific approaches were required for infrequent events (i.e., completions, workovers, liquids unloadings), methane slip from reciprocating engines, and liquid storage tanks (see SI-3.3).
We then perform a second extrapolation, using our equipment-level emission and activity factors to calculate a 2015 US O&NG production-segment CH4 emissions estimate. For this step, our tool is integrated into the Oil Production and Greenhouse Gas Emissions Estimator (further description of OPGEE can be found in SI-3.1) and parameterized using 2015 domestic well count and O&NG production data (same dataset as Alvarez et al. [13]). A total of ~1 million wells and associated equipment are partitioned and analyzed across 74 analysis bins (SI-4.1). We performed a Monte Carlo uncertainty analysis repeating the bootstrapping algorithm 100 times across all ~1 million wells (SI-4.4). It is worth mentioning that emission factors are often themselves only measured in a few locations, and thus in our extrapolation we assume applicability to other regions.

Comparison of US production-segment CH4 emissions with site-level studies and the GHGI
We first compare our resulting US 2015 O&NG production-segment CH4 emissions estimate with the GHGI's estimate for 2015 produced in their most recent 2020 inventory [25]. We also validate our bottom-up tool by comparing total emissions and emissions distributions with those generated in site-level synthesis studies (total emissions are compared with Alvarez et al. [13], site-level distributions are compared with Omara et al. [34]).
We estimate mean O&NG production-segment CH4 emissions of 6.3 Tg/yr (5.8-6.9 Tg/yr, at 95% confidence-interval, CI) (Fig. 2a, Note that the CI is relatively narrow given that this only captures uncertainty due to resampling). Our mean, production-normalized emissions rate from the production segment is 1.3% (1.2-1.4% at 95% CI, based on gross NG production of 32 trillion cubic feet and an average CH4 content of 82% [39], [40]), slightly lower than Alvarez et al. [13], [34], who estimate 1.5% (applying the same denominator as above). Both our bottom-up component-level inventory results and the Alvarez site-level results are approximately 2x those of the GHGI estimate of 3.6 Tg/yr (year 2015 data [25], excludes offshore systems) for the O&NG production segment. Interestingly, the difference in US production-segment emissions between this study and the GHGI is approximately the same volume as our estimate of contribution from super-emitters (top 5% of emissions events). Given that our results match the Alvarez et al. site-level results, we conclude that the divergence between the GHGI and topdown/site-level studies is not likely to be due to any inherent issue with the bottom-up approach. show that site-level distributions developed using our model match empirical distributions from the site-level synthesis study of Omara et al. [34]. To report our results on a basis consistent with site-level studies (recalling that sites can contain more than one well), we cluster equipment-level emissions outputs into production sites (SI-4.3). Several other observations from our simulations are of interest. First, our modeled emissions per site are higher at liquids-rich sites versus gas-rich sites ( Figure S29), in alignment with recent field measurement campaigns in both Canada and the United States [41], [42]. Second, our model recreates the trend demonstrated by Omara et al. wherein low-producing sites exhibit higher productionnormalized emissions rates [34] ( Figure S30). Finally, the tail of our modeled distribution closely matches the tail of the empirical Omara et al. distribution (Figure 2b and Figure S28). This is of particular interest, given that recent papers assert the divergence between the GHGI and sitelevel studies is mostly due to an inability of the bottom-up methods to capture super-emitters [32], [42]. Our results clearly show that a modern dataset with proper bootstrap resampling techniques can recreate observed super-emitters.
Because our approach uses a component-level, bottom-up approach, we can investigate the source of differences with the GHGI. This cannot be done with site-level data. Relative to the GHGI, contributions from equipment leaks in our estimate are larger by ~1.3 Tg CH4 and tank leaks and venting by ~2.1 Tg CH4 (Figure 3). Together, these two sources contribute over half of total O&NG production-segment CH4 emissions. The increase in estimated emissions from equipment leaks compared to the GHGI are due to our updated emission factor; we know that the difference is not due to equipment-level activity factors because ours are nearly identical to the GHGI (see SI-2.3). In the next section we will perform a deeper investigation into both component-level emissions data for equipment leaks and tank modelling as underlying contributors to differences between our results and the GHGI.  Table S3 in [13] minus contributions from offshore platforms and abandoned wells) and the GHGI [25] including fraction estimated from super-emitters (top 5% of sources) and 95% confidence interval. We also compare probability distributions of our component-level simulations (red lines), aggregated into site-level emissions, with site-level results of Omara (blue line): (b) Cumulative distribution plot (CDF) describing the fraction of well-sites with emissions below a given amount, and (c) probability distribution of emissions rate per well-site with the mean (filled square), median (x), and 95% confidence intervals shown above the plots. Results of this study are presented using 100 Monte Carlo simulations. Because of the large number of sampled sites, the Monte Carlo simulations all converge toward the same size distribution in panels (b) and (c). [25]. Inset pie charts illustrate individual sourcespecific contributions of our inventory to equipment leaks (left pie chart) and tanks (right pie chart). Discrepancies with the GHGI are dominated by liquid hydrocarbon tank leaks and venting ("tanks", ~2.1 Tg/yr CH4) and equipment leaks (~1.3 Tg/yr CH4). Details regarding the modelling of tank emissions sources is given in SI-3.3. Results in tabular form are given in Table S2 and Table S3.

Main sources of GHGI underestimation
Given that our new component-level method is validated by the empirical results from site-level field studies, can we explain why the GHGI produces lower O&NG production-segment CH4 emissions estimates? Results from our modelling (Figure 3), in addition to recent revisions by the GHGI and other analyses (SI-5.1), suggest that the downward bias of the GHGI is not due to pneumatic controllers, liquids unloadings, or completions and workovers because either the divergence is small or absolute emissions are small, or both. Methane slip in reciprocating engines is higher in the GHGI, although the overall magnitude in difference is small. The combustion emission factor used in the GHGI for methane slip from reciprocating gas engines is based on a 1991 TRANSDAT dataset published by the Gas Research Institute [43]. The difference compared to our study is probably explained by substantial improvement in engine emissions since publication of that report (based on manufacturer reported specifications for reciprocating gas engines [44]). For these reasons, this paper focuses its analysis of the two largest sources of GHGI underestimation compared to our validated method: equipment leakage and liquid hydrocarbon storage tanks, whose emissions are 1.3 and 2.1 Tg CH4 lower than our estimates, respectively. See SI-1.1 for definitions of each emissions source.
The GHGI constructs emission factors for equipment-level leaks using an approach very similar to ours, where emission factors of individual components are aggregated according to estimated counts of components per piece of equipment. To explore differences in equipment leak estimates, we decompose equipment-level emission factors into the constituent parts: Component-level emissions data, component counts, and fraction of components emitting (the relationship between these parameters is defined in Figure 4).
Reconstructing equipment-level, equipment leakage emission factors from the GHGI is complicated by the fact that the underlying studies from the 1990s [35], [45] are at a more detailed level than the GHGI itself. For example, the underlying data for natural gas system emission factors are subdivided by region (e.g., Western gas versus Eastern gas), and for petroleum systems data are subdivided by product stream (e.g., light oil versus heavy oil). Equipment-level emission factors for gas systems, for example, are a weighted average of both Western emission factors and Eastern emission factors. The GHGI approach to aggregating these factors to overall values for natural gas and petroleum systems is described in SI-5.2.
We demonstrate differences in equipment-level emission factors for equipment leaks via a decomposition into constituent factors for a single example (equipment type and region)leakage from gas wells in the West (Figure 4)with equipment leaks from all other sources similarly described in the SI (Figure S18 -Figure S26). The difference between our study's equipment-level equipment leakage emission factor for Western natural gas wells and the GHGI the difference to be explained by decompositionis ~5x (3.4 kg/day versus 0.7 kg/day). The underlying factors are plotted in Figure 4.
First, we compare component-level emission factors, defined as the average emissions rate of leaking components (Figure 4a). (Note that the "average emission rate of leaking components" is not the same as an average emission rate for all components.) For Western gas and petroleum systems in the GHGI, component-level leakage emission factors are constructed using a method referred to by the EPA [46] as the "EPA correlation approach" (defined in detail in SI-5.2.2). In this approach, emission factors are constructed from a dataset of various facilities including oil and gas production sites, refineries, and marketing terminals (n = 445, data compiled in the EPA Protocol document [46]). The difference between our study's component-level emission factors and the GHGI for connectors, valves, and open-ended lines (the components comprising the wells) is ~7x, 6x, and 5x respectively (Figure 4a). Note that the decomposition in Figure 4a is limited to connectors, valves, and open-ended lines (the three components inventoried by the GHGI) although our inventory also accounts for pressure relief valves, regulators, and other (miscellaneous) components on wells. The fact that GHGI equipment-level emission factors are based upon only three component types (when more component classes exist, according to our database) will contribute to some underestimation. Similar results are found across all equipment categories compared to the GHGI. In general, in our dataset, component-level emission factors are higher [5x to 46x comparing our emission factors for connectors, valves, and open-ended lines across all GHGI categories, see Figure S18 - Figure S26], the fraction of components emitting is lower [1x to 0.05x], and the number of components per piece of equipment is generally, but not always, higher [0.2x to 20x comparing our emission factors for wells, separators, and meters across all GHGI categories, see Figure S18 - Figure S26]. Considering the decomposition presented here, along with the rest in the SI (Figure S18 - Figure S26, plus some discussion of smaller factors not described here), we can explain much of the overall underestimation of the GHGI compared to our results for the equipment leaks source category.

Figure 4: Example decomposition of the equipment-level emission factor for Western US gas wells (Note that units differ for each panel, and also the logarithmic scale meaning that visible differences between points often span orders of magnitude). This study's equipment-level emission factor (d) is decomposed into constituent parts and compared with the GHGI. Constituent parts include: component-level emission factors (a), fraction of components emitting (b), and component counts (c).
When multiplied together, these factors have counteracting biases, with component-level emission factors and component counts contributing to higher emissions in our study versus the GHGI, and fraction of components emitting contributing to lower emissions in our study. Note that in actual usage in the GHGI, equipment-level emission factors for gas systems are a weighted average of both Western systems (API 4598, [47]) and Eastern gas systems (Star Environmental, [45]). Here, for illustration purposes, we only show constituent data for Western gas systems; results for Eastern gas system are reported in SI Section 5.

Further, we also limit this figure to connectors, valve, and open-ended lines (the three components inventoried by the GHGI) although our inventory also accounts for pressure relief valves, regulators, and other (miscellaneous) components on wells.
The second source of significant divergence between this study and the GHGI for US CH4 emissions in the O&NG production-segment is with emissions from liquid hydrocarbon storage tanks. The EPA GHGI constructs storage tank emissions estimates using Greenhouse Gas Reporting Program (GHGRP) data. The GHGRP is a program which collects emissions data from industrial facilities, where requirements for natural gas and petroleum systems are specified by the Code of Federal Regulations Section 40 Subpart W [48]. Based on GHGRP data for storage tanks (see methods in SI-5.3), we decompose total emissions for the GHGI into tank counts and emission factors allowing us to draw comparisons to results from this study.
Before presenting our decompositions, it is worth noting two key differences in modelling of emissions from liquid hydrocarbon storage tanks between our study and the GHGI (see further description of how our model estimates tank emissions in SI-3.3.2). First, whereas our model is based on direct measurements, the GHGI is based on operator reported simulations from software programs such as API E&P Tank or AspenTech HYSYS [49], [50]. Second, as a consequence of these differing approaches, whereas our emissions are classified based on measurement source (e.g., vent stack, thief hatch, etc.) GHGI emissions are classified according to the simulated process (e.g., flash emissions). As a consequence of these differences in emissions classification, comparisons between decompositions of our study versus the GHGI will be imperfect.
With this in mind, we define emission factors in our decomposition as the summation of intentional emission factors and unintentional emission factors (Figure 5) We demonstrate the decomposition in Figure 5 for petroleum systems (see Figure S28 in the SI for natural gas systems). Note that flash emissions will only occur at controlled tanks, while unintentional emissions from thief hatches, holes, or pressure-relief valves could occur at either controlled or uncontrolled tanks. Figure 5 and Figure S28 demonstrate that, while several factors contribute to differences, difference in emission factors for various unintentional emissions sources are the greatest source of difference between this study and the GHGI. Unintentional emission factors are the product of (i) average emissions rate per event, and (ii) frequency of unintentional emissions events per tank. Both of these values are approximately an order of magnitude higher for our study as compared to the GHGI, contributing to the nearly two orders of magnitude difference in total emissions.
Our findings suggest that both the magnitude and frequency of unintentional emissions sources could contribute to significant underestimation in the GHGI. Due to the limited quantified, component-level data available on tank emissions (based upon safety and accessibility issues) our tank emissions measurements come from a single study in a single geographic area (Eastern Research Group in the Barnett shale, [51]). Therefore, more studies are required to provide a comprehensive view of tank emissions.
However, while quantified emissions data for tank sources are scarce, the existence of unintentional emissions from tanks (due to open thief hatches, rust-related holes, pressure-relief valves, etc.) has been corroborated by numerous ground and aerial surveys [42], [52]- [54]. Several of these studies are summarized in Table S26. Taken together, these studies provide further evidence that: (i) high emissions events are frequently observed at storage tanks, not just from vents but also at open thief hatches and pressure relief valves, (ii) these high emissions events are common at both controlled tanks and uncontrolled tanks, (iii) the frequency (events/tank) of unintentional emissions events is much higher than the rate suggested by the EPA (2%, see Figure 5c) for malfunctioning separator dump valves.

Discussion
Development of accurate inventories at the equipment-level is critical for targeting CH4 mitigation strategies. US government agencies [26], environmental groups [55], [56], and researchers [57] rely on inventory data for policy design, cost analysis, formulation of leak detection and repair programs, and life-cycle assessment research. However, recent studies have emphasized a ~1.5x-2x divergence between the EPA GHGI estimates of CH4 emissions from O&NG and those estimated from field measurements at different spatial scales. This suggests an opportunity for improvement in the GHGI approach.
In this study we develop a component-level, bottom-up approach validated by previous site-level estimates of US 2015 CH4 emissions from the production segment of the O&NG sector. Consistent with site-level findings, our estimate is ~1.8 times that of the GHGI. The strength of our approach is that by developing our estimate using component-level data, we can diagnose at the equipment-level the key sources contributing to the GHGI underestimation. Our detailed decomposition identifies (i) underlying equipment-leak measurements and (ii) neglect of the contribution of unintentional emissions events at tanks (e.g., liquid hydrocarbon tank "thief hatches)" as the most important contributors to the underestimation.
These results demonstrate that the bottom-up methodology is a valid approach to produce accurate emissions estimates and that improvements to inventory methods are possible. We make several recommendations: • Improvements to equipment leak emission factors can be implemented relatively easily.
This study applies a very similar approach to the GHGI, albeit using a more comprehensive set of data on component-level emission factors, fraction of components emitting, and component counts. We can only speculate on why differences exist between our dataset and the GHGI dataset, but based on the fact that our dataset is larger and contains more recent measurements, we suggest that it is likely to be more representative of today's conditions. • Improvements to crude and condensate storage tank emission factors will be more difficult. Differences between our emissions estimate and the emissions estimate of the GHGI is believed to be largely a result of the GHGI neglecting emissions from failed tank controls (e.g., open thief hatches). Although we attempt to estimate their contribution, and reference supporting site-level surveys, a significant data gap exists in this area. • Regular efforts to validate equipment-level emission factors by comparing existing or new emission factors with measurements from randomly sampled sources at different spatial scales would also improve accuracy and "build in" to inventory efforts the ability to correct data over time.
The results of this study are also relevant globally. All parties to the UNFCCC submit annual inventories, generated using a bottom-up approach, to report on progress towards GHG targets. The IPCC's Guidance Document on Emission factors outlines three approaches towards producing an inventory, with the simplest approach (Tier 1) based on IPCC default emission factors [27], [58]. Default emission factors for the petroleum and natural gas systems production-segment are based upon the same underlying data sets as the GHGI. This means that, in addition to the US-submitted GHGI, other countries using Tier 1 emission factors will be contributing CH4 estimates according to data that we have found likely to be underestimating of actual emissions, and thus the recommendations offered herein, if implemented, would improve emissions estimates globally.
Improvements offered in this study are thus potentially directly applicable to the UNFCCC inventory method and any country directly reporting emissions estimates to the UNFCCC. Our study suggests an approach which can be applied to prepare a more accurate inventory. .

Methods:
Here, we describe the methodological aspects of each of this study's three key contributions: (i) tool development, (ii) generating a US CH4 estimate for the O&NG production-segment, and (iii) decomposing GHGI emission factors. Our methods are also described in greater detail in the Supplementary Information (SI). Datasets and code are available in a Github repository: https://github.com/JSRuthe/O-G_Methane_Supporting_Code

Tool development
Tool structure The analysis platform for this study is the methane venting and fugitives (VF) subroutine embedded within the Oil and Gas Production Greenhouse Gas Emissions Estimator (OPGEE version 3.0). This subroutine processes equipment-level emissions distributions and well and production values and produces gross emissions estimates.
The following equation describes the methane VF subroutine: Here, a "field" represents a subpopulation (or bin) of wells that share similar production characteristics (e.g., gas to oil ratio). This binning was necessary because OPGEE generates outputs (carbon intensity or methane leakage rate) on a "field" basis. For each field, i, emissions are calculated well-by-well. For a single well, j, equipment-level emissions are calculated by multiplying a randomly drawn emission factor, EFi,j,k [kg/equipment/day], by its respective activity scaling factor, afk [# equipment/well]. Because we iterate across wells, there is no need to explicitly multiply the activity scaling factor by well count (see SI section 3.4). Emissions are calculated across all equipment classes, k.

Database on component level studies
Our equipment-level emission factors are generated with a component-level measurement database. We conducted a detailed literature review to inform the database for this study. This review built on prior work done for Brandt et al. [11], [30] and adds new publicly available component-level measurements. Studies were reviewed for information regarding: (i) data on quantified emissions volumes per emitting component or source, (ii) activity counts for numbers of components per piece of equipment or per site, and (iii) data on fraction of components found to be emitting in a survey.
Quantified emissions data was further filtered for: (i) data collected within the production (upstream) segment, (ii) and data collected in the United States (although we do include some component count and fraction leaking data from Canada, see further details in SI-3.2). A total of 6 studies and ~ 3200 measurements met our inclusion criteria (see Table 1).
To aggregate the data from the various studies, we developed a 12-category component classification and 11-category equipment schemes. For components these include: Threaded connections and flanges, valves, open-ended lines, pressure-relief valves, compressor seals, regulators, pneumatic controllers/ actuators, chemical injection pumps, tank vents, tank thief hatches, tank pressure-relief valves, and other (miscellaneous) components. For equipment these include: Well, header, heater, separator, meter, tanksleaks, tanksvents, reciprocating compressor, dehydrator, chemical injection pump, and pneumatic controller/actuator (note that the "tanks -leaks" category tracks all non-vent/hatch emissions on a tank (e.g., connectors, valves, etc.) while the "tank -vent" category tracks all vent/hatch related emissions).
To align the categories of components used by the authors of a study to our common component definitions, we create a set of "correspondence matrices" to perform consistent matrix transformations (see SI-3.2.5).  [37] Canada N Y NR = not reported 1 Given that leakage data was taken in Canada, we limit usage of this data to component counts In addition to component-level emissions measurements, we also require component counts and fraction of components emitting. A total of 3 studies contained information on component counts [35]- [37], and we aligned the data into our standard categories. Data on fraction of components emitting was also scarce, with 3 studies containing useful information [35], [36], [51]. The fraction emitting rate is an important parameter in deriving equipment-level emission factors, but varies greatly by study due to (i) differences in screening methods between studies (e.g., Method 21 vs. infrared camera) and (ii) use of different screening sensitivity to assign a component to the emitting state (10 ppmv vs. 10,000 ppmv). Therefore, based on the technologies employed different studies may be sampling different parts of the "true" population emissions distribution.
In order to ensure that we are not over or under-sampling a subset of the true distribution, we split our dataset at 10,000 ppmv (see reasons for this threshold in SI-3.2.4). Different quantified emissions bins and fraction emitting values were derived for the two halves.

Equipment-level emission factors
We required a variety of approaches to describe the different sources of emissions. The most common approach taken by this study, utilized for fugitive leaks and most vents, is a "stochastic failure" approach. In the stochastic failure approach we combine component-level emissions data, component counts, and fraction emitting values to produce equipment-level emission factors. These emission factors take the form of distributions which are generated by iteratively resampling our emissions datasets (see SI-3.3.1).
For each equipment category, we iterate across component categories and draw emissions measurements according to a probability specified by the fraction emitting value. Given that we split our dataset at 10,000 ppmv (describing quantified emitters that were missed by optical gas imaging but caught with Method 21 below the threshold, and emitters that were caught with optical gas imaging above the threshold), we develop two sets of emission factors . These two emission factor distributions are superposed to form our best approximation of the true emissions distribution (SI-3.2.4).
We applied separate approaches for flashing emission from tanks, methane slip from reciprocating compressors, and intermittent and startup losses from liquids unloading, completions, and workovers. These approaches are described in SI sections 3.3.2, 3.3.3, and 3.3.4 respectively.

Equipment-level activity factors
In the GHGI, direct equipment counts are not available for every year. As an approximation, the GHGI uses "activity drivers" such as gas production, number of producing wells, or system throughput. Activity drivers are multiplied by a scaling factor (e.g., separators per well) derived from a subsample of the population. For each piece of equipment, we employ well counts as the activity driver. Since the 2018 GHGI, the EPA has calculated activity factors for most equipment using scaling factors based on GHGRP data. Scaling factors based upon reporting year 2015 equipment counts are multiplied by year-specific wellhead counts to calculate year-specific equipment counts [62].

Development of representative "fields" for analysis
In OPGEE, fields are described with over 50 primary input parameters, and numerous secondary parameters. Given that we are restricting our analysis to methane leaks and vents in the upstream sector, however, we only concern ourselves with a handful of inputs: Oil production, well count, gas-to-oil ratio (GOR), and methane mole fraction. The 2015 well count and production data ( Table S31) were based on the dataset from Alvarez et al. [13], which were originally derived from Enverus and filtered to remove offshore and inactive wells (~6,000).
In order to account for the heterogeneous nature of petroleum and NG systems, the total population was divided into several simulation sub-populations (or "bins") according to the production GOR (where gas wells have a GOR > 100 mscf/bbl, [63]), gas productivity, and liquids unloadings method. 60 bins were developed for natural gas systems while 14 bins were developed for petroleum systems (see SI-4.1).

Uncertainty analysis
This study applies the Monte Carlo method to estimate uncertainty. Input parameterscomponent-level emission factors, component counts, and fraction of components emittingare assigned distributions, and the range of uncertainty in these distributions is propagated through the model. Therefore, the full range of uncertainty is captured to the extent that these distributions encompass the full set of possible values.
A single OPGEE simulation will produce an estimate of total US CH4, but it will not output a distribution. We run OPGEE 100 times (100 Monte Carlo iterations), each using a different set of equipment-level emission factor distributions (further description in SI section 4.2.1). In producing variable equipment level emission factor distributions, component counts and fraction of components emitting are approximated as uniform distributions between the maximum and minimum values found in our surveyed studies (see Table S5 and Table S6 for component counts and Table S10 for fraction leaking). Unfortunately, our sparse dataset does not allow us to determine a likely distribution shape for these parameters.
Comparison with the EPA GHGI

Equipment leakage
The construction of equipment-level emission factors in the GHGI is rooted in several studies conducted in the 1990s. We review these studies and trace how emission factors in today's GHGI are derived from these earlier analyses. The modelling approach of the early 1990s studies is closely related to the approach in this paper, in that equipment-level emission factors are calculated from component-level emissions measurements and counts. By gathering the underlying datasets used to construct the GHGI's equipment-level emission factors we can generate component-level distributions for comparison with the distributions of our study.
The GHGI relies on a 1996 report by the Gas Research Institute [ [64], henceforth referred to as the "GRI report"] for natural gas systems and a 1996 calculation workbook by the American Petroleum Institute [ [65], henceforth referred to as "API 4638"] for petroleum systems. These reports were not measurement campaigns, rather these reports summarized the results of multiple earlier works. The GRI report references API 4589 ( [35], sites 9-12) for the Western US natural gas system and Star Environmental [45] for the Eastern US natural gas system. API 4638 references data from API 4598 (sites 1 -8). Therefore, only two measurement campaigns underlie GHGI equipment leakage: the API 4589 and the Star Environmental datasets.
We first analyze the screening data in API 4598 and Star Environmental and follow the methodologies outlined in SI sections 5.2.2 -5.2.4. In API 4598, screening concentrations from Appendix C were scanned and tabulated. Unfortunately, it was not possible to re-derive the component-level emission factors in the Star Environmental dataset. This was for two reasons. First, in the Eastern leak quantification data (provided in Appendix F, [45]), information is not provided on components measured. Therefore, quantified emissions cannot be connected to the screening values contained in Appendix E. Second, the Eastern dataset does not report how they assigned leak volumes to the 81 instrument readings > 10,000 ppmv which were not quantified with the Hi Flow sampler. Therefore, component-by-component distributions can only be generated for API 4598.
After digitization and re-engineering of the GHGI methods, we can compare the distributions of the resulting component-level estimates with our dataset (Figure 4, with additional comparisons in SI Section 5.2.5).

Tank emissions
To reconstruct emission factors for crude and condensate storage tanks, we begin by downloading GHGRP data from the "Envirofacts GHG Customized Search" tool [66]. After gathering the data, we divide the dataset by product stream (natural gas, petroleum systems) and tank class. However, before making any comparisons with this study, we need to adjust how emissions-factors are reported by the GHGI. The GHGI reports storage tank emission factors on a throughput-basis (kgCH4/bbl/year) and our study reports emission factors on a tank basis (kgCH4/tank/day). Fortunately, in addition to tank throughput, atmospheric storage tank counts per sub-basin are also reported to the GHGRP by tank class.
Emissions-factor distributions ( Figure 5) are calculated by dividing total emissions by tank count for every sub-basin (or row in the downloaded dataset). See SI Section 5.3 for additional details on this calculation. In SI Section 5.3, we validate this approach by calculating and comparing throughput-basis emission factors with those reported in the GHGI.