Can machine learning improve carbon storage? Synergies of deep learning, uncertainty quantification and intelligent process control

As we transition from fossil fuel to renewable energy, negative emission technologies, such as carbon capture and storage (CCS), can help us reduce CO2 emissions. Effective CO2 storage requires: (1) detailed site characterization, (2) regular, integrated risk assessment, and (3) flexible design and operation. We believe that recent advances in machine learning coupled with uncertainty quantification and intelligent process control help us with these task and thus improve the efficiency and safety of subsurface CO2 storage. Plain Language Summary We are emitting a lot of CO2 to the atmosphere to produce electricity and heat, manufacture and transport goods, and construct buildings. To reduce these emissions in the short term, we can capture and store CO2 in subsurface reservoirs. Generally, successful CO2 storage needs three things: (1) a sealed reservoir, (2) regular monitoring, and 3) flexible operation. Recent advances in machine learning can help us with these tasks and thus improve subsurface CO2 storage.


Introduction
Recent IPCC reports on climate change stress the importance of negative emission technologies, such as carbon capture and storage, in limiting the amount of CO 2 in the atmosphere [Masson-Delmotte et al., 2018]. Carbon capture and storage describes the technology of limiting CO 2 emissions from fossil-fuel combustion and industrial production by capturing, transporting and storing CO 2 in the subsurface. Carbon capture and storage is a direct emission mitigation system that can help us transition from fossil fuels to low carbon energy, but currently it is lacking far behind its ambitions with only a handful of commercial projects (e.g. Sleipner, In Salah, Snøhvit and Quest) exploring subsurface CO 2 storage [Eiken et al., 2011]. These projects highlight, at least, three critical components of successful CO 2 storage: (1) detailed geological and geomechanical site characterization, (2) regular risk assessments based on the integration of multiple different datasets, and (3) flexibility in the design and operation of the capture, compression, and injection system [Ringrose et al., 2013]. We believe that recent advances in machine learning coupled with uncertainty quantification and intelligent process control can help us with these tasks, improving the efficiency and safety of subsurface carbon storage.

Machine learning
Broadly speaking machine learning involves methods to extract information (e.g., trends, patterns) from data, uncertainty quantification involves characterisation of uncertainty in the presence of limited data and intelligent process control leverages sensory data for real time feedback and closed loop intelligent control of the process system (e.g. subsurface CO 2 injection).
Key advances in computing (e.g. GPUs, cloud infrastructure), algorithmic designs (e.g. back-propagation, deep neural nets) and low cost embedded computing devices to gather large data sets have led to large improvements in model prediction accuracy in typical benchmarks (e.g. ImageNet, CIFAR10) [Krizhevsky et al., 2012, Jordan andMitchell, 2015]. This leap has triggered a series of revolutionary developments in the fields of: (1) computer vision, (2) natural language processing, and (3) artificial intelligence. In particular, developments in computer vision involving the classification, segmentation and object detection in images offer an enormous potential for applications in Geoscience [Bergen et al., 2019], where most datasets (e.g. maps, cross-sections, 3-D models) consist of images. With large amounts of data becoming publicly available (e.g. USGS [Triezenberg et al., 2016], NPD [NPD, 2019], GA [GA, 2019]), we can start exploring these datasets for prospective carbon storage sites using machine learning.

Carbon storage
Effective carbon storage requires large-scale, long-term storage of CO 2 in the pore space of subsurface reservoir rocks. A typical life cycle of a carbon storage project begins with the assessment of potential storage sites, e.g. existing oil and gas fields as well as saline aquifers. This initial assessment typically involves the identification of potential reservoirs and seals in seismic reflection data (i.e. acoustic images of the subsurface). While seismic images differ from natural images, we can adapt machine learning models (e.g. deep convolutional networks) typically used for image classification, object detection or semantic segmentation, to identify geological structures, such as CO 2 reservoirs [Wrona et al., 2018, Waldeland et al., 2018, Wrona et al., 2021. Using training data from expert geologists, we could drastically reduce the time required and uncertainty involved using these models, which once trained allow us to identify hundreds to thousands potential CO 2 storage sites in seconds.
The next step involves a description of reservoir quality and seal integrity using borehole and seismic reflection data with uncertainty estimates. Wireline logs recording physical rock properties (e.g. density, resistivity) inside a borehole are a key source of information. Probabilistic machine learning [Ghahramani, 2015] and uncertainty quantification techniques are capable of predicting reservoir properties (e.g. porosity and permeability) with uncertainties from these wireline logs. Uncertainty quantification is critical here, as we typically propagate uncertainties through a long sequence of different physics based models. Mechanistic reservoir models, for example, are calibrated with sparse physical measurements from boreholes interspersed throughout the geological reservoirs. While recent machine learning advances like generative adversarial networks (GANs) could help us model reservoirs, the quantification of uncertainty during this process remains crucial. The next step is the assessment of seal integrity. If the sealing formation contains leakage sites (e.g. faults, fracture or poorly-cemented wells), CO 2 could escape from the reservoir. This has been a major problem in previous carbon storage pilot projects, where leakage sites only became apparent during CO 2 injection [Eiken et al., 2011]. Generative models, which excel at image translation tasks (e.g. super resolution) could enhance the resolution of geophysical images increasing the chances of an early detection of potential leakage sites. Using these methods, we can estimate reservoir quality and predict the risk of leakage with uncertainty, allowing us to compare and select future CO 2 storage sites.
Following a successful site assessment, we typically need to simulate the injection and flow of CO 2 in reservoirs using fluid flow equations for porous media. These non-linear and spatiotemporal calculations are computationally expensive to solve. High computational costs are compounded by the fact that the underlying permeability and porosity fields are uncertain (both structural uncertainty due to the position of channels and physical uncertainty in the magnitudes of geological properties) and hence require multiple Monte Carlo type calculations to obtain a range of possible outcomes for different sets of input parameters.
Statistical methods provide powerful information processing frameworks that can augment, and possibly even transform, current fluid simulations [Brunton et al., 2019]. For example, Kriging or Gaussian process models can provide uncertainty estimates, non-intrusive methods like polynomial chaos expansions and spectral projections [Xiu, 2010] are useful to approximate possible range of model outcomes (e.g. extent of CO 2 plume, amount of dissolution in brine) with less number of expensive simulations and GANs can generate multiple model realisations to investigate regions of channel vs non-channel flow.
After simulating subsurface fluid flow, intelligent process control techniques can help us design smart wells and develop optimal injection strategies. This involves real time process control and decision making under uncertainty. The design needs to consider multiple conflicting objectives like maximising subsurface CO 2 storage, minimising risk of CO 2 leakage (due to induced seismicity or caprock fracturing) while honouring engineering constraints such as proximity to existing infrastructure and limits on borehole pressure. Data driven model predictive control is useful in this context. Model based reinforcement learning algorithms can also offer viable solutions to such problems. During the injection phase, as more field sensor data comes in, the reservoir models can be calibrated to reduce uncertainties even further (history matching) and control strategies can be adapted online to take new information into account.
The final step concerns monitoring of the injected CO 2 . Techniques such as time-lapse 3-D seismic surveys can reveal the distribution of CO 2 in the subsurface at different times and real-time streams of sensory data from the surface and subsurface can inform us about pressure changes in the reservoir. Machine learning methods can help us process, analyze and visualize these large multi-dimensional datasets, ideally allowing us to predict future CO 2 migration pathways.

Summary
To summarize, a synergetic combination of machine learning, uncertainty quantification and intelligent process control employed at different stages of the workflow can help us accelerate the implementation and scale up the deployment of carbon storage as a crucial technology for transitioning to a low carbon economy.