Probabilistic inverse problems using machine learning - 1 applied to inversion of airborne EM data. 2

Abstract

A key challenge in geoscience is that of inferring information about m from different types of available information, such as geological expert knowledge, geophysical data, well log data, etc. This challenge is generally referred to as an inverse problem. (Tarantola & Valette, 1982b;Tarantola, 2005) describe the inverse problem as problem of probabilistic integration of information. Available information about m is described in the form of a probability density and then combined using conjunction of information to obtain one probability density that describes all available information. Say a specific type of information about structural information is quantified by ρ(m), information from seismic data and well logs by L(m). Then the conjunction of this information is given by the posterior probability distribution σ(m), which, under the assumption that the individual types of information have been obtained independently, is given by I.e. the conjunction of the independent information is proportional to the product of probability densities describing each independent set of information. The likelihood L(m) quantifies a probability distribution that quantifies the difference between observed data d obs , and the noise-free data d computed by evaluating the forward model where g is a non-linear operator that maps the model parameteres into data. g typically 109 refers to some numerical algorithm solving some physical equations (such as Maxwells 110 equations).

111
The central problem in probabilistic inversion, the inverse problem, is inferring in-112 formation about σ(m), which in principle contain the combined information of, in this 113 case, both structural prior information, through the prior ρ(m), and information from 114 geophysical data, through L(m).

115
The most widely used method for solving probabilistic formulated inverse problems 116 is by sampling the posterior distribution using variants of the Metropolis algorithm, Eqn. 117 1, (Metropolis et al., 1953;Hastings, 1970;Geman & Geman, 1984 Here we consider the case when the parameter of interest may not be m itself, but 132 instead a set of features/parameters n related to m through n = h(m). This is in prac-133 tice often the case when m represents a geophysical parameter, such as resistivity or ve-134 locity, but where one is interested in the lithology or hydrological properties the geophys-  The method is simple to apply and consists of two steps: A) construction of train-140 ing set (A1) and construction and training of a neural network (A2). This is done once, then in a second step B the trained machine learning algorithm is applied, very efficiently, 142 to potentially many sets of observed data (as demonstrated in the following example).

143
2.1 A1: Constructing training data 144 Eqn. 2 described the forward problem of computing noise free data. The forward problem of describing simulation of data with noise d obs can be given by where g represent a (possible) non-linear mapping typically describing some physical pro- that can be obtained simply by 1) sampling the prior, 2) solving the forward problem,

148
3) simulation of the noise, and 4) extracting/computing a feature of n from m.

149
Usually, the forward mapping between m and noise-free data d is unique, and hence work is high enough that the desired mapping can be resolved and small enough such 181 that overfitting will not be an issue.

182
When a neural network is trained using the training data set, its free parameters 183 are adjusted to minimize a specific loss function, that measures the difference in the ex-184 pected output from the training data, n i , set and the output of the neural network,n.

185
For the methodology presented here, it is the choice of the loss function, and type of ac-186 tivation function for the output layer that is critical, to allow estimation of properties 187 of the posterior distribution.

188
In general, a feature n can refer to a continuous parameter (such as velocity, re-189 sistivity, temperature), or a discrete parameter (such as lithology type, event type). Each The probability that an estimatedn is a realization of N ( n, C n ) is given by The values of the mean and covariance that maximizes Eqn. 5, can be found by minimizing the loss function, in form og the negative log-likelihood loss function, that is Therefore, any neural network that uses the loss function in Eqn. 6, will lead to an es-200 timate of the mean and covariance representing of the posterior distribution σ(n|d obs ).

201
Typically no activation function is used for regression-type neural networks as the out-202 put could have any value.

203
Eqn. 6 is not widely used as a loss function, but is readily available using for ex- The choice ofn i that maximizes Eqn. 7 can be found by minimizing the negative loglikelihood given by the loss function n j log(n j ).
Eqn. 9 is equivalent to the categorical cross-entropy between the two probability distri- The methodology can be trivially extended to account for multiple data types. In case two types of data, A and B, are available (each with a specific forward and noise model), one can create training data sets for both types of data as and use the methodology described above to compute properties of the posterior distrub- For each of the N r generated models in M * a 'feature' n int is estimated that de-288 fines whether the resistivity varies above 50% between neighboring model parameters.

289
n int thus represent a classification of 'interface' vs 'no interface'.

290
A fully connected multi-layer perceptron model, using 12 nodes in the input lay-291 ers, 2 hidden layers with 40 nodes each, and 125 nodes in the output layer is constructed.

292
The output layer represents the probability of having an interface at the location of the To illustrate a case of facies classification, a slight variation of the prior model con-mizing an appropriate loss function. This leads to fast and accurate estimation of pos-358 terior statistics.