Optimal experiment design for bottom friction parameter estimation

It is common practice within numerical coastal ocean modelling to perform model calibration with respect to a bottom friction parameter. While many modelling studies employ a spatially uniform coefficient, within the parameter estimation literature the coefficient is typically taken to be spatially (or even temporally) varying. A parameter estimation experiment requires an appropriate set of observations, and also the selection of an appropriate parameter space which captures the spatial variability of the bottom friction parameter. In regions such as the Bristol Channel, which is used as a case study within this work, observation data is relatively abundant; here we use observations of M2 and S2 harmonic amplitudes and phases at 20 locations within the Channel. However, as is typical within friction parameter estimation problems, there is no obvious constraint on the spatial variation of the friction coefficient. Here, we define the parameter estimation ‘experiment design’ as the mapping from a small number of friction parameters onto the model domain. We propose a robust method for the appropriate selection of a low-dimensional experiment design, utilising an optimal experiment design (OED) technique via construction of the Fisher Information Matrix. The objective is to identify the experiment design resulting in the tightest possible constraints on the unknown parameters, given the available observation data. We construct the Fisher Information Matrix via the use of an adjoint shallow water numerical model, Thetis, and perform a variant of D-optimal design to find the optimal experiment design from within two a priori choices of design space. These are based on splitting the model domain either by simple slices across the channel, or by the type of sediment found on the sea bed. We first validate the OED framework by utilising a Bayesian inference algorithm to perform parameter estimation using a selection of experiment designs, which confirms that the OED framework offers a good estimate of the true parameter uncertainty resulting from the use of a given experiment design. An exploration of the full space of experiment designs shows that up to three Manning’s n coefficient values can be estimated from the observation data to within an uncertainty of approximately 0.001 s m−1/3, but that the experiment design is highly influential in achieving this threshold, thus demonstrating the value of our approach as a preliminary step in a parameter estimation study. We also investigate the sensitivity of the achievable parameter uncertainty to the availability of observation data and its measurement uncertainty, providing insights useful to the design of future observation surveys. Finally, we further demonstrate our OED framework with an application to a model of the northwest European continental shelf.


Introduction
∂η ∂t + ∇ · (Hu) = 0, ∂u ∂t + u · ∇u + F C + g∇η = − τ b ρH + ∇ · (ν(∇u + ∇u T )), where η is the surface elevation, u is the depth-averaged velocity vector, H = η + h is the total water depth, h is 134 the bathymetry (measured positive downwards), F C is the Coriolis force, g is the acceleration due to gravity, ρ is 135 the density, ν is the viscosity, and τ b is the bottom stress. Within this work, the bottom stress is parameterised via 136 Manning's n formulation where n is the (spatially varying) Manning coefficient. This coefficient is assigned a uniform value of n = 0.025 138 s m −1/3 outside the Bristol Channel, while its value inside the Channel is to be inferred via parameter estimation value which depends on the bathymetry gradient and mesh resolution at the wet-dry interface. In this work, α is 143 taken as 0.2 m, which was found by experimentation to be close to the minimum stable value for this model setup. 144 The unstructured mesh used throughout this study is shown in figure 1, and is based on that used in Mackie et al. 145 (2020b). The mesh was generated on a UTM30 coordinate projection, using the Python package qmesh (version  The coastline data is from the Global Self-consistent, Hierarchical, High-resolution Geography Database (GSHHG) 148 (Wessel and Smith, 1996). The mesh resolution varies between 250 m at the innermost part of the Channel, and  (Egbert and Erofeeva, 2002). The next most significant 153 constituent (N2) has less than half the amplitude of the these primary constituents. While the choice to force with 154 only two constituents neglects any transfer of energy from other tidal constituents, we assume that this forcing is 155 sufficient for the purposes of this study, which represents a proof of concept for our OED framework. A no-normal-flow 156 boundary condition is applied on the coastal boundaries, and river outflow is neglected.  Throughout this work, we assume that some uncertainty is introduced by the observations. Specifically, we 177 assume that the amplitude observations are independent and identically distributed variables, and similarly for the 178 phases, with variances of 25 cm 2 and 6.25 •2 , respectively. While this is somewhat arbitrary, we discuss the influence The aim is to determine the optimal number of parameters m, and the optimal mapping from these parameters 184 onto the model domain, such that tight constraints on the parameters can be achieved via a parameter estimation 185 exercise using the observations described in section 2.2. The choice of this mapping and number of parameters will 186 be referred to as the experiment design. We emphasise that this problem is distinct from conventional applications of

189
We first apply an a priori constraint on the friction coefficient, by splitting the model domain into subdomains, 190 within each of which the friction coefficient is assumed to be uniform. This constraint permits the use of an exhaustive 191 search algorithm to find the optimal design. The selected number of these subdomains must be sufficiently large that 192 the space of experiment designs contains a 'good' design, but small enough to facilitate the use of the exhaustive 193 search algorithm. Within this work, we consider two such a priori constraints, which are described in section 3.1.

194
Once a suitable subdomain parameterisation of the friction coefficient has been selected, we explore the remaining 195 space of experiment designs to determine the optimal design. Each design corresponds to a grouping of the a priori 196 subdomains into m groups, such that the friction coefficient is specified by m parameters. In order to rank the 197 experiment designs, we compute the Fisher Information Matrix corresponding to each design, and find the design 198 which optimises a selected scalar measure of this matrix. This approach is described in detail in section 3.2.   seek an experiment design in which the friction parameter is specified everywhere within the Bristol Channel; each 226 experiment design corresponds to a grouping of model subdomains, not a subset selection. This is perhaps more 227 similar to the clustering of parameters considered by Chu and Hahn (2009), although we take a different approach.

228
The following is a simple exposition of the FIM approach to optimal experiment design.

229
Consider a parameter estimation problem, where a vector of unknown parameters θ is to be estimated by min-230 imising a misfit functional given by with respect to θ, where y and f (θ) are vectors of observed and modelled quantities respectively, and cov e is the 232 observation error covariance matrix. The parameter covariance matrix resulting from the minimisation of equation

233
(3) can be estimated as where J is the Jacobian matrix with elements given by It is common within the literature to utilise a forward difference method to compute this Jacobian matrix. In this 236 work we instead use the adjoint mode of the numerical model; this is described in section 3.2.3.

237
The parameter covariance given by equation (4)

243
Within the optimal experiment design literature, it is common to work with the inverse of the parameter covariance 244 matrix, the so-called Fisher Information Matrix (FIM) (Fedorov, 1972), defined as Optimisation of the experiment design is undertaken by finding the configuration (in general, the timing and location

249
Within this work, we use a modified version of the so-called D-criterion, which we motivate here. In the case 250 of estimating a single parameter, the optimal experiment design can be defined as the one which minimises the 251 confidence interval of the estimated parameter. The higher-dimensional analogue of this is that the optimal design 252 minimises the volume of the confidence ellipsoid of the estimated parameters. This volume is characterised by the 253 determinant of the parameter covariance matrix, det(cov p ) (Walter and Pronzato, 1990). Since the FIM is the 254 inverse of cov p , the minimisation of this confidence ellipsoid volume corresponds to the maximisation of det(FIM).

255
The so-called D-criterion is simply defined as this determinant, det(FIM).

256
Within this work, we define a modified D-criterion given by where m is the number of parameters to be estimated. This is sometimes given as the standard definition of the D-258 criterion (e.g. Emery and Nenarokomov (1998)), but for clarity we refer to this as the modified D-criterion, or ModD.

259
In terms of parameter confidence ellipsoids, the maximisation of the ModD criterion corresponds to the minimisation  The above exposition is only strictly valid when the model is linear. If the model is nonlinear with respect to 267 the unknown parameters (as is the case within this work, since the numerical tidal model is nonlinear), equation 268 (4) is not exact, but the right-hand-side is nevertheless a measure of the maximum achievable parameter precision 269 (de Brauwere et al., 2009). The inverse of the FIM (as defined by equation (6)) is considered to give a lower bound 270 on the parameter error covariance matrix (Alaña and Theodoropoulos, 2011). That is, where the inequality is understood to mean that (cov p − FIM −1 ) is positive semidefinite (Söderström and Stoica,272 1989, Emery andNenarokomov, 1998, Machado et al., 2009).

273
Nonlinearity also means that the Jacobian, and therefore the FIM, are functions of the unknown parameters.

274
The optimisation of the modified D-criterion as defined above therefore identifies only a 'locally optimal' experiment 275 design (Ford et al., 1989, Huan andMarzouk, 2013). That is, the optimal experiment design is optimal only within Due to the inequality in equation (8), an experiment design satisfying equation (10) is not guaranteed to achieve 294 the target parameter estimate precision given by equation (9). However, we show in section 4.1 that the bound 295 provided by the FIM on cov p appears to be reasonably tight, and therefore that equation (10) is a good indicator of experiment designs satisfying equation (9). We emphasise that the value of the OED method is that ModD can be 297 computed in advance of the parameter estimation procedure, and therefore equation (10) provides a useful a priori 298 indicator of experiment design performance.
where θ is the parameter vector of length m. The experiment design D is an N × m matrix consisting of ones and 310 zeros, which maps the parameters θ onto the mesh nodes.

311
First, we use the adjoint model to compute the Jacobian with respect to n, given by This requires one adjoint model run for each observation i, i.e. for each row of the above matrix. There are 80 313 observations (M2 and S2 amplitude and phase at 20 locations), so the computation of this Jacobian requires 80 314 adjoint runs. The Jacobian matrix with respect to the parameter vector θ can then be computed as where the ∂fi ∂n k term has been computed via the set of adjoint model runs as described above.

316
Note that, for a given model setup (mesh, etc), the adjoint model runs are a one-off computational overhead, designs is considered. In the most general case, the forward differences approach would require N + 1 forward model 324 runs, where N is O(10, 000) for the mesh used in this work; this is clearly not computationally feasible. The use of an adjoint model is a prerequisite for the extension of this OED method to more complex experiment design spaces, 326 and we therefore take the adjoint approach here. To the authors' knowledge this is the first OED study to apply 327 adjoint methods for this purpose.  The effect of the inequality in equation (8), which represents the nonlinearity of the system, is that the FIM 335 approach is expected to overestimate the true performance of a given experiment design. Here, the aim is to 336 investigate the significance of this inequality, i.e. to verify whether the FIM approach is suitable for identifying 337 'good' experiment designs within the friction parameter estimation context. We proceed by utilising a Bayesian 338 inference framework to solve the parameter estimation problem corresponding to a selection of experiment designs, 339 computing the resulting parameter covariance matrix for each. We can then compare the ModD criterion with the 340 equivalent measure of this parameter covariance matrix, given by If the inequality of equation (8) were replaced with an equality, the expression given by equation (14)  ModD values (as expected due to the inequality in equation (8)), they are reasonably close to the bound provided 361 by ModD, and therefore that the ModD value is a good indicator of experiment design performance.

362
Experiment design using synthetic observations using real observations

Sediment-based designs
Optimal m = 3 design 0.871 × 10 6 0.691 × 10 6 0.825 × 10 6 here we explore the full space of possible experiment designs (for m = 2, 3, 4 and 5). Figure 4 shows the ModD for the slice-based subdomains, and the maximum achievable ModD criteria are therefore reduced.

376
(iii) For the slice subdomains, we find that it is possible to achieve the target criterion (10 6 m 2/3 s −2 ) when grouping 377 the friction subdomains into two or three groups (m = 2 or 3), but no four-parameter designs meet the threshold 378 ModD criterion.

379
(iv) For sediment subdomains, we find that it is only possible to meet the threshold ModD criterion using m = 2,    gives insight into the properties of an experiment design which contribute to its efficacy within parameter estimation. 395 Figure 7 shows the optimal value of the ModD criterion (on the y-axis), and the corresponding optimal experiment constraints. This is largely due to the more even division of the model domain into slices, compared with the 420 sediment-based division, which resulted in several subdomains of small area. Note, however, that this result does 421 not necessarily suggest that the use of sediment data to constrain the spatial distribution of the friction parameter 422 is detrimental to the model performance, or that the sediment data is inconsistent with the observations.

Response of OED criterion to observation uncertainty
In this section, we briefly consider the response of the optimal designs to changes in the estimate of cov e . In 425 many optimal experiment design studies, the observation covariance matrix cov e is considered to be proportional to 426 the identity matrix and is therefore omitted from the formulation of equation (6)  To demonstrate the improvement in performance achieved via this model calibration, we make a comparison with 449 a uniform parameter with with n = 0.025 s m −1/3 . This comparison is shown in figure 11. The reduction in scatter 450 between the modelled and observed values is easily visible by eye, and is also evident in the root mean squared 451 errors, which reduce from 9.7 cm and 4.1 • before calibration, to 5.6 cm and 2.7 • after calibration. We emphasise 452 that further calibration with a larger number of control parameters would enable further reduction in these errors.

453
However, the OED framework tells us that increasing the number of control parameters is not justified, given the 454 observation data used to constrain the unknown parameters.   Matrix. Find the grouping which optimises a scalar measure of the FIM. Here we again use the modified D-478 criterion as described above.

479
The optimal experiment design, following the above schema for a grouping into m = 4 parameters, is shown in implicitly correcting for other modelling errors.

519
Our results indicate that the number of friction parameters that can be well constrained by the observations is 520 typically much smaller than the number of observations. This is due in part to the ratio between the observation error 521 variance and the target parameter estimate covariance (and the relationship between them via the model), but also 522 due to redundancy in the information contained in the observations. This is particularly evident in the application 523 to the northwest European continental shelf. In other words, a large quantity of data does not justify the estimation 524 of a large number of model parameters, and the methods proposed within this work offer a rigorous technique for the 525 selection of an appropriate set of model parameters for estimation from a given set of observations. The results of 526 section 4.2 also demonstrate that the careful selection of the spatial distribution of the friction parameters (i.e. not 527 just the number of parameters) is important. We have also found that, for the model domain considered, the choice 528 to divide the domain by simple slices produces better-performing experiment designs than the use of sediment data 529 to a priori constrain the spatial variation of the bottom friction coefficient. However, as noted above, this can be 530 attributed to the relative sizes of the spaces of experiment designs for these choices, and does not necessarily imply 531 that additional physical knowledge via sediment data cannot be beneficial within parameter estimation.

532
It is useful to compare the computational cost associated with the OED method presented here with the cost of a 533 'trial and error' approach where the parameter estimation problem is solved for a selection of candidate experiment surveys, and optimal experiment design with respect to observation strategy will be considered in future work.

631
Finally, while the computational cost of our approach depends on the volume of observation data, our results 632 demonstrate that for practical parameter estimation problems the OED procedure can be considered a computa-633 tionally efficient approach. For our case studies, the overall computational cost of the OED framework was on the 634 order of 10 times that of performing an individual parameter estimation experiment with a given design. The OED 635 framework is therefore not prohibitively expensive, compared with either a trial-and-error approach to finding the 636 optimal experiment design, or a more traditional regularisation approach requiring repeated optimisations in order 637 to select a regularisation parameter.

638
In summary, we have demonstrated that our OED framework can be used to formulate well-constrained calibration 639 problems, and can be a valuable preliminary step in a parameter estimation study. This exposition follows Rasmussen (2003), to which the reader is referred for further detail. The crux of Gaussian 839 process emulation is that, under the assumption that model outputs follow a multivariate Gaussian distribution, a  According to the '10d' rule (Sobol, 2001, Hristov et al., 2017, it is common to train a Gaussian process emulator 858 using at least 10d samples, where d is the number of input parameters. As a cautious approach, within this work 859 we use 10(d + 1) training samples for each selected experiment design to be tested using Bayesian inference. The where σ 2 amp and σ 2 phase are the amplitude and phase measurement covariances, respectively, as described in section 2.2, 880 and I denotes an identity matrix of the specified size (we use 40 amplitude observations and 40 phase observations).

881
The likelihood L(y|θ) is therefore given by where K = 80 is the number of observations.