Assessing erosion and flood risk in the coastal zone through the application of multilevel Monte Carlo methods

The risk from erosion and flooding in the coastal zone has the potential to increase in a changing climate. The development and use of coupled hydro-morphodynamic models is therefore becoming an ever higher priority. However, their use as decision support tools suffers from the high degree of uncertainty associated with them, due to incomplete knowledge as well as natural variability in the system. Here we show for the first time how the multilevel Monte Carlo method (MLMC) can be applied to hydro-morphodynamic models, in this case XBeach, to quantify uncertainty by computing statistics of output variables given uncertain input parameters. MLMC accelerates the Monte Carlo approach through the use of a hierarchy of models with different levels of resolution. A variety of theoretical and real-world coastal zone case studies are considered, for which output variables that are important to the assessment of flood and erosion risk are estimated, such as wave run-up height and total eroded volume. We show that MLMC can significantly reduce computational cost, resulting in speed up factors of 40 or greater compared to a simple Monte Carlo approach, whilst still maintaining the same level of accuracy. Furthermore, MLMC is used to estimate the cumulative distribution of these output variables for given uncertain parameters. This allows the risk of a variable exceeding a certain value to be calculated, for example the risk of the wave run-up height exceeding the height of a physical structure such as a seawall; this is a useful capability to inform decision-making processes.


Introduction
For millennia, the world's coastal zones have been at risk from erosion and flooding. However, climate change will in turn lead to hydrodynamic changes in the coastal zone resulting from location-dependent sea level rise and changes in the strength and frequency of storms, which may severely exacerbate these risks and the damage associated with them. Estimates suggest the economic cost of coastal flooding damage in Europe will be two to three orders of magnitude greater than current levels by the end of the century (Vousdoukas et al., 2018). Thus the modelling of coastal zones is a vital endeavour.
For several decades, complex hydro-morphodynamic models have been used to simulate flooding and erosion. Whilst these models may produce accurate results for a given idealised scenario, there is a large degree of uncertainty associated with them when they are applied in the real world, coming from both the input data as well as the models themselves (see for example Unguendoli (2018) and Villaret et al. (2016)). This uncertainty can be due to either incomplete knowledge or natural variability (Apel et al., 2004) and can be quantified by combining numerical hydromorphodynamic models with statistical frameworks.
In this work, the depth-averaged finite-volume based coastal ocean model XBeach (Roelvink et al., 2015) is used as our model because of its ability to simulate wave propagation, flow, sediment transport and morphodynamic changes. XBeach has already been used within a Monte Carlo framework, for example in Harris et al. (2018), Pender and Karunarathna (2013) and Simmons et al. (2017). However, like all complex hydro-morphodynamic models, XBeach is computationally relatively expensive and thus does not lend itself well to Monte Carlo simulations and hazard assessments, which require a large number of individual model runs. For example, in Harris et al. (2018) simulation with 240,000 individual runs of the 1D version of XBeach was required to perform the desired study. This computational cost has limited the problems that Monte Carlo based studies can be used to address and has meant that for more complex and long-term test cases, researchers have had to use simplified models. For example, Callaghan et al. (2013) used the simpler 1D semi-empirical model SBeach in their Monte Carlo simulation which took 40 days; the authors of Callaghan et al. (2013) estimate that even using the 1D version of XBeach would have taken four and a half millennia.
Instead of reverting to simpler, less costly models, we use the relatively novel multilevel Monte Carlo method (MLMC), first presented in Giles (2008). MLMC seeks to reduce computational cost by accelerating the Monte Carlo approach through the use of a hierarchy of model configurations, each with a different level of resolution. It has been used in areas as diverse as mathematical finance (Giles, 2008), data assimilation (Gregory and Cotter, 2017a) and atmospheric dispersion modelling (Katsiolides et al., 2018). In Sánchez-Linares et al. (2016), MLMC was successfully applied to estimate uncertainty in landslide generated tsunamis using a simple shallow-water like model. MLMC quantifies uncertainty by computing estimators for the expectation of discretised random variables for uncertain parameters. In this work we present an initial exploration of how MLMC can be applied to the hydro-morphodynamic model XBeach, to investigate the impact of a variety of uncertain input parameters in both theoretical and real-world test cases. To the best of our knowledge this is the first application of MLMC with complex hydro-morphodynamic models.
In coastal problems, we are not only interested in the expected value of a variable but also in the probability of a variable exceeding a certain value. This quantity can be expressed as an expectation: [1 ≥ ], where, for example, is the maximum horizontal inundation distance and the location of a physical structure, and 1 represents the indicator function. Thus [1 ≥ ] gives the probability, or the risk that the physical structure at will be flooded. However, as discussed in Giles (2015), MLMC struggles with binary output variables, because a large number of samples are required to ensure accurate variance estimates. A number of different methods have been proposed to deal with this problem including smoothing  and selective refinement (Elfverson et al., 2016). In this work, we use the ensemble generating method from Gregory and Cotter (2017b) which we discuss further in Section 4. Unlike other approaches, this method generates the entire cumulative distribution function for the output variable, which is of much greater value when assessing risk than a single statistic. Furthermore, it does so in a simple and computationally efficient manner.
The remainder of this paper is structured as follows: in Section 2, we briefly outline the relevant MLMC theory; in Section 3, we integrate MLMC with XBeach and run a series of test cases; in Section 4, we estimate the cumulative distribution function of the output variable using MLMC and finally in Section 5, we present conclusions from this work.

Multilevel Monte Carlo method
The simplest method to estimate the expectation of a random variable ( ) (where is a single sample from the sample space Ω) is to use a Monte Carlo method and take the average of ( ) for independent samples from Ω. However, the Monte Carlo method has an order of convergence of −1∕2 (Caflisch, 1998) meaning to achieve an accuracy of requires ( −2 ) samples. This makes the simulation very computationally expensive. The multilevel Monte Carlo method (MLMC) was first introduced in Giles (2008). In this section, following Giles (2008) and Giles (2015), we discuss MLMC and its use in computing estimators for the expectation of discretised random variables given uncertain input parameters. We refer the reader to Giles (2008) and Giles (2015) for more details.
Whereas the Monte Carlo method considers the model at just one resolution, MLMC accelerates the Monte Carlo method by using a hierarchy of models at different levels of resolution. The fundamental idea underlying MLMC comes from the linearity of expectations: for two variables and −1 , where (⋅) denotes the expectation. Extending this idea In the MLMC framework, denotes the numerical approximation to on level of the multilevel environment and thus 0 and denote the approximation on the coarsest and finest level respectively. Within the framework, each level is defined by its numeric mesh element size where is the total length of the domain and the integer factor the mesh element size is refined by at each level (following standard practice, we use = 2 throughout). Thus as increases, the mesh becomes more refined. Note the domain can be multi-dimensional in which case there will be an ℎ for every dimension of .
In MLMC, the expectation estimator is defined aŝ wherê is the difference estimator for [ − −1 ] defined aŝ where is the number of samples at each level ( , − 1) pair. Here the same random numbers are used to construct the variables and −1 in order to minimize the variance of the difference estimator and hence the overall error. To ensure independence, different independent samples are used at each level, meaning Cov(̂ ,̂ ) = 0 if ≠ and thus where (⋅) denotes the variance and is the variance of the sample ( ) − ( ) −1 . The overall cost of calculating the expectation estimator (4) is given by ∑ =0 where is the cost per level . To calculate the error of the estimator (4), we note that root mean square error (RMSE) is defined as This is simplified by notinĝ is an unbiased estimator of [ ] and so Here [(̂ − [ ]) 2 ] is equivalent to (6) and is the Monte Carlo error; and ( [ ] − [ ]) 2 is the square of the bias and is the numerical discretisation error. A bound on the RMSE (8) is provided by the key complexity theorem for MLMC. We outline it briefly here -the full details can be found in Giles (2015) with a proof given in detail in Cliffe et al. (2011). Briefly the theorem states that if the following conditions are met where ≥ 1∕2 min( , ), and , , , 1 , 2 , 3 are positive constants, then there exists a finest level resolution and a finite number of samples such that for any < −1 , the estimator̂ = ∑ =0̂ has RMSE ≤ .
Furthermore, the total computational cost of calculating this estimator with this RMSE is where 4 is a positive constant. Note (9a), (9c) and (9d) represent a bound on the bias, variance and cost respectively at each level and (9b) follows from the definition in (5). As the mesh is refined, controls the decay in the variance and the growth of the cost. Thus if > , most of the computational cost is at the coarser levels and if < it is mostly at the finest levels. This means if < , the order of convergence is better than the standard Monte Carlo method and if > , it is equivalent to the Monte Carlo method. The values of , and are found using convergence tests: the gradient of the line of best fit of ( − −1 ) against is , ( − −1 ) against is and timed cost against is .
A key part of MLMC is determining the optimum to achieve the convergence stated in (10). Following standard practice (for example Giles (2008)), the optimum value of is determined by using the Euler-Lagrange method to minimize the overall cost with respect to the fixed overall variance 2 ∕2. It can be shown that the optimum number of samples at each level is This formula requires a knowledge of the variance and cost at level . Following the proof in Cliffe et al. (2011), we instead use the conditions in (9) to create an upper bound on (for example (9c) and (9d) mean that < 2 3 2 ( − ) ) and set this to be the optimum number of samples. Thus • if < : • if = :

MLMC verification checks
Accompanying MLMC are the following checks, detailed in Giles (2015), to ensure the method implementation is both correct and appropriate.
• . Therefore we use the following as a convergence test where is determined by calculating the gradient of the line of best fit of ( − −1 ) against . If this check fails, this means MLMC has not converged to within the specified tolerance -this may mean there is a mathematical or programming error or simply that the value of chosen is too small for the number of levels in the MLMC algorithm. • has a probability of 0.3% of being greater than 1 (see Giles (2015)). If this ratio is greater than 1, this indicates that there is a difference between how and −1 are computed and not just that they are on a different mesh.
• Kurtosis check: The kurtosis, , of − −1 determines the number of samples required for a good variance estimate. If is very large this indicates that the variance estimate is poor and that either the number of samples used is insufficient or the method chosen for sampling is incorrect. Thus following standard practice, if the kurtosis is greater than 100, an error is raised.
We use these checks to evaluate the coding and mathematical implementation of the test cases in this work, and to identify any inherent problems with applying MLMC to these test cases.

MLMC algorithm
We have now outlined the basic MLMC method and conclude this section with a statement of the MLMC algorithm used in this work, modified from Giles (2008): Algorithm 1 Multilevel Monte Carlo Method 1: Start with = 0 2: Estimate the variance using an initial estimate for the number of samples 3: Define optimal for = 0, ..., using (12) 4: If the optimal is greater than the number of samples you already have, evaluate the extra samples needed 5: If ≥ 2 test for convergence i.e. check if (16) is satisfied 6: If < 2 or the algorithm has not converged set ∶= + 1 and return to Step 2

Applying MLMC to XBeach
In this section, we apply MLMC to the XBeach model in order to estimate the expectation of random variables in test cases given uncertain input data. We apply MLMC through a Python wrapper around the XBeach model. We have also parallelised the wrapper meaning that, within the same level, multiple XBeach runs can be performed at the same time, thus increasing the efficiency of the algorithm. Note our MLMC wrapper is written so that it can easily be applied to other models in further work. We begin this section by giving a brief overview of the XBeach model.

XBeach model
XBeach is a depth-averaged finite-volume coastal ocean model which simulates near-shore hydrodynamics and morphodynamics (Roelvink et al., 2015). Surface elevation and flow are modelled using the Generalised Lagrangian Mean formulation of the depth-averaged shallow water equations (Andrews and McIntyre, 1978), enabling the modelling of short wave-induced mass fluxes and return flows. These equations are: where is the Lagrangian velocity defined in Roelvink et al. (2015) as the distance travelled by an individual water particle during one wave period divided by the wave period, the Coriolis vector, the viscosity, and the wind and bed shear stresses respectively where denotes Eulerian velocity defined in Roelvink et al. (2015) as the fixed point short-wave-averaged velocity, the elevation, F the wave-induced stresses, the density and ℎ the water depth. Waves can be modelled using several different options in XBeach. In this work, the stationary wave model is used in Section 3.2 and the surfbeat model in Section 3.3. Both methods solve the following wave action equation for short wave energy but in the stationary mode, = 0. Here is the group velocity in the -direction, the group velocity in thedirection, the refraction speed, the dissipation by wave breaking, , the dissipation by bottom friction, the dissipation by vegetation and is the wave energy. Note the formulations of and are different in the stationary and surfbeat mode (see Bart (2017) for further details). Finally, to model morphological changes in Section 3.3.1, XBeach uses the following standard depth-averaged advection-diffusion equation to model sediment transport where is the depth-averaged sediment concentration, the Eulerian velocity, ℎ the diffusivity coefficient, eq the equilibrium sediment concentration and the adaptation time; and the Exner equation to model bed changes where is the bed, the bed porosity, mor the morphological acceleration factor used to artificially increase the rate at which the bed changes compared with the underlying hydrodynamics and q the sediment transport rates.
In this work, we use version 1.23.5526 of the XBeachX release and unless explicitly stated, parameter values are left at the default for this version.

Uncertainty in bed slope angle
We now apply the MLMC framework from Section 2 to the XBeach model described above. In the following test cases, we evaluate the uncertainty associated with the beach bed profile by considering the bed slope angle to be uncertain. This represents a significant source of uncertainty, as discussed in Unguendoli (2018).

One-dimensional (1D) bed slope test case
For the first test case, we consider a simple 1D hydrodynamics-only problem of a stationary wave approaching the beach with a wave period of 10 s and a maximum wave amplitude of 2.5 m and each simulation is run for 200 s. The beach slope starts at 750 m and the bed set-up is depicted in Figure 1.
The distribution of is set to be ∼  (arctan(0.035), arctan(0.5)). The lower bound is chosen to ensure that there is always an area of the bed which is above the initial water level. We seek the expected value of the maximum horizontal inundation distance during the simulation. In order to satisfy the constraints on required to achieve the convergence stated in (10), the inundation distance is normalised by dividing it by the total -length of the domain.
Before running the full MLMC algorithm (Algorithm 1), we first run Steps 1 and 2, in order to check that the MLMC approach is appropriate for this test case. We use 7 levels ( = 2 to 8) and 500 samples at each level. Figure 2a, a log plot of ( − −1 ) against level, shows that the variance decreases as the level number increases, i.e. as the mesh gets finer. This observation is important because, recalling (6), this means fewer samples are needed on the finer meshes. The figure also shows that ( − −1 ) is lower than ( ) for all . This is also important because otherwise a lower variance could be achieved by using the same number of samples with a simple Monte Carlo simulation. Thus, we expect the MLMC simulation to be less computationally expensive than a Monte Carlo simulation meaning the choice of the MLMC approach for this test case is justified. Figure 2b shows as expected that the cost calculated by timing the code increases as the level number increases (note here this is the time taken for the code to run in parallel). This is because for the preliminary MLMC run, a fixed number of sample points is being used and, at finer grid levels, XBeach is more computationally expensive. Using these figures, as well as the log plot of ( − −1 ) against level , convergence tests estimate the convergence values   defined in (9) as = 1.29; = 1.89; = 1.10 for this test case. Thus the numerical discretisation is approximately first order accurate (based upon the value of ) and as > most of the computational cost in the MLMC algorithm will come from the coarser levels. This means that we expect ∝ −2 , which is the same as for Monte Carlo but we still expect MLMC to be computationally faster due to the improvements in the variance. These observations from the preliminary study, as well as the fact that the consistency and kurtosis checks from Section 2.1 passed, means we can conclude that MLMC can be used successfully with XBeach and we can run the full algorithm. For this test case, the chosen range of tolerance values are = [0.0002, 0.0005, 0.0008, 0.0012, 0.002, 0.003] recalling that controls the bound on the convergence test (16). As > , we use these values in the formula (13) to derive the optimum number of samples required at each level, . We also set a maximum limit on the finest level that can reach of max = 13 (see Step 6 of Algorithm 1). Figure 3a shows the optimal and most importantly that as the level number increases (i.e. as the grid becomes finer), fewer samples are needed, which makes MLMC more efficient than a simple Monte Carlo method.
From these , we use the formula given in Giles (2008) to calculate the theoretical cost of the simulation for each . Figure 3b shows that both the gradient of the theoretical cost and the timed cost agree with the expected gradient ∝ −2 and thus the code has been correctly applied.
(a) Optimum number of samples required at each for given tolerance value .
(b) Verification that both theoretical cost (23) and timed cost vary with at expected rate.  To test the accuracy of the MLMC estimate, we also calculate the expected value using a Monte Carlo simulation at the finest mesh considered by the MLMC simulation (Δ = 0.153 m). According to Monte Carlo the expected maximum horizontal inundation distance divided by the total domain length is 0.6785 (to four significant figures). As the total domain length is 1250 m, this in fact means the water inundates a horizontal distance of 848.1 m. This agrees up to four significant figures with the value found using the MLMC simulation using = 0.0002. We can calculate the root mean square error of the MLMC simulation compared to the Monte Carlo simulation using Figure 4a shows that as the mesh becomes finer, the RMSE decreases for all values considered. In addition, Figure  4b shows that as decreases, in general the final expected value becomes closer to the real solution showing that the convergence condition (16) has worked as expected. Note the only exception to this trend is for = 0.002 and = 0.003. This is likely because at these values of , the optimal number of samples is quite low (see Figure 3a) and so the variance approximations are likely to not be completely accurate.
Finally Figure 5 compares the total cost and accuracy for MLMC versus Monte Carlo. It shows that, as expected, the Monte Carlo simulation does not converge uniformly to the final value. By contrast, the figure shows MLMC convergence is much more uniform, due to the use of the convergence test in Section 2.1 and optimum number of samples calculation. Thus, unlike with Monte Carlo, for MLMC we do not need to conduct further samples to check whether the algorithm has really converged, thus decreasing the overall computational cost. More significantly, the figure also shows that it takes the Monte Carlo method more than 110,400 core hours (4.0 × 10 8 s) to consistently achieve an RMSE lower than that achieved by MLMC with = 0.0002 in 768 core hours (2.8 × 10 6 s), a speed-up factor of over 140.
Although this test case is somewhat abstract, it demonstrates that MLMC can be used with XBeach for a wave with known properties approaching a 1D beach with unknown slope. We have shown that the key advantage of MLMC is that it can achieve results considerably faster when compared to using a simple Monte Carlo approach. This means we can confidently move to more realistic and complicated 2D problems.

Two-dimensional (2D) bed slope test case
In order to show the preceding promising results are not restricted to the simple 1D case, we briefly consider the more complex test case of a 2D beach. We use the same parameters as in Section 3.2.1 and modify the bed slope angle, , to apply in both the and direction, as shown in Figure 6. We again choose a distribution of ∼  (arctan 0.035, arctan(0.5)) and seek to find the expected value of the maximum horizontal inundation distance along the -axis that the water reaches during the simulation for all values of . The results are again normalised by dividing by the total -length of the domain, 1250 m.
To check whether MLMC can be used for this test case, we first run steps 1 and 2 of Algorithm 1. Figure 7 shows that − −1 decreases as the mesh gets finer and that for all it is lower than , meaning fewer samples are needed on finer meshes. Using convergence tests, the values in (9) are found to be = 1.18, = 1.69, = 2.93. Thus the numerical discretisation is approximately of the same order as the 1D test case. As > , most of the computational cost will come from the finer levels and we expect the order of convergence to be better than the standard Monte Carlo method.
The simulation also passes the consistency and kurtosis checks from Section 2.1 and thus the preliminary application of MLMC to a 2D test case in XBeach is successful. To run the full MLMC algorithm, we choose tolerance values of = [0.001, 0.0017, 0.0025, 0.0035, 0.005, 0.006] and set the maximum limit on the finest level can reach to be max = 8. Note that we choose larger values of here than in Section 3.2.1 so that we do not need to consider as fine a mesh here as in Section 3.2.1. Finer meshes are not an issue for MLMC but using a finer mesh than max = 8 would make the computational cost of the Monte Carlo simulation later in the section unmanageable. As > , the optimum are calculated using (14) and are shown in Figure 8a. Importantly the number of samples required decreases as the mesh becomes finer.
For this test case > and so following (11), the cost of the algorithm should be such that ∝ −2−( − )∕ . Figure 8b shows this is roughly true for both the gradient of the timed cost and the theoretical cost (calculated using (23)).
To calculate the RMSE of MLMC for this test case, we estimate the real solution [ ] using a Monte Carlo    simulation with Δ = 1250∕2 9 = 2.44 m and Δ = 1000∕2 9 = 1.95 m, which is the same as the finest mesh in the MLMC simulation. The maximum expected horizontal inundation distance along the -axis over all time and all divided by the total -length of the domain is 0.6934 (to four significant figures). As the total domain length is 1250 m, this in fact means the water inundates a horizontal distance of 866.8 m. Figures 9a and 9b show that the RMSE of MLMC decreases both as the level number increases (i.e. the fineness of the mesh increases) and as decreases, showing that MLMC is working as expected. Finally Figure 10 compares the cost and accuracy of the MLMC and Monte Carlo methods. As in Section 3.2.1, MLMC converges more uniformly than Monte Carlo and more importantly it takes the Monte Carlo method more than 222,240 core hours (8.0 × 10 8 s) to consistently achieve an RMSE that is lower than that achieved by MLMC with = 0.001 in 160 core hours (5.7 × 10 5 s). (Note the time taken to achieve the best MLMC result is less than the time taken to achieve the best MLMC result in Section 3.2.1 despite the difference in dimensions because in this section we use larger values as discussed above). This is a speed up factor of approximately 1400, which is an order of magnitude larger than the speed-up factor achieved in the 1D version of this test case. We have thus again significantly

Uncertainty in wave height
So far in this work, we have only considered the uncertainty associated with a simple beach bed profile. Another significant source of uncertainty is the structure of the waves approaching the beach. Whilst the previous test cases are purely theoretical, in this section we consider a lab test case and a test case using real world data, with the uncertain parameter being the wave height in the JONSWAP wave spectrum, ℎ (see Hasselmann et al. (1973) for more details on this spectrum).
A vital condition of MLMC theory is that for each , −1 pair calculation, the only difference in the model must be the mesh element size. Thus, when calculating , we use the XBeach option random = 0 which means that the same random seed is used to initialise the wave boundary conditions for both −1 and . Both test cases below pass the consistency check in Section 2.1 indicating that the formulation is both mathematically correct and implemented correctly.

Morphology test case
An important variable to consider when assessing the risk to coastal areas is the volume of material eroded. To do this we consider the standard DelflandStorm test case, available as part of the XBeach documentation (Roelvink et al., 2015). Note this is a 1D test case so we only alter the mesh element size in the -direction.
In this test case, we seek the expected total volume change over the simulation, found by using the trapezium rule    to approximate the integral of the difference between the initial bed profile and the final bed profile, as shown in Figure  11. Thus if the number is positive, the erosion volume is greater than the accretion volume and the beach has lost material and been net eroded, and if the inverse is true the beach has accumulated material and grown. Note that as this is a test case in one horizontal dimension this volume is in fact an area but to avoid confusion with surface area we shall continue referring to it as a volume. We normalise the volume change using the total initial volume of the beach, i.e. the volume underneath the slope with the minimum value of being the lower vertical limit. The distribution of the wave height is set to be ℎ ∼  (0 , 5 ).
To check whether MLMC can be used for this test case, we run steps 1 and 2 of Algorithm 1. Figure 12 shows that − −1 decreases as the mesh becomes finer and that for all it is lower than . Using convergence tests, the convergence values in (9) are = 0.921, = 1.77, = 2.40. Thus the numerical discretisation is approximately first order accurate and as > , we expect the order of convergence to be better than the standard Monte Carlo method. In addition, the test case also passes the standard checks in Section 2.1.
Given the success of the preliminary study, we run the full algorithm and consider tolerance values of = [1.5 × 10 −5 , 2 × 10 −5 , 5 × 10 −5 , 10 −4 ] with a maximum limit for the finest level of max = 12. As < , the optimum are calculated using (14) and are shown in Figure 13a. Importantly the number of samples required decreases as the mesh becomes finer. Furthermore, for this test case > and so following (11), the cost of the algorithm should be such that ∝ −2−( − )∕ . Figure 13b shows that both the timed cost and the theoretical cost (using (23)) follow   the expected gradient. In order to calculate the RMSE of MLMC for this test case, we estimate the real solution [ ] through a Monte Carlo simulation using Δ = 1421.562∕2 12 = 0.347 m, which is the same as the finest mesh utilised in the MLMC simulation. The expected volume change as a proportion of the total initial volume is 0.0006150 (to four significant figures). This is positive and thus means that the beach has been net eroded during the simulation. As the total volume is 16 333 m 2 , this in fact means that 10 m 2 of beach material is eroded during the simulation (recall here that the test case is in one horizontal dimension so the volume is in fact an area). Figures 14a and 14b confirm that the RMSE decreases as the level number increases and as decreases respectively, thus confirming MLMC is working as expected. Unlike with the other test cases considered, Figure 14b shows that the relationship between the RMSE and is not uniform and there is a clear elbow in the plot at = 5 × 10 −5 . This is a discrete effect caused by the fact that for = 10 −4 , the convergence test (16) passes without requiring any samples on the finest level ( = 12) (see Figure 13a).
Finally Figure 15 compares the cost and accuracy for MLMC versus Monte Carlo. It shows that in this test case the convergence of the Monte Carlo simulation is notably less uniform than that from MLMC, meaning the Monte Carlo simulation must be run for a long time to ensure convergence. (Note that the elbow present in Figure 14b is again present in the MLMC convergence for the same reason). Importantly, Figure 15 also shows that the Monte Carlo method requires over 400,000 core hours (1.4 × 10 9 s) to consistently achieve an RMSE lower than that achieved by MLMC with = 1.5 × 10 −5 in 10,911 core hours (3.9 × 10 7 s). Thus MLMC results in a speed-up factor of approximately 40 and we have shown that MLMC can significantly decrease computational costs without compromising on accuracy for a full hydro-morphodynamic model with uncertainties in the wave height.

Boscombe Beach test case
For the final test case, we consider the real world case of Boscombe Beach in Dorset, UK, which is a standard 2D test case in the XBeach documentation (Roelvink et al., 2015). The distribution of the wave height is set to be ℎ ∼  (0 , 3 ). To make the problem simpler, the morphological and tidal component present in the standard test case are switched off, but all remaining values are kept the same as those in the standard test case.
When assessing flood risk, an important quantity not yet considered in this work is the wave run-up height. Traditionally, many coastal structures have been designed to withstand the wave run-up height exceeded by 2% of incoming waves, 2% (shown in Figure 16), with the idea being that soft engineering measures such as clay and grass are sufficient to withstand the 2% of waves that exceed this height (see Allsop et al. (2007)). Thus in this section, we seek the expected value of the maximum run-up height 2% attained during the simulation over all values of . Note that 2% is by definition a probabilistic measure, but there exist a variety of standard formulae that can be used to estimate this value deterministically for an individual simulation (see Melby et al. (2012)). We use the industry standard EurOtop formula given in Allsop et al. (2007) as where , and are the influence factors for the berm (raised bank), slope roughness and oblique wave respectively, 0 is the wave height at the beach-toe (i.e. where the water level ( ) and the bed ( ) are equal to each other) and −1,0 the breaker parameter. Note as the specific values of , and for Boscombe Beach are unavailable, they are assumed equal to 1 but given the linear relationship between these values and 2% it is simple to change these values after the simulations have finished. The breaker parameter relates the slope steepness tan to the wave steepness where is the beach slope angle at the beach-toe and −1,0 is the ratio of the wave height to the wavelength ( 0 ∕ 0 ). The wavelength is calculated as and we use the standard approximation (see Melby et al. (2012)) that −1,0 = ∕1.1 where is the peak wave period.

Convolution filter
To set the bed in the XBeach model, we use the bed data provided in the XBeach documentation (Roelvink et al., 2015). Clearly in real world environments, the beach approach is not smooth as has been the case in the test cases  considered so far in this work. Instead there may be sandbars or other similar physical features present, either natural or anthropogenic. This could potentially cause issues especially if a feature is not represented in coarser meshes and suddenly appears in finer meshes. However, with MLMC it is not necessary that the coarsest mesh is a good approximation to the finest mesh, merely that each mesh is a good enough approximation to the next finest mesh + 1.
In order to mitigate the issue of features suddenly appearing, we use a convolution matrix filter which has the general expression where = ⌊ 2 ⌋ and is the length of . An appropriate filter kernel to use in this case is the normalised arithmetic mean At the edges, we use the 'nearest' method meaning that the edge value is repeated outwards as many times as necessary for the multiplication of the weights. For this test case, is the underlying bed and the convolution filter is applied in the direction perpendicular to the coastline. The convolution filter length, at each level follows = max(( max − ), 1) where max is the finest mesh level considered. At level max , = 1 which is the identity filter and thus the bed is unchanged. Hence, as the mesh becomes finer, the convolved bed slowly converges to the original bed and thus there are no extra errors associated with using a convolved bed.
To check whether it is valid to use MLMC for this test case, we run steps 1 and 2 of Algorithm 1. Figure 17a shows that − −1 decreases as the mesh gets finer. At = 5, − −1 is approximately equal to , probably due to the fact that, even with the convolution, the mesh at = 5 does not give the best approximation to = 6 thus creating a relatively large variance. To further check the effects of using a convolution, Figure 17b shows how − −1 varies; unlike with the variance, the decrease in the quantity is smooth and shows that the numerical discretisation is approximately first order accurate, as with the test case in Section 3.3.1. In addition, the simulation also passes the consistency and kurtosis checks from Section 2.1 and thus we can be confident that the MLMC algorithm is appropriate. Using convergence tests, the convergence values in (9) are = 0.996, = 1.84 and = 2.52. As > , we expect the order of convergence to be better than the standard Monte Carlo method. Proceeding, we consider tolerance values of = [0.002, 0.004, 0.006, 0.008, 0.01] and set a maximum limit of the finest mesh of max = 8 because this corresponds to the mesh that the Boscombe Beach bed data is defined on. As > , the optimum are calculated using (14) and are shown in Figure 18a. Importantly the number of samples decreases as the mesh becomes finer.   For this test case > and so following (11), the cost of the algorithm should be such that ∝ −2−( − )∕ . Figure 18b shows that this is roughly true for both the timed cost and the theoretical cost (calculated using (23)). Any discrepancy is likely due to the fact that at coarser grids the convolution simplifies the bed meaning the XBeach model runs more quickly than it would if it were using the same coarse grid on the more complex original bed. This effect is a more significant proportion of the time at larger where there are very few samples at finer levels.
As before we calculate the RMSE of the MLMC simulation by estimating the real solution [ ] through an Monte Carlo simulation with Δ = 1123.8461∕2 8 = 4.39 m and Δ = 1604∕2 8 = 6.27 m which is the same as the finest mesh in the MLMC simulation. The expected maximum value of 2% is found to be 0.067 64 m (to four significant figures). Using this value, Figures 19a and 19b confirm that the RMSE decreases as the level number increases and as decreases respectively, meaning the MLMC algorithm is working as expected. Interestingly, Figure 19a shows that only has an effect on the final error when the finest level is included. This makes sense because at previous levels, the convolved version of the bed is being used rather than the real data.
Finally Figure 20 compares the cost and accuracy of using MLMC versus Monte Carlo for this test case. Unlike with previous test cases, the figure shows that for large values of , it may be quicker to use Monte Carlo instead of MLMC. This may be because due to the use of convolutions, a certain number of samples are required on the finest levels to obtain an accurate result. We also note that, as has been the case for all the test cases considered in this work, with Monte Carlo the simulation must be run for longer than strictly necessary to ensure convergence, which is not the case for MLMC. The choice of Monte Carlo versus MLMC for this test case therefore depends on how accurate and how certain the required result needs to be. Furthermore, for the smallest and longest simulation considered here, the error using MLMC is much smaller than the error using Monte Carlo for the same computational cost. In order to achieve the same accuracy, the Monte Carlo simulation requires over 100,000 core hours (3.6 × 10 8 s) compared to 2092 core hours (7.5 × 10 6 s) for MLMC with = 0.002. Thus using MLMC results in a speed-up factor of 50 showing that MLMC can significantly decrease computational cost for a real world test case with uncertain wave height.

Cumulative distribution functions
In an MLMC framework, the objective is normally to find the expectation of an output variable. However, to analyse risk in coastal problems, the quantity of interest is the probability of an output variable exceeding a certain value. As discussed in Section 1, this is a complicated output to compute because MLMC provides very few values on the same level from which to build that distribution.
To resolve this issue, we follow Gregory and Cotter (2017b) and use the inverse transform sampling method to evaluate the inverse cumulative distribution function (CDF), −1 ( ), where ∼  [0, 1]. If is strictly increasing and absolutely continuous then ≡ −1 ( ) is unique. A simple consistent estimate for is found by sorting the samples =1,..., ∼ such that 1 < 2 < ... < and then which, as argued in Gregory and Cotter (2017b), is consistent because it converges in probability to as → ∞. For an MLMC approximation, the inverse CDF is then given by where ( ) represents the th ordered statistic of on each level and * ( ) represents the same but from a different run of the algorithm than ( ) . Note this means the MLMC algorithm must be run twice but given MLMC is significantly faster than Monte Carlo, even running it twice is much faster than running Monte Carlo once. Unlike with (2) there is no exact cancellation here as the approximations on each level are not unbiased.
Thus an ensemble of can be generated, which are not samples from , but consistent approximations to −1 ( ). Thus, as → ∞, then → −1 ( ) and the CDF of the MLMC approximation is given by For more details on this method we refer the reader to Gregory and Cotter (2017b).

Applying inverse transform sampling to test cases
In this section, we apply the inverse transform sampling method to each of the four test cases discussed in Section 3. This method provides a CDF from which the probability that the output variable exceeds a certain value can be predicted. This is particularly useful for predicting the probability of extreme flooding/erosion events occurring, a useful tool in coastal risk assessment. To apply the method, we select the lowest value considered in each test case and run Algorithm 1 twice. The output is recorded at each level separately, ordered and then (31) is used to generate pseudo-samples on the finest level.
The results of applying this method to all four test cases are shown in Figure 21, which also compares the MLMC results with those obtained by the Monte Carlo method. In all cases the MLMC results predict a similar CDF to the Monte Carlo method. The good approximation between MLMC and Monte Carlo is further shown by the small error norms between the two methods, shown in Table 1. Note the error for the Boscombe Beach test case whilst still small is a little higher than for the other test cases. This is in part due to the fact that MLMC does not predict any 2% lower than 0.025 m -if values below this quantity are excluded from the error comparison, the normalised L2 error norm drops by 25% to 0.0334. Whilst in an ideal situation MLMC would predict these lower values too, the lower values of the 2$ distribution are of the least interest because they pose the lowest risk to the beach and thus the fact that MLMC does not predict any values in this range is arguably not a significant issue. Thus we conclude that we can use MLMC not only to find the expected value of output variables but also to determine the probability of extreme flooding and erosion events occurring and assess whether certain areas are at risk.

Conclusion
In this work, we have applied MLMC to a complex hydro-morphodynamic model for the first time to evaluate risk in the coastal zone. We have shown that MLMC can be used successfully to estimate both the expected value of a hydro-morphodynamic output variable and its distribution for both 1D and 2D test cases and for both experimental and real-world test cases. The key advantage of MLMC over more traditional methods such as the Monte Carlo method is that we have shown that it is considerably faster whilst maintaining the same level of accuracy in the results. Another significant advantage of MLMC over Monte Carlo is that the solution is stable -in all test cases considered, decreasing (i.e. the accuracy tolerance) results in a more accurate solution. Thus there is no need to run the simulation for longer to ensure the solution has converged, as is the case with Monte Carlo. This again provides a computational cost advantage for MLMC compared to Monte Carlo. Furthermore, throughout this work, MLMC has no difficulties in dealing with the complex nature of the XBeach model nor with the fact that some parametrisations in XBeach only appear at finer meshes. This is mainly due to the fact that MLMC only requires that the result from one mesh is a good approximation to the result from the next finest mesh, which makes it very flexible.
In Section 1, we discussed the case in Callaghan et al. (2013), which would take approximately 4.5 millennia to run with Monte Carlo. This test case is similar to both the morphology and Boscombe Beach test case considered in this work, for which using MLMC resulted in a speed up of 40 and 50 times respectively. Therefore it is not unreasonable to suggest that using MLMC, the test case in Callaghan et al. (2013) would only take 100,000 years. Whilst this is still a large cost, because the code is parallelised, this is not outside the realm of certain supercomputers. This work has shown the potential for MLMC to be used in the coastal engineering field. In future work we will consider more complex examples and investigate the effect of uncertainty in more than one parameter on more than one output variable.

Computer code availability
The relevant code for the MLMC framework presented in this work can be found at https://github.com/ mc4117/MLMC_XBeach.