A web application for hydrogeomorphic flood hazard mapping

Abstract A detailed delineation of flood-prone areas over large regions represents a challenge that cannot be easily solved with today's resources. The main limitations lie in algorithms and hardware, but also costs, scarcity and sparsity of data and our incomplete knowledge of how inundation events occur in different river floodplains. We showcase the implementation of a data-driven web application for regional analyses and detailed (i.e., tens of meters) mapping of floodplains, based on (a) the synthesis of hydrogeomorphic features into a morphological descriptor and (b) its classification to delineate flood-prone areas. We analysed the skill of the descriptor and the performance of the mapping method for European rivers. The web application can be effectively used for delineating flood-prone areas, reproducing the reference flood maps with a classification skill of 88.59% for the 270 major river basins analysed across Europe and 84.23% for the 64 sub-catchments of the Po River.


Introduction
According to the European Environment Agency (EEA, 2016), a significant part of the European population is estimated to be living in, or near to, a flood-prone area. This tendency, with historical (e.g., Barredo, 2007) and projected (e.g., Alfieri et al., 2015) consequences of flood disasters tied to it, represents a serious challenge to flood risk management that is also emphasized by the Flood Directive (EU, 2007 In particu users to: ( ) p extrapolate flood studies across to user-selected reference maps delineation at 25 m spatial resol the-fly via a simple slider. The w attends to users' needs and exp Examples of similar tools are the 2013) that provides access to flo interactive interface or the FLIR with flood warnings, observations and remote sensing data to inform decision makers through a web-based system.
The web application we present here represents a step forward in flood hazard assessment and management and its scientific worth is considered manifold: • It provides fast and inexpensive estimates of flood-prone areas for specific return periods that complement but do not replace results from hydrodynamic simulations.
•  (GFI, Samela et al., 2017), was found to be the best performing and the most consistent index (Manfreda et al., 2015;Samela et al., 2016Samela et al., , 2017. As alternatives to the classification of flood-prone areas using the GFI, other data mining techniques somewhat relating to this work have been reported in literature, with good results (Tehrany et al., 2013(Tehrany et al., , 2014Rahmati and Pourghasem, 2017  The GFI corresponds to the morphological descriptor adopted and used as classifier in the web application; it is a composite index derived directly from terrain analysis and is defined as: where tpr is the true positive rat of a correct hit; Fawcett, 2006), tpr = 1-fnr = tp/p = tp/tp+fn with fnr the false negative rate ( flood-prone pixels in both the te and fn the number of false negatives. The true negative rate (or specificity: the probability of a correct miss), tnr, is given by: tnr = 1-fpr = tn/n = tn/tn+fp (8) with fpr being the false positive rate (or fall-out: the probability of an incorrect hit; Fawcett, 2006), tn the number of true negatives (i.e., number of flood-free pixels in both the test and the reference) n the total number of negative samples and fp the number of false positives. The You during cl t. We note ts and cost purposes  (Bradley, 1997;Fawcett, 2006). The AUC is invariant to selected thresholds and prior probabilities (Bradley, 1997;Schumann, Vernieuwe et al., 2014). Provided that a reasonable number of thresholds are considered, the AUC can be estimated by a trapezoidal rule approximation of the definite integral. In our case, the number of thresholds considered can surpass the one million mark, depending on the river basin.

We
We The web pp , p y following section and employs a (GIS) and development software Upload of datasets and downloa supported by the Geospatial Da HTML5 and JavaScript with the sharing is done via a Git-type on 4.1. Architecture A cloud-based client-server model is adopted as the web application architecture. The implementation of the web application is illustrated as a network diagram in Fig. 3. The server host functions as a web and file server for restricted uploading and storing of static layers and results, as well as for their retrieval. Clients and server communicate over the internet via any modern web browser. The web application framework incorporates a Web-GIS front-end made of a combination of HTML5 (a markup language) and OpenLayers (an open sou source g core mod Web application I/O is done using GDAL and is available in any raster file format supported by this library. The core model functions accessible to users consist of:

Parallelization strategy
The same concurrent programming model is used in different phases of the methodological workflow, from pre-processing and morphological characterization to classification and validation. An exception was made for terrain analysis, which followed a different parallelization strategy, used as an out-of-the-box feature of TauDEM utilities (Tarboton, 2015). The domain is decomposed in logical units (i.e., hydrological unique river basins and sub-catchments) used as natural geometric domain partitions for parallel computation ( Knijff et al., 2010), by estimating and by simulating floodplain hyd hydrographs as boundary condi simulations were performed by A (unspecified delineation method than 500 km 2 . As highlighted by affected by a number of uncerta issues associated with the input flood defences, coarse resolutio overestimate runoff). In fact, the flood hazard map for Europe for the 100-year return period event presents hit rates between 59% and 78% and critical success between 43% and 65% evaluated based on specific national/regional hazard maps. Therefore, we are aware of the limitation of such datasets that offer a preliminary description of flood hazard at the European scale. In case new maps become available, they can easily be incorporated in the proposed web application.
In this paper we refer to a study area composed of a selection of 270 river basins (Fig 7)

Web application generated flood hazard maps
The delineation, downscaling and extrapolation of hydrogeomorphic flood-prone areas can be performed online using the web application. In Fig. 9, we show an output of the web application relative to a specific classification The figure provides a comparison between the hydrogeo and dark rd map for example control th ets an imme pixels) fl

Optimal classifier thresholds and Youden's statistic
The classification of the morphological descriptor GFI within the pre-defined classification area resulted in an optimal GFI threshold for each European river basin, each Po river subcatchment, and each return period considered. In this section, we compare the delineated and downscaled hydrogeomorphic flood-prone areas with the reference flood hazard maps for Europe, to understand how well the former replicates the latter. We must acknowledge that the aim of such comparison is more to test the potential of the geomorphic procedures rather th approxim The resu step (see ig. 21 in sup optimal t n supplem threshold n be seen The spat f the two c success descripto values w return pe sub-catc material) catchme The skill of the GFI-based method is assessed by means of the ROC and AUC obtained within the pre-defined classification areas and is examined for each major river basin in Europe and return period considered (10, 20, 50, 100, 200 and 500-year), as well as for the sub-catchments of the river Po in Italy. In Fig. 14   To complement the spatial distribution of optimal GFI thresholds and performance measures presented in the previous subsections, and to have a more complete overview of the results, we summarize the data distribution in a set of box plots (Figures 19 and 20). Additional information about the data distribution can be found in supplementary material (tables 2 and 3). Fig. 19 refers to the optimal GFI thresholds obtained by performing the linear binary classification of flood-prone areas within the 270 major river basins of Europe and the 64 sub-catc The optim at the sub-c maxima. slightly w expected er the disce outside t The You scale its lower at consider scale.
Lower va observed scale (pa minima t an increa median t always a The MCC mean ab In this work, we have developed a flood hazard mapping web application based on a datamining method where the morphological descriptor GFI is used as a classifier of flood-prone areas. Our web application enables users to delineate, downscale and extrapolate floodprone areas over large scales and we have showcased here its wide spectrum of possible utilizations through an extended analysis at both the European and sub-catchment scale (Po River). The study has been extremely instructive in quantifying the potential of the methodo scales.
The geo Youden's ith the refer e optimal G little inte periods, te flood-pro observed correlatio additiona neverthe Analyses Europe a small ch on to the the fact t classifica differenc limitation of return method s sub-catc The pres an impac Other fac flood haz using the possibility that the size of flat are hydrogeomorphic flood-prone ar and vertical accuracy will likely u on the elevation difference, but problematic in these areas, e.g. features, for example embankm river basins (e.g., river Po, Italy) consideration neither in the reference flood hazard maps for Europe (see subsection 5.1) nor in the GFI computation.
In spite of the limitations, to be seen as opportunities for further improving the web application, results are very encouraging. We show that European flooded areas for any of the return periods are well approximated by the outputs of the web application. On the one hand, factors affecting performances may be hard to pinpoint and confirm over such large domains with some issues requiring localized hydraulic studies However the clear need for such we driven m ch efforts ca ar, the deve e the curre studies c of higher de

Co
Integratin commun supporting robust, evidence bas the progress in simplified hydrog and have adopted and testedclassification of flood-prone area Additionally, we have implemen well suited for regional hydrogeo it to classify and delineate six pa pixel resolution (one for each re evaluate their quality within major European basins, relative to the flood hazard maps for Europe (Alfieri et al., 2014;Dottori et al., 2016a-f).
We obtained an average efficiency, measured in terms of ROC analysis and AUC of 88.59%. In the analysis focusing on sub-catchments of the river Po in Italy, we obtained an average AUC of 84.23%, in line with results presented in other studies (e.g., Manfreda et al., 2015;Samela et al., 2017)