Suitability Analysis of Remote Sensing Techniques for Shoreline Extraction of Global River Deltas

High frequency flooding, sea level rise and changes to riverine sediment fluxes have threatened the habitable land area of river deltas, where close to half a billion people live, globally. Understanding shoreline positions is important for overall sustainable planning of deltaic communities and delta evolution predictive modeling. However, a gap in literature is recognized where there is a) no understanding of the most effective shoreline extraction method for a delta, and b) comparisons across techniques to infer on the performance metrics of techniques across deltas in different climate regions. This makes it difficult to apply existing knowledge to lesser studied, data sparse deltaic regions worldwide. In addressing these gaps, we evaluated the performance of 5 different remote sensing techniques against a hand-digitized shoreline vector of 44 river deltas globally, representing the 3 different morphological types of deltas (river-, tideand wave-dominated), across 4 Köppen Climate Classes using Landsat 8 imagery. We propose a new metric (Robustness: R) to evaluate the performance of a given technique. The results show that 1) the best performing method for the majority of the deltas (35/44) was Unsupervised Classification, 2) there is no geographical significance in the performance of the tested techniques, and 3) wave dominated deltas showed the highest classification robustness while tide dominated deltas showed the lowest. Recommendations are made for the application of techniques in different types of deltas and unknown deltaic territories worldwide.


I. INTRODUCTION
River deltas, home to almost half a billion people around the world, are important coastal depositional systems. They not only act as central locations for agricultural production and hydrocarbon extraction, but are also biodiversity hotspots, and carry a vast cultural heritage [1], [2]. Over the past half century, changes to storm frequency and intensity, eustatic sea level rise, and natural-human driven delta morphology evolution (e.g. changes in sedimentation patterns in deltas over time; [3], [4]) have added growing pressures to the effective deltaic land area available for human habituation, and consequently, has attracted enhanced scientific interest to studying temporal shifts in the land-sea boundary of deltas (i.e. shoreline). Understanding river delta shoreline positions are important for sustainable planning of deltaic communities. They are important in the construction of engineering structures (e.g. breakwaters, weirs), for flood mitigation, dam construction, erosion-accretion studies, regional sediment budget calculations, and for predictive modeling of coastal morphodynamics [5], [6].
All three authors are with the Department of Geography, University of Alabama, Tuscaloosa, AL, USA. Dinuke Munasinghe (dsmunainghe@crimson.ua.edu)-corresponding Sagy Cohen (sagy.cohen@ua.edu) Benjamin Hand (bghand@crimson.ua.edu) Remote Sensing provides a useful diagnostic technology to monitor large scale changes in river delta shoreline positions over time [7]. Although there exist a number of studies in the literature on identifying shoreline positions and their temporal evolution, in a recent literature review, Munasinghe et al. [8] revealed that there was no consensus as to which remote sensing technique(s) would be the most suitable to extract shorelines with satisfactory accuracy, emulating close-to-realworld shoreline positions. Challenges in shoreline identification were attributed to shoreline dynamics that are driven by many other location/climate related factors (e.g. inherent variability in rainfall, soil minerals, growing cycle phases of vegetation). They also revealed that a) studies in the literature focused mostly on a few major river deltas globally, b) there were not enough studies which compared multiple techniques at a given river delta, and c) no comparisons of techniques across multiple deltas in different climatic regions or delta types, making it challenging to apply shoreline extraction methodologies to lesser studied deltas worldwide. A comparison of remote sensing techniques on an array of delta types (river-, tide-, wave-dominated) across the globe could provide insights into the performance of techniques under varying fluvial and marine conditions. Elucidating which technique(s) would be the most appropriate for a given climatic region and delta type would allow us to infer why particular techniques underperform in different regions of the world. This will highlight some of the inherent problems of particular techniques and will offer a pathway for improving existing algorithms (e.g. to compensate for environmental noise) and development of new ones.

II. METHODOLOGY
In this study, we evaluate five traditional remote sensing techniques on 44 large river deltas worldwide, curated to represent 4 major and 13 sub-Köppen Climate classes and the 3 main delta morphology types; river-, tide-and wavedominated deltas [9]. The Köppen Climate Classification is based on air temperature and precipitation and represents biome distributions around the world: different regions in a similar class share common vegetation characteristics [10]. Five remote sensing techniques are compared: 1) Modified Normalized Difference Water Index (MNDWI), 2) Normalized Difference Vegetation Index (NDVI), 3) PCA analysis 4) Unsupervised Classification 5) Supervised Classification. Shorelines were extracted for deltas using Landsat imagery from the year 2018. The robustness of each method in shoreline extraction was assessed against a hand-digitized shoreline vector created using high resolution Google Earth imagery of the same year. A performance comparison was made between techniques and different deltaic environmental settings.
Machine learning (ML) techniques were not considered in this study. While Munasinghe et al. [8] found that ML can outperform traditional methods in some river deltas, ML techniques are 1) more challenging to apply as they rely on training data which might not be available in all regions, and 2) cannot be readily transitioned form one case study to another.

Digitization of Reference Shoreline
High resolution Google Earth imagery was used to manually digitize the shorelines of river deltas (termed 'real shoreline' hereafter). Digitization was performed at an altitude of 2000 m from a nadir view, with general spacing of around 2 meters between vertices, on Google Earth Pro, on imagery from 2018. The digitized line files were saved as .kml files and subsequently converted to shapefiles in ArcMap 10.6.

Preparation of Satellite Imagery
Polygon shapefiles were created for each river delta based on river delta extents provided by Tessler et al. [4]. Image Search was carried out on Google Earth Engine (GEE), an open source Geospatial Solution by Google LLC. The Landsat 8 -OLI Surface Reflectance Product (cataloged within the GEE) for the year 2018 for each delta were used in the study. Search parameters were governed by cloud freeness and low discharge seasons of the feeder river of the delta (high river discharge increases water turbidity which hinders shoreline identification). Constrained by the above two governing factors, generally, creating a composite mosaic to cover an entire delta coastline required imagery within a consecutive 3month period of the year.

Extraction of shorelines using Remote Sensing Techniques
The following techniques were used in this analysis: Modified Normalized Difference Water Index (MNDWI; [11]): an enhancement of the Normalized Difference Water Index (NDWI; [12]). Uses Landsat 8 shortwave infrared band (SWIR; Band 5) [MNDWI = (Green-SWIR)/(Green+SWIR)] to enhance open water features while efficiently eliminating built-up land noise and suppressing vegetation and soil noise.
Supervised Classification: a classification technique based on user-identified sample pixels (training areas) as representatives of a specific spectral signature class (e.g., water). Subsequently, the image processing software classified the rest of the pixels in the scene based on the maximum likelihood that they are similar to one of the user-defined classes. In this study, training areas were identified based on high resolution google earth imagery of 2018.
Unsupervised Classification (K-Means Classification): a classification technique based on an automated differentiation of the pixel's spectral signature to a user-defined number of groups [13]. The identification of the nature of each group (e.g. water) is made by the user. In this study, uniform number of land use classes (5) were specified for all deltas.
Normalized Difference Vegetation Index (NDVI; [14]): a technique based on band ratioing [NDVI = (NIR-Red)/(NIR+Red)] to usually monitor vegetation growth/plant biomass. The strong absorbance by water and reflection by the terrestrial vegetation and dry soil by the near-infrared (NIR) band, is leveraged in this study to distinguish the land-sea boundary [15].
Principal Component Analysis (PCA): a technique based on transforming the data to a new set of variables (principal components) which are uncorrelated and ordered, so that the first few retain most of the variation present in all the multispectral imagery [16]. The variance of the first four Principal components were used in this study.
Images were processed in batches for each technique (using Python scripts) to generate rasters with Land/Water classification. Polygon layers which represent the land/water were generated from each raster. The polygon layers were converted into polylines in order to extract the water-land boundary (shoreline). A 5-km seaward buffer was created to the manually digitized shoreline. This buffer was used to eliminate polylines which covered the land area of the delta and clip the ones which only extended from the land towards the sea. Finally, the closest representation of the real (manually digitized) shoreline was extracted from the polylines within the buffer using GIS methods (Figure 1: Inset-1).

Evaluation of the Remote Sensing Techniques
Two metrices were used to compare the robustness of the extracted shorelines: 1) the percentage length of the shoreline that was extracted in comparison to that of the real shoreline, and 2) the average distance of the shoreline from the real shoreline. A new robustness index (R) was developed which joins both metrices: where LE is the length of the extracted shoreline, LR is the length of the real shoreline, and DEA is the perpendicular distance between the extracted and real shorelines ( Fig. 1: inset 1). The R index value increases as the shoreline extracted by a given method is closer to the real shoreline in length, whereas robustness decreases as the real shoreline is farther away from the extracted shoreline.
Non-parametric ANOVA tests (Kruskall Wallis one-way ANOVA) with pairwise comparisons of robustness values across techniques were carried out to infer 1) which technique(s) performs significantly better in shoreline delineation across all the deltas, and 2) if a given technique(s) was performing better in certain regions in the world. We also evaluated how the robustness values of the best performing technique clustered based on the type of delta and attempted to provide guidelines for the usage of these techniques in different deltaic environments.

III. RESULTS AND DISCUSSION
Unsupervised classification yielded the best performance for the majority of the deltas (35 of 44) whilst supervised classification yielded the best for the remainders (9 of 44) ( Table 1). For the two best performing techniques, the percentage extractions of shoreline lengths in comparison to the real shorelines ranged between 74% and 100%, while the average distances from the actual shoreline ranged between 19 m and 130 m ( Table 1). The least successful method in shoreline delineation was PCA. The length extractions were very low (4%-84%), with a median of 26%, and the average distances were very large (as much as 2.6 km; median=405 m; Table 1). The nonparametric ANOVA showed that when all river deltas were considered, R values of Unsupervised and Supervised were significantly outperforming all the other techniques but did not show a significant difference (P=0.087; α = 0.05) between each other. The two ratioing techniques' performance also did not have a significant difference between each other (P=0.49; α = 0.05). All other techniques had significant differences with PCA (Table 1).
However, a comparison of techniques based on climate classes showed that deltas located in tropical and arid steppe climates (Amazon, Fly, Mahakam, Danube, Dnieper, Ebro) did not show significant differences in the performance of the five shoreline extractions techniques. Also, even though all techniques performed significantly better than PCA in general, NDVI performed comparably with PCA, in 6 climatic classes (Tropical Rainforests, Tropical Monsoons, Tropical Savannahs, Arid desert, Arid Steppes, and Temperate regions with no dry seasons).
The reason that the Unsupervised clustering methods performed well across a range of river deltas can be attributed to the automatic clustering of image pixels into n spectral classes based on fine differences in spectral reflectance with minimum user interference. The strength in this technique is that the assignment of pixels to a spectral class is based on the sampling of the entirety of image pixels. The intra-image pixel bias (the ambiguity of allocating a class to a certain pixel resulting from sampling only a portion of an image) is at a minimum. The analyst only attempts to assign or transform the spectral classes into thematic information classes of interest (e.g., forest, agriculture) after spectral classes have been identified. Unsupervised and Supervised not only captured straight shoreline segments, but also features such as beach spits, tombolos, bay mouth bars and cuspate forelands which are parts of shorelines (Fig. 1). In general, four of the five techniques (except PCA) performed well in capturing straight shoreline segments (Fig. 1). This average performance of ratioing techniques is attributed to the usage of two different bands, and their compounding errors. For example, the Band 2/Band 5 ratio (basis of the MNDWI index) has a value greater than one for water and less than one for land in large areas of the coastal zone. This ratio works well in coastal zones covered by soil, but not in land with vegetative cover [17]. This can lead to mistakenly classifying other land use types as water, especially along the land-sea boundary, which seemed to be happening in most of the deltaic environments studied herein.
The working principle of a PCA is such that is reduces the dimensionality of a dataset consisting of many interrelated variables, while retaining as much variation present in the dataset as possible. This is achieved by transforming the data to a new set of variables (principal components) which are uncorrelated and ordered so that the first few retain most of the variation present in all the original variables [16]. However, during the reduction in dimensionality, a loss of data can also be expected. Although usually, the first four principal components account for over 95% of the variation of the data, for the deltaic environments in this study, the variation only ranged between 60%-90% which created land/water rasters with diminished accuracies, and consequently yielded low robustness values.
Analysis was also carried out to infer if a given technique was performing significantly better in certain Köppen Climate classes. Hierarchical Clustering (Ward's method was used to find links between points and cluster them around centroids; [18]) of the robustness of the best performing technique of each delta produced a dendrogram (Fig. 2b). Reasonable clustering is where the ratio between the largest and smallest cluster sizes are close to 3. Thus, different cluster configurations (3)(4)(5)(6)(7)(8) were tested, for which a best ratio of 3.40 was obtained for a forcing of 4 clusters (see horizontal axis of dendrogram (Fig. 2b) representing the dissimilarity at 2.5% scaled distance). By correlating the clusters with robustness values, we identified that river deltas with high robustness values (above 2.17, clusters 3 and 4) are mostly wave-dominated.
A delta is considered wave-dominated when the maximum amount of sediment that the waves can transport along both flanks of the delta is greater than the coarse-grained fluvial sediments supplied to the river mouth [19]. River deltas with robustness values at the lower end of the spectrum (below 1.42) are mostly tide-dominated. These deltas occur in locations of large tidal ranges or high tidal current speeds in which sediment is carried seaward during the low tide and is brought ashore during high tides. Cluster 2 had an equal mix of river-, wave-and tide-dominated deltas. Wave domination limits the accumulation of fine-grained sediment at the delta mouth by transporting river-borne sediments offshore and away from the littoral zone (the area between high and low tide), and muddy sediments are generally below the shoreface toe (where the slope of the delta ends and smoothens out with the sea floor). This in turn sculpts delta shorelines into a cuspate shape consisting of sandy shorelines (Fig. 3a). Sandy shorelines which are typical of wave dominated deltas, provide great contrast in pixel values with their neighboring water pixels and provide clear land-water boundaries and successful shoreline extractions.

(b) (a)
Tide-dominated/tidally influenced deltas, on the other hand, accumulate sediment at the shoreface by the continuous oscillatory reworking and resuspension of sediment by ocean waves and fluvial energies. As a consequence of these varying transport energies, the sedimentary facies formed in tide-dominated deltaic settings tend to be heterolithic, with interbedded sands, silts, and clays giving it a muddy texture. This muddy-ness extends for many kilometers over land in large deltas and is also visible as plumes in the water making distinguishing between land/water pixels challenging (Fig.  3b). In addition to the inferences on the robustness of techniques on different types of deltas, as a general guideline, we advocate assigning the sediment plume in the delta nearshore environment to a separate class when conducting Supervised or Unsupervised Classifications. This is especially important in the low-robustness, tide-dominated deltas or low/midrobustness river dominated deltas with high sediment concentrations. In most cases, when not actively assigned, deltaic land and sediment plume features clustered together, heavily affecting DEA values and erroneous extractions.

IV. CONCLUSION
This global analysis conducted to infer on the suitability of remote sensing techniques in delta shoreline extraction shows Unsupervised Classification as generally the best among the five techniques, whilst PCA yielded the poorest results. No significant differences in the performance of a given technique was found across different climate classes. Based on the results, we recommend the use of Unsupervised Classification as a first order extraction technique for previously unstudied deltaic regions. Special attention is drawn to deltaic environments with high sediment-laden intertidal conditions. We also elucidate that wave-dominated deltas show the best performance in shoreline extraction while tide-dominated deltas were most challenging for the techniques employed. This is envisioned to provide prior understanding of the range of robustness values that one could expect for an unknown deltaic region, given the type of delta, and make advanced decisions on the necessity of advanced algorithms, and/or high resolution data for better shoreline extractions at these locations.