Utilizing Random Forest Machine Learning Models to Determine Water Table Flood Levels through Volunteered Geospatial Information

Many people use smartphone cameras to record their living environments through captured images, and share aspects of their daily lives on social networks, such as Facebook, Instagram, and Twitter. These platforms provide volunteered geographic information (VGI), which enables the public to know where and when events occur. At the same time, image-based VGI can also indicate environmental changes and disaster conditions, such as flooding ranges and relative water levels. However, little image-based VGI has been applied for the quantification of flooding water levels because of the difficulty of identifying water lines in image-based VGI and linking them to detailed terrain models. In this study, flood detection has been achieved through image-based VGI obtained by smartphone cameras. Digital image processing and a photogrammetric method were presented to determine the water levels. In digital image processing, the random forest classification was applied to simplify ambient complexity and highlight certain aspects of flooding regions, and the HT-Canny method was used to detect the flooding line of the classified image-based VGI. Through the photogrammetric method and a fine-resolution digital elevation model based on the unmanned aerial vehicle mapping technique, the detected flooding lines were employed to determine water levels. Based on the results of image-based VGI experiments, the proposed approach identified water levels during an urban flood event in Taipei City for demonstration. Notably, classified images were produced using random forest supervised classification for a total of three classes with an average overall accuracy of 88.05%. Thus, the proposed approach using VGI images provides a reliable and effective flood-monitoring technique for disaster management authorities


Introduction
Much evidence shows that rainfall has intensified globally in recent years. Within only a few hours, considerable amounts of rainfall can occur in an urban area, leading to large amounts of water in the drainage system. When the amount of accumulated water exceeds the design capacity of a drainage system, flooding occurs on roads. The characteristics of a flash flood can be estimated using simulation models for urban areas, such as SOBEK, SWMM, and Flash Flood Guidance. These models are computed based on their requisition of a small area, uniform rainfall, and an operating drainage system. Moreover, these simple simulations rely on rainfall, the design capacity of the drainage system, and numerical model data regarding elevation. Thus, flood simulations are limited by data indeterminacy and physical complexity and are subject to the appropriateness of the model and computational efficienc. Several studies attempted to use open-source and multiperiod data, such as optical and synthetic aperture radar (SAR) remotely sensed images, to overcome model limitations and assess the extent of flooding. In large river basins, remote sensing data provide geographical identification of flooding areas, and combine with local hydrological monitoring data to effectively predict or restore flooding impacts. However, these satellite telemetry spatial data were mostly presented in meters of ground resolution. When in situ water level monitoring data are lacking, the accuracy of flooding range assessments is limited by the spatial data resolution. Moreover, satellite and airborne optical and radar images are not suitable for detecting the water level on a city road, especially under severe weather.
A condensed urban perspective of critical geospatial technologies and techniques includes four components, (i) remote sensing; (ii) geographic information systems; (iii) object-based image analysis; and (iv)sensor webs, all which were recommend to be integrated within the language of open geospatial consortium (OGC) standard. In addition to employing smartphone sensors in professional fields, such as construction inspection, many people use smartphone cameras to record their living environments through captured images and share aspects of their daily lives on social networks, such as Facebook, Instagram, and Twitter, so that crowdsourcing data regarding various phenomena can be inexpensively acquired. Social media applications subsumed into specific categories by characteristic, including collaborative projects, blogs, content communities, social networking sites, virtual game worlds, and virtual social worlds. Social networks, which provide platforms for sharing VGI, enable citizens to easily obtain information such as texts, images, times, and locations. Recently, the framework and applications of a civic social network, FirstLife, was developed following a participatory design approach and an agile methodology by VGI with social networking functionalities. Those platforms allow users to create a crowd-based entity description and offer an opportunity to disseminate information and engage people at an affordable cost. Therefore, VGI has grown in popularity with the development of citizen sensors in natural hazards. A 2016 study employed a method of systematic mapping to investigate researches using VGI and geo-social media in the disaster management context, and found the majority of the studies searching for potential solutions of data handling. Many applications for disaster detection and flood positioning using crowdsourcing have been built to identify disaster relevant documents based on merely keyword filtering or classical language processing on user-generated texts. Currently, most VGI applications handle text information with positioning and time-so-called text-based VGI. With a dramatic increase of multi-source images, the image-based SGI provides great opportunities for relatively low-cost, fine-scale, and quantitative complementary data. A collective sensing approach was proposed to incorporate imperfect VGI and very-high-resolution optical remotely sensed data for the mapping of individual trees by using an individual tree crown detection technique. In addition to accurate image localization, massive amounts of street view photographs from Google Street View were used for estimating the sky view factor, which was proven to assist with urban climate and urban planning research.
In one study, user-generated text and photographs concerning rainfall and flooding were retrieved from social media by using feature matching, and deep learning was used to detect flooding events through spatiotemporal clustering. In addition, scales of detected objects such as flooding areas and heights are expected be quantified through spatial computation of image-based VGI and three-dimensional spatial information; however, VGI studies still face challenges, which include exploring the use of image-based VGI as data interpreters; improving methods to estimate water level from images; and harmonizing the time frequency and spatial distribution of models with those of crowdsourced data. When image-based VGI, hydrological monitoring data and flooding simulations are integrated to estimate flooding extents, the accuracy of calculation is affected by the resolution of the terrain model. However, high-resolution urban models from LiDAR and airborne photogrammetry can provide 3D information over a large urban area, but such data are not always available due to budget and time limitations. As a novel remote sensing detection technology, unmanned aerial vehicles (UAVs) equipped with cameras can help build high-resolution spatial information and monitor disasters. UAV-based spatial data provides a high ground resolution to facilitate VGI corresponding ground information, and VGI supplies a reference for flooding simulation to confirm the moment and verify the assumption of flood modeling. Quantitative observations of rainfall and flooding events extracted from social media by applying machine learning approaches to user-generated photos can play a significant role in further analyses. Thus, the aim of the present study is to develop an image-based VGI water level detection approach by considering unknown smartphone camera parameters and imaging positions. Image-based VGI records flooding scenes and also contains a lot of flooding-irrelevant information, such as trees, buildings, cars, and pedestrians. Image classification was initially employed to reduce scene complexity and identify flooding regions, in which water lines were exacted by edge detection and straight line detection. Then, in order to reveal the imaging positions of the image-based VGI and heights of the detected water lines, a UAV-based orthophoto with a centimeter ground resolution was provided to identify the scene features, the same as in image-based VGI. A UAV-derived digital elevation model was applied to obtain camera parameters and determine water levels based on the photogrammetric principle. Finally, the proposed method of water level calculation was performed in an urban flood case study, and was compared with the simulated water levels through flood modeling. Through the difference between the VGI-derived and simulated water level values, the assumption of the flood modeling was evaluated. The quantified water levels with the resolution of centimeters can validate flood modeling so to extend point-basis observation to area-basis estimation that verify the applicability and reliability of the image-based VGI method.

Methods
The proposed water level detection using VGI involves two processes: identifying water lines in an image-based VGI and measuring the water level based on photogrammetric principles (i.e., collinearity equations). A schematic is presented in Figure 1. The proposed method aims to cope with three problems in VGI water level detection: (1) unknown smartphone camera orientation parameters and VGI shooting positions, (2) VGI water line detection, and (3) VGI water level measurement. Figure 2 presents the analysis flowchart for the proposed method. To solve the collinearity equations, the coordinate system of the object (world) must be defined according to the description in Section 2.1, which also introduces the parameters establishing the relationship between the object space and image space. Subsequently, Section 2.2 introduces a compound method for detecting water lines and measuring water levels. Section 2.3 describes a rainfall runoff simulation to estimate flooding water levels. Finally, the simulated water levels were compared with the corresponding VGI-derived water levels at a designated time.

Establishing Coordinate Systems Using Image-Based VGI
Several pretreatment processes need to be conducted to establish the relationship between the object coordinate system and image coordinate system. These processes are described as follows: (1) identifying feature points in the image-based VGI such as road markings, zebra crossings, street lights, traffic signals, and buildings; (2) measuring the coordinates of calibration points by using a digital surface model (DSM); and (3) determining the interior and exterior orientation of the camera. Based on image-based VGI, categories such as flooding, vegetation, and buildings are classified by RF classifiers to generate a classified image. Using HT-Canny, the classified image is then transformed into an edge image to detect water line positions in an image system. Object-scale computing using the photogrammetric method can be facilitated by introducing control points to link an image to world coordinates. Previous studies employed image information, including camera-known parameters and water gauges, to develop water level monitoring systems. The interior orientation includes the focal length, the location of the principal point, and the description of lens distortion. These parameters are determined based on camera calibration or recommended reference values. The exterior orientation describes the position and orientationof the camera in the object space, which contains six independent parameters: (XL, YL, ZL) for position and θx, θy, θz for orientation. These parameters can be obtained by solving the following collinearity equations where (x, y) represent the image coordinates of the calibration point, (X, Y, Z) represent the object coordinates of the calibration point, m11-m33 are the elements in a 3 × 3 rotation matrix M, (x0, y0) are the offsets from the fiducial-based origin to the perspective center origin, and f is the focal length. In Equation (1), (x, y) represent the calibration point in the photo coordinates transformed from the digital image coordinates (c, r) and can be expressed as follows: where W and D are the pixel dimensions, and dW and dD are the nominal sizes of the pixel.
The rotation matrix M represents the camera orientation in the object space and can be expressed as follows: In Equation (1), x0, y0, and f are interior parameters and should be determined in advance based on a smartphone camera reference. Once seven or more calibration points have been identified and measured, at least 14 new equations can be written based on Equation (1). The camera's position (XL, YL, ZL), orientation θx, θy, θz , and focal length f can then be uniquely determined through a least squares (LSQ) technique.

Water Line Detection and Water Level Calculation
To highlight notable objects in images, image blur processing and deep learning techniques were applied to reduce noise interference and for the removal of background objects. Other studies used supervised classification methods, such as random forest (RF), nearest neighbor, support vector machines, genetic algorithm, wavelet transform, and maximum likelihood classification to identify land coverage. Of these methods, RF outperforms most others because having fewer tuning parameters prevents overfitting and retains key variables. According to relevant research on classification algorithms, an RF classifier effectively improves the classification accuracy and provides the best classification results, even when it is used to classify remotely sensed data with strong noise. In this study, RF classification is used to simplify complex scenes and highlight water lines in collected image-based VGI. The RF algorithm is developed through bootstrap aggregating (bagging) and the random selection of features to be used for classification and regression. In bagging, a training dataset containing K random replacement examples (pixels) is selected for each feature, and the pixels are defined by all decision trees. The decision trees are involved in the attribute selection measure; a commonly used attribute selection measure is the Gini impurity, which is based on the impurity of an attribute with respect to the classes involved. When randomly selected pixels belong to class Ci , the Gini impurity according to a given training set T can be expressed as follows: where p(i|T) is the probability of the selected case belonging to class Ci .
In the random selection of features, the numbers of features and trees are the two provisional parameters in the RF classifier. After the bagging process and parameter setting are completed, the classification analysis generates classified images and the accuracy of classified pixels. Accurate assessments of the classification results can be evaluated through the commonly used indicators, including producer's accuracy, user's accuracy, F1 score, and Kappa statistic. These indexes were calculated through the validation dataset obtained from more than one-third of the reference locations. After category classifications are ensured through the RF classifier, these boundaries of categories are identified through edge detection. The popular methods of edge detection include Canny, Sobel, Robert, Prewitt, and Laplacian operators, in which the performance of the Canny edge detector is considered superior to that of all other edge detection operators. Canny edge detection is performed by smoothing images through the Gaussian filter and then calculating edge gradients with a mask. In the implementation of Canny edge detection, a smoothed image processed by Gaussian filter is used to calculate edge gradient of characteristics by moving a mask, and the gradient image is used as a reference for identifying edge lines. The Hough transform, which is based on the Canny edge detector (hereafter, "HT-Canny"), is useful to detect straight line features. In Hough transform, the detected edge lines are signified by a standard line model constituting the detected line slope and length. Through properly setting the thresholds of slope and length, the water straight lines are found out through image-based VGI. The water line, generally a straight line representing the intersection between an object and the water surface, is a recognizable feature in a classified image and thus can be identified using a line detection method. Subsequently, the Canny edge detector can extract structural information from the image. The Hough transform is then applied to identify the position (xw, yw) of the water line in the image. The equation of the water line is expressed in the Hesse normal form as follows: where d is the distance from the origin to the closest point on the straight line, and θ is the angle between the x-axis and the line connecting the origin to the closest point. After the water line (xw, yw), camera's shooting position (XL, YL, ZL), orientation θx, θy, θz , and focal length f parameters have been determined, the collinearity equations define the relationship between the photo coordinate system and object coordinate system. The collinearity equations can be rewritten and inferred water level in the following basic form: The approximate location (X, Y) can search the corresponding position of detected water line through the UAV orthophoto and DSM data. UAV imagery through image-based modeling has been proven efficient in 3D scene reconstruction and damage estimation. At the beginning of urban flooding, the flooding depth started from road surfaces, so the initial water level h is considered as the same height as the road elevation. Under setting the fixed water line (xw, yw), camera's shooting position (XL, YL, ZL), orientation θx, θy, θz , and focal length f, (X, Y) and h are repeatedly iteratively modified by the least squares (LSQ) technique. After i iterations, the elevation is treated as a new candidate water level hi+1 and compared with the previous water level hi . When the elevation difference |hi+1 − hi | is less than 0.1 m, the water level hi+1 is determined at the location (Xi+1, Yi+1). The rainfall runoff model used in this study is a simple conceptual model also known as the lumped model. A lumped model is the most widely used tool for operational applications, since the model is implemented easily with limited climate inputs and streamflow data. Water storage can be expressed as the following discrete time continuity equation: where Ii is the inflow at time i; Qi is the outflow at time i, which is the designed drainage capacity of the sewage system; and S0 is the initial storage at time 0, which is assumed to be 0. All units of storage are cubic meters.
The proposed method is based on the following basic assumptions: (1) Rainfall and design drainage capacity in the small study area are uniform; (2) Rainfall in the study area is drained directly through the sewage system with a completely impermeable landcover and without baseflow and soil infiltration; (3) The sewage system is fully functioning; and (4) Water detention by buildings and trees is ignored. Based on these assumptions, rainfall is accumulated in a DSM, and the design drainage capacity is the homogeneous outflow in each DSM grid. When the accumulated water exceeds the design drainage capacity of the DSM grid, the inundation water level is obtained from the lowest elevation. Finally, the flooding water level and flooding map are updated by unit observation time. The flooding water level can be expressed as follows: where A is the study area, Sj is the water storage at time j, and Z is the ground elevation.

Results
In this study, RF, maximum likelihood (ML), and support vector machine (SVM) classification were used to identify three classes (vegetation, water, and building) in the image-based VGI. A total of 26,933 pixels were selected for all three classes through equalized random sampling, divided into 13,000 pixels as a training set and 13,933 pixels as a test set. Regarding the RF parameter settings, three classified features at each node and 60 trees were used. Table 1 shows the percentages of producer's and user's accuracy, F1 score, overall accuracy, and kappa statistic for three classification categories at three times. The results of RF classification are the best with overall accuracies are 80.10% (15:20), 80.12%   Table 1 The percentage of Producer's accuracy, F1 score, and overall accuracy, and kappa statistic for three classification categories of RF, ML and SVM classifications.
After water line extraction was completed, the calculation of VGI water levels relied on the shooting position, camera orientation parameters, and control point coordinates. The approximate shooting positions were confirmed using the collected image-based VGI, Google Street View, and DSM. The camera interior orientation parameters, including the focal length and charge-coupled device (CCD) size, are referred to as smartphone camera parameters.
This study calculated the average value of the public smartphone parameters as an initial value. For the initial interior parameters, the focal length was 4.6 mm, and the CCD size was 1/2.5". Considering human postures that a person normally holds a camera to shoot ground scene, the initial camera three-axis orientations of the exterior parameter θx, θy, θz were set as (80• , 5• , 80• ), respectively. To solve seven uncertain parameters, including shooting position (XL, YL, ZL), orientations θx, θy, θz , and the focal length f, more than seven control points were provided as redundant observations for LSQ calculation. In this case study, nine control points can be distinguished from each image-based VGI and the UAV orthophoto. The point coordinates are listed in Table 2.
Through the rainfall observations and simple discrete time continuity Equations (8) and (9), the flooding process and simulated water level were determined. The parameters of rainfall runoff simulation are shown in Table 3. The initial flood water level is assumed to be the lowest ground elevation (5.670 m), which was lower than the average ground elevation (8.989 m) of the VGI scene. When the simulated water level, which refers to the lowest ground elevation plus the depth of flooding, is higher than the ground elevations of the DSM, the flooding ranges are drawn in the UAV-derived DSM and orthophoto.
Control points, orientation parameters of image-based VGI, detected water lines, and VGI water levels were calculated though LSQ, as listed in Table 4. The VGI water levels of 9.398 m, 9.326 m, and 9.273 m occurred at 15:20, 16:10, and 16:20, respectively. The differences between the VGI water levels and simulated water levels were between 0.018 m and 0.045 m. Notably, all of the VGI water levels were higher than the simulated water levels, possibly because the simulated water levels had been underestimated; this highlighted the problem of baseflow not being considered in the simple conceptual model. By contrast, these differences could be used to estimate the baseflow in the study area. Since VGI images convey ground-truth information to assist in the correction of hypothetical lumped models, the water level differences were considered as a result of neglecting baseflow and soil infiltration. The neglected values are between 0.318 and 0.796 m3 /hr, which are estimated by multiplying the study area (0.0637 km2 ) by the hourly water level differences (between 0.018 and 0.045 m) per hour. The estimated values provided references for revising hydrological modeling.
The rainfall hyetograph and simulated water level are shown in Figure 4, and the simulated flooding process from 14:30 to 17:40 is shown in Figure 5. The simulation analysis confirmed that the significant water level of 9.252 m was reached at 15:00, and the flood area was located on the trunk road. At 15:20, the peak water level of 9.353 m was reached, and the flood spread over all roads in the study area. By 16:30, the flood had gradually receded, and the water level decreased to 8.586 m. Finally, at 17:40, the flooding event ended.  The analysis was conducted in MATLAB R2018a 64bit (Natick, MS, USA) and was tested on a personal computer, processor: Intel Core i7-4700HQ 2.4 GHz (Santa Clara, CA, USA); memory: 8.00 GBGB DDR3; operating system: Windows 7 (Redmond, WA, USA). The average computation time VGI water levels and orientations was 42 s. In summary, the proposed approach quantified the water required to classify image-based VGI and detect water lines was 355 s. The average time required to levels to within 0.001 m by using image-based VGI and real field information; thus, this approach can determine VGI water levels and orientations was 42 s. In summary, the proposed approach quantified the water levels to within 0.001 m by using image-based VGI and real field information; thus, this approach can determine warning water levels and positions as references for disaster relief.

Discussion
In the past few decades, many studies have used urban flood simulation models to estimate the temporal impact of flooding. These flood simulations are limited by hydrological observations and physical complexity, which affect the model's suitability and calculation efficiency. Furthermore, the collection and verification of in situ observation data are crucial for the evaluation of the simulation. Especially on busy roads in urban areas and in the absence of water gauges, image-based VGI that records flooding provides an opportunity for spatial and temporal verification.
This study combined image classification, line detection, collinearity equations, and LSQ to quantify water levels based on image-based VGI acquired by smartphone cameras. Based on the theoretical analysis and validation described in this paper, image-based VGI classified using the RF classifier can be used to identify flooded areas. The proposed novel approach successfully manages the ambient complexity of urban image-based VGI to detect water lines. Through a centimeter-accurate DSM, the detected water lines are used to quantify flood water levels. Therefore, the fact that there is no water level monitoring equipment at the flooding site is no longer a barrier to the acquisition of information. Moreover, by employing photogrammetric principles, the proposed method can determine imaging locations, water levels, and smartphone camera parameters. In addition, differences between VGI and simulated water levels provide a baseflow reference for simple flood modeling. The quantified water levels with a resolution of centimeters (<3-cm difference on average) can validate flood modeling so as to extend point-basis observations to area-basis estimations. Therefore, the limited performance of image-based VGI quantification has been improved to deal with flood disasters.

Conclusion
Overall, this research reveals that flooding water levels can be quantified by linking the VGI classification images with a UAV-based DSM through photogrammetry techniques, thus overcoming the limitations of past studies that used only image-based VGI to qualitatively assess the impact ranges of flooding. Based on our results, we suggest that the use of image-based VGI to obtain flooding water level must meet two requirements. The first requirement is orthographic images and a DSM with a centimeter-level ground resolution, to provide spatial identification of control point features in image-based VGI, such as building corners, ground markings, and streetlights. The other requirement is that the shooting perspective of image-based VGI should avoid being close to parallel or vertical to the ground, in order to decrease the difficulty of identifying the flooding water line on images and any resulting errors. Fortunately, most VGI images are shot at an oblique perspective. In other words, a better spatial distribution of image-based VGI is able to capture the outlines of buildings or bridges as spatial references in a frame of images, such as in Google Street View.
In the future, the proposed VGI setup can be promoted through the execution of street-monitoring techniques to supply long-term continuous imagery; these images, acquired by smartphone cameras, can then be incorporated into Google Street View to construct and update local spatial information. Eventually, by employing more deep learning techniques with distribution computation, a great amount of VGI photos can be processed and integrated into a disaster-monitoring system for flooding and traffic management in an economical and time-efficient manner.