Proposal for an index of roads and structures for the mapping of non ‐ vegetated urban surfaces using OSM and Sentinel data

The use of volunteered geographic  information (VGI), such as OpenStreetMap (OSM), to assist in mapping land use and coverage together with remote sensing images is relatively recent. Most studies have used OSM to assist in sample collection for image classification or aggregated vectors of buildings, transport, and land‐use and coverage as ancillary data to support mapping refinements. This study proposes a metric called the “index of  roads and structures”  (IRS) created on  the basis of OSM data with  the intention of assisting in the mapping of non‐vegetated and non‐aquatic urban surfaces. IRS  thresholds were  defined  and  supplemented with  information  derived  from  the Normalized Difference Vegetation Index (NDVI) and the Modified Normalized Difference Water  Index (MNDWI) as a way of extending the restriction between urban and non‐ urban classes and  thus achieving better mapping accuracy. To  implement  this study, multispectral Sentinel 2 images resampled to 10 m on the ground were processed in the Google Earth Engine (GEE). The IRS is a raster file, in which each pixel is associated with the possibility of being inserted in an urban context; thus, the importance of this index as an aid in mapping urban areas is clear. We have demonstrated the possibility of using the IRS to map urban surfaces in Brazil (8.5 million square kilometers) and obtained a very high accuracy of 93.2%.


Introduction
The mapping of urban areas using remote sensing is not a trivial task. The difficulty is partly due to the rich diversity of deposited materials-constructed or notwith different colors and arrangements, which makes this environment particularly complex. Pixels of any size in urban environments may contain a mixture of surface coverings with different spectral signals (Wentz et al., 2014). Hidden behind this "physical" complexity is another complexity: one that influenced the expansion of the city, linked to natural, historical and economic factors (Venturi, 2021).
In practical terms, to extract information using remote sensing images, the diversity of materials present in the urban environment is interpreted through its spectral properties (Harolds;Roberts, 2010). Different segments of the city may have more or less spectral diversity depending on the parceling of the land and the type of use linked to physical and social conditions. The complexity of mapping urban areas is also linked to the characteristics of the images used in mapping, since, depending on the instrument configuration, specific details in terms of spectral signatures and spatial dimensions are recorded.
One very rational and efficient methodology of analysis has been presented by Ridd (1995), who proposed that the urban environment consists of vegetation, impermeable areas and exposed soil (the V-I-S model), but aquatic surfaces are disregarded. The great advantage of the V-I-S model is that it makes it possible to analyze the structure of the territory (neighborhoods, districts, etc.) via the presence of these three components in the landscape, also making it possible to link these components to the prevailing social, economic and functional conditions in the space. Another advantage is that the V-I-S model uses various sources of input data (aerial photographs and orbital images with different levels of resolution) and forms of extracting thematic information.
Approaches both to the classification of images and the enhancement of components, as well as hybrid strategies (enhancement followed by classification) are widespread in the mapping of urban areas. The advantages and disadvantages of each are well described in the literature (Lu;Weng, 2007;Mesev, 2010;Jensen, 2016).
The advent of the possibility of sharing volunteered geographic information, in which the user generates content on the Internet, has created new horizons in urban studies (Goodchild, 2007). OpenStreetMap (OSM) has been formed in a community effort to map the world using a platform in which natural elements and human infrastructure are defined and labeled (Albrecht, 2020). Recently, various studies have used OSM in conjunction with remote sensing images and other data sources with the aim of mapping urban areas. It can be seen from Table 1 that most of these studies used OSM data to assist in the collection of samples in the classification of images or made use of the features available as an additional data source (ancillary data) to be incorporated into their studies. Faced with the challenge of mapping urban areas and the importance of this information in territorial planning, this study proposes a new metric called the index of roads and structures (IRS) created from OSM data and spectral indices. To demonstrate the efficiency of the index, we selected the entire territorial expanse of Brazil with its 8,514,876 km² as a test area. Brazil was selected as a study area because it includes a very complete showcase of the main landscapes and ecologies of the tropical world (Ab'Sáber, 2002), into which urban areas are inserted.

Materials and methods 2.1. Analysis platform and satellite data
The data from the OSM were obtained from the Geofabrik platform (https://download.geofabrik.de/south-america/brazil.html) on dates between 03/29/2021 and 04/04/2021, so that there would be no major variations between the versions of the files provided by Geofabrik. ArcGIS software version 10.2 was used to select polygons and features, to pre-process them and for the conversion to raster files.
The cloud-image processing platform Google Earth Engine (GEE, Google Inc.) was used to process rasterized OSM files and orbital images. To map the details of urban areas, we chose to use Sentinel images from 2020, standardizing the data set to 10meter spatial resolution. As pre-processing steps, the images were filtered to eliminate clouds and considering a maximum cloud cover of 40% of the scene. Values below 40% were considered too restrictive for some localities, such as in the Caatinga biome, which would prevent the analysis of the images collected in wet seasons-a very important aspect for implementing the proposal. In total, the analysis considered 42,578 Sentinel 2 scenes in an area corresponding to 85 billion pixels.

Construction of the index of roads and structures (IRS)
The IRS is a composite index that aggregates information from the OSM, more specifically from road, building, land-use and transport features. Two stages were adopted for the construction of the IRS. The first step consisted of calculating the density of roads; in the second stage, information related to the type of land use and transport was selected. These were incorporated into the IRS in two ways: the density of contour polylines and a constant value (offset), assigned to the selected land-use and transport polygons. This constant value was necessary because some areas have dimensions that exceed the influence radius of the kernel function. To be compatible with the resolution of Sentinel 2 images, the pixel size considered for generating the density raster file was 10 meters.
For the calculation of the road density, as well as for the calculation of the density of perimeters of the land-use and transport features, the following formula, adapted from Silverman (1986, Eq. 4.5), was used (Eq. 1): Eq. 1: Where: r = influence radius; L i = length of lines in the radius of influence; P i = number of pixels in the influence radius where the lines occur (population field value).
A crucial aspect in calculating the density of roads and the polylines of buildings and land-use features concerns the definition of the influence radius. This value was defined empirically, testing different radius values until a result was achieved that was satisfactory considering the pros and cons.
In the context of the urban environment, the density values between roads should be greater than that of a road, while smaller in rural areas. In this study, considering the extent, complexity and latitudinal range of the Brazilian territory, the influence radius of 0.002 degrees, approximately 210 meters and 0.0003 degrees of resolution, proved to be satisfactory for the intended adjectives, since the values among urban roads were mostly higher than the value obtained for a single road, whereas in rural roads, the values found were predominantly lower. In GEE, the resolution was resampled to 10 meters, matching the Sentinel 2 images. It is important to note that these values are not necessarily fixed and may vary in different urban contexts.
For the calculation of the density for buildings, the composite features were divided and a limit for the perimetral length was established. Of the land-use and transport vectors, only the features associated with the urban context were considered (Table 2). Many areas represented by land-use and transport features, such as military areas and airports, have high proportions of aquatic or vegetation coverage. In such cases, it was necessary to restrict these features, eliminating areas covered by vegetation and water. The remaining individual features, whose perimeters also exceeded 200 meters, were added to the building features, which were transformed into polylines. Subsequently, the same kernel density calculation as applied to roads was used (Eq. 1).
In the density image of contour polylines, large structures such as industrial plants, transport areas and storage sheds with small or zero values were observed, which may eliminate the representativeness of some of these IRS features. This is due to the distance between their perimeters and their internal areas. To avoid this problem, a constant value (offset) was assigned to the areas of all features, which was added to the density of perimeters, generating a raster with the density of structures. Finally, the IRS was obtained by adding the image representing the density of roads to the image representing the density of structures (Figure 1). In the IRS, zero values refer to areas without a road system or structures, low values refer to single-lane highways and nonurban roads, and high values refer to road junctions and urban settlements.

Exclusion of vegetated and aquatic surfaces
High values of IRS are closely associated with the presence of urban settlements, but the index does not exclude vegetated and aquatic surfaces. Thus, as a complement to the IRS, the Normalized Difference Vegetation Index (NDVI; ROUSE et al. 1973) and Modified Normalized Difference Water Index (NDWI; XU, 2006) spectral indices were used to impose restrictions. While NDVI (Eq. 2) provides an indication of biomass above the ground, MNDWI (Eq. 3) emphasizes features of bodies of water in addition to decreasing the correlation of water with features associated with built areas. The formulas adopted for Sentinel 2 were: In order to highlight the differences in signatures, maximum values of NDVI and MNDWI constructed from a temporal set of 42,578 Sentinel 2 images obtained for all of Brazil at different times in 2020 were considered in the analysis. The cut-off values defined in the composition of the IRS for the NDVI and MNDWI indices were 0.5 and 0.2, respectively. By adopting this strategy of using temporal composites (NDVI max and MNDWI max ), much class-to-class confusion was reduced due to the coverage's own development characteristics, which permits a better separation between classes.

Methodological proposal for mapping non-vegetated urban surfaces
Figure 2 shows a model for composing the IRS ( Figure 2A) and its use, supplemented with information from the NDVI and MNDWI ( Figure 2B and Figure 2C). This model has been used to map urban areas of Brazil and can be replicated for different areas of the globe, provided that information from the OSM is available and has details that allow the features to be associated with the urban infrastructure.
For the composition of the IRS (Figure 2A), a higher value among urban roads and a lower value among rural roads should occur, taking as reference the value of an isolated road for the same influence radius.
The establishment of NDVI and MNDWI thresholds ( Figure 2B) was first used to establish restrictions on transport and land-use features, eliminating water-and vegetation-covered surfaces. In a second step ( Figure 2C), the thresholds of these indices were used to restrict the IRS image. The NDVI and MNDWI thresholds, which can be defined in the steps described in Figure 2B and Figure 2C, are not necessarily the same, neither is the nebulosity threshold.
Nebulosity thresholds, NDVI, MNDWI and IRS, with values of 10; 0.5; 0.2 and 600, respectively, were established to produce the first mapping of Brazil with the IRS. In this, the boundaries of urban areas were defined.

Mapping accuracy
In order to evaluate the accuracy of mapping of non-vegetated urban surfaces, 600 points distributed in a stratified random manner were used, distributed in 300meter-wide bands, established by means of buffers of urban area limits that take capitals and smaller urban settlements into consideration. The total area of these bands was 2,437 km². These border areas were selected because they are areas with the greatest likelihood of omission and commission errors. For the classification of validation points, high-resolution images from Google Earth (Google Inc.) and from the Planet satellites (Planet Labs PBC) were used (Figure 3). Thus, samples were collected in various climates and biomes, distributed in broad latitudinal and longitudinal bands, including diversities of urban settlements within matrices of preserved areas, agricultural crops or pastures. The use of IRS for mapping urban surfaces not covered by water or vegetation was evaluated at different thresholds of NDVI and MNDWI ( Figure 2B), IRS ( Figure 2C) and nebulosity. We performed several classifications to evaluate the sensitivity (true positive rates of points classified as urban surface not covered by water or vegetation), specificity (true negative rates of points classified as water or vegetation) and accuracy (percentage of points correctly classified in relation to the total) of the mapping.

Results
Visual analyses indicated the great potential of the IRS to improve the mapping of urban areas at different levels of agglomeration. The effectiveness of the index increased when complementary information obtained by spectral indices was incorporated into the analysis as a way of expanding the restrictions. Table 1 shows the results of the classifications adopting different thresholds for cloud, NDVI, MNDWI and IRS, and the respective performances obtained. As can be seen, the best result was found with Classification 13, since it gave the highest accuracy value (0.9167) and the lowest mean absolute deviation (0.0088). However, when NDVI and MNDWI information were not included in the classification (Classification 14), the accuracy of the mapping was much lower (0.5833) and with an absolute mean deviation well above the other classifications (0.2787). When comparing the IRS with the NDVI and MNDWI, a low linear correlation was observed (-0.2931 and -0.1255, respectively), indicating the non-redundancy of information among these indices to be considered in the mapping of urban areas (   Figure 4 B, C and D). In other words, the low correlation is an indication that the information in the IRS is complementary to that obtained with the NDVI and MNDWI.
A mapping of urban surfaces of the municipalities of Porto Nacional and Palmas, capital of the State of Tocantins, not covered by water or vegetation, was produced ( Figure 4A) using the thresholds presented in Classification 13 (Table 3). This cutoff has relatively balanced proportions of water, vegetation and urban areas. The result was found to be quite satisfactory, showing different patterns in the distribution of the intra-urban vegetation and areas where the densities of roads and structures were higher.

Discussion
Several studies have used OSM and remote sensing for urban mapping (see Table  1). As far as we can see, OSM was used to define samples (Di Yang et al., 2017;Liu et al., 2020 andZhang, Gorelick andZimba, 2020), to determine land parceling (CHEN, W. et al., 2018;CHEN, W. et al., 2018;SUN, J. et al., 2020;Zhong et al. 2020 andCHEN, et al., 2021) and to assist in obtaining semantic objects (Zhao et al., 2019 andZhong et al., 2020), which were subsequently processed by artificial-intelligence algorithms for producing mappings, which delimit distinct classes. This research demonstrated the possibility of using data from OSM in conjunction with orbital images to map urban and non-urban features involving extensive territorial areas, using relatively simple but robust techniques in terms of practical results.
The IRS can be considered to be a raster file, in which the value of each pixel is associated with the possibility of being part of the urban area. Its quality depends on the detail and completeness of surface objects representation in the vector files provided by Geofabrik. The high values of sensitivity, specificity and accuracy obtained in mapping are indicative of the good quality of representation of roads, buildings and land use in OMS.
In Ridd's V-I-S model, the urban environment consists of vegetation, impermeable areas and exposed soil. The validation samples used in this study also identified these three elements, which predominate in the Brazilian landscape, together with a fourth that corresponds to aquatic surfaces. This composition reinforces the viability of the model proposed by Ridd (1995) to characterize, in a more detailed way, the intra-urban structure of Brazilian municipalities. Of the 600 validation points, 313 were located on an urban surface not covered by water or vegetation (buildings, roads, parking lots and others). Of these, 56 were exposed soil and four were sand. The increased occurrence of these surface materials in urban areas can be explained by the peripheral location of the sampling band, often in areas where cities are expanding.
To achieve the accuracy of 91.7% mapping, IRS thresholds combined with nebulosity, NDVI and MNDWI values were defined. We considered that this value followed a very rigid and conservative protocol, since reference samples used in the calculation of accuracy were collected over areas of higher tension, i.e., areas with greater chances of finding confusion between classes. It is therefore reasonable to assume that the "real" result for mapping accuracy tends to be greater than the value disclosed here. Another point observed is that commission and omission errors showed a difference of 0.0136 (see Classification 8, Table 3). In other words, when urban-area calculations are aggregated by municipalities, the estimates obtained tend to converge to values closer to reality, since deviations associated with inclusion or exclusion errors tend to compensate one another when their values are approximately equal.
An important aspect to be considered in mapping large areas in a tropical environment concerns restrictions imposed by atmospheric conditions. For this, GEE played a crucial role, because it allowed not only the processing of a very large mass of data, but it also made it possible to produce synthetic images through the use of temporal reducers (Gorelick et al. 2017). Even using this resource, the atmospheric conditions observed in some parts of Brazil imposed limitations on the definition of the best thresholds to be adopted for classification. As can be seen in Table 3, the accuracy of Classification 1 was better than for Classification 2, suggesting that the maximum nebulosity limit of 10% would achieve better results. However, by adopting a lower threshold of atmospheric interference, 59 municipalities ended up with no urban area mapped within their territories. This problem was due to the lack of Sentinel 2 images with low nebulosity or cloud masking that returned pixels associated with urban areas without information, thus omitting these areas from the classification.
Of the Brazilian biomes, Caatinga (vegetation associated with a semi-arid climate located in the northeastern region of the country) required greater attention, considering the possible temporal series of 2020. In Caatinga, herbaceous plants only grow during the rainy season, as do grasses, which is why they are not apparent for most of the year (Rizzini, 1997 p.515). For Ab'Sáber (2003), Caatinga often functions as a cloudy semi-desert, with everything reverting when the first rains arrive. As a result of this characteristic, restricting orbital images to scenes with little or no cloud would result in lower NDVI values, even considering annual maximum values. As an example, we can highlight the area of a company in the municipality of Sobral, in the state of Ceará ( Figure  5A). Using the maximum nebulosity limit of 10%, we found lower NDVI max values ( Figure  5B) than those found when using the 40% nebulosity limit ( Figure 5C). In this way, the nebulosity limit is also an important variable in the IRS configuration and in mapping with this index. Due to this particular characteristic, unlike the rest of Brazil, the threshold for selecting scenes of up to 40% nebulosity was adopted for the composition of the IRS for the states in the northeastern region. Especially in the Caatinga and Cerrado biomes (northeastern and centralwestern regions), confusion in the classifications involving urban areas, senescent vegetation, exposed soils, sand banks and rocky outcrops is commonly reported in the reviews of local literature (Silva;Cruz, 2018), with these confusions being more pronounced in the dry period. Hence the decision to work with images preferably obtained in the wet period (Almeida Filho; Carvalho, 2010). The adoption of NDVI max and MNDWI max compositions acts in a similar way to images obtained in wet periods, with the advantage of selecting only pixels with higher temporal series values within a stack of images. This resource is very useful, for example, for reducing any effects on vegetation caused by atypical episodes of drought within the wet season.
It has been estimated that the total of urban surfaces not covered by water or vegetation occupied 43,265 km² of the Brazilian territory ( Figure 6). As can be seen, urban surfaces are not uniformly distributed across the country, with a lower concentration in the Pantanal and Amazon biomes. The highest concentrations occur in the Caatinga and Atlantic Forest biomes, with the main capitals and metropolitan regions of the country being located in the latter biome. The explanation for this regional pattern of occupation is due to the history of colonization of Brazil, which began from the coast and extended to the inland areas as colonization advanced. In the Cerrado and the Amazon, the most effective process of occupation was much more recent, being closely related to the large Amazon occupation projects implemented especially in the 1960s and 1970s (Becker, 1982). Currently, the regions of Cerrado in the Central-West have experienced urban growth driven by the expansion of agribusiness, with grain production and extensive cattle ranching.

Conclusions
The use of the IRS, based on data from OSM, in conjunction with the NDVI and MNDWI, enabled the mapping of urban surfaces not covered by water or vegetation with high accuracy in Brazil for areas inserted into diverse biomes, distributed over a wide latitudinal and longitudinal range.
The IRS, NDVI max , MNDWI max and nebulosity thresholds of the Sentinel 2 scenes were shown to be important variables in the mapping process, without which the high accuracy would not have been achieved.
The proposed methodology for mapping using the IRS made it possible to measure the areas of non-vegetated urban surfaces of all Brazilian municipalities, totaling 43,265 km².