Global inventory of landscape patterns and latent variables of landscape spatial

5 We present a regionalization of the entire Earth’s landmass into land units of homogeneous landscape patterns. The 6 input to the regionalization is a high resolution Global Land Cover (GLC) dataset. The GLC is first divided into local 7 landscapes – small non-overlapping square blocks of GLC cells. These blocks are agglomerated into much larger 8 land units using a pattern-based segmentation algorithm. These units are tracts encompassing cohesive patterns of 9 land cover and the procedure divides the entire landmass into tracts of land with discernibly different patterns. We 10 characterize a pattern in each unit by a set of 39 landscape metrics. The resulting spatial database of land units 11 is the major product of this study. We make this database freely available to the community in order to provide 12 foundational information for studies aiming at explaining relationships between landscape pattern and ecological 13 process and between the process and patterns and their controlling factors. The procedure of obtaining the database 14 is described, the quality assessment of units delineation is given, and the statistics of the major properties of the units 15 are presented. To showcase the utility of the new database we use it to demonstrate that a variability of geometric 16 configurations of landscape patterns worldwide can be captured in terms of only two variables – complexity and 17 aggregation – as they explain 70% of the variability. This allows for a meaningful, two-dimensional classification and 18 mapping of landscape patterns on the basis of their geometry. Such mapping reveals that the majority of terrestrial 19 landscapes are characterized by a simple, frequently monothematic, pattern of land cover. Thus, landscapes on Earth 20 are mostly segregated by the land cover type and complex landscapes with a diverse mix of different land cover types 21 are rare exceptions from the prevailing monothematic cover. 22

GeoPAT is based on the principle of seeded region growing (SRG) (Adams and Bischof, 1994) but has a number of 85 features that distinguish it from image segmentation algorithms. It segments a grid consisting not of single-category 86 cells but of blocks having complex content (a pattern of different categories) and a non-negligible spatial extent.

87
Because of the non-negligible size of the blocks, the spatial organization of their grid is not rectangular but instead, 88 it consists of alternating horizontal layers of blocks with each layer shifted a half block length with respect to the 89 previous one like in masonry. Such grid is easy to set and it is a sufficiently good approximation of a preferred 90 isotropic hexagonal grid which is difficult to set but, because of its isotropy, minimizes segments' artifacts associated 91 with tessellation.
where H(M) indicates a value of the Shannon's entropy of the histogram M: where m i is the value of ith bin in the histogram M and |M| is the number of bins (the same for both histograms). For can be found in the other).

115
The most important parameter of the segmentation algorithm is the size of the block, k, which sets the scale of the  terize the composition of the patterns, and 17 of them are configurational landscape metrics (see Table 1 for details).

125
For this calculations, we use our own code optimized for working with 100,000s of landscape units of different sizes  principal components. Rotated components are not "principal" inasmuch as they are not uncorrelated, but they can be of landscape spatial configuration.

142
Using a pattern-based segmentation methodology described in Section 2.2 we regionalized the entire Earth's 143 landmass into regions (segments) containing cohesive land cover patterns. Four regionalizations were calculated dis-144 tinguishable by the assumed scale of local landscape (size of the block); 30 km, 15 km, 9 km, and 6 km. Table 2 145 summarizes the number of blocks and segments in each regionalization. Increasingly smaller blocks enclose increas-146 ingly specific patterns. More specific patterns extend over smaller regions, thus using a smaller block's size leads to a 147 larger number of regions (fourth column in Table 2).
148 The purpose of generating landscape regionalization/landscape database using different block sizes is that they  Table 1, standardized values of the first 10 principal components, standardized values of the first 10 rotated 157 principal components, and the type of landscape spatial configuration (see section 4)).

158
The last two columns in Table 2 quantify the quality of regions' delineation. According to Haralick and Shapiro 159 (1985) a segmentation is "good" if patterns within regions are cohesive and when adjacent regions are dissimilar from 160 the focus region. We measure a cohesiveness of regions' patterns using an inhomogeneity metric. Inhomogeneity is 161 a property of a single region; it measures a degree of mutual dissimilarity between all local patterns (blocks) within 162 the region. As a measure of region inhomogeneity we use an average distance between all distinct pairs of blocks in 163 a region. For a region S consisting of blocks (M 1 , . . . , M k1 ) the inhomogeneity δ is given as: cover. 169 We measure a degree to which a region stands out from its neighboring units using an isolation metric. Isolation 170 is also a property of a single unit; it is calculated as the average linkage ⟨D(S 0 , S α )⟩ between the focus region S 0 and 171 all of its immediate neighbors S 1 , . . . , S n , where n is the number of neighbors and the symbol ⟨ ⟩ indicates averaging.  The linkage (the distance between two groups of blocks in two regions) is given by where k 0 and k α are the numbers of blocks in the focus region and in one of its neighbors, respectively. Isolation 174 has a range between 0 and 1, regions with larger values of isolation stand out more from their neighboring segments.

175
Average values of isolation (calculated over all regions in the regionalization) are given in the sixth column of Table   176 2. Their values are much higher than the values of inhomogeneity indicating that indeed regions boundaries separate 177 discernibly different patterns. due to the size of the tiles (9 km in this example) being much larger than the size of CCI-LC cells.

185
The sizes (and shapes) of regions vary greatly because spatial extents of various landscapes vary. This can be 186 observed to a limited degree in Fig. 2 where Amazon Tree Cover Broadleaved Deciduous (TBD) region is larger than    In the second part of the paper, we will use the database described in the previous section to find a minimal set     negatively correlated with LPI and CONTAG. Thus, the value of RC1 increases if the unit has more category patches 216 and its decreases if the unit has fewer category patches. Overall, we interpret the RC1 as a measure of "complexity"  and 1 < A < 2). On average, simpler landscapes (C < 0) occupy larger areas than more complex landscapes (C > 0).

241
Finally, Fig. 5D shows a percentage of total landmass area occupied by all units in a given sector of the C-A diagram.

242
One sector, −3 < C < −2 and 1 < A < 2, consisting of simple and aggregated regions, occupy 31% of the total 243 landmass. Landscapes characterized by the values of C and A within a standard deviation from their means occupy 244 between 5%-15% of landmass per sector, the remaining landscapes occupy small (below 1%) parts of the landmass.   Table 3).  313 We concentrated on latent variables of pattern configuration while they look for latent variables of pattern structure 314 including configuration and composition.

315
Our analysis has an advantage of being done on a much larger sample of landscapes coming from a single source, 316 covering the entire world, and by considering patterns on multiple scales. However, because our landscape samples 317 come in different sizes and shapes we used a smaller number of metrics in our analysis. Despite differences in the 318 two methodologies, we find a correspondence between the results of the two analyses. Of the seven latent variables 319 of landscape structure found by Cushman et al. four ("contagion/diversity", "large patch dominance", "intersper-320 sion/juxtaposition", and "patch shape variability") could be compared to our findings. The remaining three variables 321 rely on metrics that we did not utilize for reasons stated above and thus cannot be compared to our findings. Com-322 paring Table 6 in Cushman et al. paper with our Fig. 4 it follows that our complexity variable C incorporates theirs 323 "contagion/diversity" and "large patch dominance" variables, and our aggregation variable A incorporates their "in-324 terspersion/juxtaposition", and "patch shape variability" variables. Thus, our findings agree broadly with Cushman et 325 al. results and indicate that they could be valid over the entire terrestrial landmass and on multiple scales.

326
In addition to being a fundamental finding about spatial patterns of global land cover, an identification of complex-327 ity and aggregation as the two dominant descriptors of LP configuration greatly simplifies designing a classification of 328 full landscape structures. In general, identifying a meaningful set of LPTs on the global scale is a daunting task due to 329 a great diversity of observed structures. This is why a clustering approach to a delineation of global LPTs has limita-