Stochastic, Empirically Informed Model of Landscape Dynamics and Its Application to Deforestation Scenarios

Land change including deforestation undermines the sustainability of the environment. Using data on 1992–2015 pattern change in over 1.7 million mesoscale landscapes worldwide we developed a stochastic model of long‐term landscape dynamics. The model suggests that observed heterogeneous landscapes are short‐lived stages in a transition between quasi‐stable homogeneous landscapes of different themes. As a case study we used Monte Carlo simulations based on our model to derive a probability distribution for evolutionary scenarios of landscapes that undergo a forest‐to‐agriculture transit, a prevalent element of deforestation. Results of simulations show that most likely and the fastest deforestation scenario is through the sequence of highly aggregated forest/agriculture mosaics with a decreasing share of the forest. Simulations also show that once forest share drops below 50% the remainder of the transit is rapid. This suggests that possible conservation policy is to protect mesoscale tracts of land before the forest share drops below 50%.


Introduction
Earth's lands are changing due to anthropogenic impact (Venter et al., 2016). Examples of change due to a direct human impact include urbanization and deforestation. Desertification provides an example of change due to an indirect human impact via climate change. Remote sensing of the land surface from space over the last few decades reveals that land change is a globally significant environmental trend (Song et al., 2018) that affects most landmasses and land themes with the deforestation of tropical forests being its most pronounced example (Hansen et al., 2013). Understanding the long-term dynamics of land changes is needed to identify policy options aimed at mitigating its negative impacts (Chhabra et al., 2006).
Currently, landscape dynamics is addressed in two contexts: (1) In remote sensing (Liu & Yang, 2015;Olmedo et al., 2018;Verbürg et al., 2004) land use/cover change (LUCC) models provide short-time predictions of land use/cover change and (2) in landscape ecology (Gaucherel et al., 2014;Gaucherel & Houet, 2009) landscape models (LMs) predict how change in landscape pattern influences the ecological process. LUCC models (Mas et al., 2014;NRC, 2014) are designed to predict land change mostly on a pixel-by-pixel basis though some polygonal or patch-based models have also been proposed (Meentemeyer et al., 2013;Xu & Brown, 2017). Most LUCC models investigate the land change in a specific type of landscape (e.g., agricultural, forested, arid, or urban) in the local study areas, although some regional-scale LUCC models have also been published (deNijs et al., 2004;Soares-Filho et al., 2006;Tayyebi et al., 2013). LMs are designed Geophysical Research Letters 10.1029/2019GL085952 (Gaucherel et al., 2014) to either simulate a specific process, which causes a change to landscape pattern (see, e.g., Pe'er et al. (2013)) or to use a simple set of arbitrary rules (so-called neutral LM) to simulate a progression of artificial landscapes having patterns and statistical properties similar to observed landscapes.
Our interest is in understanding a long-term evolution of landscapes which results in a profound change to the land use/cover over large spatial extents. One example of such evolution is the forest → agriculture transit (FAT), which leads to deforestation. What are the most likely intermediate steps of such a transit? Are some evolutionary paths more likely than others? Neither LUCC models nor LMs can address these questions.
To be able to start to address such problems we have developed a stochastic, empirically informed model of landscape dynamics. This model is unlike either LUCC models or LMs. It is based on the principle that a probability distribution of long-term trajectories of a single landscape can be constructed from the frequencies of various changes to a large number of landscapes during a single short-term period. The basic spatial unit of our analysis is a mesoscale (∼100 km 2 area) landscape (a pattern of land cover classes). To gather the frequencies of short-term changes, we divided the entire terrestrial landmass area into ∼1,700,000 such landscapes and tabulated their (short-term) changes between 1992 and 2015. Landscapes and their transitions are classified into a finite number of types thus enabling calculation of probabilities that a given landscape type changes to another landscape type over the period of 23 years. Assuming that, in the first approximation, the set of transition types and their frequencies is complete and stationary, the set of transition probabilities constitutes a stochastic, empirically informed model of landscape evolution that can be used to address questions like those asked in the previous paragraph.
The major purpose of this paper is to describe the stochastic, empirically informed model of landscape dynamics. In addition, we demonstrate the working of the model by simulating the evolution of the landscape from forested to agricultural. Starting from a landscape of homogeneous forest and ending with a landscape of homogeneous agricultural land, the simulation yields different deforestation scenarios and their probabilities. A landscape currently covered entirely by forest will most likely evolve to an agricultural landscape along the maximum likelihood trajectory which reflects a prevailing series of circumstances; this is the most likely deforestation scenario. For a FAT to happen along less likely trajectories rare circumstances will have to happen; these are less likely deforestation scenarios.

Data and Methods
We conceptualize land change in a given areal unit as a modification of landscape pattern within this unit between two times of observations. As an input for our model we use the European Space Agency Climate Change Initiative (CCI) global data set (CCI-LC) at t 0 = 1992 and at t 1 = 2015 (ESA, 2017). CCI-LCs are temporally consistent, 300-m-resolution maps of 22 land cover classes, which we have reclassified into nine broader Intergovernmental Panel on Climate Change land categories (see the legend to Figure 1a). The entire terrestrial landmass is tessellated into 1,764,740 nonoverlapping square areal units of the size 9 km × 9 km (30 × 30 CCI-LC cells). A mosaic formed by the nine land categories in a given areal unit at a given time constitutes a mesoscale landscape (hereafter referred to as a landscape).
A landscape is characterized by its configuration (a geometry of its pattern) and its thematic content (names of land cover classes present). Landscape configuration can be succinctly parametrized (Nowosad & Stepinski, 2018) by only two metrics interpreted as "complexity" C and "aggregation" A. The land cover data are transformed into a database consisting of 1,764,740 landscapes having attributes pertaining to landscape thematic composition { t 1 , … , t 9 } and landscape configuration {C t , A t }, where t = 1992 or 2015. To simplify the data set, we classified all landscapes into up to 64 classes with respect to their configurations (see Nowosad and Stepinski (2018), for details). This is a 2-D classification that takes into consideration values of C and A but not landscape's thematic content (Computational details are given in the supporting information; the flow of computations is outlined in Figure S1). Landscapes of similar values of C and A belong to the same class regardless of their thematic content. Figure 1a shows exemplars of landscape types taken from a forest-theme subset of all landscapes; only 34 landscape types are present in this subset. Sets of archetypes, selected from different theme subsets of landscapes, are shown in the supporting information ( Figure S2). Note that it is convenient to show the archetypes as well as all other results in a grid corresponding to increasing complexity and aggregation of landscapes. We refer to such a grid as the C-A diagram. This A histogram of all landscapes at t 0 with bins corresponding to landscape types organized into the C-A grid (diagram). Empty cells in the C-A grid correspond to landscape pattern configurations that are theoretically possible but not present in the data set. Gray colored entries indicate nonzero bins. Areas of red circles are proportional to the fraction of landscapes in the bin; the sum of all bins equals to 1. (c) Probability of landscapes of a given type to maintain its type (blue) or to transition to another type (red) during the period t. diagram systematizes landscape types. From left to right the landscapes are generally more complex (they contain more land cover categories). From bottom to top the landscapes are generally more aggregated (they have larger patches). Figure 1b shows a normalized histogram for all classified landscapes at t 0 . Note that 36 out of 64 theoretically possible landscape types are present in the data set. From Figure 1b it is clear that the Bd type, which corresponds to a "simple" (low complexity) landscape is the most frequent (42% of all landscapes). Note that landscapes with disaggregated and/or more complex configurations tend to be less abundant. Figure 1c pertains to the stability of landscape configurations during the period of t = t 1 − t 0 = 23 years. A landscape is stable if during t it did not change its type. A red part of a pie diagram shows the probability of a landscape of a given type to transition to another type during t, and the blue part shows the probability of a landscape to maintain its type. The most stable type is again Bd while other landscape types are less stable. There is an association between landscape type's abundance and its stability (correlation coefficient equal to 0.6)-more abundant landscape types are more stable. Figure S3 (in the supporting information) shows histograms and stability grids (equivalents of Figures 1b and 1c) calculated for different theme subsets of all landscapes.
Overall, landscape statistics (illustrated in Figures 1, S2, and S3) suggest that a near homogeneous configuration is the most abundant and the most stable landscape pattern for six out of eight themes considered. An exception is the subset of majority-urban landscapes and wetland landscapes.

Transitions
A transition has occurred when a landscape changed its type to another type during the period t. Among the 1,764,740 landscapes in our database, only 257,031 (15%) had transitioned during t. Because there is a finite number of landscape types there is also a finite number of possible transitions. Given our assumption of completeness and stationarity of transition types (see section 1), Figure 2a depicts a stochastic model of landscape evolution based on actual observations. A given landscape at time t n is first generalized to its type. The stochastic model assigns a type to this landscape at t n+ t with probability indicated by the transition matrix. Repeating this procedure multiple times (at each time the model would select a transition in accordance with the corresponding probability distribution), we obtain a possible evolutionary trajectory of the landscape. A large number of such simulations yield the probability distribution function of future trajectories.
From Figure 2a we observe that most landscape types may evolve in many different ways (in terms of changes to values of complexity and aggregation metrics), but with different probabilities. For example, types located at the edges of the C-A diagram must evolve away from those edges. Combining information from Figure 2a with information from Figures 1b and 1c, it follows that landscapes of type Bd (which are almost homogeneous) are long-lived stable configurations, whereas other landscapes are relatively shorter-lived and thus less stable configurations. Note that this is a purely empirical finding whose explanation is beyond the scope of this paper. Figure 2c illustrates the change to the histogram of landscape configurations (shown in Figure 1b) after t. This change is the net result of all 257,031 transitions. As landscapes transition in and out of different types, the net change amounts to only 20,408 landscapes, which, as can be seen in Figure 2c, results in some (relatively small) increase in the counts for bins corresponding to simpler landscapes at the cost of a decrease in the counts for the remaining bins. Figure S4 (in the supporting information) shows transition matrices and graphs of histogram changes (equivalents of Figures 2a and 2c) calculated for subsets of landscapes selected to have the same top category of land cover. Overall, the results for the thematic subsets of landscapes follow the same pattern as the results for all landscapes. Notable exceptions are majority-urban landscapes, where transition probability distributions are all skewed to the left of the C-A. This is consistent with the process of urbanization when landscapes that are already majority-urban increase the urban share of the landscape and become configurationally simpler. Other exceptions are majority-forest landscapes, where a change to the histogram reveals the loss of simple, Bd-type landscapes and the increase in bithematic landscapes (for example Dc type) and forested landscapes with an increasing number of intrusions (e.g., Bf type). This is consistent with significant deforestation of tropical forests during t.

Deforestation Scenarios
Deforestation is the most pronounced example of human-caused landscape change. In most cases, the forest is replaced by agricultural land cover (we refer to this long-term change as the FAT. We envision a FAT as a sequence of transitions between subsequent landscape types starting with a fully forested landscape and ending with a fully agricultural landscape; this sequence is referred to as a trajectory. We use our stochastic model to simulate a probability distribution function of the FAT trajectories. Each FAT trajectory can be divided into two phases, a forest-dominant phase ( Figure S5 (in the supporting information) shows landscape pattern types archetypes selected from among landscapes selected for the two phases of the models and their transition matrices.
We simulate FAT trajectories using the Monte Carlo method (Rubinstein & Kroese, 2016). The simulation starts from almost totally forested landscape (type Ad). At each step, a transition from a current stage to the next is chosen randomly according to a probability distribution of transitions as given by the FAT1 part of the model until landscape evolves to type Fb or Fc. Subsequently, the next evolutionary stage is chosen randomly according to a probability distribution of transitions as given by the FAT2 part of the model. The simulation ends when the trajectory reaches one of the landscapes at the left edge of the C-A diagram. Using this procedure, we have generated 20,000 FAT trajectories from which a probability distribution function of FAT trajectories is calculated.  Figures 3a and 3b show fractions of trajectories passing through a given landscape type; 100% of trajectories pass through the type Ad from which all simulations start. High probability trajectories pass only through types with large red circles, whereas low probability trajectories pass also through types with small red circles. According to Figures 3a and 3b, high probability trajectories pass through landscape types corresponding to highly aggregated patterns (look for landscape archetypes shown in Figure S4 corresponding to types frequented by the trajectories). The most likely FAT trajectory is shown in Figure 3c. This trajectory depicts the FAT in eight stages. Figure 3d shows an example of a less likely FAT trajectory. This trajectory depicts the FAT in 16 stages. The difference between this trajectory and the most likely trajectory is that in both stages the evolution goes through a series of disaggregated landscapes.
Note that the most likely FAT trajectory is not a trajectory where each subsequent stage is determined by the highest transition probability of its proceeding stage as could be expected. Such locally optimal trajectory is (Ad → Bd → Cd → Dd → Ec → Fc → Fb) + (Fb → Bd → Ad) and has half the probability of the most likely trajectory. The difference between this trajectory and the maximum likelihood trajectory is in Phase 1, where the maximum likelihood trajectory makes a jump from Bd to Ec (p tr = 0.076) instead of following the locally largest probability (p tr = 0.26) transition Bd → Cd. This jump saves the maximum likelihood trajectory from visiting stages Cd and Dd and, on balance, makes this trajectory twice as likely as the trajectory following the largest local transition probability.
A stage-to-stage transition probability of a landscape during t is a product of a probability p ch that a landscape would transition to another type (see Figure 1c) and a probability p tr of the landscape transitioning to a specific different type (see Figure S4). Thus, the product p ch p tr is a probability of a specific stage-to-stage transition occurring during a single t. The mean waiting time (in units of t) for such transition to occur is 1∕(p ch p tr ).
Numbers placed over the arrows in Figures 3c and 3d are mean waiting times for transitions between two consecutive stages to occur. For example, a transition Ad → Bd in the most likely FAT trajectory is 15 t = 345 years. The key to understanding this value is to remember that it reflects the mean waiting time for such a transition to occur. When considering the entire world, most homogeneous, forested landscapes do not change their pattern type on the time scale of 23 years resulting in p ch = 0.15. Thus, despite the p tr = 0.43, the total probability of this transition is 0.0645 resulting in the mean waiting time of 345 years. In most cases, high values of the mean waiting time are due to low values of p ch . Analyzing mean waiting times in trajectories shown in Figures 3c and 3d, we observe that Phase 1 of FAT trajectories takes longer to complete than Phase 2. Thus, it takes longer to lose the first ∼50% of the forest, than it is to lose the remaining forest; the FAT accelerates after reaching a tipping point.

Discussion and Conclusions
In this paper, we presented a novel methodology for simulating the long-term evolution of landscapes. What distinguishes our approach from existing methods is that it simulates the probability distribution function of long-term trajectories of a single landscape based on observation of 1992-2015 (short-term) transitions for a large number of different landscapes. Thus, using the Monte Carlo method, we can obtain the most likely evolutionary trajectory for a given landscape type. The method is best suited for investigating evolutionary scenarios for specific types of land change, such as deforestation, desertification, or urbanization.

10.1029/2019GL085952
The major advantage of the model is its empirical character. This makes unnecessary to account explicitly for all different processes responsible for the land change as they are implicitly accounted for by the large number statistics using methodology, which resembles machine learning. The major shortcoming of the model is that it requires (in the current implementation) some strong assumptions. The first assumption is the temporal stationarity of the transition matrix. The matrix reflects prevailing circumstances of change during the 1992-2015 period and the model assumes that those circumstances do not change. Thus, simulated trajectories reflect stationary scenarios; they show how the landscape would evolve if processes driving the change and their intensities remain unchanged. These are likely to be only the zeroth-order approximations. The second assumption is that our model (as presented in this paper) is built on the worldwide statistics. This has been done so the probabilities are calculated from the largest possible statistics of transitions. However, as a result, transition probabilities are spatially stationary which does not account for variations occurring at the regional level. The assumption of spatial stationarity can be lifted by restricted the data used to construct a model to a specific region at the cost of decreasing the number of observations from which probabilities are inferred.
These assumptions notwithstanding, our model offers an attractive addition to researching landscape dynamics. It is instructive to contrast it with a neutral model (Gaucherel et al., 2014). A neutral model uses an artificial dynamics to produce a series of landscapes with controllable patterns to emulate a specific process-it is useful to obtain specific answers to narrowly defined questions (e.g., see With (1997)).
Our model uses real-world dynamics (implicitly taking into consideration all existing processes) to produce a plausible evolutionary trajectory-it is useful to obtain generalized answers to broadly defined questions. Two examples of our approach answering broad questions were presented in this paper.
First, we were able to address the following broad question: What is the overall principle of landscape evolution? By analyzing statistics of 1995-2015 transitions we proposed the following answer: Mesoscale landscapes are predominantly transitional stages in the evolution from a homogeneous, quasi-stable landscape of one land cover class to another homogeneous, quasi-stable landscape of different land cover class.
To what degree this hypothesis depends on the stationarity of our model remains unknown, but the model at least allows to consider such a question and to formulate a hypothesis. A corollary of our hypothesis is that homogeneous landscapes are much more abundant than landscapes having complex patterns, which indeed has been shown (Nowosad & Stepinski, 2018) to be the case.
Second, we were able to address the question of how the FAT happens on a mesoscale? By performing Monte Carlo simulations based on our model we derived a probability distribution function for the forest → agriculture trajectories. The character of this distribution (see Figure 3) provides the following answer: the FAT is most likely to proceed through a series of intermediate landscapes characterized by highly aggregated forest-agriculture mosaics; once forest share drops below 50% the remainder of the transit is rapid.
As the times involved in a complete FAT on the mesoscale are quite long, direct verification of this hypothesis is not yet possible given that remotely sensed images on the global scale are only available for the past ∼30 years. In this paper, we have concentrated on the introduction of the idea of the stochastic, empirically informed model and we chose the longest period ( t = 23 years) available in the CCI-LC data to maximize the number of transitions from which to build a model. However, future work can investigate models built using t = 1-5 years transitions. CCI-LC allows for the construction of multiple such models during the 1992-2015 period; they can be used to check a degree to which the assumption of temporal stationarity is violated, and to check, at least partially, our assertion that the FAT is most likely to proceed through highly aggregated stages.
Finally, land change scenarios other than FAT can be studied using our model. It is a matter of selecting an appropriate subset of the data set. Desertification can be studied by considering grassland → sparse, grassland → bare, and sparse → bare transits. Urbanization can be studied by considering transitions from agriculture, forest, shrub, and sparse to urban. Shrinkage of wetlands can be studied from wetland → agriculture transits.