Comment: The effect of post-conflict transition on deforestation in protected areas in Colombia

A recent study on Colombian protected areas has found an increase in deforestation after ending armed conflict. The authors propose several drivers behind this trend and take their findings as proof of how these drivers specifically affect protected areas and render them particularly vulnerable to deforestation during post-conflict transition. However, after conducting an extended analysis of the data, we show that the original study merely noticed a national trend of increased deforestation in Colombia, and that forests in national protected areas are actually less affected by the transition than other forests in Colombia. Given these results, the proposed drivers and conservation lessons of the original study can only be regarded as speculative. In this comment, we point out the conceptual and statistical shortcomings of the original study to discuss how to improve forest change analyses regarding policy relevance. Introduction In their study1 on Colombian national protected areas, Clerici et al. investigated how deforestation is influenced by armed conflict. The authors evaluated the change in forest loss in and around 39 protected areas before and after a peace agreement was reached to end military conflict between the Colombian government and the guerrilla groups of the Fuerzas Armadas Revolucionarias de Colombia (FARC). The authors present their observations regarding increase in forest loss, then propose several drivers that are related specifically to protected areas, and finally argue that these drivers are responsible for forest loss observed in protected areas. After replicating the original study1 and conducting an extended statistical analysis, we found that the interpretations of the authors are not supported by their data. The analysis and conclusions presented in the study1 appear to be affected by four key conceptual and statistical shortcomings: (1) no reference trend (or “control”) was presented against which deforestation rates can be compared; (2) no counterfactual2 (i.e. a scenario in which the hypothesized effects are absent) was formulated to evaluate change in protected area effectiveness; (3) no statistical model was employed to assess the potential relationship between conflict periods and forest loss; and (4) only relative change in rates of forest loss was assessed, which does not account for the fact that the same relative change can lead to markedly different deforestation trajectories, depending on the initial deforestation rates. In our reanalysis of the data, we found that forests in national protected areas are actually less vulnerable than other forests in the face of post-conflict transition. The original study only happens to notice a national trend of increased forest loss that is also present in protected areas, albeit to a much smaller degree. In addition, we find it concerning that none of the variables that the authors present as causal drivers for deforestation in protected areas are actually included in their statistical analysis. In the following, we present the results of our reanalysis and the implications for investigating – and acting on – deforestation in protected areas.


Introduction
In their study 1 on Colombian national protected areas, Clerici et al. investigated how deforestation is influenced by armed conflict. The authors evaluated the change in forest loss in and around 39 protected areas before and after a peace agreement was reached to end military conflict between the Colombian government and the guerrilla groups of the Fuerzas Armadas Revolucionarias de Colombia (FARC). The authors present their observations regarding increase in forest loss, then propose several drivers that are related specifically to protected areas, and finally argue that these drivers are responsible for forest loss observed in protected areas.
After replicating the original study 1 and conducting an extended statistical analysis, we found that the interpretations of the authors are not supported by their data. The analysis and conclusions presented in the study 1 appear to be affected by three key conceptual and statistical shortcomings: (1) no reference trend (or "control") was presented against which deforestation rates can be compared; (2) no counterfactual 2 (i.e. a scenario in which the hypothesized effects are absent) was formulated to evaluate change in protected area effectiveness; (3) no statistical model was employed to assess the potential relationship between conflict periods and forest loss; and (4) only relative change in rates of forest loss was assessed, which does not account for the fact that the same relative change can lead to markedly different deforestation trajectories, depending on the initial deforestation rates. In our reanalysis of the data, we found that forests in national protected areas are actually less vulnerable than other forests in the face of post-conflict transition. The original study only happens to notice a national trend of increased forest loss that is also present in protected areas, albeit to a much smaller degree. In addition, we find it concerning that none of the variables that the authors present as causal drivers for deforestation in protected areas are actually included in their statistical analysis. In the following, we present the results of our reanalysis and the implications for investigating -and acting on -deforestation in protected areas.

Reanalysis
We structured our reanalysis into two parts. In the first part, we replicated the original analysis, but also compared the relative change in forest loss (between the periods "before" and "after" the peace agreement) to a reference trend. In the second part, we conducted a more comprehensive statistical analysis to also address the other shortcomings identified above. Following the methods by Clerici et al. 1 , the extent of forest loss in our replication (Supplementary Tables S1, S2) closely matched that reported in the original study (Pearson's r > 0.99). In addition, we calculated a national-level reference trend of forest loss observed outside the assessed protected areas and buffer zones. The percentage increase of forest loss within national protected areas (median: 121.7 %, range: 790.2 %) is different from zero (sign test, k = 31, n = 39, p < 0.01), but it is not significantly different (sign test, k = 19, n = 39, p > 0.99) from the reference trend (116 %). The percentage increase within 10-kilometre buffer zones around protected areas (median: 158,0%, range: 698,6%) also does not differ significantly from the reference trend (sign test, k = 23, n = 39, p = 0.34). Therefore, forests in national protected areas (and their buffer zones) do not appear to be more heavily affected by post-conflict transition than forests elsewhere in Colombia, contrary to the conclusions of Clerici et al. 1 .
For the second part of our reanalysis, we formulated a generalized linear mixed model (Supplementary methods; Supplementary Tables S3 -S5) to assess the effect of post-conflict transition on forest loss (Fig. 1). To compare deforestation trajectories, we used proportional forest loss, which expresses forest area lost as a percentage of total forest area (see Supplementary methods, Supplementary Tables S1, S2). In addition to the national reference trend, we used a counterfactual stated as: "the proportion of forest area lost within protected areas and buffer zones increases by the same amount as in Colombian forests outside these areas". We consider this the simplest counterfactual in the context of the original study, but other types of counterfactuals can surely be formulated 3,4 .
Two main results arise from the regression analysis. First, protected areas show far less forest loss than other forested areas in Colombia, both "before" and "after" peace negotiations (Fig. 1). This is a key aspect missing from the original analysis: it essentially explains why, for protected areas, a similar relative change over time translates to a deforestation trajectory that is substantially different from both the reference trend and counterfactual ( Table 1). As a second result, the difference in proportional forest loss between protected areas and the counterfactual ( Table 1) has widened during post conflict transitionwhich means that the proportion of forest area spared from deforestation has increased compared to the counterfactual and reference trend. In contrast, the trajectory for buffer zones closely tracks the counterfactual (Fig. 1). These results show that protected areas are more effective at reducing deforestation compared to forests elsewhere. It can even be argued that their effectiveness has increased relative to other forested areas. Some caution is warranted when interpreting these trends, however, as our model indicates that much of the variation in forest loss remains unexplained by our model (total deviance explained: 19.7%; Supplementary Tables S4, S5).
In summary, whatever drivers act upon forests in Colombian national protected areas during post-conflict transition, their overall effect leads to these forests being less affected by the transition than other forests. In the original study 1 , the authors repeatedly point to weak institutions and a higher incidence of illicit crops as the drivers that specifically affect protected areas, and argue that these drivers cause protected areas to be particularly prone to deforestation during post-conflict periods. However, their statistical analysis does not actually include the drivers they propose. Taken together, this omission and the results of our reanalysis indicate that the argument in the original study 1 -i.e. attributing deforestation under post-conflict transition to the purported drivers -is not supported by the available data and therefore remains entirely speculative. The same must thus be said about the conservation implications that were derived from this argument. For example: when extending beyond the Colombian situation, the authors argue in favour of an increased presence of a central government during post-conflict transition. However, their analysis does not provide evidence for why this strategy could be effective, or why it should be preferred over alternative (or complementary) strategies such as strengthening local indigenous or community-level institutions 5-7 .

Implications for forest change research and policy
Deforestation is the result of multiple interacting drivers 8,9 (which is also acknowledged by Clerici et al. 1 ). And like the original study, our reanalysis remains strongly limited in identifying causal drivers, as only one potential driver (the cessation of armed conflict) is included as a predictor variable, without controlling for other (and potentially confounding) variables. In comparing forest loss inside protected areas to the simple counterfactual we defined, we have used a rather coarse measure of protected area effectiveness, and other indicators would be needed to judge performance regarding ecosystem service provision and well-being of the local population 10,11 . In addition to weak conservation institutions and presence of illicit crops (as proposed by Clerici et al. 1 ), forest loss in protected areas could be influenced by numerous biophysical and social variables, such as distance to roads, terrain ruggedness, soil fertility, population density, and availability of alternative income sources for the local population, to name but a few. For some of these factors, geospatial information is readily available and can be integrated into statistical models. Others, however, may require detailed ground surveys or in-depth interviews. Understandably, the latter are difficult to obtain in situations of armed conflict. Nonetheless, if a certain factor is absent from the analysis, we strongly suggest that its effect on forest loss be treated as a hypothesis rather than a demonstrated cause. For example, while it is entirely possible that institutions related to protected area governance are weak -contributing to increased deforestation relative to non-protected areas -it is also possible that other factors (such as remoteness) produce counteracting effects strong enough to result in reduced deforestation, relative to non-protected areas, as we have observed for Colombia. Teasing apart a set of spatially concurrent drivers based on well-defined hypotheses thus remains important for arriving at the right "lessons" for conservation 12 .
Combining forest change data (such as provided by Hansen et al. 13 ) with other geospatial information can provide important insights into deforestation patterns. To fully leverage these data while accounting for different drivers of forest loss, we suggest that studies follow a more rigorous statistical approach and, whenever possible, use statistical models to quantify relationships 2/6 between deforestation and its potential drivers. While a comprehensive discussion of these methods is beyond the scope of this comment, we suggest that at least the following aspects be taken into account. First, to fully take advantage of high-resolution forest loss data (e.g. at 1 arc-second), we suggest using observations at the level of single raster cells. The observed response will then be a binary variable ("no forest loss" or "forest loss") for a given location (i.e. raster cell) in a given year (or otherwise defined time period). Alternatively, patches of several raster cells may be aggregated (e.g. patches of 3 × 3 cells) and forest loss events counted per patch, with the response now becoming a count variable. Both methods avoid having to spatially aggregate forest loss for areas that differ widely in size (which is often the case for protected areas). In addition, data on forest loss is then structured similar to species presence-absence data (or abundance data in the aggregated case), taking advantage of the rich toolbox that has been developed for analyzing them [14][15][16] . Conceptualizing forest loss in this way also provides vastly more observations, which means (in principle) that more potential drivers of forest loss can be included as predictors in statistical models. Second, the selection of variables that may be linked to deforestation should be informed both by a priori hypotheses, and by a systematic literature review for the location of interest, in order to identify potential confounding variables. Finally, as drivers of forest loss may not be independent, any collinearity between predictors should be accounted for 17 , and its effects should be discussed when causal interpretation of the predictor variables is attempted. In the context of Colombian forests, a good example for a more rigorous statistical approach is given by a recent study demonstrating a link between deforestation and forest fires 18 ; whereas another recent study on drivers of deforestation 19 does not take the aforementioned aspects into account and is affected by most of the shortcomings discussed above.
With this comment we would like to encourage researchers to use readily available geospatial information on deforestation and its potential drivers to investigate policy-relevant questions, such as Clerici et al. 1 have done. We greatly appreciate that the authors have brought this concerning trend of increased deforestation in Colombia to our attention, and we believe that more studies on underlying causes of deforestation are needed to improve forest governance. For future studies on forest change to be most relevant for policymaking, however, they should aim to provide the strongest supporting evidence achievable -given the available data and the question at hand. This requires that analytical concepts and statistical methods be as robust as possible, and interdependencies and uncertainties related to potential drivers of forest loss be clearly communicated.

Data availability
The analysis scripts used in this study are available as a git repository (https://www.github.com/dschoenig/ ForestchangeColPA) and have been archived with DOI 10.5281/zenodo.3984087.  Total forested area was defined as the combined area of all raster cells with at least 50% tree cover in the year 2000, and with no forest loss recorded prior to 2013. 95% confidence intervals are given in parentheses. c Absolute difference. Negative values indicate that estimated proportional forest loss is lower than the reference trend or counterfactual, respectively. 95% confidence intervals are given in parentheses. d Defined as the percentage of forest area lost outside the assessed protected areas and buffer zones in Colombia. e Hypothetical scenario defined as: "the proportion of forest area lost within protected areas and buffer zones, respectively, increases by the same amount as in Colombian forests outside these areas".