Random Forest | Blas M. Benito, PhD

Comparing the performance of species distribution models of Zostera marina: Implications for conservation

A time series of 14-year distribution data of Zostera marina in the Ems estuary (The Netherlands) was used to build different data subsets: (1) total presence area; (2) a conservative estimate of the total presence area, defined as the area which had been occupied during at least 4 years; (3) core area, defined as the area which had been occupied during at least 2/3 of the total period; and (4–6) three random selections of monitoring years. On average, colonized and disappeared areas of the species in the Ems estuary showed remarkably similar transition probabilities of 12.7% and 12.9%, respectively. SDMs based upon machine-learning methods (Boosted Regression Trees and Random Forest) outperformed regression-based methods. Current velocity and wave exposure were the most important variables predicting the species presence for widely distributed data. Depth and sea floor slope were relevant to predict conservative presence area and core area.

The impact of modelling choices in the predictive performance of richness maps derived from species‐distribution models: guidelines to build better diversity models

We generated 380 S‐SDMs of 1224 tree species in Mesoamerica by combining 19 distribution modelling methods with 20 different thresholds using presence‐only data from the Global Biodiversity Information Facility. We compared the predicted richness and composition with inventory data obtained from the BIOTREE‐NET forest plot database. We designed two indicators of predictive performance that were based on the diversity factors used to measure species turnover: a (shared species between the observed and predicted compositions), b and c (the exclusive species of the predicted and observed compositions respectively) and compared them with the Sorensen and Beta‐Simpson turnover measures. Some modelling methods – especially machine learning and ensemble model forecasting methods performed significantly better than others in minimizing the error in predicted richness and composition. Our results also indicate that restrictive thresholds (with high omission errors) lead to more accurate S‐SDMs in terms of species richness and composition. Here, we demonstrate that particular combinations of modelling methods and thresholds provide results with higher predictive performance.