Skip to content

Octoberweather/ENM-code

Repository files navigation

ENM-code

  • Selection of Bat Species // I selected bats based on their threatened status (Vulnerable—VU, Endangered—EN, and Critically Endangered—CR) according to the IUCN Red List guidelines (iucnredlist.org). At the time of access (August 25, 2020), there were 106 VU, 77 E, and 22 CR bats for a total of 205 threatened species.

  • Presence-Only Data // Species distribution models are based on known occurrence records for a given species; to access this occurrence data, I used the rgbif (Chamberlain and Boettiger 2017) and maptools (Bivand and Lewin-Koh 2020) libraries in R to download and visualize occurrence records for each bat species from the Global Biodiversity Information Facility (GBIF; gbif.org), a global aggregator of museum records and ecological surveys for a wide number of species. Of these, 80 species returned records totaling 30,530 observations. I removed duplicate records and records without coordinates. I then used the CoordinateCleaner (Zizka et al. 2019) library in R to flag non-usable records (tests: “capitals,” “centroids,” “equal,” “gbif,” “institutions,” “zeros”) in addition to removing records with low coordinate precision (less than 100km grain size), fossil records (basis of records: “human observation,” “observation,” “preserved specimen”) and those with suspicious individual counts (greater than 99 observations) following recent approaches in predicting species distributions (Maldonado et al. 2015). To reduce spatial correlation among presence records, I thinned the remaining species to 1km2 extent using the spThin (Aiello-Lammens et al. 2019) library in R and filtered the remaining records to include those with 15 or more unique records (Raes and ter Steege 2007). The final count was 23 species that were suitable for analysis; of which, 6 are Endangered and the remaining 18 are classified as Vulnerable according to the IUCN Red List (Table 1).

  • IUCN Range Maps // To understand the best estimate of current bat ranges, I downloaded range map shapefiles (.shp) for all 23 species from the IUCN Red List (https://www.iucnredlist.org/, accessed August 25, 2020). Shapefiles were visualized in R using the raster (Hijmans 2020), rgdal (Bivand et al. 2020) and rgeos (Bivand and Rundel 2020) libraries. In addition to the Red List threatened categories, the IUCN provides data concerning population trends and current threats that may be contributing to these ranges.

  • Historical Climate Data // To assess bat distributions under current climate scenarios, I used WorldClim v. 2.1 (Fick and Hijmans 2017) to download current 1970-2000 Bioclimatic variables (Bio 1-19) at 1 km2 spatial resolution (Table 2). These variables are derived from averages of monthly temperature and rainfall values that are used to represent annual trends. These trends are grouped into four categories: annual mean temperature and precipitation (Bio 1,12); seasonal temperature range and precipitation (Bio 2,3,4,7,15); mean or min/max temperatures of the warmest and coldest months or quarters (Bio 5,6,10,11,18,19); and precipitation of the wettest and driest quarters or months (Bio 8,9,13,14,16,17). I then converted these climate raster files into ASCII (.asc) format using the raster (Hijmans 2020), maps (Becker and Wilks 2018) and mapdata (Becker and Wilks 2018) libraries in R. WorldClim was used because it is easily accessible and used in many species distribution studies.

  • Future Climate Data // To predict how distributions of bat populations will be affected by climate change, I modeled a potential carbon scenario in 2050 that is considered to be a very high baseline for greenhouse gas emissions. This was done primarily to get a sense of the maximum displacement of geographic range that bats may experience under a “high” carbon environment in 2050. In order to simulate the effects of warming temperatures in the future, I used the raster library in R to download the same bioclimatic variables (1-19) from WorldClim at 1 km2 spatial resolution at a Representative Concentration Pathway (RCP) that examined a very high baseline future carbon scenario (RCP 8.5) following recent climate forecasting practices in niche modeling (Chhetri et al. 2018; Zamora‐Gutierrez et al. 2018). These 19 variables were derived from the Coupled Model Intercomparison Project Phase 5 (CMIP5) 2010-2014 using data collected from the Intergovernmental Panel on Climate Change Assessment Report 5 (IPCC AR5). In order to more accurately simulate the earth’s climate system in 2050, it was necessary to use a downscaled Global Climate Model (GCM) which translates coarse-resolution climate data into fine-resolution outputs used for modeling. From the nine available GCMs, I chose the Community Climate System Model version 4 (CCSM4) from the National Center for Atmospheric Research in the United States (NCAR).

  • Topographic Variables // Digital Elevation Models (DEMs) were downloaded from WorldClim at 1km spatial resolution derived from NASA’s Shuttle Radar Topography Mission version 4 (SRTM v.4) data. In addition, I used Earth Env (https://www.earthenv.org/topography) to download slope and aspect (east and north components) based on 1km spatial grain (aggregation) from the Global Multi-resolution Terrain Elevation data 2010 (GMTED2010) from the U.S. Geological Survey (USGS) and National Geo-Spatial Intelligence Agency (NGA). The same four topographic variables (elevation, slope, aspect/north, aspect/east) were used in both current and future 2050 climate models.

  • Ecological niche model // Because bats are distributed across the globe, using spatial data from modeling to create maps is a useful tool for projecting future distributions affected by climate change. Both current and future species distribution models for all 23 bat species were built in Maxent (v. 3.4.1), which is a robust presence-only regression program that predicts species distributions with relatively low sample sizes (<100) (Elith et al. 2006; Phillips et al. 2006; Phillips et al. 2009). Figure 3 illustrates an overview of my approach. To reduce overfitting and collinearity of BioClim variables in the model, I used the usdm library (Naimi et al. 2014) in R to construct a stepwise Variance Inflation Factor analysis (>10 indicating highly inflated) to eliminate highly correlated variables from the model following recent techniques in niche modeling (Zeng et al. 2016). Since Maxent is a presence-only algorithm, a leave-one-out cross validation (LOOCV) resampling approach was used to validate the model which makes better use of small datasets (Phillips et al. 2006). I also increased the number of background points to 10,000 and increased the number of replicates to 15 (Phillips and Dudík 2008) but set all other parameters to default. When all 46 models were completed, I converted probability of occurrence into a binary presence (1) and absence (0) prediction using the 10th percentile training presence (10% omission rate) logistic threshold (Peterson et al. 2007) following recent approaches in the study of bats (Santos et al. 2013). I then reclassified current and future maps according to threshold values, and converted them into shapefiles using the raster, rgdal and rgeos libraries in R.

  • Model Evaluation // Several methods were used to evaluate model performance of current and future climate scenarios. First, I recorded the area under the curve (AUC) value of the receiver operator characteristic (ROC), which is an indicator of how well Maxent ranks presence values from random background points, and was therefore a good measure of predictive accuracy (Phillips et al. 2006, Merow et al. 2013). AUC values range between 0 and 1, with greater than 0.75 considered to be good model performance (Elith et al., 2006), and 0.5 and below considered to be no better than random (Phillips et al. 2006). In order to evaluate how influential the EGVs were in distinguishing between presence and random background points, a statistical jackknife analysis of gain (a resampling technique that uses subsets of the original data) was used to measure which variables contributed the most to model performance based on percentage. In other words, this is a statistical method that resamples (takes random subsets from) the sample data in order to estimate which EGVs contribute the most to current and future distributions (Phillips 2007, Nisbet 2018). Lastly, a binomial p-test (p < 0.05, statistically significant) for 10% omission training presence was used to test for statistical significance of omission rate between all 23 current and future models.

  • Testing Hypothesis // To test if current and future percent range overlap varied across elevation, a Spearman’s rank correlation (correlation coefficient, rs) test was performed in R to measure the correlation between percent range overlap and elevation as well as the strength of association. That is, I hypothesized that either (1) range overlap percentage and elevation both increase or (2) range overlap percentage decreases as elevation increases (or vice-versa).

Releases

No releases published

Packages

No packages published

Languages