10.5061/DRYAD.CRJDFN32R
Olson, Sarah
0000-0002-8484-9006
Wildlife Conservation Society
McClure, Meredith
Conservation Science Partners
Haase, Catherine
0000-0002-7682-0625
Montana State University
Hranac, Carter
0000-0001-6145-4784
Massey University
Hayman, David
Massey University
Dickson, Brett
0000-0003-4160-839X
Conservation Science Partners
McGuire, Liam
0000-0002-5690-0804
Texas Tech University
Crowley, Daniel
0000-0003-4262-253X
Montana State University
Fuller, Nathan
Texas Tech University
Lausen, Cori
0000-0002-6012-1803
Wildlife Conservation Society Canada
Plowright, Raina
Montana State University
Analytic dataset informing modeling of winter species distributions of
North American bat species
Dryad
dataset
2020
FOS: Natural sciences
Bat
Bioenergetic model
Hibernation
Corynorhinus townsendii
Myotis californicus
Myotis lucifugus
Perimyotis subflavus
North America
United States Department of Defense
https://ror.org/0447fe631
W912HQ-16-C-0015
Royal Society of New Zealand
https://ror.org/04tajb587
MAU1701
2021-09-01T00:00:00Z
2021-09-01T00:00:00Z
en
2964575 bytes
6
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
The fungal pathogen Pseudogymnoascus destructans and resultant white-nose
syndrome (WNS) continues to advance across North America, infecting new
bat populations, species, and hibernacula. Western North America hosts the
highest bat diversity in the U.S. and Canada, yet little is known about
hibernacula and hibernation behavior in this region. An improved
understanding of where bats hibernate and the conditions that create
suitable hibernacula is critical if land managers are to anticipate and
address the conservation needs of WNS-susceptible species in regions yet
to be infected. We estimated suitability of potential winter hibernaculum
sites across the ranges of five bat species occurring in western North
America. We estimated winter survival capacity from a mechanistic
survivorship model based on bat bioenergetics and climate conditions.
Leveraging the Google Earth Engine platform for spatial data processing,
we used boosted regression trees to relate these estimates, along with key
landscape attributes, to bat occurrence data in a hybrid
correlative-mechanistic approach. Winter survival capacity, topography,
land cover, and access to caves and mines were important predictors of
winter hibernaculum selection, but the shape and relative importance of
these relationships varied among species. This suggests that the
occurrence of bat hibernacula can, in part, be predicted from readily
mapped above-ground features, and is not only dictated by below-ground
characteristics for which spatial data are lacking. Furthermore, our
mechanistic estimate of winter survivorship was, on average, the third
strongest predictor of winter occurrence probability across focal species.
Winter distributions of North American bat species were driven by their
physiological capacity to survive winter conditions and duration in a
given location, as well as selection for topographic and other landscape
features, but in species-specific ways. The influence of winter
survivorship on several species’ distributions, the underlying influence
of climate conditions on winter survivorship, and the anticipated
influence of WNS on bats’ hibernation physiology and survivorship together
suggest that North American bat distributions may undergo future shifts as
these species are exposed to not only WNS, but climate change. We
anticipate that the models presented here may offer a valuable baseline
for assessing the potential species-level impacts of these stressors.
Winter occurrence data We selected five focal species for our analyses:
Corynorhinus townsendii, Myotis californicus, Myotis lucifugus, Myotis
velifer, and Perimyotis subflavus. These species were chosen because
occurrence data and field-measured metabolic parameters were available for
estimating survivorship, and because they were representative of
variability in known habitat requirements among hibernating bats whose
ranges lie in whole or in part in the West, defined here as west of the
Mississippi River. We compiled species occurrence data from multiple
sources, including online databases of museum records (VertNet,
Biodiversity Information Serving Our Nation [BISON]), online repositories
of vetted public and scientific observations (Global Biodiversity
Information Facility [GBIF], Bat Population Database [BPD]), data
associated with published literature, data obtained from multiple Natural
Heritage Programs (Montana NHP and via NatureServe), and data collected in
our own field studies (unpublished data). We amassed thousands of
occurrence records for each focal species, but the vast majority of
records (>85%) were observed during summer or fall swarming, when
bats are more readily observed. Even in bats that do not migrate
seasonally, selection of hibernaculum microclimates and the surrounding
habitat mosaic is expected to differ from selection of summer roosts.
Moreover, due to the sensitivity of hibernaculum locations to disturbance
or exploitation, along with the difficulty of detecting torpid bats in
hibernacula, winter bat location data were difficult to come by and
limited in number. See linked publication for further details. We
included only in-hand or visual observations (i.e., no acoustic
detections) since 1948 (to match the earliest availability of gridded
climate data) with location error < 5 km. Because we were
interested only in winter distributions associated with hibernaculum use,
we filtered the compiled dataset to observations recorded during what we
defined as winter in a spatially explicit manner. We first used a
generalized linear model informed by the timing of M. lucifugus immergence
and emergence observations at hibernacula throughout North America to
estimate winter duration for each 1-km raster cell across the U.S. and
Canada (see linked publication for further details). Then, to estimate the
start and end date of winter hibernation at a given grid cell, we centered
this model-based winter duration estimate on the winter solstice. Finally,
we selected only occurrence records observed between these spatially
explicit start and end dates. Lastly, we removed repeat observations
(e.g., across multiple studies or survey dates), retaining a single record
for a given site (with unique sites defined to the nearest thousandth of a
degree of latitude and longitude). Predictor variables We identified
landscape attributes that potentially influence hibernaculum selection
from the published literature and our own knowledge. We selected publicly
available datasets representing these predictors with sufficient spatial
extent to encompass our compiled occurrence data (United States and Canada
south of the Arctic Circle). Where multiple candidate datasets were
available, we chose those with the highest spatial resolution and/or
temporal range that best encompassed our occurrence data. The scale at
which bats perceive and respond to landscape attributes may vary among
species, attributes, and locales. We therefore derived predictor variables
at multiple spatial scales (i.e., different neighborhood sizes, or the
radius around each focal raster cell across which predictor values were
smoothed) where applicable for comparison. Our selection of neighborhood
sizes, which included 500 m, 5 km, and 25 km was guided by those to which
bats were found to respond in previous studies of multiscale habitat
selection (100 m – 10 km). However, these studies focused on response to
the landscape during daily foraging bouts, and we felt it was important to
consider a broader range of spatial scales for selection of a winter
hibernaculum. All smoothing of predictor variables using each of the
selected neighborhood sizes was performed at the native resolution of each
variable prior to sampling. Thus, for a variable with native resolution of
30 m, we summarized values within 500 m, 5 km, and 25 km of each focal 30
m cell. Each layer was then aggregated to two scales, 1 km and 10 km, for
sampling. This step offered a means of exploring the scale of bats’
response to those variables to which we could not reasonably apply the
above range of neighborhood sizes, either due to the coarse native
resolution of the variables or because application of varying neighborhood
sizes did not make intuitive sense. All predictors were derived and/or
sampled using Google Earth Engine, a cloud-based computing platform
supporting large-scale analysis on an extensive catalog of remotely
sensed, climatological, and other geospatial datasets. All final
predictive maps were derived at a resolution of 1 km. Survivorship. We
estimated species-specific, spatially explicit winter survivorship
relative to the duration of winter. These estimates were based on an
existing bioenergetic model of bat winter survivorship, recently updated
and parameterized for western bat species. Full details are elsewhere (see
linked publication and references therein), but briefly, the model uses
the hypothesized energetic requirements of bats in torpor to dynamically
model torpor bouts for the duration of a predicted winter under specified
hibernaculum conditions. For M. lucifugus, torpor consumes approximately
eighty times less energy per unit time than euthermia, whereas the
infrequent but periodic arousals to euthermic temperatures use the
majority of energy stores, with each arousal consuming approximately 5% of
total overwinter energetic costs. In this model, ambient temperature and
relative humidity were drivers of arousal frequency. Using gridded spatial
data, we applied the model to values of each 1-km grid cell across the
study extent to predict the fat mass expected to remain at the end of
winter given mean ambient temperature and winter duration at each 1-km2
raster cell. Higher, positive predicted values are expected to correspond
to high survivorship, while low or negative values indicate areas where
bats are unlikely to survive. Further details regarding the bioenergetic
model and spatial parameters are described in the Supplemental
Information. Topography. We derived topographic covariates from the global
ALOS Digital Surface Model (DSM version 2.2) at 30-m resolution, including
elevation, topographic ruggedness, and topographic position. Topographic
ruggedness was quantified as the standard deviation of elevation values
within a given radius around each focal raster cell. Similarly,
topographic position was quantified as the difference between the
elevation of each focal raster cell and the mean of elevation values
within a given radius, such that high values are associated with peaks and
ridges and low values are associated with canyon bottoms. We also
extracted Continuous Heat-Insolation Load Index, a surrogate for effects
of solar insolation and topographic shading on evapotranspiration, also
derived from the global ALOS DSM at 90-m resolution. We used a moving
window approach to derive topographic ruggedness and position at three
spatial scales (diameter = 500 m, 5 km, 25 km), then the resulting values
were averaged to create ‘multiscale’ metrics. We took the focal mean of
solar insolation values over these multiple scales as well. Surface
attributes. We derived percent tree cover from the Terra MODIS Vegetation
Continuous Fields product, which estimates sub-pixel-level surface
vegetation cover globally, including percent tree cover, on an annual
basis (250-m resolution; NASA). Because data were not available for the
entire temporal range of our occurrence data, we used data for the most
recent year available (2015). We estimated percent tree cover at two
aggregated scales (diameter = 5 km, 25 km). We used global nighttime
lights imagery from the Defense Meteorological Program Operational
Line-Scan System (Radiance-Calibrated, V4) as a proxy for relative
intensity of human development (30-arcsec resolution; NOAA). We estimated
availability of surface water based on the Joint Research Center Yearly
Water Classification History (V1), which maps the location and seasonality
of surface water from Landsat 5, 7, and 8 imagery (30-m resolution). We
estimated the percent cover of seasonal or permanent surface water at
three spatial scales (diameter = 500 m, 5 km, 25 km), focusing on the most
recent year for which data were available (2015) because the data do not
span the entire temporal range of our occurrence dataset. We estimated the
frequency of snow cover based on the MODIS Global Daily Snow Cover product
(V6), which estimates percent snow cover of each 500-m pixel on a daily
basis. We counted the average number of days per year with at least 10%
snow cover over the 5-year period from July, 2013 to June, 2018. We
quantified precipitation using the DayMet dataset (V3), which provides
gridded daily precipitation data at 1-km resolution. We estimated mean
annual total precipitation by summing daily values annually then averaging
the most recent five years available (2013-2018) for consistency with the
temporal range of other available predictor data. Below-ground
attributes. To represent potential availability of karst features that may
provide suitable hibernacula, we relied on a map of karst and pseudokarst
features across the United States produced by Weary and Doctor (2014, USGS
Open-File Report 2014–1156) derived from State geological survey maps and
USGS integrated geologic map databases (1:24,000 to 1:500,000 resolution).
We merged this with an equivalent dataset for British Columbia provided by
the Ministry of Forests, Lands, Natural Resource Operations and Rural
Development (1:250,000 resolution) (Forest Analysis and Inventory, 2019).
We did not differentiate among karst types, and instead created a simple
binary indicator of karst presence vs. absence in raster format (1-km
resolution). We also estimated availability of mines as potential
hibernacula, using mine site locations available from the USGS Prospect-
and Mine-Related Features database (v4, available for all but northeastern
states) and the Mineral Resources Data System (MRDS, used for northeastern
states; USGS), and from the MINFILE Production Database for British
Columbia (BC Geological Survey). We included only mineral resource sites
classified as mines (Mine-Related Features and MRDS) or as producing at
one time (MINFILE). We derived two measures of mine availability: distance
to the nearest mine and density of mines within 50 km of each focal raster
cell (1-km resolution), calculated using a Gaussian kernel density
function (sigma = 25 km). Karst and mine data were not available for other
Canadian provinces; these predictors were not included in models for M.
lucifugus, whose range spans these areas. Finally, we estimated
groundwater depth from a global water table depth model that gap-filled
point observations with a mechanistic groundwater model (1-km resolution;
Fan et al., 2013, Science 339:940). Model fitting We estimated
species-specific relative probability of occurrence during winter using
boosted regression trees (BRT; De’Ath, 2007, Ecology 88(1):243; Elith et
al., 2008, Journal of Animal Ecology 77:802). A BRT (a.k.a. gradient
boosting machine or stochastic gradient boosting) is an ensemble approach
that combines regression trees, which relate a response to predictors by
recursive binary splits of the data, and boosting, in which inference is
drawn from the relative strength of many possible models rather than
fitting a single parsimonious model. This method offers advantages over
more traditional linear regression approaches in that a variety of
response data and model forms can be accommodated (e.g., Gaussian,
binomial, Poisson); different types of predictor variables (e.g.,
continuous, ordinal, categorical) can be included with no need for
transformation or outlier removal; nonlinear relationships are easily
captured; and interactions between predictors are handled automatically.
Furthermore, overfitting is well-controlled through the use of
cross‐validation as BRT models are ‘grown’. Importantly, a number of
studies (see linked publication and references therein) have shown strong
BRT predictive performance relative to other SDM approaches (e.g.,
generalized linear models, generalized additive models, climatic envelope
models, maximum entropy). We follow the approach detailed by Elith et al.
(2008) for application of BRT to species distribution modeling. One key
difference in our application is that we make use of presence-background
data rather than presence-absence data. Use of presence-background data,
in which sites where the focal species was absent are not known with
certainty, requires a shift in model assumptions and inference.
Presence-absence models compare landscape attributes of sites at which the
species was known to be present and absent to estimate the absolute
probability of occurrence at any unobserved site given its climate and/or
landscape characteristics. Without absence data, attributes of presence
locations must instead be compared to randomly-sampled ‘background’
locations. In this case, presence is assessed relative to availability and
the species’ absence at sampled background locations is not guaranteed.
This shift in comparison fundamentally alters the inferences that can be
made from the model: we cannot estimate the absolute probability of focal
species occurrence (i.e., 80% probability of occurrence at a given site),
but we can estimate, or rank, the relative probability of occurrence. We
sampled ‘background’ locations from geographic areas extending well beyond
each species’ known range in the US and Canada (16 western states and
British Columbia for C. townsendii, M. californicus; all states and
provinces for M. lucifugus; all US states for M. velifer, P. subflavus).
This choice aimed to sufficiently capture the full range of environmental
conditions limiting bats’ distributions. Because bats were more likely to
have been observed in locations already known to harbor bats and that are
more accessible (e.g., closer to population centers, accessible by roads,
and in less rugged topography), we generated background points so as to
replicate and thus control for this inherent spatial bias (after Hertzog
et al., 2014, Diversity and Distributions 20:1403). We first created a
bias grid based on the kernel density of occurrence locations using the
MASS package for R, then generated background points with probability
dictated by occurrence density, such that areas with high density of
occurrences had high probability of background sampling, but all locations
within the sampling extent had nonzero probability of sampling. Our
background sample consisted of three background points per occurrence
point as a balance between achieving good coverage of available habitat
and not artificially inflating sample size. Finally, we sampled all
candidate predictor variables at each presence and background location. To
identify the most appropriate scale for each predictor (i.e., the scale at
which habitat selection was most evident), we first fit univariate
generalized additive models (GAM) for each predictor. We chose GAM for
this preliminary predictor selection step to not constrain the form of the
response. We selected the best performing scale for each predictor based
on a comparison of Akaike’s Information Criterion (AIC) scores across each
scale at which the predictor was sampled. We then assessed pairwise
correlations and variance inflation factors across the resulting set of
predictors and excluded those causing standard thresholds of 0.7 and 4.0,
respectively, to be exceeded to avoid multicollinearity. We also excluded
mine density from further consideration due to its poorer AIC-based
performance across all focal species compared to distance from mines. We
fit and calibrated each BRT model using the stepwise cross-validation
process detailed by Elith et al. (2008) and accompanying R scripts (Elith
et al., 2008 Appendix S3). We adjusted the model learning rate to ensure
that a minimum of 1000 trees were fit, then calibrated the tree complexity
(range: 3-5) and bag fraction (range: 0.5-0.7) to minimize deviance. We
tested for benefits of dropping uninformative model terms based on
estimated reduction in deviance. We then used this ‘optimized’ model to
assess the relative contribution of each predictor, plot the relationship
between each predictor and relative occurrence probability, and evaluate
model performance. We evaluated the model’s fit to the training data
(iteratively partitioned in the cross-validation process) based on the
mean proportion of deviance explained in each cross-validation iteration
(D2), a pseudo-determination coefficient intended to be comparable to R2.
We also assess predictive performance based on the cross-validated area
under the receiver operating curve (AUC). Although use of this metric to
evaluate presence-background models is flawed by ‘contamination’ of
background sites with unobserved presence, we report it as an additional
evaluation metric that supports comparison with other studies that
frequently include it. As a final modeling step, we applied the optimized
model to predictor values in each 1-km cell of the extent of interest for
each species to predict and map relative probability of occurrence (Elith
et al., 2008 Appendix S3). We summarized the percentile ranks of
occurrence probability values predicted for presence and background
locations as an additional assessment of predictive performance. All
model fitting and prediction were conducted in R (version 3.4.1).
See uploaded ReadMe file.