10.5061/DRYAD.HMGQNK9DH
Dapporto, Leonardo
0000-0001-7129-4526
University of Florence
Leonardo, Platania
Institute of Evolutionary Biology
Menchetti, Mattia
University of Florence
Corbella, Cecília
0000-0001-7502-5833
Institute of Evolutionary Biology
Kay-Lavelle, Isaac
University of Manchester
Vila, Roger
Institute of Evolutionary Biology
Wiemers, Martin
0000-0001-5272-3903
Helmholtz Centre for Environmental Research
Schweiger, Oliver
0000-0001-8779-2335
Helmholtz Centre for Environmental Research
Assigning occurrence data to cryptic taxa improves climatic niche
assessments: biodecrypt, a new tool tested on European butterflies
Dryad
dataset
2020
2021-08-14T00:00:00Z
2020-08-14T00:00:00Z
en
https://doi.org/10.1111/geb.13154
2416493 bytes
3
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Aim Occurrence data are fundamental to macroecology, but accuracy is often
compromised when multiple units are lumped together (e.g. in recently
separated cryptic species or citizen science records). Using amalgamated
data leads to inaccuracy in species mapping, to biased beta-diversity
assessments and to potentially erroneously predicted responses to climate
change. We provide a set of R functions (biodecrypt) to objectively
attribute undetermined occurrences to the most probable taxon based on a
subset of identified records. Innovation Biodecrypt assumes that unknown
occurrences can only be attributed at certain distances from areas of
sympatry. The function draws concave hulls based on the subset of
identified records; subsequently, based on hull geometry, it attributes
(or not) unknown records to a given taxon. Concavity can be imposed with
an alpha value and sea or land areas can be excluded. A cross-validation
function tests attribution reliability and another function optimizes the
parameters (alpha, buffer, distance ratio between hulls). We applied the
procedure to 16 European butterfly complexes recently separated into 33
cryptic species for which most records were amalgamated. We compared niche
similarity and divergence between cryptic taxa, and we re-calculated and
contributed updated CLIMBER variables for climatic preferences. Main
conclusions Biodecrypt showed a cross-validated correct attribution of
known records always ≥98% and attributed more than 80% of unknown records
to the most likely taxon in parapatric species. The functions determined
where records can be assigned even for largely sympatric species, and
highlighted areas where further sampling is required. All the cryptic taxa
showed significantly diverging climatic niches, reflected in different
values of mean temperature and precipitation compared to the values
originally provided in the CLIMBER database. The substantial fraction of
cryptic taxa existing across different taxonomic groups and their
divergence in climatic niches highlights the importance of using reliably
assigned occurrence data in macroecology.
The script "Script.R" contains all the scrpts to run examples
and the analyses to carry out separation of occurrence data among cryptic
taxa. The "biodecrypt.R" file contains the functions The
"Total_data.txt" file contains the data used for the analyses
without duplicates for each cell The "all_data.txt" file
contains all the occurrence data (also duplicated data) with indications
for geographic locations, BOLD and genbank IDs The
"Appendix_S2.txt" file contains the new CLIMBER data