10.5061/DRYAD.J0R8127
Marshall, Charles R.
University of California, Berkeley
Finnegan, Seth
University of Alaska Fairbanks
Clites, Erica C.
University of California, Berkeley
Holroyd, Patricia A.
University of California Museum of Paleontology
Bonuso, Nicole
California State University, Fullerton
Cortez, Crystal
John D. Cooper Archaeological and Paleontological Center, Santa Ana, CA
92701-6427, USA
Davis, Edward
University of Oregon
Dietl, Gregory P.
Paleontological Research Institution, 1259 Trumansburg Road, Ithaca, NY
14850, USA
Druckenmiller, Patrick S.
University of Alaska System
Eng, Ron C.
University of Washington
Garcia, Christine
California Academy of Sciences
Estes-Smargiassi, Kathryn
Natural History Museum of Los Angeles County
Hendy, Austin
Natural History Museum of Los Angeles County
Hollis, Kathy A.
Smithsonian Institution
Little, Holly
Smithsonian Institution
Nesbitt, Elizabeth A.
University of Washington
Roopnarine, Peter
California Academy of Sciences
Skibinski, Leslie
Paleontological Research Institution, 1259 Trumansburg Road, Ithaca, NY
14850, USA
Vendetti, Jann
Natural History Museum of Los Angeles County
White, Lisa D.
University of California Museum of Paleontology
Data from: Quantifying the dark data in museum fossil collections as
palaeontology undergoes a second digital revolution
Dryad
dataset
2018
iDigBio
Museum collections
Digitization
Dark data
National Science Foundation
https://ror.org/021nxhr62
DBI-1503678, DBI-1503628, DBI-1503611, DBI-1503065, DBI-1503613,
DBI-1503545, DBI-1502500, DBI-1349430, DBI-1561759, DBI-1203600
2018-08-07T19:20:00Z
2018-08-07T19:20:00Z
en
https://doi.org/10.1098/rsbl.2018.0431
3115 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Large-scale analysis of the fossil record requires aggregation of
palaeontological data from individual fossil localities. Prior to
computers these synoptic datasets were compiled by hand, a laborious
undertaking that took years of effort and forced palaeontologists to make
difficult choices about what types of data to tabulate. The advent of
desktop computers ushered in palaeontology’s first digital revolution –
online literature-based databases, such as the Paleobiology Database
(PBDB). However, the published literature represents only a small
proportion of the palaeontological data housed in museum collections.
Although this issue has long been appreciated, the magnitude, and thus
potential significance, of these so-called “dark data” has been difficult
to determine. Here, in the early phases of a second digital revolution in
palaeontology the digitization of museum collections – we provide an
estimate of the magnitude of palaeontology’s dark data. Digitization of
our nine institutions’ holdings of Cenozoic marine invertebrate
collections from California, Oregon, and Washington in the United States
reveals that they represent 23 times the number of unique localities than
are currently available in the Paleobiology Database. These data, and the
vast quantity of similarly untapped dark data in other museum collections,
will when digitally mobilized enhance palaeontologists’ ability to make
inferences about the patterns and processes of past evolutionary and
ecological changes.
Supplementary_DataCounts of sites by county for Cenozoic fossil marine
invertebrates used to create Figure 1, specifically the number of sites
(collections) from the Paleobiology Database download, and the number of
sites (localities) digitally mobilized from nine institutions of the EPICC
TCN. The Paleobiology Database download was performed on November 10 2017
using the following query:
http://paleobiodb.org/data1.2/colls/list.csv?datainfo&rowcount&interval=Cenozoic,Cenozoic&cc=US&envtype=marine&show=loc,paleoloc,strat,stratext,lith,geo,methods,resgroup,refattr,secref,ent,entname. The nine institutions are: Burke Museum of Natural History and Culture, University of Washington, Seattle, WA; California Academy of Sciences, San Francisco, CA; Natural History Museum of Los Angeles County, Los Angeles, CA; National Museum of Natural History, Smithsonian Institution, Washington, DC; Orange County Paleontological Collection, Fullerton, CA; Paleontological Research Institution, Ithaca, NY; University of Alaska Museum, University of Alaska, Anchorage, AK; University of California Museum of Paleontology, Berkeley, CA; University of Oregon Museum of Natural History, Eugene, OR. Data from both Paleobiology Database collections and museum collections were cleaned to remove terrestrial strata erroneously categorized as marine.
Oregon
Pacific
California
Washington