10.5061/DRYAD.64687
Schnell, Ida Bærholm
University of Copenhagen
Bohmann, Kristine
University of Bristol
University of Copenhagen
Gilbert, M. Thomas P.
University of Copenhagen
Curtin University
Schnell, Ida Baerholm
Copenhagen Zoo
University of Copenhagen
Data from: Tag jumps illuminated – reducing sequence-to-sample
misidentifications in metabarcoding studies
Dryad
dataset
2015
Tag jumping
Diversity assessment
Second generation sequencing
Chimeras
2015-03-03T16:44:21Z
2015-03-03T16:44:21Z
en
https://doi.org/10.1111/1755-0998.12402
2644994440 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Metabarcoding of environmental samples on second-generation sequencing
platforms has rapidly become a valuable tool for ecological studies. A
fundamental assumption of this approach is the reliance on being able to
track tagged amplicons back to the samples from which they originated. In
this study, we address the problem of sequences in metabarcoding
sequencing outputs with false combinations of used tags (tag jumps).
Unless these sequences can be identified and excluded from downstream
analyses, tag jumps creating sequences with false, but already used tag
combinations, can cause incorrect assignment of sequences to samples and
artificially inflate diversity. In this study, we document and investigate
tag jumping in metabarcoding studies on Illumina sequencing platforms by
amplifying mixed-template extracts obtained from bat droppings and leech
gut contents with tagged generic arthropod and mammal primers,
respectively. We found that an average of 2.6% and 2.1% of sequences had
tag combinations, which could be explained by tag jumping in the leech and
bat diet study, respectively. We suggest that tag jumping can happen
during blunt-ending of pools of tagged amplicons during library build and
as a consequence of chimera formation during bulk amplification of tagged
amplicons during library index PCR. We argue that tag jumping and
contamination between libraries represents a considerable challenge for
Illumina-based metabarcoding studies, and suggest measures to avoid false
assignment of tag jumping-derived sequences to samples.
LD1_Read1.fastqRaw sequences, Leech diet study, Pool 1, Read
1.TOG-3LX3-A3_S4_L001_R1_001.fastq.gzLD1_Read2.fastqRaw sequences, Leech
diet study, Pool 1, Read
2.TOG-3LX3-A3_S4_L001_R2_001.fastq.gzLD2_Read1.fastqRaw sequences, Leech
diet study, Pool 2, Read
1.TOG-3LX3-A4_S5_L001_R1_001.fastq.gzLD2_Read2.fastqRaw sequences, Leech
diet study, Pool 2, Read
2.TOG-3LX3-A4_S5_L001_R2_001.fastq.gzLD3_Read1.fastqRaw sequences, Leech
diet study, Pool 3, Read
1.TOG-3LX3-A5_S8_L001_R1_001.fastq.gzLD3_Read2.fastqRaw sequences, Leech
diet study, Pool 3, Read
2.TOG-3LX3-A5_S8_L001_R2_001.fastq.gzLD4_Read1.fastqRaw sequences, Leech
diet study, Pool 4, Read
1.TOG-3LX3-A6_S10_L001_R1_001.fastq.gzLD4_Read2.fastqRaw sequences, Leech
diet study, Pool 4, Read
2.TOG-3LX3-A6_S10_L001_R2_001.fastq.gzLD5_Read1.fastqRaw sequences, Leech
diet study, Pool 5, Read
1.TOG-3LX3-A7_S13_L001_R1_001.fastq.gzLD5_Read2.fastqRaw sequences, Leech
diet study, Pool 5, Read
2.TOG-3LX3-A7_S13_L001_R2_001.fastq.gzLD6_Read1.fastqRaw sequences, Leech
diet study, Pool 6, Read
1.TOG-3LX3-A9_S16_L001_R1_001.fastq.gzLD6_Read2.fastqRaw sequences, Leech
diet study, Pool 6, Read
2.TOG-3LX3-A9_S16_L001_R2_001.fastq.gzBD1_Read1.fastqRaw sequences, Bat
diet study, Pool 1, Read
1TOG-TYYP-Japan5_S8_L001_R1_001.fastq.gzBD1_Read2.fastqRaw sequences, Bat
diet study, Pool 1, Read
2TOG-TYYP-Japan5_S8_L001_R2_001.fastq.gzBD2_Read1.fastqRaw sequences, Bat
diet study, Pool 2, Read
1TOG-TYYP-Japan2_S5_L001_R1_001.fastq.gzBD2_Read2.fastqRaw sequences, Bat
diet study, Pool 2, Read
2TOG-TYYP-Japan2_S5_L001_R2_001.fastq.gzBD3_Read1.fastqRaw sequences, Bat
diet study, Pool 3, Read
1TOG-TYYP-Japan3_S6_L001_R1_001.fastq.gzBD3_Read2.fastqRaw sequences, Bat
diet study, Pool 3, Read
2TOG-TYYP-Japan3_S6_L001_R2_001.fastq.gzBD4_Read1.fastqRaw sequences, Bat
diet study, Pool 4, Read
1TOG-TYYP-Japan4_S7_L001_R1_001.fastq.gzBD4_Read2.fastqRaw sequences, Bat
diet study, Pool 4, Read
2TOG-TYYP-Japan4_S7_L001_R2_001.fastq.gzBD5_Read1.fastqRaw sequences, Bat
diet study, Pool 5, Read
1TOG-TYYP-Japan1_S4_L001_R1_001.fastq.gzBD5_Read2.fastqRaw sequences, Bat
diet study, Pool 5, Read
2TOG-TYYP-Japan1_S4_L001_R2_001.fastq.gzTags_in_librariesSheet 1: List of
tags used in libraries (LD1-LD6 and BD1-BD5); Sheet 2: List of
tag-sequences in the leech diet study; Sheet 3: List of tag-sequences in
the bat diet study.