10.5061/DRYAD.89K5K30
Blischak, Paul D.
The Ohio State University
Latvis, Maribeth
South Dakota State University
Morales-Briones, Diego F.
University of Minnesota
Johnson, Jens C.
University of Washington
Di Stilio, Verónica S.
University of Washington
Wolfe, Andrea D.
The Ohio State University
Tank, David C.
University of Idaho
Data from: Fluidigm2PURC: automated processing and haplotype inference for
double-barcoded PCR amplicons
Dryad
dataset
2019
Thalictrum L.
haplotype inference
microfluidic PCR
Bioinformatics
National Science Foundation
https://ror.org/021nxhr62
DEB-1455399, DEB-1253463, IOS-1121669, DBI-0939454
2019-06-06T00:00:00Z
2019-06-06T00:00:00Z
en
https://doi.org/10.1002/aps3.1156
2390466 bytes
1
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Premise of the Study: Targeted enrichment strategies for phylogenomic
inference are a time‐ and cost‐efficient way to collect DNA sequence data
for large numbers of individuals at multiple, independent loci. Automated
and reproducible processing of these data is a crucial step for
researchers conducting phylogenetic studies. Methods and Results: We
present Fluidigm2PURC, an open source Python utility for processing
paired‐end Illumina data from double‐barcoded PCR amplicons. In
combination with the program PURC (Pipeline for Untangling Reticulate
Complexes), our scripts process raw FASTQ files for analysis with PURC and
use its output to infer haplotypes for diploids, polyploids, and samples
with unknown ploidy. We demonstrate the use of the pipeline with an
example data set from the genus Thalictrum (Ranunculaceae). Conclusions:
Fluidigm2PURC is freely available for Unix‐like operating systems on
GitHub (https://github.com/pblischak/fluidigm2purc) and for all operating
systems through Docker (https://hub.docker.com/r/pblischak/fluidigm2purc).
Thalictrum DataPaired-end, Illumina MiSeq reads for 6 species of
Thalictrum sequenced at two portions of the gene PISTILLATA (PIS_3 and
PIS_4). The files present are Thalictrum_R1.fastq.gz and
Thalictrum_R2.fastq.gz for paired-end reads 1 and 2. The code to analyze
these data using dbcAmplicons and Fluidigm2PURC are in the supplemental
materials.Thalictrum-data.zip