10.5285/6CD12DE1-02C7-4F94-86F0-C11E76B86067
Sleight, Victoria A
Victoria A
Sleight
0000-0003-0550-8500
School of Biological Sciences, University of Aberdeen
Clark, Melody S
Melody S
Clark
0000-0002-3442-3824
British Antarctic Survey
Cavallo, Alessandro
Alessandro
Cavallo
0000-0002-0135-0032
MRC Weatherall Institute of Molecular Medicine
Laternula elliptica developmental bulk RNA-Seq data analysis results 2022, collected from Hangar Cove Rothera Point, on Adelaide Island in 2018
NERC EDS UK Polar Data Centre
2022
"EARTH SCIENCE","BIOLOGICAL CLASSIFICATION","ANIMALS/INVERTEBRATES","MOLLUSKS","BIVALVES","CLAMS"
"EARTH SCIENCE","BIOLOGICAL CLASSIFICATION","ANIMALS/INVERTEBRATES","MOLLUSKS"
biomineralisation
developmental biology
mollusc
transcriptomics
Sleight, Victoria A
Victoria A
Sleight
0000-0003-0550-8500
School of Biological Sciences, University of Aberdeen
Sleight, Victoria A
Victoria A
Sleight
0000-0003-0550-8500
School of Biological Sciences, University of Aberdeen
Clark, Melody S
Melody S
Clark
0000-0002-3442-3824
British Antarctic Survey
Cavallo, Alessandro
Alessandro
Cavallo
0000-0002-0135-0032
MRC Weatherall Institute of Molecular Medicine
UK Polar Data Centre
Natural Environment Research Council
UK Polar Data Centre
Natural Environment Research Council
UK Polar Data Centre
Natural Environment Research Council
2018-04-25/2018-09-25
2022-02-11
2022-02-11
2022-02-11
2022-11-09
en
Dataset
https://www.biorxiv.org/content/10.1101/2022.04.22.489168v1
https://github.com/SleightLab/Lelliptica_bulkRNAseq_analysis
16 files
108 MB
text/csv
text/plain
application/cys
1.0
Open Government Licence V3.0
This dataset comprises mRNA that was extracted from Laternula elliptica developmental stages (blastula to juvenile) and sequenced (n=3 pools of 200 individual per stage). The resulting sequence data was analysed and the following results files and analysis scripts are available here: Results files from differential gene expression analysis in edgeR (directory = edgeR_DE), results files from WGCNA analysis (directory = WGCNA). Data collection was carried out over Hangar Cove Rothera Point, Adelaide Island, in Ryder Bay, from 2018-04-25 to 2018-09-25 by researchers with the British Antarctic Survey. The data was collected as part of research on the developmental biology of molluscs.
This work was supported by UKRI Natural Environment Research Council (NERC) Core Funding to the British Antarctic Survey, a DTG Studentship (Project Reference: NE/J500173/1) and a Junior Research Fellowship to VAS from Wolfson College, University of Cambridge.
Adult L. elliptica were collected from Hangar Cove Rothera Point, Adelaide Island, Ryder Bay between 2018-04-25 and 2018-09-25 and transported in a refrigerated recirculating aquarium by ship to the British Antarctic Survey aquarium facility (Cambridge, UK). Embryos were obtained from an adult broodstock of sexually mature individuals and divided into three independent closed-system 1L tanks. Embryos were maintained at 0 degrees Celsius (within 0.5 degrees Celsius), aerated with an airstone with water changes every two days using autoclaved seawater, until the desired developmental stage. Embryos were staged as per (Peck et al. 2007) and the following stages were studied: blastula, gastrula, trochophore, veliger, early D-larvae (PI), late D-larvae (PII) and postlarva/juvenile (DI).
Triplicate RNA-Seq samples were collected for each developmental stage, one from each independent tank. For each sample, two hundred staged-matched embryos were selected, transferred into a microcentrifuge tube, snap frozen in a 70 percent ethanol dry ice slurry and stored at -80 degrees Celsius. Total RNA was extracted from each sample as per manufactures' recommendations (Relia Miniprep kit, Promega) and tested for quality and quantity using Nanodrop and Agilent Tapestation. All samples had a RNA Integrity Number (RIN) of over 7. Libraries were prepared by the sequencing facility in the Biochemistry Department at the University of Cambridge (TruSeq Stranded mRNA, Illumina) and sequenced on an Illumina NextSeq500 generating over 300 million 150bp stranded paired-end reads.
Clean, normalised reads were assembled using Trinity v.2.2.0 with default parameters.
Transcript abundance was estimated by alignment-based quantification using Trinity v.2.2.0 utilities. Transcripts from each cleaned library were aligned to the transcriptome using bowtie2 with default parameters and transcript abundance estimates were calculated using RNA-Seq by Expectation-Maximization (RSEM). Raw counts and Trimmed Mean of M-values [TMM] normalised Fragments Per Kilobase Of Exon Per Million Fragments Mapped [FPKM] matrices were generated using Trinity v2.2.0 utilities.
Pairwise differential gene expression tests were performed to find transcripts that were upregulated at each stage of shell development (compared to the previous stage). Using the EdgeR package, a negative binomial additive general linear model with a quasi-likelihood F-test was performed and p-values were adjusted for multiple testing using the Benjamini-Hochberg method to control the false discovery rate, cut-offs for statistical significance (FDR less than or equal to 0.05) and magnitude were used (log2FC less than 2). Upregulated transcripts were putatively annotated based on sequence similarity searched using blastx against Uniprot (http://www.uniprot.org/), and screened for functional categories relating to gene regulation and shell secretion.
For WGCNA analysis, TMM-FPKM values were used to calculate a gene dissimilarity matrix (adjacency= softpower 16 and signed, TOMsimilarity = signed) and hierarchical clustering was performed (method = average). Modules were determined using the cutreeDynamic function with a minimum gene membership threshold of 30 and dynamic tree cut-off of 25. Modules were correlated to external traits (days post fertilisation or candidate gene expression values). Modules of that were significantly correlated to traits of interest were extracted and all transcripts were putatively annotated based on sequence similarity searched using blastx against Uniprot and tested for functional enrichment.
Instrumentation:
Sequencing: TruSeq Stranded mRNA libraries sequenced on an Illumina NextSeq500
Assembly: Trinity v.2.2.0
Analysis: edgeR and WGCNA
Triplicate RNA-Seq libraries were generated for each developmental stage
Raw reads (total 309,593,642) were cleaned using ea-utils tool v1.1.2 fastq-mcf (quality -q 30, and length -l 100), after cleaning 296,480,254 reads remained.
Hangar Cove, Rothera Point, Adelaide Island, Ryder Bay Antarctica
-68.13333
-67.56667
Natural Environment Research Council
https://ror.org/02b5d8509
NE/J500173/1
BAS-2011-DTG-Funding 5 Studentships