Data Lineage (if applicable). Please include versions (e.g., input and forcing data, models, and coupling modules; instrument measurements; surveys; sample collections; etc.)
1. Description of methods used for collection/generation of data:
Zooplankton samples were collected from boreal lake mesocosms exposed to a simulated diluted bitumen spill and additional remediation practices.
Remediation practices consisted of enhanced monitored natural recovery and shoreline cleaner application, with samples being collected day -3, day 11 and day 38 post-spill.
All samples were collected following strict quality control methods, including decontaminated equipment, one-use items, and tools specific to treatment.
Samples were immediately placed in LifeGuard Soil Preservation Solution Qiagen Inc., Mississauga, ON) on ice prior to being transferred to a -80C freezer for long-term storage.
16s amplicon sequencing
Total genomic DNA and RNA was extracted from zooplankton samples using the AllPrep DNA/RNA Mini Kit (Qiagen Inc., Mississauga, ON). Concentrations were measured and checked for quality using Qubit 4 Fluorometer and NanoDrop Spectrophotometers, respectively (Thermo Fisher Scientific, USA).
Complementary DNA (cDNA) was synthesized using SuperScript IV Reverse Transcriptase (Invitrogen, CA, USA) along with ezDNase to remove residual DNA.
The COI gene was amplified using primers specified by Leray et al. 2013, with the forward primer (mICOIintF) (GGWACWGGWTGAACWGTWTAYCCYCC) and the reverse primer (jgHCO2198R) (TAAACTTCAGGGTGACCAAAAAATCA).
Samples were dual indexed to increase throughput of sequencing (Fadrosh et al., 2014).
Samples were amplified with a 50 μL PCR reaction including Platinum Taq Hot Start II High-Fidelity DNA Polymerase (Invitrogen, USA) using a SimpliAmp thermal cycler (ThermoFisher Scientific) under the following conditions: initial denaturation at 98°C for 30s, followed by 25 cycles of 98°C for 30s, 58°C for 30s, and 72°C for 30s, with a final extension at 72°C for 10 min.
PCR products were assessed for size and specificity using electrophoresis on a 1.2% w/v agarose gel and purified using the Qiagen QIAquick PCR Purification Kit (Qiagen Inc.).
All purified products were quantified with the Qubit dsDNA HS assay kit and concentrations were adjusted to 1 ng/ μL with molecular-grade water.
Purified products were pooled, and libraries were constructed using the NEBNext® DNA Library Prep Master Mix Set for Illumina® (New England BioLabs Inc., Whitby, ON).
Libraries were quantified prior to sequencing using the NEBNext® Library Quant Kit for Illumina®.
Sequencing was performed on an Illumina® MiSeq instrument (Illumina, San Diego, CA) using a 2x300 base pair kit.
2. Methods for processing the data:
Sequences were trimmed, cleaned, and demultiplexed using a combination of USEARCH v11 (Edgar 2016) and fastq-multx (
https://github.com/brwnj/fastq-multx).
Paired-end sequences were merged with USEARCH v11 (Edgar 2016).
Merged reads were then filtered for high quality (ee > 1.0) and greater length (300 bp). Primer binding regions were then removed from the merged sequence, with 17 nucleotides stripped left and 20 nucleotides stripped right.
Chimeric sequences were subsequently removed using unoise3, and ZOTUs (zero-radius OTUs) were compared to the BOLD database, an in-house curated database, and the NCBI database for taxonomic annotation.
Statistical analyses were performed in R (R Core Team, 2013).
All samples specified were collapsed down to unique Time, Nucleic Acid, and Enclosure for further data analysis (See Data Specific Information For: sample-annotation.xlsx and variable list: TimeNucleicAcidEnclosure).
3. Instrument- or software-specific information needed to interpret the data:
User defined. These are fastq files. They can be analyzed in QIIME2 or R.
4. Environmental/experimental conditions:
Zooplankton samples were collected from the boreal lake shoreline enclosures in the rock habitat (Lake 260) located at the IISD-ELA.
The samples were collected on June 19th, July 3rd, and July 30th using a pump and 53 micrometer filter mesh, with 20 Liters of water being pumped per sample.
5. Describe any quality-assurance procedures performed on the data:
Quality was visualized with fastqc. USEARCH was used to trim low-quality bases out of the dataset prior to demultiplexing and merge reads after demultiplexing.
unoise3 then filters and denoises the merged sequences. unoise3 also removes chimeras from the data.