Data: Evolution av koppartolerans i den marina kiselalgsarten Skeletonema marinoi

DOI

This project explores if, and how, contemporary populations the costal diatom Skeletonema marinoi have evolved in response to mining pollution. The study systems are two semi-enclosed inlets in the Baltic Sea, where one, Gåsfjärden (VG: 57°34.35'N, 16°34.98'E), has been affected by mining pollution for ca. 400 years, while the other, Gropviken (GP: 58°19.92N 16°42.35'E), has not. Strains were isolated, and the genome sequenced for 55 individual strains, and they were phenotyped in terms of specific growth rate and dose-responses to toxic copper concentrations (6-12 uM Cu). An artificial evolution experiment was conducted by assembling 28 and 30 strains from the two locations separately, and let them evolve with, and without, toxic Cu stress of 8.65 uM, corresponding to the concentration that inhibits the reference S. marinoi strain RO5AC’s specific growth rate with 50% in acute toxic tests (Andersson et al. 2020: DOI: 10.1016/j.aquatox.2020.105551). A recently developed 523 bp long strain-specific metabarcoding loci (Sm_C12W1: https://github.com/topel-research-group/Live2Tell) was used to track the selection process. This locus is located on contig 12 of S. marinoi, inside a pentatricopeptide (PPR) repeat region of gene Sm_t00009768-RA, encoding an RNA-binding protein. The locus has 38 SNP positions amongst the 58 strains used in this study, and 110 unique alleles with 100% heterozygosity, including two triploid/aneuploid strains. The outcome was contrasted against strain selection models computed according to Andersson et al. 2022 (DOI: 10.1038/s41396-021-01092-9). The data and analyses included here are raw data and R-scripts that analyses the data, together with essential data created from the analysis. However, sequencing data is not processed or included, but has been deposited and available at NCBI under BioProject PRJNA939970. The amplicon sequencing data has been analyzed as outlined in https://github.com/topel-research-group/Bamboozle/wiki/Bamboozle-Part-2:-Barcode-Quantification. For more detailed information, see README.md files associated with each step of the analysis briefly outlined below. Each of the four sections includes necessary input data and can be run separately. Barcodes This is an analysis pipeline of the amplicon sequences of the selection experiment using the hypervariable locus in S. marinoi. The locus was bioinformatically identified based on analysis of whole genome sequences of 55 strains of S. marinoi from two Baltic Sea locations. It was predicted to have at least one unique allele enabling tracking of evolution through selection on standing genetic diversity in a artificial evolution experiment (See Fig above). Two barcode loci (Sm_C2W24 and Sm_C12W1) were sequenced in the experiment, but Sm_C12W1 had much more allelic diversity so the majority of the analysis focus on this data (see Barcodes/Barcoding_C12W1/README.md for more information). This Git repository does not contain the bioinformatic sequence analyses, but starts after raw reads have been trimmed, merged, and mapped back to the known allele sequences. Data from two pre-processing approaches are included, one based on Dada2 error-correction (Barcodes/Barcoding_C12W1/C12W1_BBmergeDada2_input), and one that uses exact matches of merged amplicon sequences (Barcodes/Barcoding_C12W1/ C12W1_abundances). The latter is the one we use for the publication: Andersson et al. Strain-specific metabarcoding reveals rapid evolution of copper tolerance in populations of the coastal diatom Skeletonema marinoi, in prep. for Molecular Ecology. The zipfile Cu_evolution.zip contains all raw data, indexing information, R-scripts, and README.md files to reproduce the analysis and plot data. The documentation file README.md summarizes the contents of CU_evolution.zip. Key data from the analysis is provided as individual files, which are summarized in the documentation file Datafile_descriptions.md

Detta projekt undersöker om, och hur, samtida populationer av kiselalgsarten Skeletonema marinoi har utvecklats som svar på gruvföroreningar. Provtagningslokalerna är två fjärdar i Östersjön, där en, Gåsfjärden (VG: 57°34.35'N, 16°34.98'E), har påverkats av gruvföroreningar i ca. 400 år medan den andra, Gropviken (GP: 58°19.92N 16°42.35'E), inte har gjort det. Olika strains isolerades och genomet sekvenserades för 55 individuella strains, och de var fenotypiskt karaktäriserade i termer av specifik tillväxthastighet och toxisk dos-respons till koppar (6-12 µM Cu). Ett artificiellt evolutionsexperiment genomfördes genom att sätta ihop 28 och 30 strains från de två populationerna separat och låta dem evolvera med och utan toxisk Cu-stress om 8,65 µM koppar, motsvarande den koncentration som hämmar referens-strainen RO5AC:s specifika tillväxthastighet med 50 % i akuta toxiska tester (Andersson et al. 2020: DOI: 10.1016/j.aquatox.2020.105551). En nyligen utvecklad 523 bp lång strain-specifik metabarcoding-locus (Sm_C12W1: https://github.com/topel-research-group/Live2Tell) användes för att spåra urvalsprocessen. Detta locus är beläget på contig 12 av S. marinoi, inuti en pentatricopeptid (PPR)-repeterande region av genen Sm_t00009768-RA, som kodar för ett RNA-bindande protein. Lokuset har 38 SNP-positioner bland de 58 strainarna som används i denna studie och 110 unika alleler med 100 % heterozygositet, inklusive två triploida/aneuploida strainar. Utfallet av experimentet kontrasteras mot en selektionsmodell beräknad enligt Andersson et al. 2022 (DOI: 10.1038/s41396-021-01092-9). Data och analyser här inkluderar rådata och R-skript. Sekvenseringsdata bearbetas eller ingår inte, men finns tillgänglig på NCBI hemisida under BioProject: PRJNA939970. Amplikonsekvenseringsdata har analyseras enligt beskrivningen/koden i https://github.com/topel-research-group/Bamboozle/wiki/Bamboozle-Part-2:-Barcode-Quantification. Kod finns medpaketerad i arkivet Cu_evolution.zip. För mer detaljerad information om data och koder, se README.md-filer som är associerade med varje steg i analysen. De fyra analysstegen anhåller relevant input data och kan köras separat. Se den engelska versionen av databeskrivningen för detaljer och resurser för reproduktion av analysen.

Identifier
DOI https://doi.org/10.5878/7eww-g857
Metadata Access https://datacatalogue.cessda.eu/oai-pmh/v0/oai?verb=GetRecord&metadataPrefix=oai_ddi25&identifier=8f531b21b4a8928beb84c5f86e6f8935d9ece8c0067953fb7d14e796c8115cbb
Provenance
Creator Andersson, Björn
Publisher Swedish National Data Service; Svensk nationell datatjänst
Publication Year 2023
Rights Access to data through SND. Data are freely accessible.; Åtkomst till data via SND. Data är fritt tillgängliga.
OpenAccess true
Contact https://snd.gu.se
Representation
Language English
Discipline Natural Sciences