Scripts used for data analyzing in the manuscript: “Unraveling genetic load dynamics during biological invasion: insights from two invasive insect species”

DOI

The scripts were used for data analyzing in the manuscript: “Unraveling genetic load dynamics during biological invasion: insights from two invasive insect species” Authors: Eric Lombaert, Aurélie Blin, Barbara Porro, Thomas Guillemaud, Julio S. Bernal, Gary Chang, Natalia Kirichenko, Thomas W. Sappington, Stefan Toepfer and Emeline Deleury

The scripts were used on pool-seq data. Raw reads have been deposited in Sequence Read Archive, National Center for Biotechnology Information, under BioProject PRJNA1079689: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1079689

The scripts allow (i) trimming and mapping of raw reads, (ii) calling of SNP polymorphisms across multiple populations, (iii) annotation of SNPs, (iv) polarisation of alleles, and (v) comparison of genetic load between different populations.

The scripts were successfully executed on a Debian 9.13 system with a Linux kernel version 4.9.0-7-amd64. The processor used was an Intel(R) Xeon(R) CPU E7540 at 2.00GHz, featuring 24 physical cores and 48 threads across 4 sockets. The machine had 128 GB of RAM and utilized a 2 TB hard drive for storage. Running all the scripts with this configuration takes one to several weeks.

The main software used by the various scripts are: FastQC v0.11.5 (Andrews, 2010), Trimmomatic v0.35 (Bolger et al., 2014), bwa-mem v0.7.15 (Li, 2013), SAMtools v1.15.1 (Li et al., 2009), freebayes v1.3.6 (Garrison & Marth, 2012), bcftools v1.13 (Danecek et al., 2021), SnpEff v5.0 (Cingolani et al., 2012), est-sfs v2.03 (Keightley & Jackson, 2018) and R v4.2.2 (R Core Team, 2021).

All the provided scripts have been executed for Diabrotica virgifera virgifera. The numbering at the beginning of each script is intended to facilitate chronological use.

Identifier
DOI https://doi.org/10.57745/ESQFDB
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/ESQFDB
Provenance
Creator Lombaert, Eric ORCID logo
Publisher Recherche Data Gouv
Contributor Lombaert, Eric; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2024
Funding Reference Agence nationale de la recherche ANR-19-CE02-0010
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact Lombaert, Eric (INRAE)
Representation
Resource Type Workflow; Dataset
Format application/zip
Size 912886
Version 1.0
Discipline Geosciences