The scripts were used for data analyzing in the manuscript: “Unraveling genetic load dynamics during biological invasion: insights from two invasive insect species”
Authors: Eric Lombaert, Aurélie Blin, Barbara Porro, Thomas Guillemaud, Julio S. Bernal, Gary Chang, Natalia Kirichenko, Thomas W. Sappington, Stefan Toepfer and Emeline Deleury
The scripts were used on pool-seq data. Raw reads have been deposited in Sequence Read Archive, National Center for Biotechnology Information, under BioProject PRJNA1079689: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1079689
The scripts allow (i) trimming and mapping of raw reads, (ii) calling of SNP polymorphisms across multiple populations, (iii) annotation of SNPs, (iv) polarisation of alleles, and (v) comparison of genetic load between different populations.
The scripts were successfully executed on a Debian 9.13 system with a Linux kernel version 4.9.0-7-amd64. The processor used was an Intel(R) Xeon(R) CPU E7540 at 2.00GHz, featuring 24 physical cores and 48 threads across 4 sockets. The machine had 128 GB of RAM and utilized a 2 TB hard drive for storage. Running all the scripts with this configuration takes one to several weeks.
The main software used by the various scripts are: FastQC v0.11.5 (Andrews, 2010), Trimmomatic v0.35 (Bolger et al., 2014), bwa-mem v0.7.15 (Li, 2013), SAMtools v1.15.1 (Li et al., 2009), freebayes v1.3.6 (Garrison & Marth, 2012), bcftools v1.13 (Danecek et al., 2021), SnpEff v5.0 (Cingolani et al., 2012), est-sfs v2.03 (Keightley & Jackson, 2018) and R v4.2.2 (R Core Team, 2021).
All the provided scripts have been executed for Diabrotica virgifera virgifera. The numbering at the beginning of each script is intended to facilitate chronological use.