Data for Phaeoexplorer publication: Evolutionary genomics of the emergence of brown algae as key components of coastal ecosystems

DOI

The Phaeoexplorer project sequenced 60 genomes corresponding to 44 brown algal and sister species. This dataset corresponds to supplementary information relating to the initial annotation of the Phaeoexplorer genomes and multiple analyses of the genome data. The dataset includes presubmission (v0) versions of the Phaeoexplorer genome annotation (GFF) files (GFF_v0.tar.gz) and genome-wide predicted proteomes as fasta files (Proteomes_v0.tar.gz), de novo transcriptome assemblies for the Phaeoexplorer species (RNA-seq data assembled with Trinity or rnaSPAdes; de-novo-transcriptomes.tar.gz), RepeatMasker analyses of repeat sequences (RepeatMasker.tar.gz), alignment files used to generate a phylogenetic tree for the Phaeoexplorer species (PhylogeneticTree.tar.gz), alignments used to build a densitree specifically for Ectocarpus species (Microevolution_Ectocarpus.tar.gz), an Orthofinder-based analysis of shared orthologues (Orthogroups.tar.gz) together with a Dollo-logic-based analysis of orthogroup gain and loss during evolution (Dollo_analysis.tar.gz), a Phylostratigraphy analysis of brown algal genes (Phylostratigraphy.tar.gz), an analysis of protein functional domain fissions and fusions (CompositeGenes.tar.gz), Interproscan analyses of protein domains (InterProScan.tar.gz), Hectar predictions of protein subcellular localisations (Hectar.tar.gz), eggNOG output providing information about predicted protein functions (eggNOG.tar.gz), RNA-seq-based data on gene expression levels (mRNAexpression.tar.gz), results of a search for genes acquired via horizontal gene transfer (HGT.tar.gz), analyses of intron conservation across genomes (Introns_conservation.tar.gz), an analysis of tandem gene duplications (Tandemely_duplicated_genes.tar.gz), comparisons of CDS size with the Ectocarpus reference genome that were used to evaluate gene model completeness (CDS_size.tar.gz), a DESeq2 analysis of differential gene expression between the sporophyte and gametophyte generations of several brown algal species (DEG_LifeCycle.tar.gz), information about orthogroups selected to analyse the effects of morphological complexity and life cycle structure on gene evolution (Genes_selection.tar.gz). Each individual dataset contains a README file explaining its content. Detailed information about the methodology used for each analysis can be found in the Methods section of the manuscript preprint (https://doi.org/10.1101/2024.02.19.579948). The majority of these analyses and datasets can also be accessed via the Phaeoexplorer website (https://phaeoexplorer.sb-roscoff.fr/).

Identifier
DOI https://doi.org/10.57745/9U1J85
Related Identifier IsCitedBy https://doi.org/10.1101/2024.02.19.579948
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/9U1J85
Provenance
Creator Godfroy, Olivier; Denoeud, France ORCID logo; Cruaud, Corinne; Heesch, Svenja ORCID logo; Nehr, Zofia ORCID logo; Tadrent, Nachida; Couloux, Arnaud; Brillet-Guéguen, Loraine ORCID logo; Delage, Ludovic ORCID logo; Mckeown, Dean; Motomura, Taizo; Sussfeld, Duncan ORCID logo; Fan, Xiao ORCID logo; Mazéas, Lisa; Terrapon, Nicolas; Barrera-Redondo, Josué ORCID logo; Petroll, Romy (ORCID: 0000-0003-0165-982X); Reynes, Lauric; Choi, Seok-Wan; Jo, Jihoon ORCID logo; Uthanumallian, Kavitha; Bogaert, Kenny ORCID logo; Duc, Céline ORCID logo; Ratchinski, Pélagie; Lipinska, Agnieszka ORCID logo; Noel, Benjamin; Murphy, Eleanor A. ORCID logo; Lohr, Martin; Khatei, Ananya ORCID logo; Hamon-Giraud, Pauline; Vieira, Christophe ORCID logo; Akerfors, Svea Sanja; Akita, Shingo ORCID logo; Avia, Komlan ORCID logo; Badis, Yacine; Barbeyron, Tristan; Belcour, Arnaud ORCID logo; Berrabah, Wahiba; Blanquart, Samuel; Bouguerba-Collin, Ahlem; Bringloe, Trevor; Cattolico, Rose Ann; Cormier, Alexandre ORCID logo; Cruz de Carvalho, Helena ORCID logo; Dallet, Romain; De Clerck, Olivier ORCID logo; Debit, Ahmed; Denis, Erwan; Destombe, Christophe; Dinatale, Erica ORCID logo; Dittami, Simon ORCID logo; Drula, Elodie ORCID logo; Faugeron, Sylvain; Got, Jeanne ORCID logo; Graf, Louis; Groisillier, Agnès (ORCID: 0000-0002-5358-923X); Guillemin, Marie-Laure ORCID logo; Harms, Lars; Hatchett, William John; Henrissat, Bernard; Hoarau, Galice; Jollivet, Chloé; Jueterbock, Alexander ORCID logo; Kayal, Ehsan ORCID logo; Kogame, Kazuhiro ORCID logo; Le Bars, Arthur; Leblanc, Catherine ORCID logo; Ley, Ronja; Liu, Xi; Lopez, Pascal Jean; Lopez, Philippe ORCID logo; Manirakiza, Eric; Massau, Karine; Mauger, Stéphane ORCID logo; Mest, Laetitia; Michel, Gurvan; Monteiro, Catia; Nagasato, Chikako; Nègre, Delphine ORCID logo; Pelletier, Eric ORCID logo; Phillips, Naomi; Potin, Philippe ORCID logo; Rensing, Stefan A. (ORCID: 0000-0002-0225-873X); Rousselot, Ellyn ORCID logo; Rousvoal, Sylvie; Schroeder, Declan; Scornet, Delphine; Siegel, Anne ORCID logo; Tirichine, Leila ORCID logo; Tonon, Thierry ORCID logo; Valentin, Klaus ORCID logo; Verbruggen, Heroen ORCID logo; Weinberger, Florian ORCID logo; Wheeler, Glen ORCID logo; Kawai, Hiroshi ORCID logo; Peters, Akira F. (ORCID: 0000-0001-5332-199X); Yoon, Hwan Su ORCID logo; Hervé, Cécile ORCID logo; Ye, Naihao ORCID logo; Bapteste, Eric; Valero, Myriam ORCID logo; Markov, Gabriel V. ORCID logo; Corre, Erwan ORCID logo; Coelho, Susana M. ORCID logo; Wincker, Patrick; Aury, Jean-Marc; Cock, J. Mark ORCID logo
Publisher Recherche Data Gouv
Contributor COCK, Jeremy Mark; Godfroy, Olivier; Centre national de la recherche scientifique; Genoscope; Entrepôt-Catalogue Recherche Data Gouv
Publication Year 2024
Funding Reference France Genomique ANR-10-INBS-09 ; European Research Council 638240 ; Laoshan Laboratory grants LSKJ202203801 ; Laoshan Laboratory grants LSKJ202203204 ; Taishan Scholars Program and Talent Projects of Distinguished Scientific Scholars in Agriculture ; CNRS international research network DABMA 00022 ; Agence Nationale de la Recherche ANR-19-CE20-0028-01 ; Agence Nationale de la Recherche ANR-20-CE44-0011 ; Agence Nationale de la Recherche ANR-22-CE20-0025 ; Agence Nationale de la Recherche ANR-20-CE43-0013 ; Agence Nationale de la Recherche ANR-23-CE20-0048-01 ; National Research Foundation of Korea 2022R1A2B5B03002312 ; National Research Foundation of Korea 2022R1A5A1031361 ; Pays de la Loire-Nantes Métropole, Connect Talent EpiAlg Région ; Région Pays de la Loire, Etoiles Montantes M-EpiCC ; MITI Algometabionte ; CNRS ; Sorbonne University
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact COCK, Jeremy Mark (CNRS); Godfroy, Olivier (CNRS)
Representation
Resource Type Dataset
Format application/gzip
Size 8295789; 3001583; 11937355; 3240531067; 4733167; 178701287; 3229; 109639186; 23544291; 221745658; 1326761329; 2710754; 1842964; 55214139; 24005853; 46812; 41765885; 274519329; 2185084903; 192677
Version 1.0
Discipline Life Sciences; Medicine