Synthetic data for GAZEL-ADN blood-saliva comparison

DOI

This study sets out to establish the suitability of saliva-based whole-genome sequencing (WGS) through a comparison against blood-based WGS. To fully appraise the observed differences, we developed a novel technique of pseudo-replication. We also investigated the potential of characterizing individual salivary microbiomes from non-human DNA fragments found in saliva. We observed that the majority of discordant genotype calls between blood and saliva fell into known regions of the human genome that are typically sequenced with low confidence; and could be identified by quality control measures. Pseudo-replication demonstrated that the levels of discordance between blood- and saliva-derived WGS data were entirely similar to what one would expect between technical replicates if an individual's blood or saliva had been sequenced twice. Finally, we successfully sequenced salivary microbiomes in parallel to human genomes as demonstrated by a comparison against the Human Microbiome Project.

A synthetic data set has been generated that allows the replication of our principal results but without a full disclosure of individual level sequencing data. Read counts and relative abundances for the microbiome profiling analyses are similarly available.

Identifier
DOI https://doi.org/10.57745/MFIXFW
Related Identifier https://doi.org/10.1002/gepi.22386
Metadata Access https://entrepot.recherche.data.gouv.fr/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.57745/MFIXFW
Provenance
Creator Herzig, Anthony ORCID logo
Publisher Recherche Data Gouv
Contributor Herzig, Anthony
Publication Year 2023
Funding Reference French Ministry of Research PFMG2025
Rights etalab 2.0; info:eu-repo/semantics/openAccess; https://spdx.org/licenses/etalab-2.0.html
OpenAccess true
Contact Herzig, Anthony (Inserm)
Representation
Resource Type Dataset
Format text/x-vcard; text/tab-separated-values
Size 8269859015; 6740
Version 1.0
Discipline Life Sciences; Medicine