Sequence cross-references and taxonomic lineage for glycoside hydrolase family 19

DOI

The Glycoside Hydrolase 19 Engineering Database (GH19ED) contains information on protein sequences and structures of glycoside hydrolases from family 19. This dataset lists cross-references to the National Center for Biotechnology Information (NCBI), cross-references to the Protein Data Bank (PDB) and the taxonomic lineage for each sequence entry in the GH19ED.

The tab-separated tabular file comprises nine columns: (1) the sequence identifier from the GH19ED, integer (Sequence_id), (2) the protein sequence accessions from the NCBI, semicolon-separated (NCBI_accessions), (3) the PDB accessions, semicolon-separated (PDB_accessions), (4) the name of the source or source organism (Source_name), (5) the NCBI taxonomy identifier for the source (NCBI_taxonomy_id), (6) the taxonomic lineage from the lowest to the highest rank, as inferred from NCBI taxonomy (Lineage), (7) the "protein" identifier from the GH19ED, integer (Protein_id), (8) the "homologous family" (or group) identifier from the GH19ED, integer (Homologous_family_id), (9) the "superfamily" (or subfamily) identifier from the GH19ED, integer (Superfamily_id). For sequence entries assigned to more than one source organism name, only the first taxonomic lineage found in the GH19ED is listed.

Identifier
DOI https://doi.org/10.18419/darus-1163
Metadata Access https://darus.uni-stuttgart.de/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18419/darus-1163
Provenance
Creator Buchholz, Patrick C. F. ORCID logo
Publisher DaRUS
Contributor Pleiss, Jürgen
Publication Year 2021
Rights CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Contact Pleiss, Jürgen (Universität Stuttgart)
Representation
Resource Type Dataset
Format text/tab-separated-values
Size 5239878
Version 1.0
Discipline Life Sciences; Medicine