NGS data related to Adam et al.: On the accuracy of the epigenetic copy machine - comprehensive specificity analysis of the DNMT1 DNA methyltransferase

DOI

Expression and purification of DNMT1 for biochemical work Full length murine DNMT1 (UniProtKB P13864) was overexpressed and purified as described (Adam, et al. 2020) using the Bac-to-Bac baculovirus expression system (Invitrogen). The expression construct of the DNMT1 with mutated CXXC domain was taken from Bashtrykov, et al. (2012).

Synthesis long DNA substrate and methylation reactions with them The sequence of the 349 bp substrate with 44 CpG sites was taken from Adam et al. 2020. It was used in unmethylated and hemimethylated form. Generation of the substrates and the methylation reaction were conducted as described (Adam, et al. 2020). In brief, for the generation of hemimethylated substrates, the unmethylated DNA was methylated in vitro by M.SssI (purified as described in Adam, et al. 2020) to introduce methylation at all CpG sites, or by M.HhaI (NEB) together with M.MspI (NEB) to introduce methylation at GCGC and CCGG sites. For the synthesis of hemimethylated substrates, the upper strand of the methylated substrate was digested with lambda exonuclease, the ss-DNA purified and finally ds hemimethylated DNA was generated by by primer extension using Phusion® HF DNA Polymerase (Thermo). Methylation reaction were conducted using mixtures of UM, fully hemimethylated and patterned substrate (total DNA concentration 200 ng in 20 µL) in methylation buffer (100 mM HEPES, 1 mM EDTA, 0.5 mM DTT, 0.1 mg mL-1 BSA, pH 7.2 with KOH) containing 1 mM AdoMet. DNMT1 concentrations and incubation times are indicated in the text. Methylation was followed by bisulfite conversion using the EZ DNA Methylation-LightningTM Kit (ZYMO RESEARCH) followed by library generation and Illumina paired-end sequencing (Novogene).

Flanking sequence preference analysis with randomized single-site substrates Methylation reactions of the randomized substrate with DNMT1 were performed similarly as described (Adam, et al. 2020; Gao, et al. 2020). Briefly, single-stranded oligonucleotides containing a methylated, hydroxymethylated or unmethylated CpG site embedded in a 10 nucleotide random context were obtained from IDT and used for generation of 67 bps long double-stranded DNA substrates by primer extension. Pools of these randomized substrates were then mixed in different combination, methylated by DNMT1 in methylation buffer (100 mM HEPES, 1 mM EDTA, 0.5 mM DTT, 0.1 mg mL-1 BSA, pH 7.2 with KOH) containing 1 mM AdoMet. DNMT1 concentrations and incubation times are indicated in the text. Methylation was followed by bisulfite conversion using the EZ DNA Methylation-LightningTM Kit (ZYMO RESEARCH) followed by library generation and Illumina paired-end sequencing (Novogene).

Bioinformatics analysis NGS data sets were bioinformatically analyzed using a local instance of the Galaxy server as described (Adam, et al. 2020; Dukatz, et al. 2020; Dukatz, et al. 2022). In brief, for the long substrate, reads were trimmed, filtered by quality, mapped against the reference sequence and demultiplexed using substrate type and experiment specific barcodes. Afterwards, methylation information was assigned and retrieved by home-made skripts. For the randomized substrate, reads were trimmed and filtered according to the expected DNA size. The original DNA sequence was then reconstituted based on the bisulfite converted upper and lower strands to investigate the average methylation state of both CpG sites and the NNCGNN flanks using home-made skripts. Methylation rates of 256 NNCGNN sequence contexts in the competitive methylation experiments with the mixed single-site substrates were determined by fitting to monoexponential reaction progress curves with variable time points with MatLab skripts as described (Adam, et al. 2022). Pearson correlation factors were calculated with Excel using the correl function.

Structure of the deposited data Methylation data of long substrates are placed in the “long DNA substrates” folder. Methylation data of short single-site substrates with randomized flanks are placed in the “single sites substrates” folder. In both folder an explanatory pdf file gives further information. Subfolders are arranged by enzyme (CXXC mutant or DNMT1 WT). Then, for each enzyme, the different substrates or substrate mixtures are provided in separate subfolders.

References Adam S, Bräcker J, Klingel V, Osteresch B, Radde NE, Brockmeyer J, Bashtrykov P, Jeltsch A. Flanking sequences influence the activity of TET1 and TET2 methylcytosine dioxygenases and affect genomic 5hmC patterns. Communications Biology 5, 92 (2022) Adam S, Anteneh H, Hornisch M, Wagner V, Lu J, Radde NE, Bashtrykov P, Song J, Jeltsch A. DNA sequence-dependent activity and base flipping mechanisms of DNMT1 regulate genome-wide DNA methylation. Nature Commun 11, 3723 (2020) Bashtrykov P, et al. Specificity of Dnmt1 for methylation of hemimethylated CpG sites resides in its catalytic domain. Chem Biol 19, 572-578 (2012) Dukatz M, Dittrich M, Stahl E, Adam S, de Mendoza A, Bashtrykov P, Jeltsch A. DNA methyltransferase DNMT3A forms interaction networks with the CpG site and flanking sequence elements for efficient methylation. J. Biol. Chem. 298(10), 102462 (2022) Dukatz M, Adam S, Biswal M, Song J, Bashtrykov P, Jeltsch A. Complex DNA sequence readout mechanisms of the DNMT3B DNA methyltransferase. Nucleic Acids Res 48, 11495-11509 (2020) Gao L, Emperle M, Guo Y, Grimm SA, Ren W, Adam S, Uryu H, Zhang ZM, Chen D, Yin J, Dukatz M, Anteneh H, Jurkowska RZ, Lu J, Wang Y, Bashtrykov P, Wade PA, Wang GG, Jeltsch A, Song J. Comprehensive Structure-Function Characterization of DNMT3B and DNMT3A Reveals Distinctive De Novo DNA Methylation Mechanisms. Nature Commun 11, 3355 (2020)

Data set 1 contains the combined methylation rates of all 256 NNCGNN sequences in HM, OH and UM context by DNMT1, as well as their corresponding standard error of the mean (SEM) values. For details how these numbers were determined refer to the description in the corresponding publication.

Identifier
DOI https://doi.org/10.18419/darus-3334
Related Identifier IsCitedBy https://doi.org/10.1093/nar/gkad465
Metadata Access https://darus.uni-stuttgart.de/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=doi:10.18419/darus-3334
Provenance
Creator Jeltsch, Albert ORCID logo; Bashtrykov, Pavel ORCID logo; Adam, Sabrina ORCID logo
Publisher DaRUS
Contributor Jeltsch, Albert
Publication Year 2023
Funding Reference DFG JE 252/48 - 498335429
Rights CC BY 4.0; info:eu-repo/semantics/openAccess; http://creativecommons.org/licenses/by/4.0
OpenAccess true
Contact Jeltsch, Albert (Universität Stuttgart)
Representation
Resource Type Raw DNA sequences extracted from Fastq NGS files; Bisulfite-seq of 5mC analysis; Dataset
Format application/octet-stream; text/plain; text/tab-separated-values; application/pdf
Size 2475776; 5594880; 5801544; 2659968; 5638900; 6342084; 2460672; 5738224; 5836698; 2439424; 5157408; 5825862; 2186624; 5579504; 5055624; 1862656; 4032480; 4650660; 2148224; 4554148; 4993884; 2109184; 4807852; 5075784; 1970944; 4258656; 4282866; 1220992; 3715536; 3905874; 29018221; 15002082; 23767550; 19030287; 23585540; 25747201; 23208530; 25558456; 24752803; 19374743; 24963831; 17831736; 21080; 4893312; 4933888; 5421312; 4323968; 4965760; 5486592; 4491900; 4699170; 4012974; 3891762; 4706352; 4886154; 2053120; 2863104; 3732864; 1880064; 2752384; 2118144; 2491904; 4847160; 5056472; 3251652; 3940596; 3972092; 5753104; 5684284; 4508640; 2672712; 3029670; 3048318; 88; 10177391; 10429407; 11486745; 21890884; 11410626; 7978088; 9726723; 21118757; 22621715; 7271865; 10450846; 16954582; 12846896; 16266058; 13828088; 10565501; 5728563; 11165309; 13568822; 10471886; 5299053; 15820219; 18038737; 19506026; 13167350; 33910300; 41842821; 4334080; 3426432; 3638912; 3288448; 3350272; 4180224; 49899; 90364; 4189374; 3535686; 3954384; 3476214; 4027590; 3418254
Version 2.1
Discipline Basic Biological and Medical Research; Biochemistry; Biology; Chemistry; Life Sciences; Medicine; Natural Sciences