Skip to main content

Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention

  • Conference paper
  • First Online:
Algorithms for Computational Biology (AlCoB 2019)

Abstract

DNA methylation has been extensively linked to alterations in gene expression, playing a key role in the manifestation of multiple diseases, especially cancer. Hence, the sequence determinants of methylation and the relationship between methylation and expression are of great interest from a molecular biology perspective. Several models have been suggested to support the prediction of methylation status. These models, however, have two main limitations: (a) they are limited to specific CpG loci; and (b) they are not easily interpretable. We address these limitations using deep learning with attention. We produce a general model that predicts DNA methylation for a given sample in any CpG position based solely on the sample’s gene expression profile and the sequence surrounding the CpG. Depending on gene-CpG proximity, our model attains a Spearman correlation of up to 0.84 for thousands of CpG sites on two separate test sets of CpG positions and subjects (cancer and healthy samples). Importantly, our approach, especially the use of attention, offers a novel framework with which to extract valuable insights from gene expression data when combined with sequence information. We demonstrate this by linking several motifs and genes to methylation activity, including Nodal and Hand1. The code and trained weights are available at: https://github.com/YakhiniGroup/Methylation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 49.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/YakhiniGroup/Methylation.

References

  1. Bhasin, M., et al.: Prediction of methylated CPGS in DNA sequences using a support vector machine. FEBS Lett. 579(20), 4302–4308 (2005)

    Article  Google Scholar 

  2. Chen, X., Ji, Z., Webber, A., Sharrocks, A.D.: Genome-wide binding studies reveal DNA binding specificity mechanisms and functional interplay amongst forkhead transcription factors. Nucl. Acids Res. 44(4), 1566–1578 (2015)

    Article  Google Scholar 

  3. Cooper, D.N., et al.: Methylation-mediated deamination of 5-methylcytosine appears to give rise to mutations causing human inherited disease in CpNpG trinucleotides, as well as in CpG dinucleotides. Hum. Genomics 4(6), 406 (2010)

    Article  Google Scholar 

  4. Dai, H.Q., et al.: TET-mediated DNA demethylation controls gastrulation by regulating Lefty-Nodal signalling. Nature 538(7626), 528 (2016)

    Article  Google Scholar 

  5. Das, R., et al.: Computational prediction of methylation status in human genomic sequences. Proc. Nat. Acad. Sci. 103(28), 10713–10716 (2006)

    Article  Google Scholar 

  6. Eden, E., et al.: Discovering motifs in ranked lists of DNA sequences. PLoS Comput. Biol. 3(3), e39 (2007)

    Article  MathSciNet  Google Scholar 

  7. Ehrlich, M.: DNA methylation in cancer: too much, but also too little. Oncogene 21(35), 5400 (2002)

    Article  Google Scholar 

  8. Fiorito, G., et al.: Oxidative stress and inflammation mediate the effect of air pollution on cardio - and cerebrovascular disease: a prospective study in nonsmokers. Environ. Mol. Mutagen. 59(3), 234–246 (2018)

    Article  Google Scholar 

  9. Grasso, C.S., et al.: Genetic mechanisms of immune evasion in colorectal cancer. Cancer Discov. 8, 730–749 (2018)

    Article  Google Scholar 

  10. Hollenberg, S.M., et al.: Identification of a new family of tissue-specific basic helix-loop-helix proteins with a two-hybrid system. Mol. Cell. Biol. 15(7), 3813–3822 (1995)

    Article  Google Scholar 

  11. Hui, J., et al.: Intronic CA-repeat and CA-rich elements: a new class of regulators of mammalian alternative splicing. EMBO J. 24(11), 1988–1998 (2005)

    Article  Google Scholar 

  12. Irier, H.A., Jin, P.: Dynamics of DNA methylation in aging and Alzheimer’s disease. DNA Cell Biol. 31(S1), S-42 (2012)

    Article  Google Scholar 

  13. Kajiura, K., et al.: Frequent silencing of the candidate tumor suppressor TRIM58 by promoter methylation in early-stage lung adenocarcinoma. Oncotarget 8(2), 2890 (2017)

    Article  Google Scholar 

  14. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  15. Kurdyukov, S., Bullock, M.: DNA methylation analysis: choosing the right method. Biology 5(1), 3 (2016)

    Article  Google Scholar 

  16. Leibovich, L., et al.: Drimust: a web server for discovering rank imbalanced motifs using suffix trees. Nucl. Acids Res. 41(W1), W174–W179 (2013)

    Article  Google Scholar 

  17. Ma, B., et al.: Predicting DNA methylation level across human tissues. Nucl. Acids Res. 42(6), 3515–3528 (2014)

    Article  Google Scholar 

  18. Maor, G.L., et al.: The alternative role of DNA methylation in splicing regulation. Trends Genet. 31(5), 274–280 (2015)

    Article  Google Scholar 

  19. Nejman, D., et al.: Molecular rules governing de novo methylation in cancer. Cancer Res. 74(5), 1475–1483 (2014)

    Article  Google Scholar 

  20. Nichol, K., Pearson, C.E.: CpG methylation modifies the genetic stability of cloned repeat sequences. Genome Res. 12(8), 1246–1256 (2002)

    Article  Google Scholar 

  21. Plumitallo, S., et al.: Functional analysis of a novel eng variant in a patient with hereditary hemorrhagic telangiectasia (HHT) identifies a new Sp1 binding-site. Gene 647, 85–92 (2018)

    Article  Google Scholar 

  22. Raiber, E.A., et al.: A non-canonical DNA structure is a binding motif for the transcription factor Sp1 in vitro. Nucl. Acids Res. 40(4), 1499–1508 (2011)

    Article  Google Scholar 

  23. Wang, Y., et al.: Predicting DNA methylation state of CpG dinucleotide using genome topological features and deep networks. Sci. Rep. 6, 19598 (2016)

    Article  Google Scholar 

  24. Yang, C., et al.: Prevalence of the initiator over the tata box in human and yeast genes and identification of DNA motifs enriched in human tata-less core promoters. Gene 389(1), 52–65 (2007)

    Article  Google Scholar 

  25. Zhang, W., et al.: Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol. 16(1), 14 (2015)

    Article  Google Scholar 

Download references

Acknowledgments

We would like to thank the Yakhini Group, and specifically Leon Anavy and Oz Solomon, for valuable discussions and suggestions. We also thank Anthony Mathelier and colleagues from the Kristensen Group for important comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alona Levy-Jurgenson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Levy-Jurgenson, A., Tekpli, X., Kristensen, V.N., Yakhini, Z. (2019). Predicting Methylation from Sequence and Gene Expression Using Deep Learning with Attention. In: Holmes, I., Martín-Vide, C., Vega-Rodríguez, M. (eds) Algorithms for Computational Biology. AlCoB 2019. Lecture Notes in Computer Science(), vol 11488. Springer, Cham. https://doi.org/10.1007/978-3-030-18174-1_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18174-1_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18173-4

  • Online ISBN: 978-3-030-18174-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics