Skip to main content

Recurrent Deep Neural Networks for Nucleosome Classification

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 11925))

Abstract

Nucleosomes are the fundamental repeating unit of chromatin. A nucleosome is an 8 histone proteins complex, in which approximately 147–150 pairs of DNA bases bind. Several biological studies have clearly stated that the regulation of cell type-specific gene activities are influenced by nucleosome positioning. Bioinformatic studies have improved those results showing proof of sequence specificity in nucleosomes’ DNA fragment. In this work, we present a recurrent neural network that uses nucleosome sequence features representation for their classification. In particular, we implement an architecture which stacks convolutional and long short-term memory layers, with the main purpose to avoid the features extraction and selection steps. We have computed classifications using eight datasets of three different organisms with a growing genome complexity, from yeast to human. We have also studied the capability of the model trained on the highest complex species in recognizing nucleosomes of the other organisms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Kornberg, R.D., Lorch, Y.: Twenty-five years of the nucleosome, fundamental particle of the eukaryote chromosome. Cell 98(3), 285–294 (1999)

    Article  Google Scholar 

  2. Mazina, M.Y., Vorobyeva, N.E.: The role of ATP-dependent chromatin remodeling complexes in regulation of genetic processes. Russ. J. Genet. 52(5), 529–540 (2016)

    Article  Google Scholar 

  3. Sala, A., et al.: Genome-wide characterization of chromatin binding and nucleosome spacing activity of the nucleosome remodelling ATPase ISWI. EMBO J. 30(9), 1766–1777 (2011)

    Article  Google Scholar 

  4. Mirabella, A.C., Foster, B.M., Bartke, T.: Chromatin deregulation in disease. Chromosoma 125, 75–93 (2016)

    Article  Google Scholar 

  5. Giancarlo, R., Lo Bosco, G., Pinello, L., Utro, F.: The three steps of clustering in the post-genomic era: a synopsis. In: Rizzo, R., Lisboa, P.J.G. (eds.) CIBB 2010. LNCS, vol. 6685, pp. 13–30. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21946-7_2

    Chapter  Google Scholar 

  6. Ciaramella, A., et al.: Interactive data analysis and clustering of genomic data. Neural Netw. 21(2–3), 368–378 (2008)

    Article  Google Scholar 

  7. Camastra, F., Di Taranto, M.D., Staiano, A.: Statistical and computational methods for genetic diseases: an overview. Comput. Math. Methods Med. 2015, 954598 (2015)

    Google Scholar 

  8. Calcagno, G., et al.: A multilayer perceptron neural network-based approach for the identification of responsiveness to interferon therapy in multiple sclerosis patients. Inf. Sci. 180(21), 4153–4163 (2010)

    Article  Google Scholar 

  9. Di Taranto, D., et al.: Association of USF1 and APOA5 polymorphisms with familial combined hyperlipidemia in an Italian population. Mol. Cell. Probes 29(1), 19–24 (2015)

    Article  Google Scholar 

  10. Staiano, A., Di Taranto, M.D., Bloise, E., D’Agostino, M.N., et al.: Investigation of single nucleotide polymorphisms associated to familial combined hyperlipidemia with random forests. In: Apolloni, B., Bassis, S., Esposito, A., Morabito, F. (eds.) Neural Nets and Surroundings. Smart Innovation, Systems and Technologies, vol. 19, pp. 169–178. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-35467-0_18

    Chapter  Google Scholar 

  11. Pinello, L., Lo Bosco, G., Yuan, G.-C.: Applications of alignment-free methods in epigenomics. Briefings Bioinform. 15(3), 419–430 (2014)

    Article  Google Scholar 

  12. Di Gesú, V., Lo Bosco, G., Pinello, L., Yuan, G.-C., Corona, D.F.V.: A multi-layer method to study genome-scale positions of nucleosomes. Genomics 93(2), 140–145 (2009)

    Google Scholar 

  13. Struhl, K., Segal, E.: Determinants of nucleosome positioning. Nat. Struct. Mol. Biol. 20(3), 267–273 (2013)

    Article  Google Scholar 

  14. Yuan, G.-C.: Linking genome to epigenome. Wiley Interdisc. Rev.: Syst. Biol. Med. 4(3), 297–309 (2012)

    Google Scholar 

  15. Hui, L., Ruichang, Z., Wei, X., Jihong, G., Ziheng, Z., Shuigeng, Z.: A comparative evaluation on prediction methods of nucleosome positioning. Briefings Bioinf. 15(6), 1014–1027 (2014)

    Article  Google Scholar 

  16. Lo Bosco, G.: Alignment free dissimilarities for nucleosome classification. In: Angelini, C., Rancoita, P.M.V., Rovetta, S. (eds.) CIBB 2015. LNCS, vol. 9874, pp. 114–128. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44332-4_9

    Chapter  Google Scholar 

  17. Fici, G., Langiu, A., Lo Bosco, G., Rizzo, R.: Bacteria classification using minimal absent words. AIMS Med. Sci. 5(1), 23–32 (2017)

    Article  Google Scholar 

  18. Pinello, L., Lo Bosco, G., Hanlon, B., Yuan, G.-C.: A motif-independent metric for DNA sequence specificity. BMC Bioinf. 12, 408 (2011)

    Article  Google Scholar 

  19. Lo Bosco, G., Pinello, L.: A new feature selection methodology for K-mers representation of DNA sequences. In: di Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds.) CIBB 2014. LNCS, vol. 8623, pp. 99–108. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24462-4_9

    Chapter  Google Scholar 

  20. Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: The general regression neural network to classify barcode and mini-barcode DNA. In: di Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds.) CIBB 2014. LNCS, vol. 8623, pp. 142–155. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24462-4_13

    Chapter  Google Scholar 

  21. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MathSciNet  Google Scholar 

  22. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  23. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  24. Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: A deep learning approach to DNA sequence classification. In: Angelini, C., Rancoita, P.M.V., Rovetta, S. (eds.) CIBB 2015. LNCS, vol. 9874, pp. 129–140. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-44332-4_10

    Chapter  Google Scholar 

  25. Lo Bosco, G., Di Gangi, M.A.: Deep learning architectures for DNA sequence classification. In: Petrosino, A., Loia, V., Pedrycz, W. (eds.) WILF 2016. LNCS (LNAI), vol. 10147, pp. 162–171. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-52962-2_14

    Chapter  Google Scholar 

  26. Fiannaca, A., et al.: Deep learning models for bacteria taxonomic classification of metagenomic data. BMC Bioinf. 19, 198 (2018)

    Article  Google Scholar 

  27. Lo Bosco, G., Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: A deep learning model for epigenomic studies. In: 12th International Conference on Signal Image Technology & Internet Systems, SITIS 2016, pp. 688–692. IEEE, New York (2016)

    Google Scholar 

  28. Di Gangi, M.A., Gaglio, S., La Bua, C., Lo Bosco, G., Rizzo, R.: A deep learning network for exploiting positional information in nucleosome related sequences. In: Rojas, I., Ortuño, F. (eds.) IWBBIO 2017. LNCS, vol. 10209, pp. 524–533. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-56154-7_47

    Chapter  Google Scholar 

  29. Lo Bosco, G., Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: Variable ranking feature selection for the identification of nucleosome related sequences. In: Benczúr, A. (ed.) ADBIS 2018. CCIS, vol. 909, pp. 314–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00063-9_30

    Chapter  Google Scholar 

  30. Di Gangi, M., Lo Bosco, G., Rizzo, R.: Deep learning architectures for prediction of nucleosome positioning from sequences data. BMC Bioinf. 19, 418 (2018)

    Article  Google Scholar 

  31. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  32. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  33. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. Wiley/IEEE, New York (2001)

    Google Scholar 

  34. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference on Learning Representations. ICLR 2015, CoRR, abs/1412.6980 (2014)

    Google Scholar 

  35. Kaplan, N., et al.: The DNA-encoded nucleosome organization of a eukaryotic genome. Nature 458, 362–366 (2009)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giosuè Lo Bosco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Amato, D., Di Gangi, M.A., Lo Bosco, G., Rizzo, R. (2020). Recurrent Deep Neural Networks for Nucleosome Classification. In: Raposo, M., Ribeiro, P., Sério, S., Staiano, A., Ciaramella, A. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2018. Lecture Notes in Computer Science(), vol 11925. Springer, Cham. https://doi.org/10.1007/978-3-030-34585-3_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-34585-3_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-34584-6

  • Online ISBN: 978-3-030-34585-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics