Skip to main content

Analysis of Relevance and Redundance on Topoisomerase 2b (TOP2B) Binding Sites: A Feature Selection Approach

  • Conference paper
  • First Online:
  • 2397 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10784))

Abstract

Topoisomerases are proteins that regulate the topology of DNA by introducing transient breaks to relax supercoiling. In this paper we focus our attention on Topoisomerases 2 (TOP2), which generate double-strand DNA breaks that, if inefficiently repaired, can seriously compromise genomic stability. It is then important to gain insights on the molecular processes involved in TOP2-DNA binding. In order to do this, we collected genomic and epigenomic information from publicly available high-throughput sequencing projects and systematically quantified them within experimentally measured TOP2 binding sites. We then applied feature selection techniques in order to both increase the performance of classification and to gain insight on the particular properties that can be of biological relevance. Results obtained allowed us to identify a core set of predictive chromatin features that faithfully explain TOP2 binding.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Pommier, Y., Sun, Y., Shar-yin, N.H., Nitiss, J.L.: Roles of eukaryotic topoisomerases in transcription, replication and genomic stability. Nature Rev. Mol. Cell Biol. 17(11), 703–721 (2016). http://www.nature.com/doifinder/10.1038/nrm.2016.111

    Article  Google Scholar 

  2. Deweese, J.E., Osheroff, N.: The DNA cleavage reaction of topoisomerase II: wolf in sheep’s clothing. Nucleic Acids Res. 37(3), 738–748 (2009)

    Article  Google Scholar 

  3. Jackson, S.P., Bartek, J.: The DNA-damage response in human biology and disease. Nature 461(7267), 1071–1078 (2010)

    Article  Google Scholar 

  4. Sng, J.H., Heaton, V.J., Bell, M., Maini, P., Austin, C.A., Fisher, L.: Molecular cloning and characterization of the human topoisomerase II\(\alpha \) and II\(\beta \) genes: evidence for isoform evolution through gene duplication. Biochimica et Biophysica Acta (BBA) - Gene Struct. Expr. 144(3), 395–406 (1999)

    Article  Google Scholar 

  5. Uusküla-Reimand, L., Hou, H., Samavarchi-Tehrani, P., Rudan, M.V., Liang, M., Medina-Rivera, A., Mohammed, H., Schmidt, D., Schwalie, P., Young, E.J., Reimand, J., Hadjur, S., Gingras, A.C., Wilson, M.D.: Topoisomerase II beta interacts with cohesin and CTCF at topological domain borders. Genome Biol. 17(1), 1–22 (2016). https://doi.org/10.1186/s13059-016-1043-8

    Article  Google Scholar 

  6. Canela, A., Maman, Y., Jung, S., Wong, N., Callen, E., Day, A., Kieffer-Kwon, K.R., Pekowska, A., Zhang, H., Rao, S.S., Huang, S.C., Mckinnon, P.J., Aplan, P.D., Pommier, Y., Aiden, E.L., Casellas, R., Nussenzweig, A.: Genome organization drives chromosome fragility. Cell 170(3), 507–521 (2017)

    Article  Google Scholar 

  7. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  8. Arvey, A., Agius, P., Noble, W.S., Leslie, C.: Sequence and chromatin determinants of cell-type-specific transcription factor binding. Genome Res. 22(9), 1723–1734 (2012)

    Article  Google Scholar 

  9. Liu, L., Jin, G., Zhou, X.: Modeling the relationship of epigenetic modifications to transcription factor binding. Nucleic Acids Res. 43(8), 3873–3885 (2015)

    Article  Google Scholar 

  10. Comoglio, F., Schlumpf, T., Schmid, V., Rohs, R., Beisel, C., Paro, R.: High-resolution profiling of drosophila replication start sites reveals a DNA shape and chromatin signature of metazoan origins. Cell Reports 11(5), 821–834 (2015)

    Article  Google Scholar 

  11. Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, New York (1998)

    MATH  Google Scholar 

  12. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Mateo (1995)

    Google Scholar 

  13. Laguna, M., Martí, R.: Scatter Search: Methodology and Implementations in C. Kluwer Academic Press, Norwell (2003)

    Book  MATH  Google Scholar 

  14. Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. University of Michigan Press, Ann Arbo (1975)

    MATH  Google Scholar 

  15. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Mach. Learn. Res. 5, 1205–1224 (2004)

    MathSciNet  MATH  Google Scholar 

  16. Langmead, B., Trapnell, C., Pop, M., Salzberg, S.: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10(3), R25 (2009). https://genomebiology.biomedcentral.com/articles/10.1186/gb-2009-10-3-r25

    Article  Google Scholar 

  17. Zhang, Y., Liu, T., Meyer, C.A., Eeckhoute, J., Johnson, D.S., Bernstein, B.E., Nussbaum, C., Myers, R.M., Brown, M., Li, W., Liu, X.S.: Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9(9), 137 (2008). http://genomebiology.biomedcentral.com/articles/10.1186/gb-2008-9-9-r137

    Article  Google Scholar 

  18. Comoglio, F., Paro, R.: Combinatorial modeling of chromatin features quantitatively predicts DNA replication timing in Drosophila. PLoS Comput. Biol. 10(1), e1003419 (2014)

    Article  Google Scholar 

  19. Mathelier, A., Xin, B., Chiu, T.P., Yang, L., Rohs, R., Wasserman, W.W.: DNA shape features improve transcription factor binding site predictions in vivo. Cell Syst. 3(3), 278–286 (2016)

    Article  Google Scholar 

  20. Chiu, T.P., Comoglio, F., Zhou, T., Yang, L., Paro, R., Rohs, R.: Dnashaper: an r/bioconductor package for DNA shape prediction and feature encoding. Bioinformatics 32(8), 1211–1213 (2016)

    Article  Google Scholar 

  21. Hall, M.A.: Correlation-based Feature Subset Selection for Machine Learning. Ph.D. thesis, University of Waikato, Hamilton, New Zealand (1999)

    Google Scholar 

  22. Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the Thirteenth International Conference on Machine Learning, pp. 284–292 (1996)

    Google Scholar 

  23. Glover, F.: Heuristics for integer programming using surrogate constraints. Decis. Sci. 8, 156–166 (1977)

    Article  Google Scholar 

  24. Goldberg, D.E.: Genetics Algorithms in Search, Optimization and Machine Learning. Addison Wesley, Reading (1989)

    MATH  Google Scholar 

  25. da Silva, C.G.: Time series forecasting with a non-linear model and the scatter search meta-heuristic. Inf. Sci. 178(16), 3288–3299 (2008). Including Special Issue: Recent advances in granular computing, Fifth International Conference on Machine Learning and Cybernetics

    Article  MathSciNet  MATH  Google Scholar 

  26. García-López, F.C., García-Torres, M., Melián-Batista, B., Moreno-Pérez, J.A., Moreno-Vega, J.M.: Solving the feature selection problem by a parallel scatter search. Eur. J. Oper. Res. 169(2), 477–489 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  27. Kaya, I.: A genetic algorithm approach to determine the sample size for attribute control charts. Inf. Sci. 179(10), 1552–1566 (2009). Including Special Issue on Artificial Imune Systems

    Article  Google Scholar 

  28. Cheng, C.H., Chen, T.L., Wei, L.Y.: A hybrid model based on rough sets theory and genetic algorithms for stock price forecasting. Inf. Sci. 180(9), 1610–1629 (2010)

    Article  Google Scholar 

  29. Witten, I.H., Frank, E., Hall, M.A., Pal, C.: Data Mining: Practical Machine Learning Tools and Techniques, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco (2017)

    Google Scholar 

  30. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence - Vol. 2, IJCAI 1995 pp. 1137–1143. Morgan Kaufmann Publishers Inc., San Francisco (1995)

    Google Scholar 

  31. Jones, P.A.: Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nature Rev. Genet. 13(7), 484–492 (2012). http://www.nature.com/doifinder/10.1038/nrg3230

    Article  Google Scholar 

  32. Vinson, C., Chatterjee, R.: CG methylation. Epigenomics 4(6), 655–663 (2012). http://www.futuremedicine.com/doi/abs/10.2217/epi.12.55?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_ pub=pubmed&

    Article  Google Scholar 

  33. Ong, C.T., Corces, V.G.: CTCF: an architectural protein bridging genome topology and function. Nature Rev. Genet. 15(4), 234–246 (2014)

    Article  Google Scholar 

  34. Ghirlando, R., Felsenfeld, G.: CTCF: making the right connections. Genes Dev. 30(8), 881–891 (2016)

    Article  Google Scholar 

Download references

Acknowledgements

This research was partly funded by the Ministry of Economy and the European Regional Development Fund under grant TIN2015-64776-C3-2-R (MINECO/FEDER).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Federico Divina .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Martínez García, P.M., García Torres, M., Divina, F., Gómez Vela, F.A., Cortés-Ledesma, F. (2018). Analysis of Relevance and Redundance on Topoisomerase 2b (TOP2B) Binding Sites: A Feature Selection Approach. In: Sim, K., Kaufmann, P. (eds) Applications of Evolutionary Computation. EvoApplications 2018. Lecture Notes in Computer Science(), vol 10784. Springer, Cham. https://doi.org/10.1007/978-3-319-77538-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77538-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77537-1

  • Online ISBN: 978-3-319-77538-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics