TASB-AC: Term Annotated Sliding-Window-Based Boosting Associative Classifier for DNA Repair Gene Categorization

  • A. Vidya
  • Santosh Pattar
  • M. S. Roopa
  • K. R. Venugopal
  • L. M. Patnaik
Conference paper


Damage to DNA affects the biochemical pathways of the cell and leads to aging, if not repaired. Several genes in the genome of an organism are responsible for DNA repair activities, however, not all of them are related to the biological aging process. In this paper, we develop a data mining technique to relate association of DNA repair genes with the aging process of the organism. Nucleotide sequence of the DNA repair genes is annotated with their respective biochemical properties and is then converted to a transactional dataset. Further, biological features are extracted from the dataset by constructing an associative classifier. To select significant gene features, we employ sliding-window technique to divide the gene sequence into subsequences and thus increase their count. An extensive evaluation is performed of the proposed technique by taking human DNA repair genes along with their biochemical properties like gene ontology terms and protein–protein interactions. We also provide biological interpretation of the features extracted from the classification technique.


Associative classifier DNA repairs genes Gene-document Rule pruning Sliding window Subsequence 


  1. 1.
    Moffitt, T.E., Belsky, D.W., Danese, A., Poulton, R., Caspi, A.: The longitudinal study of aging in human young adults: knowledge gaps and research agenda. J. Gerontol. A 72(2), 210–215 (2017)CrossRefGoogle Scholar
  2. 2.
    Lombard, D.B., Chua, K.F., Mostoslavsky, R., Franco, S., Gostissa, M., Alt, F.W.: DNA repair, genome stability, and aging. Cell 120(4), 497–512 (2005)CrossRefGoogle Scholar
  3. 3.
    Kirschner, K., Chandra, T., Kiselev, V., Flores-Santa Cruz, D., Macaulay, I.C., Park, H.J., Li, J., Kent, D.G., Kumar, R., Pask, D.C., et al.: Proliferation drives aging-related functional decline in a subpopulation of the hematopoietic stem cell compartment. Cell Rep. 19(8), 1503–1511 (2017)CrossRefGoogle Scholar
  4. 4.
    Cadet, J., Davies, K.J.: Oxidative DNA damage & repair: an introduction. Free Radic. Biol. Med. 107, 2–12 (2017)CrossRefGoogle Scholar
  5. 5.
    Li, Y.-H., Zhang, G.-G., Guo, Z.: Computational prediction of aging genes in human. In: Proceedings of 2010 International Conference on Biomedical Engineering and Computer Science (ICBECS), pp. 1–4 (2010)Google Scholar
  6. 6.
    Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, Amsterdam (2011)zbMATHGoogle Scholar
  7. 7.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRefGoogle Scholar
  8. 8.
    Song, K., Lee, K.: Predictability-based collective class association rule mining. Expert Syst. Appl. 79, 1–7 (2017)CrossRefGoogle Scholar
  9. 9.
    Jiang, H., Ching, W.-K.: Classifying DNA repair genes by Kernel-based support vector machines. Bioinformation 7(5), 257–263 (2011)CrossRefGoogle Scholar
  10. 10.
    Freitas, A.A., Vasieva, O., de Magalhães, J.P.: A data mining approach for classifying DNA repair genes into ageing-related or non-ageing-related. BMC Genomics 12(1), 27 (2011)CrossRefGoogle Scholar
  11. 11.
    Fang, Y., Wang, X., Michaelis, E.K., Fang, J.: Classifying aging genes into DNA repair or non-DNA repair-related categories. In: Proceedings of the International Conference on Intelligent Computing, pp. 20–29 (2013)CrossRefGoogle Scholar
  12. 12.
    Wan, C., Freitas, A.A.: Two methods for constructing a gene ontology-based feature network for a Bayesian network classifier and applications to datasets of aging-related genes. In: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, pp. 27–36 ACM (2015)Google Scholar
  13. 13.
    Pevsner, J.: Bioinformatics and Functional Genomics. Wiley, New York (2015)Google Scholar
  14. 14.
    Vidya, A., Pattar, S., Tejaswi, V., Venugopal, K.R., Patnaik, L.M.: DNA repair gene catergorization through associative classification. In: 7th International Conference on Advanced Computer Theory and Engineering (ICACTE-2014), vol. 7, pp. 1–5 (2014)Google Scholar
  15. 15.
    Salim, A., Chandra, S.V.: Association rule based frequent pattern mining in biological sequences. In: Proceedings of the 2013 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), pp. 1–5 (2013)Google Scholar
  16. 16.
    Becerra, D., Vanegas, D., Cantor, G., Niño, L.: An association rule based approach for biological sequence feature classification. In: IEEE Congress on Evolutionary Computation, 2009. CEC’09., pp. 3111–3118 (2009)Google Scholar
  17. 17.
    Yu, P., Wild, D.J.: Discovering associations in biomedical datasets by link-based associative classifier (LAC). PloS One 7(12), e51018 (2012)CrossRefGoogle Scholar
  18. 18.
    Yoon, Y., Lee, G.G.: Subcellular localization prediction through boosting association rules. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(2), 609–618 (2012)CrossRefGoogle Scholar
  19. 19.
    Benson, D.A., Cavanaugh, M., Clark, K., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Sayers, E.W.: GenBank. Nucleic Acids Res. 41(D1), D36–D42 (2012)CrossRefGoogle Scholar
  20. 20.
    McCallum, A.K.: Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering (1996). [Online]. Available:
  21. 21.
    Borgelt, C.: An implementation of the FP-growth algorithm. In: Proceedings of the 1st International Workshop on Open Source Data Mining: Frequent Pattern Mining Implementations, pp. 1–5 (2005)Google Scholar
  22. 22.
    Wood, R.D., Mitchell, M., Lindahl, T.: Human DNA repair genes, 2005. Mutat. Res. Fundam. Mol. Mech. Mutagen. 577(1), 275–283 (2005)CrossRefGoogle Scholar
  23. 23.
    Tacutu, R., Craig, T., Budovsky, A., Wuttke, D., Lehmann, G., Taranukha, D., Costa, J., Fraifeld, V.E., De Magalhães, J.P.: Human ageing genomic resources: integrated databases and tools for the biology and genetics of ageing. Nucleic Acids Res. 41(D1), D1027–D1033 (2012)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  • A. Vidya
    • 1
  • Santosh Pattar
    • 2
  • M. S. Roopa
    • 2
  • K. R. Venugopal
    • 2
  • L. M. Patnaik
    • 3
  1. 1.Department of Information Science and EngineeringVivekananda Institute of TechnologyBangaloreIndia
  2. 2.Department of Computer Science and EngineeringUniversity Visvesvaraya College of Engineering, Bangalore UniversityBangaloreIndia
  3. 3.Department of Computer Science and AutomationIndian Institute of ScienceBangaloreIndia

Personalised recommendations