Application of Data Mining Techniques to Protein-Protein Interaction Prediction

  • A. Kocatas
  • A. Gursoy
  • R. Atalay
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2869)


Protein-protein interactions are key to understanding biological processes and disease mechanisms in organisms. There is a vast amount of data on proteins waiting to be explored. In this paper, we describe application of data mining techniques, namely association rule mining and ID3 classification, to the problem of predicting protein-protein interactions. We have combined available interaction data and protein domain decomposition data to infer new interactions. Preliminary results show that our approach helps us find plausible rules to understand biological processes.


Association Rule Domain Decomposition Rule Mining Minimum Support Association Rule Mining 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S.-M., Eisenberg, D.: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Research 30(1), 303–305 (2002)CrossRefGoogle Scholar
  2. 2.
    Agrawal, R., Srikant, R.: Fast Algorithms for Mining Association Rules. In: Proc. 20th Int. Conf. Very Large Data Bases, VLDB (1994)Google Scholar
  3. 3.
    Bateman, A., Birney, E., Cerruti, L., Durbin, R., Etwiller, L., Eddy, S.R., Griffiths-Jones, S., Howe, K.L., Marshall, M., Sonnhammer, E.L.: The Pfam Protein Families Database. Nucleic Acids Research 30, 276–280 (2002)CrossRefGoogle Scholar
  4. 4.
    Agrawal, R., Irnielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proceedings of ACM SIGMOD, May 1993, pp. 207–216 (1993)Google Scholar
  5. 5.
    Oyama, T., Kitano, K., Satou, K., Ito, T.: Extraction of knowledge on proteinprotein interaction by association rule discovery. Bioinformatics 18(5) (2002)Google Scholar
  6. 6.
    Uetz, et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000)CrossRefGoogle Scholar
  7. 7.
    Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365–370 (2003)CrossRefGoogle Scholar
  8. 8.
    Christian Borgelt’s Software Page,

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • A. Kocatas
    • 1
  • A. Gursoy
    • 1
  • R. Atalay
    • 2
  1. 1.Computer Engineering DepartmentKoç UniversityIstanbulTurkey
  2. 2.Department of Molecular Biology and GeneticsBilkent UniversityAnkaraTurkey

Personalised recommendations