Data Mining by Navigation – An Experience with Systems Biology

  • Amarnath Gupta
  • Michael Baitaluk
  • Animesh Ray
  • Aditya Bagchi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5909)

Abstract

This paper proposes a navigational method for mining by collecting evidences from diverse data sources. Since the representation method and even semantics of data elements differ widely from one data source to the other, consolidation of data under a single platform doesn’t become cost effective. Instead, this paper has proposed a method of mining in steps where knowledge gathered in one step or from one data source is transferred to the next step or next data source exploiting a distributed environment. This incremental mining process ultimately helps in arriving at the desired result. The entire work has been done in the domain of systems biology. Indication has been given how this process can be followed in other application areas as well.

Keywords

Graph Mining Phenyl Ethanolamine Common Transcription Factor Distribute Data Mining OMIM Database 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Baitaluk, M., Qian, X., Godbole, S., Raval, A., Ray, A., Gupta, A.: PathSys: Integrating Molecular Interaction Graphs for Systems Biology. BMC Bioinformatics 7, 55 (2006), http://www.biomedcentral.com/1471-2105/7/55 CrossRefGoogle Scholar
  2. 2.
    Chang, L.W., Fontaine, B.R., Stormo, G.D., Nagarajan, R.: PAP: a comprehensive workbench for mammalian transcriptional regulatory sequence analysis. Nucleic Acids Research, 238–244 (2007), http://bioinformatics.wustl.edu/webTools/PromoterAnalysis.do
  3. 3.
    Friese, R.S., Mahboubi, P., Mahapatra, N.R., Mahata, S.K., Schork, N.J., Schmid-Schönbein, G.W., O’Connor, D.T.: Common genetic mechanisms of blood pressure elevation in two independent rodent models of human essential hypertension. Am. J. Hypertension 18(5 Pt 1), 633–652 (2005)CrossRefGoogle Scholar
  4. 4.
    Gene Expression Omnibus (GEO), http://www.ncbi.nlm.nih.gov/geo/
  5. 5.
    Pal, S., Bagchi, A.: Association against Dissociation: some pragmatic considerations for Frequent Itemset generation under Fixed and Variable Thresholds. ACM SIGKDD Explorations 7(2), 151–159 (2005)CrossRefGoogle Scholar
  6. 6.
    Silverstein, C., Motwani, R., Brin, S.: Beyond Market Baskets: Generalizing Association Rules to Correlations. In: Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 265–276 (1997)Google Scholar
  7. 7.
    Srikant, R.: Fast Algorithms for Mining Association Rules and Sequential Patterns, Ph.D. Thesis, University of Wisconsin, Madision, USA (1996)Google Scholar
  8. 8.
    Wikipedia, the free online encyclopedia, http://www.wikipedia.org/
  9. 9.
    Ying, J., Murali, T.M., Ramakrishnan, N.: Compositional Mining of Multirelational Biological Datasets. ACM TKDD 2(1), 1–35 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Amarnath Gupta
    • 1
  • Michael Baitaluk
    • 1
  • Animesh Ray
    • 2
  • Aditya Bagchi
    • 3
  1. 1.San Diego Supercomputer CenterUniv. of California San DiegoLa JollaUSA
  2. 2.Keck Graduate InstituteClaremontUSA
  3. 3.Indian Statistical InstituteKolkataIndia

Personalised recommendations