Advertisement

Automatic Parameter Learning for Multiple Network Alignment

  • Jason Flannick
  • Antal Novak
  • Chuong B. Do
  • Balaji S. Srinivasan
  • Serafim Batzoglou
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)

Abstract

We developed Græmlin 2.0, a new multiple network aligner with (1) a novel scoring function that can use arbitrary features of a multiple network alignment, such as protein deletions, protein duplications, protein mutations, and interaction losses; (2) a parameter learning algorithm that uses a training set of known network alignments to learn parameters for our scoring function and thereby adapt it to any set of networks; and (3) an algorithm that uses our scoring function to find approximate multiple network alignments in linear time.

We tested Græmlin 2.0’s accuracy on protein interaction networks from IntAct, DIP, and the Stanford Network Database. We show that, on each of these datasets, Græmlin 2.0 has higher sensitivity and specificity than existing network aligners. Græmlin 2.0 is available under the GNU public license at http://graemlin.stanford.edu.

Keywords

Equivalence Class Protein Interaction Network Alignment Algorithm Multiple Network Edge Deletion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sharan, R., Ideker, T.: Modeling cellular machinery through biological network comparison. Nat. Biotechnol. 24, 427–433 (2006)CrossRefGoogle Scholar
  2. 2.
    Hartwell, L.H., Hopfield, J.J., Leibler, S., Murray, A.W.: From molecular to modular cell biology. Nature 402, 47–52 (1999)CrossRefGoogle Scholar
  3. 3.
    Pereira-Leal, J.B., Levy, E.D., Teichmann, S.A.: The origins and evolution of functional modules: lessons from protein complexes. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 361, 507–517 (2006)CrossRefGoogle Scholar
  4. 4.
    Uetz, P., Finley Jr., R.L.: From protein networks to biological systems. FEBS Lett. 579, 1821–1827 (2005)CrossRefGoogle Scholar
  5. 5.
    Cusick, M.E., Klitgord, N., Vidal, M., Hill, D.E.: Interactome: gateway into systems biology. Hum. Mol. Genet. 14(2), 171–181 (2005)CrossRefGoogle Scholar
  6. 6.
    Kelley, B.P., Sharan, R., Karp, R.M., Sittler, T., Root, D.E., Stockwell, B.R., Ideker, T.: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc. Natl. Acad. Sci. USA 100, 11394–11399 (2003)CrossRefGoogle Scholar
  7. 7.
    Sharan, R., Ideker, T., Kelley, B., Shamir, R., Karp, R.M.: Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J Comput. Biol. 12, 835–846 (2005)CrossRefGoogle Scholar
  8. 8.
    Koyuturk, M., Kim, Y., Topkara, U., Subramaniam, S., Szpankowski, W., Grama, A.: Pairwise alignment of protein interaction networks. J Comput. Biol. 13, 182–199 (2006)CrossRefMathSciNetGoogle Scholar
  9. 9.
    Pinter, R.Y., Rokhlenko, O., Yeger-Lotem, E., Ziv-Ukelson, M.: Alignment of metabolic pathways. Bioinformatics 21, 3401–3408 (2005)CrossRefGoogle Scholar
  10. 10.
    Dost, B., Shlomi, T., Gupta, N., Ruppin, E., Bafna, V., Sharan, R.: QNet: A Tool for Querying Protein Interaction Networks. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 1–15. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  11. 11.
    Singh, R., Xu, J., Berger, B.: Pairwise global alignment of protein interaction networks by matching neighborhood topology. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 16–31. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  12. 12.
    Zhenping, L., Zhang, S., Wang, Y., Zhang, X.-S., Chen, L.: Alignment of molecular networks by integer quadratic programming. Bioinformatics 23, 1631–1639 (2007)CrossRefGoogle Scholar
  13. 13.
    Sharan, R., Suthram, S., Kelley, R.M., Kuhn, T., McCuine, S., Uetz, P., Sittler, T., Karp, R.M., Ideker, T.: Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005)CrossRefGoogle Scholar
  14. 14.
    Flannick, J., Novak, A., Srinivasan, B.S., Batzoglou, S., McAdams, H.H.: Graemlin: General and Robust Alignment of Multiple Large Interaction Networks. Genome Res. 16 (2006)Google Scholar
  15. 15.
    Berg, J., Lassig, M.: Cross-species analysis of biological networks by Bayesian alignment. Proc. Natl. Acad Sci. USA 103, 10967–10972 (2006)CrossRefGoogle Scholar
  16. 16.
    Hirsh, E., Sharan, R.: Identification of conserved protein complexes based on a model of protein network evolution. Bioinformatics 23, 170–176 (2007)CrossRefGoogle Scholar
  17. 17.
    Remm, M., Storm, C.E., Sonnhammer, E.L.: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol. Biol. 314, 1041–1052 (2001)CrossRefGoogle Scholar
  18. 18.
    Do, C.B., Gross, S.S., Batzoglou, S.: Contralign: Discriminative training for protein sequence alignment. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 160–174. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  19. 19.
    Do, C.B., Woods, D.A., Batzoglou, S.: CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics 22, 90–98 (2006)CrossRefGoogle Scholar
  20. 20.
    Felsenstein, J.: Maximum-likelihood estimation of evolutionary trees from continuous characters. Am. J. Hum. Genet. 25, 471–492 (1973)Google Scholar
  21. 21.
    Ratliff, N., Bagnell, J., Zinkevich, M. (online) subgradient methods for structured prediction. In: Eleventh International Conference on Artificial Intelligence and Statistics (AIStats) (2007)Google Scholar
  22. 22.
    Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic. Acids. Res. 28, 27–30 (2000)CrossRefGoogle Scholar
  23. 23.
    Shor, N.Z., Kiwiel, K.C., Ruszcayǹski, A.: Minimization methods for non-differentiable functions. Springer, New York (1985)zbMATHGoogle Scholar
  24. 24.
    Nedic, A., Bertsekas, D.: Convergence rate of incremental subgradient algorithms (2000)Google Scholar
  25. 25.
    Russell, S., Norvig, P.: Artificial Intelligence: A Modern Approach, 2nd edn. Prentice-Hall, Englewood Cliffs (2003)Google Scholar
  26. 26.
    Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997)CrossRefGoogle Scholar
  27. 27.
    Kerrien, S., Alam-Faruque, Y., Aranda, B., Bancarz, I., Bridge, A., Derow, C., Dimmer, E., Feuermann, M., Friedrichsen, A., Huntley, R., Kohler, C., Khadake, J., Leroy, C., Liban, A., Lieftink, C., Montecchi-Palazzi, L., Orchard, S., Risse, J., Robbe, K., Roechert, B., Thorneycroft, D., Zhang, Y., Apweiler, R., Hermjakob, H.: IntAct–open source resource for molecular interaction data. Nucleic Acids Res. 35, 561–565 (2007)CrossRefGoogle Scholar
  28. 28.
    Xenarios, I., Salwinski, L., Duan, X.J., Higney, P., Kim, S.-M., Eisenberg, D.: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002)CrossRefGoogle Scholar
  29. 29.
    Srinivasan, B.S., Novak, A.F., Flannick, J.A., Batzoglou, S., McAdams, H.H.: Integrated protein interaction networks for 11 microbes. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 1–14. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  30. 30.
    Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–29 (2000)CrossRefGoogle Scholar
  31. 31.
    Srinivasan, B.S., Shah, N.H., Flannick, J.A., Abeliuk, E., Novak, A.F., Batzoglou, S.: Current progress in network research: toward reference networks for key model organisms. Brief Bioinform (2007)Google Scholar
  32. 32.
    Altschul, S.F., Carroll, R.J., Lipman, D.J.: Weights for data related by a tree. J Mol. Biol. 207, 647–653 (1989)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jason Flannick
    • 1
  • Antal Novak
    • 1
  • Chuong B. Do
    • 1
  • Balaji S. Srinivasan
    • 2
  • Serafim Batzoglou
    • 1
  1. 1.Department of Computer ScienceStanford UniversityStanfordUSA
  2. 2.Department of StatisticsStanford UniversityStanfordUSA

Personalised recommendations