Advertisement

InterDB, a Prediction-Oriented Protein Interaction Database for C. elegans

  • Nicolas Thierry-Mieg
  • Laurent Trilling
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2066)

Abstract

Protein-protein interactions are critical to many biological processes, extending from the formation of cellular macromolecular structures and enzymatic complexes to the regulation of signal transduction pathways. With the availability of complete genome sequences, several groups have begun large-scale identification and characterization of such interactions, relying mostly on high-throughput two-hybrid systems. We collaborate with one such group, led by Marc Vidal, whose aim is the construction of a protein-protein interaction map for C. elegans. In this paper we first describe WISTdb, a database designed to store the interaction data generated in Marc Vidal’s laboratory. We then describe InterDB, a multi-organism prediction-oriented database of protein-protein interactions. We finally discuss our current approaches, based on inductive logic programming and on a data mining technique, for extracting predictive rules from the collected data.

Keywords

Association Rule Frequent Itemset Data Mining Technique Inductive Logic Programming Horn Clause 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The Acembly sequence assembly package, http://alpha.crbm.cnrs-mop.fr/acembly/
  2. 2.
    Agrawal R., Srikant R. (1994): Fast algorithms for mining association rules. Proceedings of the 20 th VLDB Conference, 487–499Google Scholar
  3. 3.
    Bairoch A., Apweiler R. (1999): The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Research 27(1), 49–54Google Scholar
  4. 4.
    A. Bateman, E. Birney, R. Durbin, S. Eddy, R.D. Finn, E.L. Sonnhammer(1999): Pfam 3.1: 1313 multiple alignments and profie HMMs match the majority of proteins. Nucleic Acids Research, 27(1), 260–262CrossRefGoogle Scholar
  5. 5.
    The C. elegans Sequencing Consortium (1998), Science 282, 2012–2018Google Scholar
  6. 6.
    M. Eisen, P. Spellman, P. Brown, D. Botstein (1998): Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868CrossRefGoogle Scholar
  7. 7.
    A. Enright, I. Iliopoulos, N. Kyrpides, C. Ouzounis (1999): Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90CrossRefGoogle Scholar
  8. 8.
    K. Hofmann, P. Bucher, L. Falquet, A. Bairoch(1999): The PROSITE database, its status in 1999. Nucleic Acids Research, 27(1), 215–219CrossRefGoogle Scholar
  9. 9.
  10. 10.
    The C. elegans Gene Knockout Consortium, http://www.cigenomics.bc.ca/elegans/
  11. 11.
    Lecrenier N., Foury F., Goffeau A. (1998): Two-hybrid systematic screening of the yeast proteome. BioEssays, 20, 1–5CrossRefGoogle Scholar
  12. 12.
    E. Marcotte, M. Pellegrini, H. Ng, D. Rice, T. Yeates, D. Eisenberg (1999): Detecting protein function and protein-protein interactions from genome sequences. Science, 285, 751–753CrossRefGoogle Scholar
  13. 13.
    Marcotte E., Pellegrini M., Thompson M., Yeates T., Eisenberg D. (1999): A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86CrossRefGoogle Scholar
  14. 14.
    Manilla H., Toivonen H., Verkamo A. (1994): Efficient algorithms for discovering association rules. KDD-94: AAAI Workshop on Knowledge Discovery in Databases Google Scholar
  15. 15.
    S. Muggleton, L. De Raedt(1994): Inductive logic programming: theory and methods. Journal of logic programming, 19,20:629–679CrossRefMathSciNetGoogle Scholar
  16. 16.
    S. Muggleton(1995): Inverse entailement and Progol. New generation computing, 13, 245–286CrossRefGoogle Scholar
  17. 17.
    A. Sali (1999): Functional links between proteins. Nature 402, 23–26CrossRefGoogle Scholar
  18. 18.
    Pellegrini M., Marcotte E., Thompson M., Eisenberg D., Yeates T. (1999): Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288CrossRefGoogle Scholar
  19. 19.
    C. Sanchez, C. Lachaize, F. Janody, B. Bellon, L. Röder, J. Euzenat, F. Rechenmann, B. Jacq(1999): Grasping at molecular interactions and genetic networks in Drosophila melanogaster using FlyNets, an internet database. Nucleic Acids Research 27(1), 89–94CrossRefGoogle Scholar
  20. 20.
    L. Stein, J. Thierry-Mieg (1999): Scriptable Access to the Caenorhabditis elegans Genome Sequence and other Acedb Databases. Genome Research 8(12):1308–1315Google Scholar
  21. 21.
    J. Thierry-Mieg, D. Thierry-Mieg, L. Stein (1999): ACEDB: The ACE database manager. In S. Letovsky (ed.): Bioinformatics, Databases and Systems, Kluwer Academic Publishers, 265–278Google Scholar
  22. 22.
    Uetz et al. (2000): A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature, 403, 623–627CrossRefGoogle Scholar
  23. 23.
    M. Vidal, P. Legrain (1999): Yeast forward and reverse ‘n’-hybrid systems. Nucleic Acids Research 27(4), 919–929CrossRefGoogle Scholar
  24. 24.
    A. Walhout, H. Endoh, N. Thierry-Mieg, W. Wong, M. Vidal (1999): A model of elegance. American Journal of Human Genetics 63(4):955–61CrossRefGoogle Scholar
  25. 25.
    A. Walhout, R. Sordella, X. Lu, J. Hartley, G. Temple, M. Brasch, N. Thierry-Mieg, M. Vidal (2000): Protein interaction mapping in C. elegans using proteins involved in vulval development. Science, 287, 116–122CrossRefGoogle Scholar
  26. 26.
    Winona C. Barker, John S. Garavelli, Peter B. McGarvey, Christopher R. Marzec, Bruce C. Orcutt, Geetha Y. Srinivasarao, Lai-Su L. Yeh, Robert S. Ledley, Hans-Werner Mewes, Friedhelm Pfeiffer, Akira Tsugita and Cathy Wu (1999): The PIR-International Protein Sequence Database. Nucleic Acids Research 27(1): 39–43CrossRefGoogle Scholar
  27. 27.
    The Yeast Protein Database, http://www.proteome.com/

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Nicolas Thierry-Mieg
    • 1
  • Laurent Trilling
    • 1
  1. 1.Laboratoire LSR-IMAGSaint-Martin-d’Hères cedexFrance

Personalised recommendations