Quasi-bicliques: Complexity and Binding Pairs

  • Xiaowen Liu
  • Jinyan Li
  • Lusheng Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5092)


Protein-protein interactions (PPIs) are one of the most important mechanisms in cellular processes. To model protein interaction sites, recent studies have suggested to find interacting protein group pairs from large PPI networks at the first step, and then to search conserved motifs within the protein groups to form interacting motif pairs. To consider noise effect and incompleteness of biological data, we propose to use quasi-bicliques for finding interacting protein group pairs. We investigate two new problems which arise from finding interacting protein group pairs: the maximum vertex quasi-biclique problem and the maximum balanced quasi-biclique problem. We prove that both problems are NP-hard. This is a surprising result as the widely known maximum vertex biclique problem is polynomial time solvable [16]. We then propose a heuristic algorithm which uses the greedy method to find the quasi-bicliques from PPI networks. Our experiment results on real data show that this algorithm has a better performance than a benchmark algorithm for identifying highly matched BLOCKS and PRINTS motifs.


Bipartite Graph Protein Group Domain Pair Motif Pair Protein Interaction Data 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andreopoulos, B., An, A., Wang, X., Faloutsos, M., Schroeder, M.: Clustering by Common Friends Finds Locally Significant Proteins Mediating Modules. Bioinformatics 23(9), 1124–1131 (2007)CrossRefGoogle Scholar
  2. 2.
    Apweiler, R., Attwood, T.K., Bairoch, A., Bateman, A., Birney, E., Biswas, M., Bucher, P., Cerutti, L., Corpet, F., Croning, M.D., Durbin, R., Falquet, L., Fleischmann, W., Gouzy, J., Hermjakob, H., Hulo, N., Jonassen, I., Kahn, D., Kanapin, A., Karavidopoulou, Y., Lopez, R., Marx, B., Mulder, N.J., Oinn, T.M., Pagni, M., Servant, F., Sigrist, C.J., Zdobnov, E.M.: The InterPro Database, an Integrated Documentation Resource for Protein Families, Domains and Functional Sites. Nucleic Acids Research 29(1), 37–40 (2001)CrossRefGoogle Scholar
  3. 3.
    Attwood, T.K., Beck, M.E.: PRINTS-a Protein Motif Fingerprint Database. Protein Engineering, Design and Selection 7, 841–848 (1994)CrossRefGoogle Scholar
  4. 4.
    Bu, D., Zhao, Y., Cai, L., Xue, H., Zhu, X., Lu, H., Zhang, J., Sun, S., Ling, L., Zhang, N., Li, G., Chen, R.: Topological Structure Analysis of the Protein-Protein Interaction Network in Budding Yeast. Nucleic Acids Research 31(9), 2443–2450 (2003)CrossRefGoogle Scholar
  5. 5.
    Finn, R.D., Marshall, M., Bateman, A.: iPfam: Visualization of Protein-Protein Interactions in PDB at Domain and Amino Acid Resolutions. Bioinformatics 21(3), 410–412 (2005)CrossRefGoogle Scholar
  6. 6.
    Garey, M.R., Johnson, D.S.: Computers and Intractability, A Guide to the Theory of NP-Completeness. Freeman, San Francisco (1979)zbMATHGoogle Scholar
  7. 7.
    Grahne, G., Zhu, J.: Efficiently using Prefix-Trees in Mining Frequent Itemsets. In: Proceedings of the Workshop on Frequent Itemset Mining Implementations (FIMI) (2003)Google Scholar
  8. 8.
    Hishigaki, H., Nakai, K., Ono, T., Tanigami, A., Takagi, T.: Assessment of Prediction Sccuracy of Protein Gunction From Protein–Protein Interaction Data. Yeast 18(6), 523–531 (2001)CrossRefGoogle Scholar
  9. 9.
    Karp, R.M.: Reducibility among Combinatorial Problems. In: Miller, R.E., Thatcher, J.W. (eds.) Complexity of Computer Computations, pp. 85–103 (1972)Google Scholar
  10. 10.
    Li, H., Li, J., Wang, L.: Discovering Motif Pairs at Interaction Sites from Protein Sequences on a Proteome-Wide Scale. Bioinformatics 22(8), 989–996 (2006)CrossRefGoogle Scholar
  11. 11.
    Morrison, J.L., Breitling, R., Higham, D.J., Gilbert, D.R.: A Lock-and-Key Model for Protein-Protein Interactions. Bioinformatics 22(16), 2012–2019 (2006)CrossRefGoogle Scholar
  12. 12.
    Peeters, R.: The Maximum Edge Biclique Problem is NP-Vomplete. Discrete Applied Mathematics 131(3), 651–654 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Pietrokovski, S.: Searching Databases of Conserved Sequence Regions by Aligning Protein Multiple-Alignments. Nucleic Acids Research 24, 3836–3845 (1996)CrossRefGoogle Scholar
  14. 14.
    Sonnhammer, E.L.L., Eddy, S.R., Durbin, R.: Pfam: A Vomprehensive Database of Protein Domain Families Based on Seed Alignments. Proteins: Structure, Function and Genetics 28, 405–420 (1997)CrossRefGoogle Scholar
  15. 15.
    Thomas, A., Cannings, R., Monk, N.A.M., Cannings, C.: On the Structure of Protein-Protein Interaction Networks. Biochemical Society Transactions 31(Pt 6), 1491–1496 (2003)CrossRefGoogle Scholar
  16. 16.
    Yannakakis, M.: Node Deletion Problems on Bipartite Graphs. SIAM Journal on Computing 10, 310–327 (1981)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Xiaowen Liu
    • 1
    • 3
  • Jinyan Li
    • 2
  • Lusheng Wang
    • 1
  1. 1.Department of Computer ScienceCity University of Hong Kong, KowloonHong Kong 
  2. 2.School of Computer EngineeringNanyang Technological UniversitySingapore
  3. 3.Department of Computer ScienceUniversity of Western OntarioCanada

Personalised recommendations