On the statistical significance of protein complex
- 29 Downloads
Statistical validation of predicted complexes is a fundamental issue in proteomics and bioinformatics. The target is to measure the statistical significance of each predicted complex in terms of p-values. Surprisingly, this issue has not received much attention in the literature. To our knowledge, only a few research efforts have been made towards this direction.
In this article, we propose a novel method for calculating the p-value of a predicted complex. The null hypothesis is that there is no difference between the number of edges in target protein complex and that in the random null model. In addition, we assume that a true protein complex must be a connected subgraph. Based on this null hypothesis, we present an algorithm to compute the p-value of a given predicted complex.
We test our method on five benchmark data sets to evaluate its effectiveness.
The experimental results show that our method is superior to the state-of-the-art algorithms on assessing the statistical significance of candidate protein complexes.
Keywordspredicted complex statistical significance testing subgraph mining community detection
This work was partially supported by the National Natural Science Foundation of China (No. 61572094), the Fundamental Research Funds for the Central Universities of China (Nos. DUT2017TB02 and DUT14QY07). Additionally, we want to thank the academic support received from Mr. Ben Teng and Dr. Xiuli Ma.
- 9.Wang, J., Li, M., Deng, Y. and Pan, Y. (2010) Recent advances in clustering methods for protein interaction networks. BMC Genomics, 11, S10Google Scholar
- 21.Csardi, G. and Nepusz, T. (2006) The Igraph software package for complex network research. Inter Journal Complex Systems, 1695, 1–9Google Scholar
- 22.Nepusz, T., Yu, H. and Paccanaro, A. Clusterone cytoscape plugin. https://doi.org/www.paccanarolab.org/static_content/clusterone/cl1-cytoscape3-1.0.html
- 26.“How many connected graphs over v vertices and e edges?” https://doi.org/math.stackexchange.com/questions/689526/how-many-connected-graphs-over-v-vertices-and-e-edges
- 29.Moré, J. (1977) The levenberg–marquardt algorithm: Implementation and theory. In Conference on Numerical Analysis. Dundee, UKGoogle Scholar