Abstract
The goal of this paper is to develop a novel statistical framework for inferring dependence between distributions of variables in omics data. We propose the concept of building a dependence network using a copula-based kernel dependency measures to reconstruct the underlying association network between the distributions. ISaaC is utilized for reverse-engineering gene regulatory networks and is competitive with several state-of-the-art gene regulatory inferrence methods on DREAM3 and DREAM4 Challenge datasets. An open-source implementation of ISaaC is available at https://bitbucket.org/HossamAlmeer/isaac/.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bach, F.R., Jordan, M.I.: Kernel independent component analysis. J. Mach. Learn. Res. 3(Jul), 1–48 (2002)
Borgwardt, K.M., Gretton, A., Rasch, M.J., Kriegel, H.P., Schölkopf, B., Smola, A.J.: Integrating structured biological data by kernel maximum mean discrepancy. Bioinformatics 22(14), e49–e57 (2006)
Bosq, D.: Contribution à la théorie de l’estimation fonctionnelle. Institut de statistique de l’Université de Paris, Paris (1971)
Dedecker, J., Doukhan, P., Lang, G., José Rafael, L., Louhichi, S., Prieur, C.: The empirical process. In: Dedecker, J., Doukhan, P., Lang, G., José Rafael, L., Louhichi, S., Prieur, C. (eds.) Weak Dependence: With Examples and Applications, pp. 223–246. Springer, New York (2007). https://doi.org/10.1007/978-0-387-69952-3_10
Evangelista, P.F., Embrechts, M.J., Szymanski, B.K.: Some properties of the gaussian kernel for one class learning. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds.) ICANN 2007. LNCS, vol. 4668, pp. 269–278. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74690-4_28
Fortet, R., Mourier, E.: Convergence de la répartition empirique vers la répartition théorique. Annales scientifiques de l’École Normale Supérieure 70(3), 267–285 (1953)
Gretton, A., Borgwardt, K.M., Rasch, M., Schölkopf, B., Smola, A.J.: A kernel method for the two-sample-problem. In: Advances in Neural Information Processing Systems, pp. 513–520 (2007)
Gretton, A., Bousquet, O., Smola, A., Schölkopf, B.: Measuring statistical dependence with Hilbert-Schmidt norms. In: Jain, S., Simon, H.U., Tomita, E. (eds.) ALT 2005. LNCS (LNAI), vol. 3734, pp. 63–77. Springer, Heidelberg (2005). https://doi.org/10.1007/11564089_7
Gretton, A., Herbrich, R., Smola, A.J.: The kernel mutual information. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Proceedings, ICASSP 2003, vol. 4, pp. IV-880. IEEE (2003)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis, vol. 46. Wiley, New Jersy (2004)
Irrthum, A., Wehenkel, L., Geurts, P., et al.: Inferring regulatory networks from expression data using tree-based methods. PLoS ONE 5(9), e12776 (2010)
Karlebach, G., Shamir, R.: Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9(10), 770–780 (2008)
Krus, D.J., Blackman, H.S.: Test reliability and homogeneity from the perspective of the ordinal test theory. Appl. Measur. Educ. 1(1), 79–88 (1988)
Mall, R., Cerulo, L., Garofano, L., Frattini, V., Kunji, K., Bensmail, H., Sabedot, T.S., Noushmehr, H., Lasorella, A., Iavarone, A., Ceccarelli, M.: RGBM: regularized gradient boosting machines for identification of the transcriptional regulators of discrete glioma subtypes. Nucleic Acids Res. gky015 (2018). https://doi.org/10.1093/nar/gky015
Mall, R., Jumutc, V., Langone, R., Suykens, J.A.: Representative subsets for big data learning using k-NN graphs. In: 2014 IEEE International Conference on Big Data (Big Data), pp. 37–42. IEEE (2014)
Mall, R., Suykens, J.A.: Very sparse LSSVM reductions for large-scale data. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1086–1097 (2015)
Marbach, D., Costello, J.C., Küffner, R., Vega, N.M., Prill, R.J., Camacho, D.M., Allison, K.R., Kellis, M., Collins, J.J., Stolovitzky, G., et al.: Wisdom of crowds for robust gene network inference. Nat. Methods 9(8), 796–804 (2012)
Marbach, D., Prill, R.J., Schaffter, T., Mattiussi, C., Floreano, D., Stolovitzky, G.: Revealing strengths and weaknesses of methods for gene network inference. Proc. Nat. Acad. Sci. 107(14), 6286–6291 (2010)
Marbach, D., Schaffter, T., Mattiussi, C., Floreano, D.: Generating realistic in silico gene networks for performance assessment of reverse engineering methods. J. Comput. Biol. 16(2), 229–239 (2009)
Margolin, A.A., Nemenman, I., Basso, K., Wiggins, C., Stolovitzky, G., Dalla Favera, R., Califano, A.: Aracne: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform. 7(1), S7 (2006)
Nelsen, R.B.: An Introduction to Copulas. Springer, Heidelberg (2007). https://doi.org/10.1007/0-387-28678-0
Pál, D., Póczos, B., Szepesvári, C.: Estimation of rényi entropy and mutual information based on generalized nearest-neighbor graphs. In: Advances in Neural Information Processing Systems, pp. 1849–1857 (2010)
Petralia, F., Wang, P., Yang, J., Tu, Z.: Integrative random forest for gene regulatory network inference. Bioinformatics 31(12), i197–i205 (2015)
Pinna, A., Soranzo, N., De La Fuente, A.: From knockouts to networks: establishing direct cause-effect relationships through graph analysis. PLoS ONE 5(10), e12912 (2010)
Plaisier, C.L., O’Brien, S., Bernard, B., Reynolds, S., Simon, Z., Toledo, C.M., Ding, Y., Reiss, D.J., Paddison, P.J., Baliga, N.S.: Causal mechanistic regulatory network for glioblastoma deciphered using systems genetics network analysis. Cell Syst. 3(2), 172–186 (2016)
Póczos, B., Ghahramani, Z., Schneider, J.: Copula-based kernel dependency measures. arXiv preprint arXiv:1206.4682 (2012)
Prill, R.J., Marbach, D., Saez-Rodriguez, J., Sorger, P.K., Alexopoulos, L.G., Xue, X., Clarke, N.D., Altan-Bonnet, G., Stolovitzky, G.: Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS ONE 5(2), e9202 (2010)
Rényi, A., et al.: On measures of entropy and information. In: Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 547–561 (1961)
Sarmanov, O.: The maximum correlation coefficient (symmetrical case). Dokl. Akad. Nauk SSSR 120(4), 715–718 (1958)
Schaffter, T., Marbach, D., Floreano, D.: Genenetweaver: in silico benchmark generation and performance profiling of network inference methods. Bioinformatics 27(16), 2263–2270 (2011)
Schweizer, B., Wolff, E.F.: On nonparametric measures of dependence for random variables. Ann. Stat. 9(4), 879–885 (1981)
Shannon, C.W., Weaver, W.: The Mathematical Theory of Communication. Press UoI, Urbana (1949)
Sławek, J., Arodź, T.: Ennet: inferring large gene regulatory networks from expression data using gradient boosting. BMC Syst. Biol. 7(1), 1 (2013)
van Someren, E., Wessels, L., Backer, E., Reinders, M.: Genetic network modeling. Pharmacogenomics 3(4), 507–525 (2002)
Steinwart, I.: On the influence of the kernel on the consistency of support vector machines. J. Mach. Learn. Res. 2(Nov), 67–93 (2001)
Sun, X., Janzing, D., Schölkopf, B., Fukumizu, K.: A kernel-based causal learning algorithm. In: Proceedings of the 24th International Conference on Machine Learning, pp. 855–862. ACM (2007)
Székely, G.J., Rizzo, M.L., Bakirov, N.K., et al.: Measuring and testing dependence by correlation of distances. Ann. Stat. 35(6), 2769–2794 (2007)
Tsallis, C.: Possible generalization of Boltzmann-gibbs statistics. J. Stat. Phys. 52(1), 479–487 (1988)
Weisstein, E.: Sklar’s theorem. Retrieved 4, 15 (2011)
Yip, K.Y., Alexander, R.P., Yan, K.K., Gerstein, M.: Improved reconstruction of in silico gene regulatory networks by integrating knockout and perturbation data. PLoS ONE 5(1), e8121 (2010)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Al Meer, H., Mall, R., Ullah, E., Megrez, N., Bensmail, H. (2018). ISaaC: Identifying Structural Relations in Biological Data with Copula-Based Kernel Dependency Measures. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2018. Lecture Notes in Computer Science(), vol 10813. Springer, Cham. https://doi.org/10.1007/978-3-319-78723-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-78723-7_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78722-0
Online ISBN: 978-3-319-78723-7
eBook Packages: Computer ScienceComputer Science (R0)