Inferring Gene Regulatory Networks from Time-Series Expressions Using Random Forests Ensemble
Abstract
Reconstructing gene regulatory network (GRN) from time-series expression data has become increasingly popular since time course data contain temporal information about gene regulation. A typical microarray gene expression data contain expressions of thousands of genes but the number of time samples is usually very small. Therefore, inferring a GRN from such a high-dimensional expression data poses a major challenge. This paper proposes a tree based ensemble of random forests in a multivariate auto-regression framework to tackle this problem. The efficacy of the proposed approach is demonstrated on synthetic time-series datasets and Saccharomyces cerevisiae (Yeast) microarray gene expression data with 9-genes. The performance is comparable or better than GRN generated using dynamic Bayesian networks and ordinary differential equations (ODE) model.
Keywords
Gene regulatory networks time-series gene expression data gene regulation Random forests multivariate auto-regression regression treesReferences
- 1.Husmeier, D.: Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic bayesian networks. Bioinformatics 19(17), 2271–2282 (2003)CrossRefGoogle Scholar
- 2.Bornholdt, S.: Boolean network models of cellular regulation: prospects and limitations. Journal of the Royal Society Interface 5(suppl. 1), S85–S94 (2008)Google Scholar
- 3.Li, P., Zhang, C., Perkins, E.J., Gong, P., Deng, Y.: Comparison of probabilistic boolean network and dynamic bayesian network approaches for inferring gene regulatory networks. BMC Bioinformatics 8(suppl. 7), S13 (2007)Google Scholar
- 4.Filkov, V.: Identifying gene regulatory networks from gene expression data. Handbook of Computational Molecular Biology, 27-1 (2005)Google Scholar
- 5.Liu, B., Thiagarajan, P.S., Hsu, D.: Probabilistic approximations of signaling pathway dynamics. In: Degano, P., Gorrieri, R. (eds.) CMSB 2009. LNCS (LNBI), vol. 5688, pp. 251–265. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 6.Kim, S.Y., Imoto, S., Miyano, S.: Inferring gene networks from time series microarray data using dynamic bayesian networks. Briefings in Bioinformatics 4(3), 228–235 (2003)CrossRefGoogle Scholar
- 7.Friedman, N., Murphy, K., Russell, S.: Learning the structure of dynamic probabilistic networks. In: Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 139–147. Morgan Kaufmann Publishers Inc. (1998)Google Scholar
- 8.Zoppoli, P., Morganella, S., Ceccarelli, M.: TimeDelay-ARACNE: Reverse engineering of gene networks from time-course data by an information theoretic approach. Bmc Bioinformatics 11(1), 154 (2010)CrossRefGoogle Scholar
- 9.Fujita, A., Sato, J., Garay-Malpartida, H., Yamaguchi, R., Miyano, S., Sogayar, M., Ferreira, C.: Modeling gene expression regulatory networks with the sparse vector autoregressive model. BMC Systems Biology 1, 39 (2007)Google Scholar
- 10.Rajapakse, J.C., Mundra, P.A.: Stability of building gene regulatory networks with sparse autoregressive models. BMC Bioinformatics 12(suppl. 13), S17 (2011)Google Scholar
- 11.Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)MATHCrossRefGoogle Scholar
- 12.Strobl, C., Boulesteix, A.L., Kneib, T., Augustin, T., Zeileis, A.: Conditional variable importance for random forests. BMC Bioinformatics 9(1), 307 (2008)CrossRefGoogle Scholar
- 13.Cutler, A., Cutler, D.R., Stevens, J.R.: Tree-based methods. High-Dimensional Data Analysis in Cancer Research, 1–19 (2009)Google Scholar
- 14.Boulesteix, A.L., Janitza, S., Kruppa, J., König, I.R.: Overview of random forest methodology and practical guidance with emphasis on computational biology and bioinformatics (2012)Google Scholar
- 15.Huynh-Thu, V.A., Irrthum, A., Wehenkel, L., Geurts, P.: Inferring regulatory networks from expression data using tree-based methods. PLoS One 5(9), e12776 (2010)Google Scholar
- 16.Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. Chapman & Hall/CRC (1984)Google Scholar
- 17.Pagano, M., Gauvreau, K., Pagano, M.: Principles of biostatistics. Duxbury Pacific Grove^ eCA CA (2000)Google Scholar
- 18.Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)MathSciNetCrossRefGoogle Scholar
- 19.Marbach, D., Schaffter, T., Mattiussi, C., Floreano, D.: Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Journal of Computational Biology 16(2), 229–239 (2009)CrossRefGoogle Scholar
- 20.Simon, I., Barnett, J., Hannett, N., Harbison, C.T., Rinaldi, N.J., Volkert, T.L., Wyrick, J.J., Zeitlinger, J., Gifford, D.K., Jaakkola, T.S., et al.: Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106(6), 697–708 (2001)CrossRefGoogle Scholar
- 21.Spellman, P.T., Sherlock, G., Zhang, M.Q., Iyer, V.R., Anders, K., Eisen, M.B., Brown, P.O., Botstein, D., Futcher, B.: Comprehensive identification of cell cycle–regulated genes of the yeast saccharomyces cerevisiae by microarray hybridization. Molecular Biology of the Cell 9(12), 3273–3297 (1998)CrossRefGoogle Scholar
- 22.Husmeier, D.: Inferring dynamic bayesian networks with mcmc (2003), http://www.bioss.ac.uk/~dirk/software/DBmcmc/index.html
- 23.Bansal, M., Della Gatta, G., Di Bernardo, D.: Inference of gene regulatory networks and compound mode of action from time course gene expression profiles. Bioinformatics 22(7), 815–822 (2006)CrossRefGoogle Scholar
- 24.Haifen, C., Maduranga, D., Mundra, P., Zheng, J.: Integrating epigenetic prior in dynamic bayesian network for gene regulatory network inference. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (accepted, 2013)Google Scholar
- 25.Mundra, P., Niranjan, M., Welsch, R., Zheng, J., Rajapakse, J.: Inferring time-delayed gene regulatory networks using cross-correlation and sparse regression. In: 9th International Symposium on Bioinformatics Research and Applications (accepted, 2013)Google Scholar