SFFS-MR: A Floating Search Strategy for GRNs Inference

  • Fabrício M. Lopes
  • David C. MartinsJr.
  • Junior Barrera
  • Roberto M. CesarJr.
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


An important problem in the bioinformatics field is the inference of gene regulatory networks (GRN) from temporal expression profiles. In general, the main limitations faced by GRN inference methods is the small number of samples with huge dimensionalities and the noisy nature of the expression measurements. In face of these limitations, alternatives are needed to get better accuracy on the GRNs inference problem. In this context, this work addresses this problem by presenting an alternative feature selection method that applies prior knowledge on its search strategy, called SFFS-MR. The proposed search strategy is based on SFFS algorithm, with the inclusion of multiple roots at the beginning of the search, which are defined by the best and worst single results of the SFS algorithm. In this way, the search space traversed is guided by these roots in order to find the predictor genes for a given target gene, specially to better identify genes presenting intrinsically multivariate prediction, without worsening the asymptotical computational cost of the SFFS. Experimental results show that the SFFS-MR provides a better inference accuracy than SFS and SFFS, maintaining a similar robustness of the SFS and SFFS methods. In addition, the SFFS-MR was able to achieve 60% of accuracy on network recovery after only 20 observations from a state space of size 220, thus presenting very good results.


SFS SFFS feature selection inference gene networks pattern recognition systems biology bioinformatics 


  1. 1.
    Anastassiou, D.: Computational analysis of the synergy among multiple interacting genes. Molecular Systems Biology 3(83) (2007)Google Scholar
  2. 2.
    Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286(5439), 509–512 (1999)CrossRefPubMedGoogle Scholar
  3. 3.
    Barrera, J., Cesar Jr., R.M., Martins Jr., D.C., Vencio, R.Z.N., Merino, E.F., Yamamoto, M.M., Leonardi, F.G., Pereira, C.A.B., Portillo, H.A.: Methods of Microarray Data Analysis V. In: Constructing Probabilistic Genetic Networks of Plasmodium Falciparum, from Dynamical Expression Signals of the Intraerythrocytic Development Cycle, pp. 11–26. Springer, Heidelberg (2007)Google Scholar
  4. 4.
    Butte, A., Kohane, I.: Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. In: Proceedings of the Pacific Symposium on Biocomputing, pp. 418–429 (2000)Google Scholar
  5. 5.
    D’haeseleer, P., Liang, S., Somogyi, R.: Genetic network inference: from co-expression clustering to reverse engineering. Bioinformatics 16(8), 707–726 (2000)CrossRefPubMedGoogle Scholar
  6. 6.
    Dougherty, E.R.: Validation of inference procedures for gene regulatory networks. Current Genomics 8(6), 351–359 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Dougherty, E.R., Brun, M., Trent, J.M., Bittner, M.L.: Conditioning-Based Modeling of Contextual Genomic Regulation. IEEE/ACM TCBB 6(2), 310–320 (2009), PubMedGoogle Scholar
  8. 8.
    Dougherty, E.R., Kim, S., Chen, Y.: Coefficient of determination in nonlinear signal processing. Signal Processing 80, 2219–2235 (2000)CrossRefGoogle Scholar
  9. 9.
    Dougherty, J., Tabus, I., Astola, J.: Inference of gene regulatory networks based on a universal minimum description length. EURASIP Journal on Bioinformatics and Systems Biology, 1–11 (2008)Google Scholar
  10. 10.
    Erdös, P., Rényi, A.: On random graphs. Publ. Math. Debrecen 6, 290–297 (1959)Google Scholar
  11. 11.
    Faith, J., Hayete, B., Thaden, J., Mogno, I., Wierzbowski, J., Cottarel, G., Kasif, S., Collins, J., Gardner, T.: Large-scale mapping and validation of escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biology 5(1), 259–265 (2007)CrossRefGoogle Scholar
  12. 12.
    Ghaffari, N., Ivanov, I., Qian, X., Dougherty, E.R.: A CoD-based reduction algorithm for designing stationary control policies on Boolean networks. Bioinformatics 26(12), 1556–1563 (2010) doi: 10.1093/bioinformatics/btq225, CrossRefPubMedGoogle Scholar
  13. 13.
    Hashimoto, R.F., Kim, S., Shmulevich, I., Zhang, W., Bittner, M.L., Dougherty, E.R.: Growing genetic regulatory networks from seed genes. Bioinformatics 20(8), 1241–1247 (2004)CrossRefPubMedGoogle Scholar
  14. 14.
    Hecker, M., Lambeck, S., Toepfer, S., van Someren, E., Guthke, R.: Gene regulatory network inference: Data integration in dynamic models - A review. Biosystems 96(1), 86–103 (2009)CrossRefPubMedGoogle Scholar
  15. 15.
    Hovatta, I., Kimppa, K., Lehmussola, A., Pasanen, T., Saarela, J., Saarikko, I., Saharinen, J., Tiikkainen, P., Toivanen, T., Tolvanen, M., et al.: DNA microarray data analysis. In: CSC, 2nd edn., Scientific Computing Ltd. (2005)Google Scholar
  16. 16.
    de Jong, H.: Modeling and simulation of genetic regulatory systems: A literature review. Journal of Computational Biology 9(1), 67–103 (2002)CrossRefPubMedGoogle Scholar
  17. 17.
    Karlebach, G., Shamir, R.: Modelling and analysis of gene regulatory networks. Nat. Rev. Mol. Cell Biol. 9(10), 770–780 (2008)CrossRefPubMedGoogle Scholar
  18. 18.
    Liang, S., Fuhrman, S., Somogyi, R.: Reveal: a general reverse engineering algorithm for inference of genetic network architectures. In: Proceedings of the Pacific Symposium on Biocomputing, pp. 18–29 (1998)Google Scholar
  19. 19.
    Lopes, F.M., Martins Jr., D.C., Cesar Jr., R.M.: Feature selection environment for genomic applications. BMC Bioinformatics 9(1), 451 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Lopes, F.M., Cesar Jr., R.M., Costa, L.d.F.: AGN simulation and validation model. In: Bazzan, A.L.C., Craven, M., Martins, N.F. (eds.) BSB 2008. LNCS (LNBI), vol. 5167, pp. 169–173. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  21. 21.
    Margolin, A., Basso, K.N., Wiggins, C., Stolovitzky, G., Favera, R., Califano, A.: ARACNE: An algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7(suppl. 1), S7 (2006)CrossRefGoogle Scholar
  22. 22.
    Martins Jr., D.C., Braga-Neto, U., Hashimoto, R.F., Dougherty, E.R., Bittner, M.L.: Intrinsically multivariate predictive genes. IEEE Journal of Selected Topics in Signal Processing 2(3), 424–439 (2008)CrossRefGoogle Scholar
  23. 23.
    Peng, H., Long, F., Ding, C.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE TPAMI 27(8), 1226–1238 (2005)CrossRefGoogle Scholar
  24. 24.
    Pudil, P., Novovičová, J., Kittler, J.: Floating search methods in feature-selection. Pattern Recognition Letters 15(11), 1119–1125 (1994)CrossRefGoogle Scholar
  25. 25.
    Steuer, R., Kurths, J., Daub, C., Weise, J., Selbig, J.: The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18(Suppl. 2), 231–240 (2002)CrossRefGoogle Scholar
  26. 26.
    Rao, A., Hero III, A., States, D., Engel, J.: Using directed information to build biologically relevant influence networks. In: Proc. LSS Comput. Syst. Bioinform, pp. 145–156 (August 2007)Google Scholar
  27. 27.
    Ris, M., Martins Jr., D.C., Barrera, J.: U-curve: A branch-and-bound optimization algorithm for u-shaped cost functions on boolean lattices applied to the feature selection problem. Pattern Recognition 43(3), 557–568 (2010)CrossRefGoogle Scholar
  28. 28.
    Schllit, T., Brazma, A.: Current approaches to gene regulatory network modelling. BMC Bioinformatics 8(suppl. 6), S9 (2007)CrossRefGoogle Scholar
  29. 29.
    Shalon, D., Smith, S.J., Brown, P.O.: A DNA microarray system for analyzing complex DNA samples using two-color fluorescent probe hybridization. Genome Research 6(7), 639–645 (1996)CrossRefPubMedGoogle Scholar
  30. 30.
    Shmulevich, I., Dougherty, E.R., Kim, S., Zhang, W.: Probabilistic boolean networks: a rule-based uncertainty model for gene regulatory networks. Bioinformatics 18(2), 261–274 (2002)CrossRefPubMedGoogle Scholar
  31. 31.
    Somol, P., Pudil, P., Kittler, J.: Fast branch & bound algorithms for optimal feature selection. IEEE TPAMI 26(7), 900–912 (2004)CrossRefGoogle Scholar
  32. 32.
    Somol, P., Pudil, P., Novovičová, J., Paclík, P.: Adaptive floating search methods in feature selection. Pattern Recognition Letters 20, 1157–1163 (1999)CrossRefGoogle Scholar
  33. 33.
    Stuart, J.M., Segal, E., Koller, D., Kim, S.K.: A gene-coexpression network for global discovery of conserved genetic modules. Science 302(5643), 249–255 (2003)CrossRefPubMedGoogle Scholar
  34. 34.
    Styczynski, M.P., Stephanopoulos, G.: Overview of computational methods for the inference of gene regulatory networks. Computers & Chemical Engineering 29(3), 519–534 (2005)CrossRefGoogle Scholar
  35. 35.
    Velculescu, V.E., Zhang, L., Vogelstein, B., Kinzler, K.W.: Serial Analysis of Gene Expression. Science 270(5235), 484–487 (1995)CrossRefPubMedGoogle Scholar
  36. 36.
    Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393, 440–442 (1998)CrossRefPubMedGoogle Scholar
  38. 38.
    Zhao, W., Serpedin, E., Dougherty, E.R.: Inferring connectivity of genetic regulatory networks using information-theoretic criteria. IEEE/ACM TCBB 5(2), 262–274 (2008)PubMedGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Fabrício M. Lopes
    • 1
    • 2
  • David C. MartinsJr.
    • 3
  • Junior Barrera
    • 4
  • Roberto M. CesarJr.
    • 2
    • 5
  1. 1.Federal University of Technology - ParanáBrazil
  2. 2.Institute of Mathematics and StatisticsUniversity of São PauloBrazil
  3. 3.Federal University of ABCBrazil
  4. 4.Faculty of Philosophy, Sciences and Letters of Ribeirão PretoUniversity of São PauloBrazil
  5. 5.CTBEBrazil

Personalised recommendations