Skip to main content

An Effective Biclustering Algorithm for Time-Series Gene Expression Data

  • Conference paper
  • First Online:
Book cover Machine Learning and Cybernetics (ICMLC 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 481))

Included in the following conference series:

  • 1587 Accesses

Abstract

The biclustering is a useful tool in analysis of massive gene expression data, which performs simultaneous clustering on rows and columns of the data matrix to find subsets of coherently expressed genes and conditions. Especially, in analysis of time-series gene expression data, it is meaningful to restrict biclusters to contiguous time points concerning coherent evolutions. In this paper, the BCCC-Bicluster is proposed as an extension of the CCC-Bicluster. An algorithm based on the frequent sequential mining is proposed to find all maximal BCCC-Biclusters. The newly defined Frequent-Infrequent Tree-Array (FITA) is constructed to speed up the traversal process, with useful strategies originating from Apriori Property to avoid redundant search. To make it more efficient, the bitwise operation XOR is applied to capture identical or opposite contiguous patterns between two rows. The algorithm is tested on the yeast microarray data. Experimental results show that the proposed algorithm is able to find all embedded BCCC-Biclusters, which are proven to reveal significant GO terms involved in biological processes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press (2000)

    Google Scholar 

  2. Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving submatrix problem. In: RECOMB 2002: Proceedings of the Sixth Annual International Conference on Computational Biology, pp. 49–57. ACM, New York (2002)

    Google Scholar 

  3. Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, S136–S144 (2002)

    Article  Google Scholar 

  4. Divina, F., Aguilar-Ruiz, J.S.: A multi-objective approach to discover biclusters in microarray data. In: GECCO 2007: Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation, pp. 385–392. ACM, New York (2007)

    Google Scholar 

  5. Gu, J., Liu, J.S.: Bayesian biclustering of gene expression data. BMC Genomics 9(suppl. 1), S4 (2008)

    Article  Google Scholar 

  6. Lazzeroni, L., Owen, A.: Plaid models for gene expression data. J Statistica Sinica 12, 61–86 (2002)

    Google Scholar 

  7. Barkow, S., Bleuler, S., Prelic, A., Zimmermann, P., Zitzler, E.: Bicat: a biclustering analysis toolbox. Bioinformatics 22(10), 1282–1283 (2006)

    Article  Google Scholar 

  8. Bleuler, S., Prelic, A., Zitzler, E.: An ea framework for biclustering of gene expression data. In: Proceedings of Congress on Evolutionary Computation, pp. 166–173 (2004)

    Google Scholar 

  9. Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: BIBE 2003: Proceedings of the 3rd IEEE Symposium on Bioinformatics and Bio Engineering, pp. 321. IEEE Computer Society, Washington, DC (2003)

    Google Scholar 

  10. Prelic, A., Bleuler, S., Zimmermann, P., Buhlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)

    Article  Google Scholar 

  11. Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) 1(1), 24–45 (2004)

    Google Scholar 

  12. Madeira, S.C., Oliveira, A.L.: A Linear Time Biclustering Algorithm for Time Series Gene Expression Data. In: Casadio, R., Myers, G. (eds.) WABI 2005. LNCS (LNBI), vol. 3692, pp. 39–52. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  13. Murali, T.M., Kasif, S.: Extracting Conserved Gene Expression Motifs from Gene Expression Data. In: Proc. Pacific Symp. Biocomputing, vol. 8, pp. 77–88 (2003)

    Google Scholar 

  14. Liu, J., Yang, J., Wang, W.: Biclustering in gene expression data by tendency. In: Proceedings of the 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004, August 16-19, pp. 182–193 (2004)

    Google Scholar 

  15. Peeters, R.: The maximum edge biclique problem is NP-complete. Discrete Applied Mathematics 131(3), 651–654 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  16. Zhang, Y., Zha, H., Chu, C.-H.: A time-series biclustering algorithm for revealing co-regulated genes. In: International Conference on Information Technology: Coding and Computing, ITCC 2005, April 4-6, vol. 1, pp. 32–37 (2005)

    Google Scholar 

  17. Madeira, S.C., Teixeira, M.C., Sá-Correia, I., Oliveira, A.L.: Identification of Regulatory Modules in Time Series Gene Expression Data using a Linear Time Biclustering Algorithm. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics (March 21, 2008)

    Google Scholar 

  18. Cheung, L., Cheung, D.W., Kao, B.: On mining micro-array data by Order-Preserving Submatrix. International Journal of Bioinformatics Research and Applications 3, 42–64 (2007)

    Article  Google Scholar 

  19. Gao, B.J., Griffith, O.L., Ester, M., Xiong, H., Zhao, Q., Jones, S.J.M.: On the Deep Order-Preserving Submatrix Problem: A Best Effort Approach. IEEE Trans. Knowl. Data Eng. 24, 309–325 (2012)

    Article  Google Scholar 

  20. Yordzhev, K.: An Example for the Use of Bitwise Operations in programming. Mathematics and Education in Mathematics 38, 196–202 (2009)

    Google Scholar 

  21. Gottesman, D.: A theory of fault-tolerant quantum computation. Phys. Rev. A 57, 127–137 (1998)

    Google Scholar 

  22. Hall, K.L., Rauschenbach, K.A.: 100-Gbit/s bitwise logic. Opt. Lett. 23(16), 1271–1273 (1998)

    Google Scholar 

  23. Tan, K.-L., Eng, P.-K., Ooi, B.C.: Efficient progressive skyline computation. In: Proc. of the Conf. on Very Large Data Bases, Rome, Italy (September 2001)

    Google Scholar 

  24. Cheng, Y., Church, G.M.: Biclustering of expression data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, pp. 93–103. AAAI Press (2000)

    Google Scholar 

  25. Martin, D., Brun, C., Remy, E., Mouren, P., Thieffry, D., Jacq, B.: GOToolBox: functional investigation of gene datasets based on Gene Ontology. Genome Biology 5 (12R101) (2004). http://burgundy.cmmt.ubc.ca/GOToolBox/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Xue .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Xu, H. et al. (2014). An Effective Biclustering Algorithm for Time-Series Gene Expression Data. In: Wang, X., Pedrycz, W., Chan, P., He, Q. (eds) Machine Learning and Cybernetics. ICMLC 2014. Communications in Computer and Information Science, vol 481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45652-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-45652-1_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-45651-4

  • Online ISBN: 978-3-662-45652-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics