Abstract
Biclustering is an important approach in microarray data analysis. Using biclustering algorithms, one can identify sets of genes sharing compatible expression patterns across subsets of samples. These patterns may provide clues about the main biological processes associated to different physiological states. In this study, we present a new biclustering algorithm to identify local structures from gene expression data set. Our method uses singular value decomposition (SVD) as its framework. Based on the singular value decomposition, identifying bicluster problem from gene expression matrix is transformed into two global clustering problems. After biclustering, our algorithm forms blocks of up-regulated or down-regulated in gene expression matrix, so as to infer that which genes are co-regulated and which genes possibly are functionally related. The experimental results on three benchmark datasets (Human Tissues, Lymphoma, Leukemia) demonstrate good visualization and interpretation ability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abdullah, A., Hussain, A., et al.: A new biclustering technique based on crossing minimization. Neurocomputing 69, 1882–1896 (2006)
Alizadeh, A.A., Eisen, M.B., et al.: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403, 503–511 (2000)
Alter, O., Brown, P.O., Botstein, D.: Singular value decomposition for genome-wide expression data processing and modeling. Proceedings of the National Academy of Sciences 97(18), 10101–10106 (2000)
Alter, O., Golub, G.H.: Singular value decomposition of genome-scale mrna lengths distribution reveals asymmetry in rna gel electrophoresis band broadening. Proceedings of the National Academy of Sciences 103(32), 11828–11833 (2006)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving sub-matrix problem. In: Proceedings of the 6th Annual International Conference on Computational Biology (RECOMB 2002), New York, USA, pp. 49–57 (2002)
Carmona-Saez, P., Pascual-Marqui, R.D., Tirado, F., Carazo, J.M., Pascual-Montano, A.: Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinformatics 7(78) (2006)
Changdra, B., Shanker, S., Mishra, S.: A new approach: Interrelated two-way clustering of gene expression data. Statistical Methodology 3, 93–102 (2006)
Cheng, Y., Church, G.M.: Biclustering of gene expression data. In: Proc. 8th Int. Conf. Intelligent Systems for Molecular Biology (ISMB 2000), San Diego, CA, pp. 93–103 (2000)
Dudoit, S., Fridlyand, J., Speed, T.: Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association 97, 77–87 (2002)
Hartigan, J.A.H.: Direct clustering of a data matrix. Journal of the American Statistical Association 67(337), 123–129 (1972)
Ihmels, J., Bergmann, S., Barkai, N.: Defining transcription modules using large-scale gene expression data. Bioinformatics 20, 1993–2003 (2004)
Jackson, D.A.: Stopping rules in principal components analysis: a comparison of heuristical and statistical approaches. Ecology 74, 2204–2214 (1993)
Kluger, Y., Basri, R., Chang, J.T., Gerstein, M.: Spectral biclustering of microarraydata: coclustering genes and conditions. Genome Research 13(4), 703–716 (2003)
Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Statistica Sinica 12(1), 61–86 (2002)
Liu, B., Wan, C., Wang, L.: An efficient semi-unsupervised gene selection method via spectral biclustering. IEEE Transactions on Nanobioscience 5(2), 110–114 (2006)
Liu, L., Hawkins, D.M., Ghosh, S., Young, S.S.: Robust singular value decomposition analysis of microarray data. Proceedings of the National Academy of Sciences 100(23), 13167–13172 (2003)
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics 1(1), 24–45 (2004)
McLachlan, G., Do, K., Ambroise, C.: Analysing microarray gene expression data. Wiley, Chichester (2004)
Mirkin, B.: Mathematical Classification and Clustering. Kluwer Academic Press, Norwell (1996)
Pomeroy, S.L., Tamayo, P., et al.: Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature 24(415), 436–442 (2002)
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, 36–44 (2002)
Yang, W.H., Dai, D.Q., Yan, H.: Generalized discriminant analysis for tumor classification with gene expression data. In Proceedings of the Fifth International Conference on Machine Learning and Cybernetics (ICMLC 2006), pp. 4322–4327, Dalian, China (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yang, WH., Dai, DQ., Yan, H. (2007). Biclustering of Microarray Data Based on Singular Value Decomposition. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-77018-3_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77016-9
Online ISBN: 978-3-540-77018-3
eBook Packages: Computer ScienceComputer Science (R0)