A Novel Approach for Biclustering Gene Expression Data Using Modular Singular Value Decomposition

Aradhya, V. N. Manjunath; Masulli, Francesco; Rovetta, Stefano

doi:10.1007/978-3-642-14571-1_19

V. N. Manjunath Aradhya^22,23,
Francesco Masulli^22,24 &
Stefano Rovetta²²

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 6160))

Included in the following conference series:

International Meeting on Computational Intelligence Methods for Bioinformatics and Biostatistics

1020 Accesses
6 Citations

Abstract

Clustering is a method of unsupervised learning, and a common technique for statistical data analysis used in many fields, including machine learning, data mining, pattern recognition, image analysis and bioinformatics. Recently, biclustering (or co-clustering), performing simultaneous clustering on the row and column dimensions of the data matrix, has been shown to be remarkably effective in a variety of applications. In this paper we propose a novel approach to biclustering gene expression data based on Modular Singular Value Decomposition (Mod-SVD). Instead of applying SVD directly on on data matrix, the proposed approach computes SVD on modular fashion. Experiments conducted on synthetic and real dataset demonstrated the effectiveness of the algorithm in gene expression data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://www.affymetrix.com/analysis/index.affix
Barkow, S., Bleuler, S., Prelic, A., Zimmermann, P., Zitzler, E.: Bicat: A biclustering analysis toolbox. Bioinformatics 19, 1282–1283 (2006)
Article Google Scholar
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z.: Discovering local structure in gene expression data: the order-preserving sub-matrix problem. In: Proceedings of the Sixth Annual International Conference on Computational Biology, pp. 49–57. ACM Press, New York (2002)
Google Scholar
Blucke, Leemput, Naudts, Remortel, Ma, Verschoren, Moor, Marchal: Syntren: a generator of synthetic gene expression data for design and analysis of structure learning algorithm. BMC Bioinformatics 7, 1–16 (2006)
Article Google Scholar
Cano, C., Adarve, L., López, J., Blanco, A.: Possibilistic approach for biclustering microarray data. Computers in Biology and Medicine 37, 1426–1436 (2007)
Article Google Scholar
Cheng, Y., Church: Biclustering of expression data. In: Proceedings of the Intl Conf. on intelligent Systems and Molecular Biology, pp. 93–103 (2000)
Google Scholar
Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: Nsga-ii. IEEE Trans. on Evolutionary Computatation 6, 182–197 (2002)
Article Google Scholar
Dhillon, I.S.: Co-clustering documents and words using bipartite spectral graph partitioning. In: Proceedings of the 7th ACM SIGKDD, pp. 269–274 (2001)
Google Scholar
Prelic, A., et al.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22, 1122–1129 (2006)
Article Google Scholar
Filippone, M., Masulli, F., Rovetta, S., Zini, L.: Comparing fuzzy approaches to biclustering. In: Proceedings of International Meeting on Computational Intelligence for Bioinformatics and Biostatistics, CIBB (2008)
Google Scholar
Filippone, M., Masulli, F., Stefano, R.: Possibilistic approach to biclustering: An application to oligonucleotide microarray data analysis. In: Proceedings of the Computational Methods in System Biology, pp. 312–322 (2006)
Google Scholar
Gan, X., Alan, Yan, H.: Discovering biclusters in gene expression data based on high dimensional linear geometries. BMC Bioinformatics 9, 209–223 (2008)
Article Google Scholar
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering analysis of gene microarray data. In: Proceedings of National Academy of Science, 12079–12084 (2000)
Google Scholar
Hartigan, J.A.: direct clustering of a data matrix. Journal of the American Statistical Association 67, 123–129 (1972)
Article Google Scholar
Hastie, T., Levine, E., Domany, E.: ’Gene shaving’ as a method for identifying distinct set of genes with similar expression patterns. Genome Biology 1, 0003.1–0003.21 (2000)
Google Scholar
Mallela, S., Dhillon, I., Modha, D.: Information-theoretic co-clustering. In: In Proceedings of the 9th International Conference on Knowledge Discovery and Data Mining (KDD), pp. 89–98 (2003)
Google Scholar
Ihmels, J., Bergmann, S., Barkai, N.: Defining transcription modules using large-scale gene expression data. Bioinformatics 20, 1993–2003 (2004)
Article Google Scholar
Jain, A.K., Murthy, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31, 264–323
Google Scholar
Kluger, Y., Basri, Chang, Gerstein: Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Research 13, 703–716 (2003)
Article Google Scholar
Lay, D.C.: Linear Algebra and its Applications. Addison-Wesley, Reading (2002)
Google Scholar
Li, Z., Lu, X., Shi, W.: Process variation dimension reduction based on svd. In: Proceedings of the Intl Symposium on Circuits and Systems, pp. 672–675 (2003)
Google Scholar
Liu, X., Wang, L.: Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23, 50–56 (2007)
Article Google Scholar
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: A survey. IEEE & ACM Trans. on Computational Biology and Bioinformatics 1, 24–45 (2004)
Article Google Scholar
Mitra, S., Banka, H.: Mulit-objective evolutionary biclustering of gene expression data. Pattern Recognition 39, 2464–2477 (2006)
Article MATH Google Scholar
Mitra, S., Banka, H.: Multi-objective evolutionary biclustering of gene expression data. Pattern Recognition 39, 2464–2477 (2006)
Article MATH Google Scholar
Orr, S.: Network motifs in the transcriptional regulation network of escherichia coli. Nature Genetics 31, 64–68 (2002)
Article Google Scholar
Rohwer, R., Freitag, D.: Towards full automation of lexicon construction. In: HLT-NAACL 2004: Workshop on Computational Lexical Semantics, pp. 9–16 (2004)
Google Scholar
Tanay, A., Sharan, R., Shamir, R.: Discovering statistically significant biclusters in gene expression data. Bioinformatics 18, 136–144 (2002)
Google Scholar
Tang, C., Zhang, L., Zhang, A., Ramanathan, M.: Interrelated two-way clustering: an unsupervised approach for gene expression data analysis. In: Proceedings of the Second Annual IEEE International Symposium on Bioinformatics and Bioengineering, BIBE, pp. 41–48 (2001)
Google Scholar
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)
Article Google Scholar
Tjhi, W.C., Chen, L.: A partitioning based algorithm to fuzzy co-cluster documents and words. Pattern Recognition Letters 27, 151–159 (2006)
Article Google Scholar
Yang, J., Wang, W., Wang, H., Yu, P.: δ-cluster: capturing subspace correlation in a large data set. In: Proceedings of the 18th IEEE International Conference Data Engineering, pp. 517–528 (2002)
Google Scholar
Yang, J., Wang, W., Wang, H., Yu, P.: Enhanced biclustering on expression data. In: Proceedings of the Third IEEE Conference on Bioinformatics and Bioengineering, pp. 321–327 (2003)
Google Scholar
Zhang, Z., Teo, A., Ooi, B.: Mining deterministic biclusters in gene expression data. In: Proceedings of the 4th IEEE Symposium on Bioinformatics and Bioengineering, p. 283 (2004)
Google Scholar
Zhao, H., Alan, Xie, X., Yan, H.: A new geometric biclustering algorithm based on the Hough transform for analysis of large scale microarray data. Journal of Theoretical Biology 251, 264–274 (2008)
Article Google Scholar
Zhao, H., Yan, H.: Hough feature, a novel method for assessing drug effects in three-color cdna microarray experiments. BMC Bioinformatics 8, 256 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer and Information Sciences, University of Genova, Via Dodecaneso 35, 16146, Genova, Italy
V. N. Manjunath Aradhya, Francesco Masulli & Stefano Rovetta
Dept. of ISE, Dayananda Sagar College of Engg, Bangalore, India, 560078
V. N. Manjunath Aradhya
Sbarro Institute for Cancer Research and Molecular Medicine, Center for Biotechnology, Temple University, BioLife Science Bldg., 1900 N 12th Street, Philadelphia, PA, 19122, USA
Francesco Masulli

Authors

V. N. Manjunath Aradhya
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Masulli
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Rovetta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

DISI - Dipartimento di Informatica e Scienze dell’Informazione, Università di Genova, Via Dodecaneso 35, 16146, Genova, Italy
Francesco Masulli
Center for Biostatistics, The Methodist Hospital Research Institute (TMHRI), Weill Cornell Medical College, Cornell University, 6565 Fannin, Suite MGJ6-031, 77030, Houston, Texas, USA
Leif E. Peterson
Dipartimento di Matematica ed Informatica, Università di Salerno, Via Ponte don Melillo, 84084, Fisciano, (Sa), Italy
Roberto Tagliaferri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Aradhya, V.N.M., Masulli, F., Rovetta, S. (2010). A Novel Approach for Biclustering Gene Expression Data Using Modular Singular Value Decomposition. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2009. Lecture Notes in Computer Science(), vol 6160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14571-1_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-14571-1_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14570-4
Online ISBN: 978-3-642-14571-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics