Abstract
Clustering and biclustering are important techniques arising in data mining. Different from clustering, biclustering simultaneously groups the objects and features according their expression levels. In this review, the backgrounds, motivation, data input, objective tasks, and history of data biclustering are carefully studied. The bicluster types and biclustering structures of data matrix are defined mathematically. Most recent algorithms, including OREO, nsNMF, BBC, cMonkey, etc., are reviewed with formal mathematical models. Additionally, a match score between biclusters is defined to compare algorithms. The application of biclustering in computational neuroscience is also reviewed in this chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Angiulli, F., Cesario, E., Pizzuti, C. Random walk biclustering for microarray data. Inf Sci: Int J 178(6), 1479–1497 (2008)
Barkow, S., et al. BicAT: A biclustering analysis toolbox. Bioinformatics 22, 1282–1283 (2006)
Ben-Dor, A., Chor, B., Karp, R., Yakhini, Z. Discovering local structure in gene expression data: The order-preserving submatrix problem. J Comput Biol 10, 373–384 (2003)
Busygin, S., Prokopyev, O.A., Pardalos, P.M. Feature selection for consistent biclustering via fractional 0–1 programming. J Comb Optim 10/1, 7–21 (2005)
Busygin, S., Prokopyev, O.A., Pardalos, P.M. Biclustering in datamining. Comput Oper Res 35, 2964–2987 (2008)
Busygin, S., Boyko, N., Pardalos, P., Bewernitz, M., Ghacibehc, G. Biclustering EEG data from epileptic patients treated with vagus nerve stimulation. AIP Conference Proceedings of the Data Mining, Systems Analysis and Optimization in Biomedicine, 220–231 (2007)
Califano, A., Stolovitzky, G., Tu, Y. Analysis of gene expression microarays for phenotype classification. Proceedings of International Conference on Computational Molecular Biology, 75–85 (2000)
Carmona-Saez, P., Pascual-Marqui, R.D., Tirado, F., Carazo, J.M., Pascual-Montano, A. Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinformatics 7, 78 (2006)
Chaovalitwongse, W.A., Butenko, S., Pardalos, P.M. Clustering Challenges in Biological Networks, World Scientific Publishing, Singapore (2008)
Cheng, K.O., et al. Bivisu: Software tool for bicluster detection and visualization. Bioinformatics 23, 2342–2344 (2007)
Cheng, Y., Church, G.M. Biclustering of expression data. Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, 93–103 (2000)
Cho, H., Dhillon, I.S. Coclustering of human cancer microarrays using minimum sum-squared residue coclustering. IEEE/ACM Trans Comput Biol Bioinform 5(3), 385–400 (2008)
Chung, F.R.K. Spectral graph theory. Conference Board of the Mathematical Sciences, Number 92, American Mathematical Society (1997)
CPLEX: ILOG CPLEX 9.0 Users Manual (2005)
Data Clustering. http://en.wikipedia.org/wiki/Data clustering, access at Dec. 8 (2008)
Data Transformation Steps. http://www.dmg.org/v2–0/Transformations.html, access at Dec. 8 (2008)
Dhillon, I.S. Co-clustering documents and words using bipartite spectral graph partitioning. Proceedings of the 7th ACM SIGKDD International Conference on Knowledging Discovery and Data Mining (KDD), 26–29 (2001)
Dhillon, I.S., Mallela, S., Modha, D.S. Information theoretic co-clustering. Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 89–98 (2003)
DiMaggio, P.A., McAllister, S.R., Floudas, C.A., Feng, X.J., Rabinowitz, J.D., Rabitz, H.A. Biclustering via optimal re-ordering of data matrices in systems biology: Rigorous methods and comparative studies. BMC Bioinformatics 9, 458 (2008)
Engel, J. Jr. Seizures and Epilepsy. F. A. Davis Co., Philadelphia, PA (1989)
Engel, J. Jr., Pedley, T.A. Epilepsy: A Comprehensive Textbook. Lippincott-Raven, Philadelphia, PA (1997)
Fan, N., Chinchuluun, A., Pardalos, P.M. Integer programming of biclustering based on graph models, In: Chinchuluun, A., Pardalos, P.M., Enkhbat, R. and Tseveendorj, I. (eds.) Optimization and Optimal Control: Theory and Applications, Springer (2009)
Fan, N., Pardalos, P.M. Linear and quadratic programming approaches for the general graph partitioning problem, J Global Optim, DOI 10.1007/s10898-009-9520-1, (2010)
Fisher, R.S., Krauss, G.L., Ramsay, E., Laxer, K., Gates, J. Assessment of vagus nerve stimulation for epilepsy: Report of the therapeutics and technology assessment subcommittee of the American academy of neurology. Neurology 49, 293–297 (1997)
Fisher, R.S., Theodore W.H. Brain stimulation for epilepsy. Lancet Neurol 3(2), 111–118 (2004)
Gu, J., Liu, J.S. Bayesian biclustering of gene expression data. BMC Genom 9(Suppl 1), S4 (2008)
Hagen, L., Kahng, A.B. New spectral methods for ratio cut partitioning and clustering. IEEE Trans Computer-Aided Design 11(9), 1074–1085 (1992)
Hartigan, J.A. Direct clustering of a data matrix. J Am Stat Assoc 67, 123–129 (1972)
Iasemidis, L.D., Principe, J.C., Sackellares, J.C. Measurement and quantification of spatiotemporal dynamics of human epilepic seizures. In: Akay, M. (ed.) Nonlinear Signal Processing in Medicine, IEEE Press (1999)
Ihmels, J., Friedlander, G., Bergmann, S., Sarig, O., Ziv, Y., Barkai, N. Revealing modular organization in the yeast transcriptional network. Nat Genet 31(4), 370–377 (2002)
Jain, A.K., Murty, M.N., Flynn, P.J. Data clustering: A review. ACM Comput Survey 31(3), 264–323 (1999)
Kaiser, S., Leisch, F. A toolbox for bicluster analysis in r. Tech. Rep. 028, Ludwing-Maximilians-Universitat Mnchen (2008)
Kluger, Y., Basri, R., Chang, J.T., Gerstein, M. Spectral biclustering of microarray cancer data: Co-clustering genes and conditions. Genome Res 13, 703–716 (2003)
Lazzeroni, L., Owen, A. Plaid models for gene expression data. Stat Sinica 12, 61C86 (2002)
Lee, D.D., Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Liu, J., Wang, W. OP-cluster: Clustering by tendency in high dimensional space. Proceedings of the Third IEEE International Conference on Data Mining, 187–194 (2003)
Madeira, S.C., Oliveira, A.L. Biclustering algorithms for biological data analysis: A survey. IEEE Trans Comput Biol Bioinform 1(1), 24–45 (2004)
Madeira, S.C., Oliveira, A.L. A linear time biclustering algorithm for time series gene expression data. Lect Notes Comput Sci 3692, 39–52, (2005)
Murali, T.M., Kasif, S. Extracting conserved gene expression motifs from gene expression data. Pacific Symp Biocomput 8, 77–88 (2003)
Pardalos, P.M., Busygin, S., Prokopyev, O.A. On biclustering with feature selection for microarray data sets. In: Mondaini, R. (ed.) BIOMAT 2005łinternational Symposium on Mathematical and Computational Biology, pp. 367–378. World Scientific, Singapore (2006)
Pardalos, P.M., Chaovalitwongse, W., Iasemidis, L.D., Sackellares, J.C., Shiau, D.-S., Carney, P.R., Prokopyev, O.A., Yatsenko, V.A. Seizure warning algorithm based on optimization and nonlinear dynamics. Math Prog 101(2), 365–385 (2004)
Pardalos, P.M., Chaovalitwongse, W., Prokopyev, O. Electroencephalogram (EEG) time series classification: Application in epilepsy. Ann Oper Res (2006)
Pascual-Montano, A., Carazo, J.M., Kochi, K., Lehmann, D., Pascual-Marqui, R.D. Non-smooth Non-negative matrix factorization (nsNMF). IEEE Trans Pattern Anal Mach Intell 28, 403–415 (2006)
Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Buhlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E. A systematic comparison and evaluation of biclusteringmethods for gene expression data. Bioinformatics 22(9), 1122–1129, (2006)
Rege, M., Dong, M., Fotouhi, F. Bipartite isoperimetric graph partitioning for data co-clustering. Data Min Know Disc 16, 276–312 (2008)
Reiss, D.J., Baliga, N.S., Bonneau, R. Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics 7, 280 (2006)
Richards, A.L., Holmans, P.A., O'Donovan, M.C., Owen, M.J., Jones, L. A comparison of four clustering methods for brain expression microarray data. BMC Bioinformatics 9, 490 (2008)
Santamaria, R., Theron, R., Quintales, L. BicOverlapper: A tool for bicluster visualization Rodrigo. Bioinformatics 24, 1212–1213 (2008)
Schachter, S.C., Wheless, J.W. (eds.) Vagus nerve stimulation therapy 5 years after approval: A comprehensive update. Neurology S4, 59 (2002)
Sheng, Q., Moreau, Y., De Moor, B. Biclustering microarray data by Gibbs sampling. Bioinformatics 19, 196–205 (2003)
Shi, J., Malik, J. Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell, 22(8), 888–905 (2000)
Supper, J., Strauch, M., Wanke, D., Harter, K., Zell, A. EDISA: Extracting biclusters from multiple time-series of gene expression profiles. BMC Bioinformatics 8, 334 (2007)
Tanay, A., Sharan, R., Kupiec, M., Shamir, R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc Natl Acad Sci USA 101, 2981–2986 (2004)
Tanay, A., Sharan, R., Shamir, R. Discovering statistically significant bilcusters in gene expression data. Bioinformatics 18, S136–S144 (2002)
Tanay, A., Sharan, R., Shamir, R. Biclustering algorithms: A survey. In: Aluru, S. (ed.) Handbook of Computational Molecular Biology. Chapman Hall, London (2005)
Uthman, B.M., Wilder, B.J., Penry, J.K., Dean, C., Ramsay, R.E., Reid, S.A., Hammond, E.J., Tarver, W.B., Wernicke, J.F. Treatment of epilepsy by stimulation of the vagus nerve. Neurology 43, 1338–1345 (1993)
Xu, R., Wunsch, D. II. Survey of clustering algorithms. IEEE Trans Neural Netw 16(3), 645–678 (2005)
Yang, J., Wang, W., Wang, H., Yu, P. δ -Clusters: Capturing subspace correlation in a large data set. Proceedings of the 18th IEEE International Conference on Data Engineering, 517–528 (2002)
Yang, J., Wang, W., Wang, H., Yu, P. Enhanced biclustering on expression data. Proceedings of the Third IEEE Conference on Bioinformatics and Bioengineering, 321–327 (2003)
Zha, H., He, X., Ding, C., Simon, H., Gu, M. Bipartite graph partitioning and data clustering. Proceedings of the Tenth International Conference on Information and Knowledge Management, 25–32 (2001)
Zhao, H., Liew, A.W.-C., Xie, X., Yan, H. A new geometric biclustering based on the Hough transform for analysis of large-scale microarray data. J Theor Biol 251, 264–274 (2008)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Fan, N., Boyko, N., Pardalos, P.M. (2010). Recent Advances of Data Biclustering with Application in Computational Neuroscience. In: Chaovalitwongse, W., Pardalos, P., Xanthopoulos, P. (eds) Computational Neuroscience. Springer Optimization and Its Applications(), vol 38. Springer, New York, NY. https://doi.org/10.1007/978-0-387-88630-5_6
Download citation
DOI: https://doi.org/10.1007/978-0-387-88630-5_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-88629-9
Online ISBN: 978-0-387-88630-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)