Mean Square Residue Biclustering with Missing Data and Row Inversions

Gremalschi, Stefan; Altun, Gulsah; Astrovskaya, Irina; Zelikovsky, Alexander

doi:10.1007/978-3-642-01551-9_4

Mean Square Residue Biclustering with Missing Data and Row Inversions

Stefan Gremalschi²²,
Gulsah Altun²³,
Irina Astrovskaya²² &
…
Alexander Zelikovsky²²

Conference paper

729 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 5542))

Abstract

Cheng and Church proposed a greedy deletion-addition algorithm to find a given number of k biclusters, whose mean squared residues (MSRs) are below certain thresholds and the missing values in the matrix are replaced with random numbers. In our previous paper we introduced the dual biclustering method with quadratic optimization to missing data and row inversions.

In this paper, we modified the dual biclustering method with quadratic optimization and added three new features. First, we introduce ”row status” for each row in a bicluster where we add and also delete rows from biclusters based on their status in order to find min MSR. We compare our results with Cheng and Church’s approach where they inverse rows while adding them to the biclusters. We select the row or the negated row not only at addition, but also at deletion and show improvement. Second, we give a prove for the theorem introduced by Cheng and Church in [4]. Since, missing data often occur in the given data matrices for biclustering, usually, missing data are filled by random numbers. However, we show that ignoring the missing data is a better approach and avoids additional noise caused by randomness. Since, an ideal bicluster is a bicluster with an H value of zero, our results show a significant decrease of H value of the biclusters with lesser noise compared to original dual biclustering and Cheng and Church method.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Angiulli, F., Pizzuti, C.: Gene Expression Biclustering using Random Walk Strategies. In: Tjoa, A.M., Trujillo, J. (eds.) DaWaK 2005. LNCS, vol. 3589, pp. 509–519. Springer, Heidelberg (2005)
Chapter Google Scholar
Baldi, P., Hatfield, G.W.: DNA Microarrays and Gene Expression. In: From Experiments to Data Analysis and Modelling. Cambridge Univ. Press, Cambridge (2002)
Google Scholar
Bertsimas, D., Tsitsiklis, J.: Introduction to Linear Optimization. Athena Scientific
Google Scholar
Cheng, Y., Church, G.: Biclustering of Expression Data. In: Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 93–103. AAAI Press, Menlo Park (2000)
Google Scholar
Madeira, S.C., Oliveira, A.L.: Biclustering Algorithms for Biological Data Analysis: A Survey. IEEE Transactions on Computational Biology and Bioinformatics 1(1), 24–45 (2004)
Article CAS PubMed Google Scholar
Papadimitriou, C.H., Steiglitz, K.: Combinatorial optimization: algorithms and complexity, p. 2982. Prentice-Hall, Inc., Upper Saddle River
Google Scholar
Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Bhlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzle, E.: A systematic comparison and evaluation of biclustering methods for gene expression data. Bioinformatics 22(9), 1122–1129 (2006)
Article CAS PubMed Google Scholar
Shamir, R., Lecture notes, http://www.cs.tau.ac.il/~rshamir/ge/05/scribes/lec04.pdf
Tanay, A., Sharan, R., Shamir, R.: Discovering Statistically Significant Biclusters in Gene Expression Data. Bioinformatics 18, 136–144 (2002)
Article Google Scholar
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genetics 22, 281–285 (1999)
Article CAS PubMed Google Scholar
Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on gene expression data. In: Proceedings of the 3rd IEEE Conference on Bioinformatics and Bioengineering (BIBE), pp. 321–327 (2003)
Google Scholar
Zhang, Y., Zha, H., Chu, C.H.: A time-series biclustering algorithm for revealing co-regulated genes. In: Proc. Int. Symp. Information and Technology: Coding and Computing (ITCC 2005), Las Vegas, USA, pp. 32–37 (2005)
Google Scholar
Zhou, J., Khokhar, A.A.: ParRescue: Scalable Parallel Algorithm and Implementation for Biclustering over Large Distributed Datasets. In: 26th IEEE International Conference on Distributed Computing Systems, ICDCS 2006 (2006)
Google Scholar
Gremalschi, S., Altun, G.: Mean Squared Residue Based Biclustering Algorithms. In: Măndoiu, I., Sunderraman, R., Zelikovsky, A. (eds.) ISBRA 2008. LNCS (LNBI), vol. 4983, pp. 232–243. Springer, Heidelberg (2008)
Chapter Google Scholar
Divina, F., Aguilar, J.: Ruiz Biclustering of Expression Data with Evolutionary Computation. IEEE Transactions on Knowledge and Data Engineering 18(5), 590–602 (2006)
Article Google Scholar
Yang, J., Wang, W., Wang, H., Yu, P.S.: Enhanced biclustering on expression data. In: Proceedings of the 3rd IEEE Conference on Bioinformatics and Bioengineering (BIBE 2003), pp. 321–327 (2003)
Google Scholar
Prelic, A., Bleuler, S., Zimmermann, P., Wille, A., Bhlmann, P., Gruissem, W., Hennig, L., Thiele, L., Zitzler, E.: A Systematic Comparison and Evaluation of Biclustering Methods for Gene Expression Data. Bioinformatics 22(9), 1122–1129 (2006)
Article CAS PubMed Google Scholar
Xiao, J., Wang, L., Liu, X., Jiang, T.: An Efficient Voting Algorithm for Finding Additive Biclusters with Random Background. Journal of Computational Biology 15(10), 1275–1293 (2008)
Article CAS PubMed PubMed Central Google Scholar
Liu, X., Wang, L.: Computing the maximum similarity bi-clusters of gene expression data. Bioinformatics 23(1), 50–56 (2007)
Article PubMed Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Georgia State University, Atlanta, GA 30303
Stefan Gremalschi, Irina Astrovskaya & Alexander Zelikovsky
Department of Reproductive Medicine, University of California, San Diego, CA 92093
Gulsah Altun

Authors

Stefan Gremalschi
View author publications
You can also search for this author in PubMed Google Scholar
Gulsah Altun
View author publications
You can also search for this author in PubMed Google Scholar
Irina Astrovskaya
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Zelikovsky
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science & Engineering Department, University of Connecticut, 371 Fairfield Way, Unit 2155, CT 06269, Storrs, USA
Ion Măndoiu
Bioinformatics Research Group (BioRG), School of Computing and Information Sciences, Florida International University, 11200 SW 8th Street, Room ECS254, University Park, FL 33199, Miami, USA
Giri Narasimhan
Department of Computer Science, Georgia State University, P.O. Box 3994, GA 30302-3994, Atlanta, USA
Yanqing Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gremalschi, S., Altun, G., Astrovskaya, I., Zelikovsky, A. (2009). Mean Square Residue Biclustering with Missing Data and Row Inversions. In: Măndoiu, I., Narasimhan, G., Zhang, Y. (eds) Bioinformatics Research and Applications. ISBRA 2009. Lecture Notes in Computer Science(), vol 5542. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01551-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-01551-9_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01550-2
Online ISBN: 978-3-642-01551-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics