Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data

Strickert, Marc; Schleif, Frank-Michael; Villmann, Thomas; Seiffert, Udo

doi:10.1007/978-3-642-01805-3_5

Marc Strickert²³,
Frank-Michael Schleif²⁴,
Thomas Villmann²⁴ &
…
Udo Seiffert²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5400))

1421 Accesses
1 Citations

Abstract

Pearson correlation is one of the standards for comparisons in biomedical analyses, possessing yet unused potential. Substantial value is added by transferring Pearson correlation into the framework of adaptive similarity measures and by exploiting properties of the mathematical derivatives. This opens access to optimization-based data models applicable in tasks of attribute characterization, clustering, classification, and visualization. Modern high-throughput measuring equipment creates high demand for analysis of extensive biomedical data including spectra and high-resolution gel-electrophoretic images. In this study cDNA arrays are considered as data sources of interest. Recent computational methods are presented for the characterization and analysis of these huge-dimensional data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anscombe, F.J.: Graphs in statistical analysis. American Statistician 27, 17–21 (1973)
Google Scholar
Azuaje, F., Dopazo, J.: Data Analysis and Visualization in Genomics and Proteomics. Wiley, Chichester (2005)
Book Google Scholar
Balasubramaniyan, R., Hüllermeier, E., Weskamp, N., Kämper, J.: Clustering of gene expression data using a local shape-based similarity measure. Bioinformatics 21(7), 1069–1077 (2005)
Article CAS PubMed Google Scholar
Bar-Joseph, Z., Gifford, D.K., Jaakkola, T.S.: Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17(suppl. 1), S22–S29 (2001)
Article Google Scholar
Blest, D.: Rank correlation – an alternative measure. Australian & New Zealand Journal of Statistics 42(1), 101–111 (2000)
Article Google Scholar
Bloom, J., Adami, C.: Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evolutionary Biology 3(1), 21 (2003)
Article PubMed PubMed Central Google Scholar
Buja, A., Swayne, D., Littman, M., Dean, N., Hofmann, H.: Interactive Data Visualization with Multidimensional Scaling. Report, University of Pennsylvania (2004), http://www-stat.wharton.upenn.edu/~buja/
Cottrell, M., Hammer, B., Hasenfuß, A., Villmann, T.: Batch NG. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 275–282. D-side Publications (2005)
Google Scholar
Cox, M., Cox, M.: Multidimensional Scaling. Chapman and Hall, Boca Raton (2001)
Google Scholar
Ferguson, T., Genest, C., Hallin, M.: Kendall’s Tau for autocorrelation. The Canadian Journal of Statistics 28(3), 587–604 (2000)
Article Google Scholar
Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Springer, Heidelberg (1992)
Book Google Scholar
Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)
Article Google Scholar
Johnson, S.: Hierarchical Clustering Schemes. Psychometrika 2, 241–254 (1967)
Article Google Scholar
Kaski, S.: Dimensionality reduction by random mapping: Fast similarity computation for clustering. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 1998), vol. 1, pp. 413–418. IEEE Service Center, Piscataway (1998)
Google Scholar
Kaski, S., Nikkila, J., Oja, M., Venna, J., Toronen, P., Castren, E.: Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4(1), 48 (2003)
Article PubMed PubMed Central Google Scholar
Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)
Book Google Scholar
Lee, J., Verleysen, M.: Nonlinear Dimension Reduction. Springer, Heidelberg (2007)
Book Google Scholar
Lee, J., Verleysen, M.: Rank-based quality assessment of nonlinear dimensionality reduction. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 49–54. D-facto Publications (2008)
Google Scholar
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)
Google Scholar
Lohninger, H.: Teach/Me Data Analysis. Springer, Heidelberg (1999)
Google Scholar
Ma, Y., Lao, S., Takikawa, E., Kawade, M.: Discriminant analysis in correlation similarity measure space. In: Ghahramani, Z. (ed.) Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), pp. 577–584. Omnipress (2007)
Google Scholar
Mardia, K., Dryden, I.: Statistical Shape Analysis. Wiley, Chichester (1998)
Google Scholar
Martinetz, T., Berkovich, S., Schulten, K.: “Neural-gas” network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks 4(4), 558–569 (1993)
Article CAS PubMed Google Scholar
Martinetz, T., Schulten, K.: A ”neural-gas” network learns topologies. Artificial Neural Networks I, 397–402 (1991)
Google Scholar
Meuleman, W., Engwegen, J., Gast, M.-C., Beijnen, J., Reinders, M., Wessels, L.: Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics 9(1), 88 (2008)
Article PubMed PubMed Central Google Scholar
Nielsen, N., Carstensen, J., Smedsgaard, J.: Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography 805, 17–35 (1998)
Article CAS Google Scholar
Sreenivasulu, N., Radchuk, V., Strickert, M., Miersch, O., Weschke, W., Wobus, U.: Gene expression patterns reveal tissue-specific signaling networks controlling programmed cell death and ABA-regulated maturation in developing barley seeds. The Plant Journal 47(2), 310–327 (2006)
Article CAS PubMed Google Scholar
Strickert, M., Schleif, F.-M., Seiffert, U., Villmann, T.: Derivatives of Pearson correlation for gradient-based analysis of biomedical data. Inteligencia Artificial, Revista Iberoamericana de IA 12(37), 37–44 (2008)
Google Scholar
Strickert, M., Schleif, F.-M., Villmann, T.: Metric adaptation for supervised attribute rating. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 31–36. D-facto Publications (2008)
Google Scholar
Strickert, M., Seiffert, U., Sreenivasulu, N., Weschke, W., Villmann, T., Hammer, B.: Generalized relevance LVQ (GRLVQ) with correlation measures for gene expression data. Neurocomputing 69, 651–659 (2006)
Article Google Scholar
Strickert, M., Sreenivasulu, N., Seiffert, U.: Sanger-driven MDSLocalize - A comparative study for genomic data. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 265–270. D-facto Publications (2006)
Google Scholar
Strickert, M., Sreenivasulu, N., Usadel, B., Seiffert, U.: Correlation-maximizing surrogate gene space for visual mining of gene expression patterns in developing barley endosperm tissue. BMC Bioinformatics 8(165) (2007)
Google Scholar
Strickert, M., Sreenivasulu, N., Villmann, T., Hammer, B.: Robust centroid-based clustering using derivatives of Pearson correlation. In: Proc. Int. Joint Conf. Biomedical Engineering Systems and Technologies, BIOSIGNALS, Madeira (2008)
Google Scholar
Strickert, M., Teichmann, S., Sreenivasulu, N., Seiffert, U.: High-Throughput Multi-Dimensional Scaling (HiT-MDS) for cDNA-array expression data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 625–633. Springer, Heidelberg (2005)
Chapter Google Scholar
Strickert, M., Witzel, K., Mock, H.-P., Schleif, F.-M., Villmann, T.: Supervised attribute relevance determination for protein identification in stress experiments. In: Proceedings of Machine Learning in Systems Biology (MLSB 2007), pp. 81–86 (2007)
Google Scholar
Venna, J., Kaski, S.: Neighborhood preservation in nonlinear projection methods: An experimental study. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 485–591. Springer, Heidelberg (2001)
Google Scholar
Villmann, T., Claussen, J.C.: Magnification control in self-organizing maps and neural gas. Neural Computation 18(2), 446–469 (2006)
Article PubMed Google Scholar
Villmann, T., Schleif, F.-M., Hammer, B.: Comparison of Relevance Learning Vector Quantization with other Metric Adaptive Classification Methods. Journal of Neural Networks 19(5), 610–622 (2006)
Article PubMed Google Scholar
Xu, W., Chang, C., Hung, Y., Kwan, S., Fung, P.: Order Statistics Correlation Coefficient as a Novel Association Measurement with Applications to Biosignal Analysis. IEEE Transactions on Signal Processing 55(12), 5552–5563 (2007)
Article Google Scholar
Yang, L.: An overview of distance metric learning. Technical report, Department of Computer Science and Engineering, Michigan State University (2007)
Google Scholar
Zhou, X., Kao, M.-C.J., Wong, W.H.: Transitive functional annotation by shortest-path analysis of gene expression data. PNAS 99(20), 12783–12788 (2002)
Article CAS PubMed PubMed Central Google Scholar

Download references

Author information

Authors and Affiliations

Data Inspection Group, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany
Marc Strickert
Research group Computational Intelligence, University of Leipzig, Germany
Frank-Michael Schleif & Thomas Villmann
Biosystems Engineering, Fraunhofer Institute for Factory Operation and Automation (IFF), Magdeburg, Germany
Udo Seiffert

Authors

Marc Strickert
View author publications
You can also search for this author in PubMed Google Scholar
Frank-Michael Schleif
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Villmann
View author publications
You can also search for this author in PubMed Google Scholar
Udo Seiffert
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Mathematics and Computing Science, Intelligent Systems Group, University Groningen, P.O. Box 407, 9700 AK, Groningen, Netherlands
Michael Biehl
Department of Computer Science, Clausthal University of Technology, 38679, Clausthal-Zellerfeld, Germany
Barbara Hammer
Machine Learning Group, DICE, Place du Levant, Université catholique de Louvain,, 3-B-1348, Louvain-la-Neuve, Belgium
Michel Verleysen
Dep. of Mathematics/Physics/Computer Sciences, University of Applied Sciences Mittweida, Technikumplatz 17, 09648, Mittweida, Germany
Thomas Villmann

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Strickert, M., Schleif, FM., Villmann, T., Seiffert, U. (2009). Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-642-01805-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01804-6
Online ISBN: 978-3-642-01805-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics