Skip to main content

Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data

  • Chapter
Similarity-Based Clustering

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5400))

Abstract

Pearson correlation is one of the standards for comparisons in biomedical analyses, possessing yet unused potential. Substantial value is added by transferring Pearson correlation into the framework of adaptive similarity measures and by exploiting properties of the mathematical derivatives. This opens access to optimization-based data models applicable in tasks of attribute characterization, clustering, classification, and visualization. Modern high-throughput measuring equipment creates high demand for analysis of extensive biomedical data including spectra and high-resolution gel-electrophoretic images. In this study cDNA arrays are considered as data sources of interest. Recent computational methods are presented for the characterization and analysis of these huge-dimensional data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anscombe, F.J.: Graphs in statistical analysis. American Statistician 27, 17–21 (1973)

    Google Scholar 

  2. Azuaje, F., Dopazo, J.: Data Analysis and Visualization in Genomics and Proteomics. Wiley, Chichester (2005)

    Book  Google Scholar 

  3. Balasubramaniyan, R., Hüllermeier, E., Weskamp, N., Kämper, J.: Clustering of gene expression data using a local shape-based similarity measure. Bioinformatics 21(7), 1069–1077 (2005)

    Article  CAS  PubMed  Google Scholar 

  4. Bar-Joseph, Z., Gifford, D.K., Jaakkola, T.S.: Fast optimal leaf ordering for hierarchical clustering. Bioinformatics 17(suppl. 1), S22–S29 (2001)

    Article  Google Scholar 

  5. Blest, D.: Rank correlation – an alternative measure. Australian & New Zealand Journal of Statistics 42(1), 101–111 (2000)

    Article  Google Scholar 

  6. Bloom, J., Adami, C.: Apparent dependence of protein evolutionary rate on number of interactions is linked to biases in protein-protein interactions data sets. BMC Evolutionary Biology 3(1), 21 (2003)

    Article  PubMed  PubMed Central  Google Scholar 

  7. Buja, A., Swayne, D., Littman, M., Dean, N., Hofmann, H.: Interactive Data Visualization with Multidimensional Scaling. Report, University of Pennsylvania (2004), http://www-stat.wharton.upenn.edu/~buja/

  8. Cottrell, M., Hammer, B., Hasenfuß, A., Villmann, T.: Batch NG. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 275–282. D-side Publications (2005)

    Google Scholar 

  9. Cox, M., Cox, M.: Multidimensional Scaling. Chapman and Hall, Boca Raton (2001)

    Google Scholar 

  10. Ferguson, T., Genest, C., Hallin, M.: Kendall’s Tau for autocorrelation. The Canadian Journal of Statistics 28(3), 587–604 (2000)

    Article  Google Scholar 

  11. Gersho, A., Gray, R.M.: Vector Quantization and Signal Compression. Springer, Heidelberg (1992)

    Book  Google Scholar 

  12. Hartigan, J.A., Wong, M.A.: A K-means clustering algorithm. Applied Statistics 28, 100–108 (1979)

    Article  Google Scholar 

  13. Johnson, S.: Hierarchical Clustering Schemes. Psychometrika 2, 241–254 (1967)

    Article  Google Scholar 

  14. Kaski, S.: Dimensionality reduction by random mapping: Fast similarity computation for clustering. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN 1998), vol. 1, pp. 413–418. IEEE Service Center, Piscataway (1998)

    Google Scholar 

  15. Kaski, S., Nikkila, J., Oja, M., Venna, J., Toronen, P., Castren, E.: Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics 4(1), 48 (2003)

    Article  PubMed  PubMed Central  Google Scholar 

  16. Kohonen, T.: Self-Organizing Maps, 3rd edn. Springer, Berlin (2001)

    Book  Google Scholar 

  17. Lee, J., Verleysen, M.: Nonlinear Dimension Reduction. Springer, Heidelberg (2007)

    Book  Google Scholar 

  18. Lee, J., Verleysen, M.: Rank-based quality assessment of nonlinear dimensionality reduction. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 49–54. D-facto Publications (2008)

    Google Scholar 

  19. Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. Journal of Machine Learning Research 2, 419–444 (2002)

    Google Scholar 

  20. Lohninger, H.: Teach/Me Data Analysis. Springer, Heidelberg (1999)

    Google Scholar 

  21. Ma, Y., Lao, S., Takikawa, E., Kawade, M.: Discriminant analysis in correlation similarity measure space. In: Ghahramani, Z. (ed.) Proceedings of the 24th Annual International Conference on Machine Learning (ICML 2007), pp. 577–584. Omnipress (2007)

    Google Scholar 

  22. Mardia, K., Dryden, I.: Statistical Shape Analysis. Wiley, Chichester (1998)

    Google Scholar 

  23. Martinetz, T., Berkovich, S., Schulten, K.: “Neural-gas” network for vector quantization and its application to time-series prediction. IEEE Transactions on Neural Networks 4(4), 558–569 (1993)

    Article  CAS  PubMed  Google Scholar 

  24. Martinetz, T., Schulten, K.: A ”neural-gas” network learns topologies. Artificial Neural Networks I, 397–402 (1991)

    Google Scholar 

  25. Meuleman, W., Engwegen, J., Gast, M.-C., Beijnen, J., Reinders, M., Wessels, L.: Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics 9(1), 88 (2008)

    Article  PubMed  PubMed Central  Google Scholar 

  26. Nielsen, N., Carstensen, J., Smedsgaard, J.: Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography 805, 17–35 (1998)

    Article  CAS  Google Scholar 

  27. Sreenivasulu, N., Radchuk, V., Strickert, M., Miersch, O., Weschke, W., Wobus, U.: Gene expression patterns reveal tissue-specific signaling networks controlling programmed cell death and ABA-regulated maturation in developing barley seeds. The Plant Journal 47(2), 310–327 (2006)

    Article  CAS  PubMed  Google Scholar 

  28. Strickert, M., Schleif, F.-M., Seiffert, U., Villmann, T.: Derivatives of Pearson correlation for gradient-based analysis of biomedical data. Inteligencia Artificial, Revista Iberoamericana de IA 12(37), 37–44 (2008)

    Google Scholar 

  29. Strickert, M., Schleif, F.-M., Villmann, T.: Metric adaptation for supervised attribute rating. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 31–36. D-facto Publications (2008)

    Google Scholar 

  30. Strickert, M., Seiffert, U., Sreenivasulu, N., Weschke, W., Villmann, T., Hammer, B.: Generalized relevance LVQ (GRLVQ) with correlation measures for gene expression data. Neurocomputing 69, 651–659 (2006)

    Article  Google Scholar 

  31. Strickert, M., Sreenivasulu, N., Seiffert, U.: Sanger-driven MDSLocalize - A comparative study for genomic data. In: Verleysen, M. (ed.) European Symposium on Artificial Neural Networks (ESANN), pp. 265–270. D-facto Publications (2006)

    Google Scholar 

  32. Strickert, M., Sreenivasulu, N., Usadel, B., Seiffert, U.: Correlation-maximizing surrogate gene space for visual mining of gene expression patterns in developing barley endosperm tissue. BMC Bioinformatics 8(165) (2007)

    Google Scholar 

  33. Strickert, M., Sreenivasulu, N., Villmann, T., Hammer, B.: Robust centroid-based clustering using derivatives of Pearson correlation. In: Proc. Int. Joint Conf. Biomedical Engineering Systems and Technologies, BIOSIGNALS, Madeira (2008)

    Google Scholar 

  34. Strickert, M., Teichmann, S., Sreenivasulu, N., Seiffert, U.: High-Throughput Multi-Dimensional Scaling (HiT-MDS) for cDNA-array expression data. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3696, pp. 625–633. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  35. Strickert, M., Witzel, K., Mock, H.-P., Schleif, F.-M., Villmann, T.: Supervised attribute relevance determination for protein identification in stress experiments. In: Proceedings of Machine Learning in Systems Biology (MLSB 2007), pp. 81–86 (2007)

    Google Scholar 

  36. Venna, J., Kaski, S.: Neighborhood preservation in nonlinear projection methods: An experimental study. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) Proceedings of the International Conference on Artificial Neural Networks (ICANN), pp. 485–591. Springer, Heidelberg (2001)

    Google Scholar 

  37. Villmann, T., Claussen, J.C.: Magnification control in self-organizing maps and neural gas. Neural Computation 18(2), 446–469 (2006)

    Article  PubMed  Google Scholar 

  38. Villmann, T., Schleif, F.-M., Hammer, B.: Comparison of Relevance Learning Vector Quantization with other Metric Adaptive Classification Methods. Journal of Neural Networks 19(5), 610–622 (2006)

    Article  PubMed  Google Scholar 

  39. Xu, W., Chang, C., Hung, Y., Kwan, S., Fung, P.: Order Statistics Correlation Coefficient as a Novel Association Measurement with Applications to Biosignal Analysis. IEEE Transactions on Signal Processing 55(12), 5552–5563 (2007)

    Article  Google Scholar 

  40. Yang, L.: An overview of distance metric learning. Technical report, Department of Computer Science and Engineering, Michigan State University (2007)

    Google Scholar 

  41. Zhou, X., Kao, M.-C.J., Wong, W.H.: Transitive functional annotation by shortest-path analysis of gene expression data. PNAS 99(20), 12783–12788 (2002)

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Strickert, M., Schleif, FM., Villmann, T., Seiffert, U. (2009). Unleashing Pearson Correlation for Faithful Analysis of Biomedical Data. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds) Similarity-Based Clustering. Lecture Notes in Computer Science(), vol 5400. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01805-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01805-3_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01804-6

  • Online ISBN: 978-3-642-01805-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics