Metabolomics pp 87-104 | Cite as

Visualization and Analysis of Molecular Data

  • Matthias Scholz
  • Joachim Selbig
Part of the Methods in Molecular Biology™ book series (MIMB, volume 358)


This chapter provides an overview of visualization and analysis techniques applied to large-scale datasets from genomics, metabolomics, and proteomics. The aim is to reduce the number of variables (genes, metabolites, or proteins) by extracting a small set of new relevant variables, usually termed components. The advantages and disadvantages of the classical principal component analysis (PCA) are discussed and a link is given to the closely related singular value decomposition and multidimensional scaling. Special emphasis is given to the recent trend toward the use of independent component analysis, which aims to extract statistically independent components and, therefore, provides usually more meaningful components than PCA. We also discuss normalization techniques and their influence on the result of different analytical techniques.


Principal Component Analysis Singular Value Decomposition Independent Component Analysis Independent Component Analysis Sample Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Scholz, M., Gatzek, S., Sterling, A., Fiehn, O., and Selbig, J. (2004) Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 20, 2447–2454.PubMedCrossRefGoogle Scholar
  2. 2.
    Quackenbush, J. (2002) Microarray data normalization and transformation. Nat. Genet. 32, 496–501.PubMedCrossRefGoogle Scholar
  3. 3.
    Jolliffe, I. T. (1986) Principal Component Analysis. Springer-Verlag, New York, NY.Google Scholar
  4. 4.
    Diamantaras K. I., and Kung, S. Y. (1996) Principal Component Neural Networks. Wiley, New York, NY.Google Scholar
  5. 5.
    Golub, G. and van Loan, C. (1996) Matrix Computations, 3rd Ed. The Johns Hopkins University Press, Baltimore, MD.Google Scholar
  6. 6.
    Wall, M. E., Rechtsteiner, A., and Rocha, L. M. (2003) Singular value decomposition and principal component analysis. In: A Practical Approach to Microarray Data Analysis, (Berrar, D. P., Dubitzky, W., and Granzow, M., eds.), Kluwer, Norwell, MA, pp. 91–109.CrossRefGoogle Scholar
  7. 7.
    Alter, O., Brown, P. O., and Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. PNAS 97, 10,101–10,106.PubMedCrossRefGoogle Scholar
  8. 8.
    Holter, N. S., Mitra, M., Maritan, A., Cieplak, M., Banavar, J. R., and Fedoroff, N. V. (2000) Fundamental patterns underlying gene expression profiles: simplicity from complexity. PNAS 97, 8409–8414.PubMedCrossRefGoogle Scholar
  9. 9.
    Liu, L., Hawkins, D. M., Ghosh, S., and Young, S. S. (2003) Robust singular value decomposition analysis of microarray data. PNAS 100, 13,167–13,172.PubMedCrossRefGoogle Scholar
  10. 10.
    Cox, T. F. and Cox, M. A. A. (2001) Multidimensional Scaling. Chapman and Hall, London, England.Google Scholar
  11. 11.
    Burges, C. J. C. (2004) Geometric methods for feature extraction and dimensional reduction-a guided tour. In: Data Mining and Knowledge Discovery Handbook (Rokach, L. and Maimon, O., eds.), Springer Verlag, New York, pp. 59–92.Google Scholar
  12. 12.
    Sanger, T. D. (1989) Optimal unsupervised learning in a single layer linear feedforward network. Neural Networks 2, 459–473.CrossRefGoogle Scholar
  13. 13.
    Baldi, P. F. and Homik, K. (1995) Learning in linear neural networks: a survey. IEEE Trans. on Neural Networks 6, 837–858.CrossRefGoogle Scholar
  14. 14.
    Comon P. (1994) Independent component analysis, a new concept? Signal Processing 36, 287–314.CrossRefGoogle Scholar
  15. 15.
    Bell, A. J. and Sejnowski, T. J. (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7, 1129–1159.PubMedCrossRefGoogle Scholar
  16. 16.
    Hyvärinen, A. and Oja, E. (2000) Independent component analysis: algorithms and applications. Neural Networks 4-5, 411–430.CrossRefGoogle Scholar
  17. 17.
    Stone, J. V. (2002) Independent component analysis: an introduction. Trends Cogn. Sci. 6, 59–64.PubMedCrossRefGoogle Scholar
  18. 18.
    Haykin, S. (2000) Unsupervised Adaptive Filtering, vol. 1: Blind Source Separation. Wiley, New York, NY.Google Scholar
  19. 19.
    Haykin, S. (2000) Unsupervised Adaptive Filtering, vol. 2: Blind Deconvolution. Wiley, New York, NY.Google Scholar
  20. 20.
    Hyvärinen, A., Karhunen, J., and Oja, E. (2001) Independent Component Analysis. Wiley, New York, NY.CrossRefGoogle Scholar
  21. 21.
    Cichocki, A. and Amari, S. (2003) Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. Wiley, New York, NY.Google Scholar
  22. 22.
    Stone, J. V. (2004) Independent Component Analysis: A Tutorial Introduction. MIT Press, Cambridge, MA.Google Scholar
  23. 23.
    Vigário, R., Särelä, J., Jousmäki, V., Hämäläinen, M., and Oja, E. (2000) Independent component approach to the analysis of EEG and MEG recordings. IEEE Trans. Biomed. Eng. 47, 589–593.PubMedCrossRefGoogle Scholar
  24. 24.
    Tang, A. C., Pearlmutter, B. A., Malaszenko, N. A., Phung, D. B., and Reeb, B. C. (2002) Independent components of magnetoencephalography: Localization. Neural Comput. 14, 1827–1858.PubMedCrossRefGoogle Scholar
  25. 25.
    Jung, T.-P., Makeig, S., Lee, T.-W., et al. (2000) Independent component analysis of biomedical signals. In: Proc. Int. Workshop on Independent Component Analysis and Blind Signal Separation (ICA2000), (Pajunen, P. and Karhunen, J., eds.), IEEE Signal Processing Society, Helsinki, Finland, pp. 633–644.Google Scholar
  26. 26.
    Makeig, S., Westerfield, M., Jung, T.-P., et al. (2002) Dynamic brain sources of visual evoked responses. Science 295, 690–694.PubMedCrossRefGoogle Scholar
  27. 27.
    Liebermeister, W. (2002) Linear modes of gene expression determined by independent component analysis. Bioinformatics 18, 51–60.PubMedCrossRefGoogle Scholar
  28. 28.
    Martoglio, A.-M., Miskin, J. W., Smith, S. K., and MacKay, D. J. C. (2002) A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer. Bioinformatics 18, 1617–1624.PubMedCrossRefGoogle Scholar
  29. 29.
    Lee, S.-I. and Batzoglou, S. (2003) Application of independent component analysis to microarrays. Genome Biol. 4, R76.PubMedCrossRefGoogle Scholar
  30. 30.
    Saidi, S. A., Holland, C. M., Kreil, D. P., et al. (2004) Independent component analysis of microarray data in the study of endometrial cancer. Oncogene 23, 6677–6683.PubMedCrossRefGoogle Scholar
  31. 31.
    Scholz, M., Gibon, Y., Stitt, M., and Selbig, J. (2004) Independent component analysis of starch deficient pgm mutants. In: Proceedings of the German Conference on Bioinformatics, (Giegerich, R. and Stoye, J., eds.), GI, Bielefeld, Germany, pp. 95–104.Google Scholar
  32. 32.
    Cardoso, J.-F. and Souloumiac, A. (1993) Blind beamforming for non Gaussian signals. IEE Proceedings-F 6, 362–370.Google Scholar
  33. 33.
    Ziehe, A. and Müller, K.-R. (1998) TDSEP: an efficient algorithm for blind separation using time structure. In: Proc. ICANN’98, Int. Conf. on Artificial Neural Networks, (Niklasson, L., Boden, M., and Ziemke, T., eds.), Springer Verlag, London, UK, pp, 675–680.Google Scholar
  34. 34.
    Blaschke, T. and Wiskott, L. (2004) CuBICA: independent component analysis by simultaneous third-and fourth-order cumulant diagonalization. IEEE Trans. Image Process 52, 1250–1256.Google Scholar
  35. 35.
    Bach, F. R. and Jordan, M. I. (2002) Kernel independent component analysis. J. Mach. Learn. Res. 3, 1–48.CrossRefGoogle Scholar

Copyright information

© Humana Press Inc. 2007

Authors and Affiliations

  • Matthias Scholz
    • 1
  • Joachim Selbig
    • 1
  1. 1.Institute of Biochemistry and BiologyUniversity of PotsdamPotsdamGermany

Personalised recommendations