Abstract
This chapter provides an overview of visualization and analysis techniques applied to large-scale datasets from genomics, metabolomics, and proteomics. The aim is to reduce the number of variables (genes, metabolites, or proteins) by extracting a small set of new relevant variables, usually termed components. The advantages and disadvantages of the classical principal component analysis (PCA) are discussed and a link is given to the closely related singular value decomposition and multidimensional scaling. Special emphasis is given to the recent trend toward the use of independent component analysis, which aims to extract statistically independent components and, therefore, provides usually more meaningful components than PCA. We also discuss normalization techniques and their influence on the result of different analytical techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Scholz, M., Gatzek, S., Sterling, A., Fiehn, O., and Selbig, J. (2004) Metabolite fingerprinting: detecting biological features by independent component analysis. Bioinformatics 20, 2447–2454.
Quackenbush, J. (2002) Microarray data normalization and transformation. Nat. Genet. 32, 496–501.
Jolliffe, I. T. (1986) Principal Component Analysis. Springer-Verlag, New York, NY.
Diamantaras K. I., and Kung, S. Y. (1996) Principal Component Neural Networks. Wiley, New York, NY.
Golub, G. and van Loan, C. (1996) Matrix Computations, 3rd Ed. The Johns Hopkins University Press, Baltimore, MD.
Wall, M. E., Rechtsteiner, A., and Rocha, L. M. (2003) Singular value decomposition and principal component analysis. In: A Practical Approach to Microarray Data Analysis, (Berrar, D. P., Dubitzky, W., and Granzow, M., eds.), Kluwer, Norwell, MA, pp. 91–109.
Alter, O., Brown, P. O., and Botstein, D. (2000) Singular value decomposition for genome-wide expression data processing and modeling. PNAS 97, 10,101–10,106.
Holter, N. S., Mitra, M., Maritan, A., Cieplak, M., Banavar, J. R., and Fedoroff, N. V. (2000) Fundamental patterns underlying gene expression profiles: simplicity from complexity. PNAS 97, 8409–8414.
Liu, L., Hawkins, D. M., Ghosh, S., and Young, S. S. (2003) Robust singular value decomposition analysis of microarray data. PNAS 100, 13,167–13,172.
Cox, T. F. and Cox, M. A. A. (2001) Multidimensional Scaling. Chapman and Hall, London, England.
Burges, C. J. C. (2004) Geometric methods for feature extraction and dimensional reduction-a guided tour. In: Data Mining and Knowledge Discovery Handbook (Rokach, L. and Maimon, O., eds.), Springer Verlag, New York, pp. 59–92.
Sanger, T. D. (1989) Optimal unsupervised learning in a single layer linear feedforward network. Neural Networks 2, 459–473.
Baldi, P. F. and Homik, K. (1995) Learning in linear neural networks: a survey. IEEE Trans. on Neural Networks 6, 837–858.
Comon P. (1994) Independent component analysis, a new concept? Signal Processing 36, 287–314.
Bell, A. J. and Sejnowski, T. J. (1995) An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7, 1129–1159.
Hyvärinen, A. and Oja, E. (2000) Independent component analysis: algorithms and applications. Neural Networks 4-5, 411–430.
Stone, J. V. (2002) Independent component analysis: an introduction. Trends Cogn. Sci. 6, 59–64.
Haykin, S. (2000) Unsupervised Adaptive Filtering, vol. 1: Blind Source Separation. Wiley, New York, NY.
Haykin, S. (2000) Unsupervised Adaptive Filtering, vol. 2: Blind Deconvolution. Wiley, New York, NY.
Hyvärinen, A., Karhunen, J., and Oja, E. (2001) Independent Component Analysis. Wiley, New York, NY.
Cichocki, A. and Amari, S. (2003) Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications. Wiley, New York, NY.
Stone, J. V. (2004) Independent Component Analysis: A Tutorial Introduction. MIT Press, Cambridge, MA.
Vigário, R., Särelä, J., Jousmäki, V., Hämäläinen, M., and Oja, E. (2000) Independent component approach to the analysis of EEG and MEG recordings. IEEE Trans. Biomed. Eng. 47, 589–593.
Tang, A. C., Pearlmutter, B. A., Malaszenko, N. A., Phung, D. B., and Reeb, B. C. (2002) Independent components of magnetoencephalography: Localization. Neural Comput. 14, 1827–1858.
Jung, T.-P., Makeig, S., Lee, T.-W., et al. (2000) Independent component analysis of biomedical signals. In: Proc. Int. Workshop on Independent Component Analysis and Blind Signal Separation (ICA2000), (Pajunen, P. and Karhunen, J., eds.), IEEE Signal Processing Society, Helsinki, Finland, pp. 633–644.
Makeig, S., Westerfield, M., Jung, T.-P., et al. (2002) Dynamic brain sources of visual evoked responses. Science 295, 690–694.
Liebermeister, W. (2002) Linear modes of gene expression determined by independent component analysis. Bioinformatics 18, 51–60.
Martoglio, A.-M., Miskin, J. W., Smith, S. K., and MacKay, D. J. C. (2002) A decomposition model to track gene expression signatures: preview on observer-independent classification of ovarian cancer. Bioinformatics 18, 1617–1624.
Lee, S.-I. and Batzoglou, S. (2003) Application of independent component analysis to microarrays. Genome Biol. 4, R76.
Saidi, S. A., Holland, C. M., Kreil, D. P., et al. (2004) Independent component analysis of microarray data in the study of endometrial cancer. Oncogene 23, 6677–6683.
Scholz, M., Gibon, Y., Stitt, M., and Selbig, J. (2004) Independent component analysis of starch deficient pgm mutants. In: Proceedings of the German Conference on Bioinformatics, (Giegerich, R. and Stoye, J., eds.), GI, Bielefeld, Germany, pp. 95–104.
Cardoso, J.-F. and Souloumiac, A. (1993) Blind beamforming for non Gaussian signals. IEE Proceedings-F 6, 362–370.
Ziehe, A. and Müller, K.-R. (1998) TDSEP: an efficient algorithm for blind separation using time structure. In: Proc. ICANN’98, Int. Conf. on Artificial Neural Networks, (Niklasson, L., Boden, M., and Ziemke, T., eds.), Springer Verlag, London, UK, pp, 675–680.
Blaschke, T. and Wiskott, L. (2004) CuBICA: independent component analysis by simultaneous third-and fourth-order cumulant diagonalization. IEEE Trans. Image Process 52, 1250–1256.
Bach, F. R. and Jordan, M. I. (2002) Kernel independent component analysis. J. Mach. Learn. Res. 3, 1–48.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Humana Press Inc.
About this protocol
Cite this protocol
Scholz, M., Selbig, J. (2007). Visualization and Analysis of Molecular Data. In: Weckwerth, W. (eds) Metabolomics. Methods in Molecular Biology™, vol 358. Humana Press. https://doi.org/10.1007/978-1-59745-244-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-59745-244-1_6
Publisher Name: Humana Press
Print ISBN: 978-1-58829-561-3
Online ISBN: 978-1-59745-244-1
eBook Packages: Springer Protocols