Multivariate Methods for the Integration and Visualization of Omics Data
As the developments in high throughput technologies have become more common and accessible it is becoming usual to take several distinct simultaneous approaches to study the same problem. In practice, this means that data of different types (expression, proteins, metabolites...) may be available for the same study, highlighting the need for methods and tools to analyze them in a combined way. In recent years there have been developed many methods that allow for the integrated analysis of different types of data. Corresponding to a certain tradition in bioinformatics many methodologies are rooted in machine learning such as bayesian networks, support vector machines or graph-based methods. In contrast with the high number of applications from these fields, another that seems to have contributed less to “omic” data integration is multivariate statistics, which has however a long tradition in being used to combine and visualize multidimensional data. In this work, we discuss the application of multivariate statistical approaches to integrate bio-molecular information by using multiple factorial analysis. The techniques are applied to a real unpublished data set consisting of three different data types: clinical variables, expression microarrays and DNA Gel Electrophoretic bands. We show how these statistical techniques can be used to perform reduction dimension and then visualize data of one type useful to explain those from other types. Whereas this is more or less straightforward when we deal with two types of data it turns to be more complicated when the goal is to visualize simultaneously more than two types. Comparison between the approaches shows that the information they provide is complementary suggesting their combined use yields more information than simply using one of them.
KeywordsData Integration Omic Data Visualization Multiple Factor Analysis
Unable to display preview. Download preview PDF.
- 4.Escofier, B., Pages, J.: Analyses factorielles simples et multiples. [Multiple and Simple Factor Analysis], 3rd edn. Dunod, Paris (1998)Google Scholar
- 6.Falciani, F.: Microarray technology through applications. Taylor & Francis, New York (2007)Google Scholar
- 9.Hamid, J., Hu, P., Roslin, V., Greenwood, C., Beyene, J.: Data integration in genetics and genomics: Methods and challenges. Human Genomics and Proteomics (2009)Google Scholar
- 10.Huopaniemi, I., Suvitaival, T., Nikkil, J., Orei, M., Kaski, S.: Multivariate multi-way analysis of multi-source data. Bioinformatics 26(12), i391–i398 (2010), http://bioinformatics.oxfordjournals.org/content/26/12/i391.abstract CrossRefGoogle Scholar
- 13.Nguyen, D.V.: DNA microarray experiments: Biological and technological aspects. Biometrics 58(4), 701–717 (2002), http://www.blackwell-synergy.com/doi/abs/10.1111/j.0006-341X.2002.00701.x MathSciNetCrossRefzbMATHGoogle Scholar
- 14.Rhodes, D.R., Barrette, T.R., Rubin, M.A., Ghosh, D., Chinnaiyan, A.M.: Meta-analysis of microarrays: interstudy validation of gene expression profiles reveals pathway dysregulation in prostate cancer. Cancer Res. 62(15), 4427–4433 (2002)Google Scholar