Multivariate Data Exploration
- 624 Downloads
This chapter contains our first excursion away from the simple problems of univariate samples and univariate distribution estimation.We consider samples of simultaneous observations of several numerical variables. We generalize some of the exploratory data analysis tools used in the univariate case. In particular, we discuss histograms and kernel density estimators. Then we review the properties of the most important multivariate distribution of all, the normal or Gaussian distribution. For jointly normal random variables, dependence can be completely captured by the classical Pearson correlation coefficient. In general however, the situation can be quite different. We review the classical measures of dependence, and emphasize how inappropriate some of them can become in cases of significant departure from the Gaussian hypothesis. In such situations, quantifying dependence requires new ideas, and we introduce the concept of copula as a solution to this problem. We show how copulas can be estimated, and how one can use them for Monte Carlo computations and random scenarios generation. We illustrate all these concepts with an example of coffee futures prices. The last section deals with principal component analysis, a classical technique from multivariate data analysis, which is best known for its use in dimension reduction. We demonstrate its usefulness on data from the fixed income markets.
KeywordsMarginal Distribution Yield Curve Multivariate Normal Distribution Generalize Pareto Distribution Future Contract
Unable to display preview. Download preview PDF.