Advertisement

Abstract

The paper contains some general remarks on the high art of data analysis, some philosophical thoughts about classification, a partial review of outliers and robustness from the point of view of applications, including a discussion of the problem of model choice, and a review of several aspects of robust estimation of covariance matrices, including the pragmatic choice of a weight function based on empirical and theoretical evidence. Several sections contain new (or at least original) ideas: There are some proposals for incorporating robustness into Bayesian practice and theory, including weighted log likelihoods and Bayes’ theorem for weighted data. Some small ideas refer to artificial classification in a continuum, to a “robust” (Prohorov-type) metric for high-dimensional data, and to the use of multiple minimum spanning trees. A promising but difficult research idea for clustering on the real line, based on a new smoothing method, concludes the paper.

Keywords

Minimum Span Tree Influence Function Breakdown Point Philosophical Thought Epistemic Probability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ANDREWS, D.F., BICKEL, P.J., HAMPEL, F.R., HUBER, P.J., ROGERS, W.H., and TUKEY, J.W. (1972): Robust Estimates of Location; Survey and Advances. Princeton University Press, Princeton, N.J.zbMATHGoogle Scholar
  2. BARNETT, V., and LEWIS, T. (1994): Outliers in Statistical Data. Wiley, New York. Earlier editions: 1978, 1984.zbMATHGoogle Scholar
  3. BEATON, A.B., and TUKEY, J.W. (1974): The Fitting of Power Series, Meaning Polynomials, Illustrated on Band-Spectroscopic Data. Technometrics, 16, 2, 147–185, with Discussion —192.zbMATHCrossRefGoogle Scholar
  4. BENNETT, C.A. 1954: Effect of measurement error in chemical process control. Industrial Quality Control, 11, 17–20.Google Scholar
  5. BERGER, J.O. (1984): The robust Bayesian viewpoint. In: J.B. Kadane (Ed.): Robustness of Bayesian Analyses. Elsevier Science, Amsterdam.Google Scholar
  6. BICKEL, P.J. (1975): One-step Huber estimates in the linear model. J. Amer. Statist. Ass., 70, 428–434.MathSciNetzbMATHCrossRefGoogle Scholar
  7. COX, D.R., and HINKLEY, D.V. (1968): A note on the efficiency of least-squares estimates. J. R. Statist. Soc. B, 30, 284–289.MathSciNetzbMATHGoogle Scholar
  8. DANIEL, C. (1976): Applications of Statistics to Industrial Experimentation. Wiley, New York.zbMATHCrossRefGoogle Scholar
  9. DANIEL, C., and WOOD, F.S. (1980): Fitting Equations to Data. Wiley, New York. Second edition.zbMATHGoogle Scholar
  10. DAVIES, P.L. (1995): Data Features. Statistica Nederlandica, 49, 185–245.zbMATHCrossRefGoogle Scholar
  11. DEMPSTER, A.P. (1967): Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Statist., 38, 325–339.MathSciNetzbMATHCrossRefGoogle Scholar
  12. DEMPSTER, A.P. (1968): A generalization of Bayesian inference. J. Roy. Statist. Soc., B 30, 205–245.MathSciNetzbMATHGoogle Scholar
  13. DEMPSTER, A.P. (1975): A subjectivist look at robustness. Bull. Internat. Statist. Inst., 46, Book 1, 349–374.MathSciNetGoogle Scholar
  14. DONOHO, D.L. (1982): Breakdown properties of multivariate location estimators. Ph D qualifying paper, Department of Statistics, Harvard University, Cambridge, Mass.Google Scholar
  15. GNANADESIKAN, R. (1977): Methods for Statistical Data Analysis of Multivariate Observations. Wiley, New York.zbMATHGoogle Scholar
  16. GOOD, I.J. (1983): Good Thinking; The Foundations of Probability and Its Applications. University of Minnesota Press, Minneapolis.zbMATHGoogle Scholar
  17. GRIZE, Y.L. (1978): Robustheitseigenschaften von Korrelationsschätzungen. Diplomarbeit, Seminar für Statistik, ETH Zürich.Google Scholar
  18. HAMPEL, F. (1968): Contributions to the theory of robust estimation. Ph.D. thesis, University of California, Berkeley.Google Scholar
  19. HAMPEL, F. (1974): The influence curve and its role in robust estimation. J. Amer. Statist. Assoc., 69, 383–393.MathSciNetzbMATHCrossRefGoogle Scholar
  20. HAMPEL, F. (1975): Beyond location parameters: Robust concepts and methods (with discussion). Bull. Internat. Statist. Inst., 46, Book 1, 375–391.MathSciNetGoogle Scholar
  21. HAMPEL, F. (1978): Optimally bounding the gross-error-sensitivity and the influence of position in factor space. Invited paper ASA/IMS Meeting. Amer. Statist. Assoc. Proc. Statistical Computing Section, ASA, Washington, D.C., 59–64.Google Scholar
  22. HAMPEL, F. (1980): Robuste Schätzungen: Ein anwendungsorientierter Überblick. Biometrical J. 22, 3–21.MathSciNetzbMATHCrossRefGoogle Scholar
  23. HAMPEL, F. (1983): The robustness of some nonparametric procedures. In: P.J. Bickel, K.A. Doksum and J.L Hodges Jr. (Eds.): A Festschrift for Erich L. Lehmann. Wadsworth, Belmont, California, 209–238.Google Scholar
  24. HAMPEL, F. (1985): The breakdown points of the mean combined with some rejection rules. Technometrics, 27, 95–107.MathSciNetzbMATHCrossRefGoogle Scholar
  25. HAMPEL, F. (1987): Design, modelling, and analysis of some biological data sets. In: C.L. Mallows (Ed.): Design, Data, and Analysis, by some friends of Cuthbert Daniel. Wiley, New York, 93–128.Google Scholar
  26. HAMPEL, F. (1997): Some additional notes on the “Princeton Robustness Year”. In: D.R. Brillinger, L.T. Fernholz and S. Morgenthaler (Eds.): The Practice of Data Analysis: Essays in Honor of John W. Tukey. Princeton University Press, Princeton, 133–153.Google Scholar
  27. HAMPEL, F. (1998a): Is statistics too difficult? Canad. J. Statist., 26, 3, 497–513.zbMATHCrossRefGoogle Scholar
  28. HAMPEL, F. (1998b): On the foundations of statistics: A frequentist approach. In: Manuela Souto de Miranda and Isabel Pereira (Eds.): Estatistica: a diversidade na unidade. Ediçôes Salamandra, Lda., Lisboa, Portugal, 77–97.Google Scholar
  29. HAMPEL, F. (2000): An outline of a unifying statistical theory. Gert de Cooman, Terrence L. Fine and Teddy Seidenfield (Eds.): ISIPTA’01 Proceedings of the Second International Symposium on Imprecise Probabilities and their Applications. Cornell University, June 26–29, 2001. Shaker Publishing BV, Maastricht, Netherlands (2000), 205–212.Google Scholar
  30. HAMPEL, F. (2002): Robust Inference. In: Abdel H. El-Shaarawi and Walter W. Piegorsch (Eds.): Encyclopedia of Environmetrics, 3, 1865–1885.Google Scholar
  31. HAMPEL, F.R., Ronchetti, E.M., Rousseeuw, P.J., and Stahel, W.A. (1986): Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.zbMATHGoogle Scholar
  32. HENNIG, C. (1998) Clustering and outlier identification: Fixed Point Clusters. In: A. Rizzi, M. Vichi, and H.-H. Bock (Eds.): Advances in Data Science and Classification. Springer, Berlin, 37–42.CrossRefGoogle Scholar
  33. HENNIG, C. (2001) Clusters, Outliers, and Regression: Fixed Point Clusters. J. Multivariate Anal. Submitted.Google Scholar
  34. HENNIG, C., and CHRISTLIEB N. (2002): Validating visual clusters in large data sets: Fixed point clusters of spectral features. Computational Statistics and Data Analysis, to appear.Google Scholar
  35. HUBER, P. (1981): Robust Statistics. Wiley, New York.zbMATHCrossRefGoogle Scholar
  36. JEFFREYS, H. (1939): Theory of Probability. Clarendon Press, Oxford. Later editions: 1948, 1961, 1983.Google Scholar
  37. KUNSCH, H.R., BERAN, J., and HAMPEL F.R. (1993): Contrasts under long-range correlations. Ann. Statist., 212, 943–964.MathSciNetCrossRefGoogle Scholar
  38. KAUFMAN, L., and ROUSSEEUW, P.J. (1990): Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York.CrossRefGoogle Scholar
  39. MACHLER, M.B. (1989): Parametric’ Smoothing Quality in Nonparametric Regression: Shape Control by Penalizing Inflection Points. Ph. D. thesis, no 8920, ETH Zurich, Switzerland.Google Scholar
  40. MACHLER, M.B. (1995a): Estimating Distributions with a Fixed Number of Modes. In: H. Rieder (Ed.): Robust Statistics, Data Analysis, and Computer Intensive Methods–Workshop in honor of Peter J. Huber, on his 60th birthday. Springer, Berlin, Lecture Notes in Statistics, Volume 109, 267–276.CrossRefGoogle Scholar
  41. MACHLER, M.B. (1995b); Variational Solution of Penalized Likelihood Problems and Smooth Curve Estimation. The Annals of Statistics. 23, 1496–1517.MathSciNetCrossRefGoogle Scholar
  42. MARONNA, R.A. (1976): Robust M-estimators of location and scatter. Ann. Statist., 4, 51–67.MathSciNetzbMATHCrossRefGoogle Scholar
  43. PROHOROV, Y.V. (1956): Convergence of random processes and limit theorems in probability theory. Theor. Prob. Appl., 1, 157–214.MathSciNetCrossRefGoogle Scholar
  44. RELLES, D.A., and ROGERS, W.H. (1977): Statisticians are fairly robust estimators of location. J. Amer. Statist. Assoc., 72, 107–111.zbMATHCrossRefGoogle Scholar
  45. ROSENTHAL, R. (1978): How often are our numbers wrong? American Psychologist, 33, 11, 1005–1008.CrossRefGoogle Scholar
  46. SHAFER, G. (1976): A Mathematical Theory of Evidence. Princeton University Press, Princeton, N. J.zbMATHGoogle Scholar
  47. STAHEL, W. (1981a): Robust estimation: Infinitesimal optimality and covariance matrix estimators (in German) Ph. D. thesis, no 6881, ETH Zurich, Switzerland.Google Scholar
  48. STAHEL, W. (1981b): Breakdown of covariance estimators. Research Report 31, ETH Zurich, Switzerland.Google Scholar
  49. STIGLER, S.M. (1977): Do robust estimators work on real data? Ann. Statist., 6, 1055–1098.MathSciNetCrossRefGoogle Scholar
  50. STUDENT“ (1927): Errors of routine analysis. Biometrika, 19,151–164.Google Scholar
  51. TUKEY, J.W. (1960): A survey of sampling from contaminated distributions. In: I. Olkin, S.G. Ghurye, W. Hoeffding, W.G. Madow, and H.B. Mann (Eds.): Contributions to Probability and Statistics. Stanford University Press, 448–485.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Frank Hampel
    • 1
  1. 1.Seminar for Statistics of ETHSwiss Federal Institute of TechnologyZurichSwitzerland

Personalised recommendations