Skip to main content

Hierarchical Clustering for Boxplot Variables

  • Conference paper
Book cover Data Science and Classification

Abstract

Boxplots are well-known exploratory charts used to extract meaningful information from batches of data at a glance. Their strength lies in their ability to summarize data retaining the key information, which also is a desirable property of symbolic variables. In this paper, boxplots are presented as a new kind of symbolic variable. In addition, two different approaches to measure distances between boxplot variables are proposed. The usefulness of these distances is illustrated by means of a hierarchical clustering of boxplot data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • BENJAMINI, Y. (1988): Opening the Box of a Boxplot. American Statistician, 42/4, 257–262.

    Article  Google Scholar 

  • BILLARD, L., and DIDAY, E. (2002): From the Statistics of Data to the Statistics of Knowledge: Symbolic Data Analysis. Journal of the American Statistical Association, 98/462, 991–999.

    MathSciNet  Google Scholar 

  • BOCK, H.H. and DIDAY, E. (2000): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information ¿From Complex Data. Springer-Verlag, Heidelberg.

    Google Scholar 

  • FRIGGE, M., HOAGLIN, D. C., and IGLEWICZ, B. (1989): Some Implementations of the Boxplot. American Statistician, 43/1, 50–54.

    Article  Google Scholar 

  • HOAGLIN, D. C., IGLEWICZ, B., and TUKEY, J. W. (1986): Performance of Some Resistant Rules for Outlier Labeling. Journal of the American Statistical Association, 81/396, 991–999.

    Article  MathSciNet  Google Scholar 

  • ICHINO, M., and YAGUCHI, H. (1994): Generalized Minkowski Metrics for Mixed Feature-Type Data Analysis. IEEE Transactions on Systems, Man and Cybernetics, 24/1, 698–708.

    Article  MathSciNet  Google Scholar 

  • NIBLACK, W., BARBER, R., EQUITZ, W., FLICKNER, M.D., GLASMAN, E.H., PETKOVIC, D., YANKER, P., FALOUTSOS, C., TAUBIN, G., and HEIGHTS, Y. (1993): Querying images by content, using color, texture, and shape. SPIE Conference on Storage and Retrieval for Image and Video Databases, 1908, 173–187.

    Google Scholar 

  • TRENKLER, D. (2002): Quantile-Boxplots. Communications in Statistics: Simulation and Computation, 31/1, 1–12.

    Article  MathSciNet  Google Scholar 

  • TUKEY, J. W. (1977): Exploratory Data Analysis. Addison-Wesley, Reading.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin · Heidelberg

About this paper

Cite this paper

Arroyo, J., Maté, C., Roque, A.MS. (2006). Hierarchical Clustering for Boxplot Variables. In: Batagelj, V., Bock, HH., Ferligoj, A., Žiberna, A. (eds) Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-34416-0_7

Download citation

Publish with us

Policies and ethics