Abstract
In this paper, we propose a method for monitoring the evolution of data described by histograms of values. Our proposal consists to define new order statistics on the quantile functions associated with the empirical distributions, represented by the histogram-data. We introduce the Median, the First and the Third Quartile quantile functions, as well as a generalized representation of the box and whiskers plot. For example, the proposed representations and indices are useful for identifying and classifying outliers, arriving along the time in a data stream environment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arroyo, J., González-Riviera, G., Maté, C., & Muñoz San Roque, A. (2011). Smoothing methods for histogram-valued time series. An application to value-at-risk. Statistical Analysis and Data Mining, 4(2), 216–228.
Gama, J., & Pinto, C. (2006). Discretization from data streams: Applications to histograms and data mining. In Proceedings of the ACM Symposium on Applied Computing (pp. 662–667), New York.
Gilchris, W. (2000). Statistical modelling with quantile functions. London/Boca Raton: Chapman & Hall/CRC.
Irpino, A., & Verde, R. (2006). Dynamic clustering of histograms using Wasserstein metric. In A. Rizzi & M. Vichi (Eds.), Advances in computational statistics (pp. 869–876). Heidelberg: Physica-Verlag.
Rivoli, L., Irpino, A., & Verde, R. (2012). The median of a set of histogram data. In XLVI Riunione Scientifica della Società Italiana di Statistica, CLEUP [ISBN 978-88-6129-882-8].
Verde, R., & Irpino, A. (2007). Dynamic clustering of histogram data: Using the right metric. In Studies in classification, data analysis, and knowledge organization (vol. I, pp. 123–134).
Verde, R., & Irpino, A. (2008). Comparing histogram data using a Mahalanobis-Wasserstein distance (COMPSTAT 2008) (pp. 77–89). Heidelberg: Physica-Verlag.
Tukey, J. W. (1977). Exploratory data analysis. Reading: Addison-Wesley.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Verde, R., Irpino, A., Rivoli, L. (2014). A Box-Plot and Outliers Detection Proposal for Histogram Data: New Tools for Data Stream Analysis. In: Vicari, D., Okada, A., Ragozini, G., Weihs, C. (eds) Analysis and Modeling of Complex Data in Behavioral and Social Sciences. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-06692-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-06692-9_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06691-2
Online ISBN: 978-3-319-06692-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)