Abstract
The effectiveness of change-detection algorithms is often assessed on real-world datasets by injecting synthetically generated changes. Typically, the magnitude of the introduced changes is not controlled, and most of experimental practices lead to results that are difficult to reproduce and compare with. This problem becomes particularly relevant when the data-dimension scales, as it happens in big data applications.
To enable a fair comparison among change-detection algorithms, we have designed “Controlling Change Magnitude” (CCM), a rigorous method to introduce changes in multivariate datasets. In particular, we measure the change magnitude as the symmetric Kullback-Leibler divergence between the pre- and post-change distributions, and introduce changes by applying a roto-translation directly to the data. We present an algorithm to identify the parameters yielding the desired change magnitude, and analytically prove its convergence. Our experiments show the effectiveness of the proposed method and the limitations of tests run on high-dimensional datasets when changes are injected following traditional approaches. The MATLAB framework implementing the proposed method is made publicly available for download.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
publicly available for download at: http://home.deib.polimi.it/carrerad.
- 2.
This bound is trivial for Gaussian pdfs and follows from basic algebra in case of GM.
- 3.
There is no need to find a dominant function for \(\left\| {\mathbf {x}}\right\| _2 \le r\) since there \(g(\cdot ,\mathbf {x})\) is bounded.
References
Alippi, C.: Intelligence for Embedded Systems, A Methodological Approach. Springer, Switzerland (2014)
Alippi, C., Boracchi, G., Carrera, D., Roveri, M.: Change detection in multivariate datastreams: Likelihood and detectability loss. In: Proceedings of IJCAI (2016)
Alippi, C., Boracchi, G., Roveri, M.: A Just-In-Time adaptive classification system based on the Intersection of Confidence Intervals rule. Neural Netw. 24(8), 791–800 (2011)
Alippi, C., Boracchi, G., Roveri, M.: Just-in-time classifiers for recurrent concepts. IEEE Trans. Neural Netw. Learn. Syst. 24(4) (2013)
Alippi, C., Boracchi, G., Roveri, M.: Hierarchical change-detection tests. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–13 (2016)
Bauer, H.: Measure and Integration Theory. Walter de Gruyter, Berlin (2001)
Boracchi, G., Roveri, M.: Exploiting self-similarity for change detection. In: Proceedings of IEEE International Joint Conference on Neural Networks (IJCNN) (2014)
Burden, R.L., Faires, J.D.: Numerical Analysis. Brooks/Cole, USA (2001)
Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. 41(3), 15:1–15:58 (2009)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Proceedings of Brazilian Symposium on Artificial Intelligence (SBIA) (2004)
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4) (2014)
Harel, M., Mannor, S., El-yaniv, R., Crammer, K.: Concept drift detection through resampling. In: Proceedings of ICML, pp. 1009–1017 (2014)
Kuncheva, L.I.: Change detection in streaming multivariate data using likelihood detectors. IEEE Trans. Knowl. Data Eng. 25(5) (2013)
Lichman, M.: UCI machine learning repository. http://archive.ics.uci.edu/ml
McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, New York (2004)
Pimentel, M.A., Clifton, D.A., Clifton, L., Tarassenko, L.: A review of novelty detection. Sig. Process. 99, 215–249 (2014)
Ross, G.J., Tasoulis, D.K., Adams, N.M.: Nonparametric monitoring of data streams for changes in location and scale. Technometrics 53(4) (2011)
Rudin, W.: Principles of Mathematical Analysis. McGraw-Hill, New York (1964)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Alippi, C., Boracchi, G., Carrera, D. (2017). CCM: Controlling the Change Magnitude in High Dimensional Data. In: Angelov, P., Manolopoulos, Y., Iliadis, L., Roy, A., Vellasco, M. (eds) Advances in Big Data. INNS 2016. Advances in Intelligent Systems and Computing, vol 529. Springer, Cham. https://doi.org/10.1007/978-3-319-47898-2_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-47898-2_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47897-5
Online ISBN: 978-3-319-47898-2
eBook Packages: EngineeringEngineering (R0)