Abstract
Classic data analysis techniques generally assume that variables have single values only. However, the data complexity during the age of big data has gone beyond the classic framework such that variable values probably take the form of a set of stochastic measurements instead. We refer to the above case as the stochastic pattern-based symbolic data where each measurement set is an instance of an underlying stochastic pattern. In such a case, non existing classic data analysis approaches, such as the crystal item or fuzzy region ones, could apply yet. For this reason, we put forward a novel Incremental Hierarchical Clustering algorithm for stochastic Pattern-based Symbolic Data (IHCPSD). IHCPSD is robust to overlapping and missing measurements and well adapted for incremental learning. Experiments on synthetic and application on real-life emitter parameter data have validated its effectiveness.
X. Xu—This work was supported by National Natural Science Foundation of China (No. 61402426, 61373129) and partially supported by Collaborative Innovation Center of Novel Software Technology and Industrialization.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Diday, E.: Introduction à lapproche symbolique en analyse des données. RAIRO Rech. Opérationnelle 23(2), 193–236 (1989)
Bock, H.-H., Diday, E. (eds.): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Heidelberg (2000)
Noirhomme-Fraiture, M., Brito, P.: Far beyond the classical data models: symbolic data analysis. Stat. Anal. Data Min. ASA Data Sci. J. 4(2), 157–170 (2011)
Billard, L.: Sample covariance functions for complex quantitative data. In: Proceedings of the IASC, Joint Meeting of 4th World Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics & Data Analysis, Yokohama, Japan (2008)
Lauro, C., Verde, R., Irpino, A.: Generalized canonical analysis. In: Diday, E., Noirhomme-Fraiture, M. (eds.) Symbolic Data Analysis and the Sodas Software, pp. 313–330. Wiley, Chichester (2008)
De Carvalho, F.A.T., de Souza, R.: Unsupervised pattern recognition models for mixed feature-type symbolic data. Pattern Recogn. Lett. 31(5), 430–443 (2010)
Rasson, J.P., Pircon, J.-Y., Lallemand, P., Adans, S.: Unsupervised divisive classification. In: Diday, E., Noirhomme-Fraiture, M. (eds.) Symbolic Data Analysis and the Sodas Software, pp. 149–156. Wiley, Chichester (2008)
Neto, E.A.L., De Carvalho, F.A.T.: Constrained linear regression models for symbolic interval-valued variables. Comput. Stat. Data Anal. 54(2), 333–347 (2010)
Arroyo, J., González-Rivera, G., Maté, C.: Forecasting with interval and histogram data. Some financial applications. In: Ullah, A., Giles, D., Balakrishnan, N., Schucany, W., Schilling, E. (eds.) Handbook of Empirical Economics and Finance. Chapman and Hall/CRC, New York (2010)
González-Rivera, G., Arroyo, J.: Time series modeling of histogram-valued data: the daily histogram time series of SP&500 intradaily returns. Int. J. Forecast. 28(1), 20–33 (2012)
Singh, S.K., Wayal, G., Sharma, N.: A review: data mining with fuzzy association rule mining. Int. J. Eng. Res. Technol. (IJERT) 1(5) (2012)
Prabha, K.S., Lawrance, R.: Mining fuzzy frequent itemset using compact frequent pattern (CFP) tree algorithm. In: International Conference on Computing and Control Engineering (ICCCE) (2012)
Lin, C.-M., Chen, Y.-M., Hsueh, C.-S.: A self-organizing interval type-2 fuzzy neural network for radar emitter identification. Int. J. Fuzzy Syst. 16(1), 20–30 (2014)
Hahsler, M., Buchta, C., Gruen, B.: arules: Mining Association Rules and Frequent Itemsets. R package version 1.0-10 (2011). http://CRAN.R-project.org/
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Xu, X., Lu, J., Wang, W. (2016). Incremental Hierarchical Clustering of Stochastic Pattern-Based Symbolic Data. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-31750-2_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31749-6
Online ISBN: 978-3-319-31750-2
eBook Packages: Computer ScienceComputer Science (R0)