Skip to main content

Incremental Hierarchical Clustering of Stochastic Pattern-Based Symbolic Data

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9652))

Included in the following conference series:

Abstract

Classic data analysis techniques generally assume that variables have single values only. However, the data complexity during the age of big data has gone beyond the classic framework such that variable values probably take the form of a set of stochastic measurements instead. We refer to the above case as the stochastic pattern-based symbolic data where each measurement set is an instance of an underlying stochastic pattern. In such a case, non existing classic data analysis approaches, such as the crystal item or fuzzy region ones, could apply yet. For this reason, we put forward a novel Incremental Hierarchical Clustering algorithm for stochastic Pattern-based Symbolic Data (IHCPSD). IHCPSD is robust to overlapping and missing measurements and well adapted for incremental learning. Experiments on synthetic and application on real-life emitter parameter data have validated its effectiveness.

X. Xu—This work was supported by National Natural Science Foundation of China (No. 61402426, 61373129) and partially supported by Collaborative Innovation Center of Novel Software Technology and Industrialization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Diday, E.: Introduction à lapproche symbolique en analyse des données. RAIRO Rech. Opérationnelle 23(2), 193–236 (1989)

    MathSciNet  MATH  Google Scholar 

  2. Bock, H.-H., Diday, E. (eds.): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Springer, Heidelberg (2000)

    MATH  Google Scholar 

  3. Noirhomme-Fraiture, M., Brito, P.: Far beyond the classical data models: symbolic data analysis. Stat. Anal. Data Min. ASA Data Sci. J. 4(2), 157–170 (2011)

    Article  MathSciNet  Google Scholar 

  4. Billard, L.: Sample covariance functions for complex quantitative data. In: Proceedings of the IASC, Joint Meeting of 4th World Conference of the IASC and 6th Conference of the Asian Regional Section of the IASC on Computational Statistics & Data Analysis, Yokohama, Japan (2008)

    Google Scholar 

  5. Lauro, C., Verde, R., Irpino, A.: Generalized canonical analysis. In: Diday, E., Noirhomme-Fraiture, M. (eds.) Symbolic Data Analysis and the Sodas Software, pp. 313–330. Wiley, Chichester (2008)

    Google Scholar 

  6. De Carvalho, F.A.T., de Souza, R.: Unsupervised pattern recognition models for mixed feature-type symbolic data. Pattern Recogn. Lett. 31(5), 430–443 (2010)

    Article  Google Scholar 

  7. Rasson, J.P., Pircon, J.-Y., Lallemand, P., Adans, S.: Unsupervised divisive classification. In: Diday, E., Noirhomme-Fraiture, M. (eds.) Symbolic Data Analysis and the Sodas Software, pp. 149–156. Wiley, Chichester (2008)

    Google Scholar 

  8. Neto, E.A.L., De Carvalho, F.A.T.: Constrained linear regression models for symbolic interval-valued variables. Comput. Stat. Data Anal. 54(2), 333–347 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  9. Arroyo, J., González-Rivera, G., Maté, C.: Forecasting with interval and histogram data. Some financial applications. In: Ullah, A., Giles, D., Balakrishnan, N., Schucany, W., Schilling, E. (eds.) Handbook of Empirical Economics and Finance. Chapman and Hall/CRC, New York (2010)

    Google Scholar 

  10. González-Rivera, G., Arroyo, J.: Time series modeling of histogram-valued data: the daily histogram time series of SP&500 intradaily returns. Int. J. Forecast. 28(1), 20–33 (2012)

    Article  Google Scholar 

  11. Singh, S.K., Wayal, G., Sharma, N.: A review: data mining with fuzzy association rule mining. Int. J. Eng. Res. Technol. (IJERT) 1(5) (2012)

    Google Scholar 

  12. Prabha, K.S., Lawrance, R.: Mining fuzzy frequent itemset using compact frequent pattern (CFP) tree algorithm. In: International Conference on Computing and Control Engineering (ICCCE) (2012)

    Google Scholar 

  13. Lin, C.-M., Chen, Y.-M., Hsueh, C.-S.: A self-organizing interval type-2 fuzzy neural network for radar emitter identification. Int. J. Fuzzy Syst. 16(1), 20–30 (2014)

    Google Scholar 

  14. Hahsler, M., Buchta, C., Gruen, B.: arules: Mining Association Rules and Frequent Itemsets. R package version 1.0-10 (2011). http://CRAN.R-project.org/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Xu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Xu, X., Lu, J., Wang, W. (2016). Incremental Hierarchical Clustering of Stochastic Pattern-Based Symbolic Data. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J., Wang, R. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2016. Lecture Notes in Computer Science(), vol 9652. Springer, Cham. https://doi.org/10.1007/978-3-319-31750-2_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31750-2_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31749-6

  • Online ISBN: 978-3-319-31750-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics