Skip to main content

An Integrated Graph and Probability Based Clustering Framework for Sequential Data

  • Conference paper
Discovery Science (DS 2008)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5255))

Included in the following conference series:

  • 893 Accesses

Abstract

This paper proposes a new integrated sequential data clustering framework based on an iterative process which alternates between the EM process and a modified b-coloring clustering algorithm. It exhibits two important features: Firstly, the proposed framework allows to give an assignment of clusters to the sequences where the b-coloring properties are maintained as long as the clustering process runs. Secondly, it gives each cluster a twofold representation by a generative model (Markov chains) as well as dominant members which ensure the global stability of the returned partition. The proposed framework is evaluated against benchmark datasets in UCI repository and its effectiveness is confirmed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Antunes, C., Oliveira, A.: Temporal data mining: an overview. In: KDD Workshop on Temporal Data Mining, pp. 1–13 (2001)

    Google Scholar 

  2. Cadez, I.V., Heckerman, D., Meek, C., Smyth, P., White, S.: Visualization of navigation patterns on a Web site using model-based clustering. In: Knowledge Discovery and Data Mining, pp. 280–284 (2000)

    Google Scholar 

  3. Cadez, I.V., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals and objects. In: Knowledge Discovery and Data Mining, pp. 140–149 (2000)

    Google Scholar 

  4. Elghazel, H., Deslandres, V., Kallel, K., Dussauchoy, A.: Clinical Pathway Analysis Using Graph-Based Approach and Markov Models. In: The Second IEEE/ACM International Conference on Digital Information Management, Lyon, France, pp. 279–284 (2007)

    Google Scholar 

  5. Elghazel, H., Deslandres, V., Hacid, M.S., Dussauchoy, A., Kheddouci, H.: A new clustering approach for symbolic data and its validation: Application to the healthcare data. In: Esposito, F., Raś, Z.W., Malerba, D., Semeraro, G. (eds.) ISMIS 2006. LNCS (LNAI), vol. 4203, pp. 473–482. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Gunopulos, D., Das, G.: Time series similarity measures (tutorial pm-2). In: Tutorial notes of the 6th ACM SIGKDD (2000)

    Google Scholar 

  7. Jain, A.K., Murty, M.N., Flynn, P.J.: Data Clustering: A Review. ACM Computing Surveys 31, 264–323 (1999)

    Article  Google Scholar 

  8. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM-Algorithm. Journal of the Royal Statistical Society, Series B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  9. Smyth, P.: Clustering sequences with hidden Markov models. Advances in Neural Information Processing 9, 648–654 (1997)

    MathSciNet  Google Scholar 

  10. Alon, J., Sclaroff, S., Kollios, G., Pavlovic, V.: Discovering clusters in motion time-series data. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 375–381 (2003)

    Google Scholar 

  11. Oates, T., Firoiu, L., Cohen, P.: Clustering time series with hidden Markov models and dynamic time warping. In: Proceedings of the IJCAI 1999 Workshop on Neural, Symbolic and Reinforcement Learning Methods for Sequence Learning, pp. 17–21 (1999)

    Google Scholar 

  12. Irving, W., Manlove, D.F.: The b-chromatic number of a graph. Discrete Applied Mathematics 91, 127–141 (1999)

    Article  MathSciNet  MATH  Google Scholar 

  13. Blake, C.L., Merz, C.J.: UCI repository of machine learning databases, 1998. University of California, Irvine (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer Berlin Heidelberg

About this paper

Cite this paper

Elghazel, H., Yoshida, T., Hacid, MS. (2008). An Integrated Graph and Probability Based Clustering Framework for Sequential Data. In: Jean-Fran, JF., Berthold, M.R., Horváth, T. (eds) Discovery Science. DS 2008. Lecture Notes in Computer Science(), vol 5255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88411-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-88411-8_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-88410-1

  • Online ISBN: 978-3-540-88411-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics