Advertisement

EVIDIST: A Similarity Measure for Uncertain Data Streams

  • Abdelwaheb FerchichiEmail author
  • Mohamed Salah Gouider
  • Lamjed Ben Said
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9375)

Abstract

Large amount of data generated by sensors, and increased use of privacy-preserving techniques have led to an increasing interest in mining uncertain data streams. Traditional distance measures such as the Euclidean distance do not always work well for uncertain data streams. In this paper, we present EVIDIST, a new distance measure for uncertain data streams, where uncertainty is modeled as sample observations at each time slot. We conduct an extensive experimental evaluation of EVIDIST (Evidential Distance) on the 1-NN classification task with 15 real datasets. The results show that, compared with Euclidean distance, EVIDIST increases the classification accuracy by about 13 % and is also far more resilient to error.

Keywords

Data mining Distance measure Similarity Uncertain data streams 

References

  1. 1.
    Aßfalg, J., Kriegel, H.-P., Kröger, P., Renz, M.: Probabilistic similarity search for uncertain time series. In: Winslett, M. (ed.) SSDBM 2009. LNCS, vol. 5566, pp. 435–443. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  2. 2.
    Dempster, A.P.: Upper and lower probabilities induced by a multivalued mapping. Ann. Math. Stat 219, 325–339 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Henrikson, J.: Completeness and total boundedness of the Hausdorff metric. MIT Undergraduate J. Math. 1, 69–80 (1999)Google Scholar
  4. 4.
    Keogh, E., Xi, X., Wei, L., Ratanamahatana, C.A.: The UCR time series classification/clustering homepage. www.cs.ucr.edu/~eamonn/time_series_data. Accessed 5 March 2015
  5. 5.
    Orang, M., Shiri, N.: An experimental evaluation of similarity measures for uncertain time series. In: Proceedings of the 18th International Database Engineering and Applications Symposium, pp. 261–264 (2014)Google Scholar
  6. 6.
    Sarangi, S.R., Murthy, K.: DUST: a generalized notion of similarity between uncertain time series. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 383–392 (2010)Google Scholar
  7. 7.
    Shafer, G., et al.: A Mathematical Theory of Evidence, vol. 1. Princeton University Press, Princeton (1976)zbMATHGoogle Scholar
  8. 8.
    Shasha, D.E., Zhu, Y.: High Performance Discovery in Time Series: Techniques and Case Studies. Springer Science & Business Media, Berlin (2004)CrossRefzbMATHGoogle Scholar
  9. 9.
    Smets, P., Kennes, R.: The transferable belief model. Artif. Intell. 66(2), 191–234 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Smets, P.: The combination of evidence in the transferable belief model. IEEE Trans. Pattern Anal. Mach. Intell. 12(5), 447–458 (1990)CrossRefGoogle Scholar
  11. 11.
    Yeh, M.-Y., Wu, K.-L., Yu, P.S., Chen, M.-S.: PROUD: a probabilistic approach to processing similarity queries over uncertain data streams. In: Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology, pp. 684–695 (2009)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Abdelwaheb Ferchichi
    • 1
    Email author
  • Mohamed Salah Gouider
    • 1
  • Lamjed Ben Said
    • 1
  1. 1.SOIEUniversity of TunisLe BardoTunisia

Personalised recommendations