Advertisement

An Algorithm for Outlier Detection on Uncertain Data Stream

  • Keyan Cao
  • Donghong Han
  • Guoren Wang
  • Yachao Hu
  • Ye Yuan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7808)

Abstract

Outlier detection plays an important role in fraud detection, sensor net, computer network management and many other areas. Now the flow property and uncertainty of data are more and more apparent, outlier detection on uncertain data stream has become a new research topic. Firstly, we propose a new outlier concept on uncertain data stream based on possible worlds. Then an outlier detection method on uncertain data stream is proposed to meet the demand of limited storage and real-time processing. Next, a dynamic storage structure is designed for outlier detection on uncertain data stream over sliding window, to meet the demands of limited storage and real-time response. Furthermore, an efficient range query method based on SM-tree(Statistics M-tree) is proposed to reduce some redundant calculation. Finally, the performance of our method is verified through a large number of simulation experiments. The experimental results show that our method is an effective way to solve the problem of outlier detection on uncertain data stream, and it could significantly reduce the execution time and storage space.

Keywords

Outlier detection uncertain data stream possible world 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aggarwal, C.C.: On density based transforms for uncertain data mining. In: ICDE, pp. 866–875 (2007)Google Scholar
  2. 2.
    Aggarwal, C.C., Yu, P.S.: Outlier detection with uncertain data. In: SDM, pp. 483–493 (2008)Google Scholar
  3. 3.
    Assent, I., Kranen, P., Baldauf, C., Seidl, T.: Anyout: Anytime outlier detection on streaming data. In: VLDB, pp. 228–242 (2012)Google Scholar
  4. 4.
    Burdick, D., Deshpande, P.M., Jayram, T.S., Ramakrishnan, R., Vaithyanathan, S.: Olap over uncertain and imprecise data. In: VLDB, pp. 970–981 (2005)Google Scholar
  5. 5.
    Chandola, V., Banerjee, A., Kumar, V.: Outlier detection: A survey. ACM Computing Surveys (2007) (to appear)Google Scholar
  6. 6.
    Cheng, R., Kalashnikov, D., Prabhakar, S.: Evaluating probabilistic queries over imprecise data. In: SIGMOD, pp. 551–562 (2003)Google Scholar
  7. 7.
    Jiang, B., Pei, J.: Outlier detection on uncertain data: Objects, instances, and inferences. In: ICDE, pp. 422–433 (2011)Google Scholar
  8. 8.
    Kontaki, M., Gounaris, A., Papadopoulos, A., Tsichlas, K., Manolopoulos, Y.: Continuous monitoring of distance-based outliers over data streams. In: ICDE, pp. 135–146 (2011)Google Scholar
  9. 9.
    Sarma, A.D., Benjelloun, O., Halevy, A., Widom, J.: Working models for uncertain data. In: ICDE, p. 7 (2006)Google Scholar
  10. 10.
    Singh, S., Mayfield, C., Prabhakar, S., Shah, R., Hambrusch, S.: Indexing uncertain categorical data. In: ICDE, pp. 616–625 (2007)Google Scholar
  11. 11.
    Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: ICDE, pp. 922–933 (2005)Google Scholar
  12. 12.
    Wang, B., Xiao, G., Yu, H., Yang, X.: Distance-based outlier detection on uncertain data. CIT 1, 293–298 (2009)Google Scholar
  13. 13.
    Wang, B., Yang, X., Wang, G., Yu, G.: Outlier detection over sliding windows for probabilistic data streams. Journal of Computer Science and Technology 25(3), 389–400 (2010)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Keyan Cao
    • 1
    • 2
  • Donghong Han
    • 1
    • 2
  • Guoren Wang
    • 1
    • 2
  • Yachao Hu
    • 1
    • 2
  • Ye Yuan
    • 1
    • 2
  1. 1.College of Information Science & EngineeringNortheastern UniversityChina
  2. 2.Key Laboratory of Medical Image Computing(NEU), Ministry of EducationChina

Personalised recommendations