Skip to main content

Outlier Detection with Uncertain Data Using Graphics Processors

  • Living reference work entry
  • First Online:
Encyclopedia of Social Network Analysis and Mining

Synonyms

Data mining; Graphics processors; Parallel processing; Outlier detection; Uncertain data

Glossary

Outlier detection:

A data mining task in which data points that are outside expected patterns in a given dataset are identified.

Parallel processing:

A technique in which a task is split into multiple parts to be executed simultaneously by multiple processors.

Graphics processing unit (GPU):

A specialized processor that is designed to compute large numbers of mathematical operations in parallel, primarily for generating 3D graphics. Modern GPUs can also be programmed to perform a variety of other tasks.

General-purpose computing using GPUs (GPGPU):

Programming GPUs for computational tasks other than graphics.

Floating-point operations per second (FLOPS):

A measurement of computing performance using floating-point mathematical operations, often expressed in billions of FLOPS (GFLOPS).

Definition

Outlier detection, also known as anomaly detection, is a widely used fundamental data...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Acklam PJ (2003) An algorithm for computing the inverse normal cumulative distribution function. Tech. rep

    Google Scholar 

  • Aggarwal CC (ed) (2009) Managing and mining uncertain data. Springer, New York

    Google Scholar 

  • Aggarwal CC, Yu PS (2008) Outlier detection with uncertain data. In: Proceedings of the SIAM international conference on data mining. Atlanta, GA, pp 483–493

    Google Scholar 

  • Aggarwal CC, Yu PS (2009) A survey of uncertain data algorithms and applications. IEEE Trans Knowl Data Eng 21(5):609–623. Piscataway, NJ

    Google Scholar 

  • Alshawabkeh M, Jang B, Kaeli D (2010) Accelerating the local outlier factor algorithm on a GPU for intrusion detection systems. In: Proceedings of the 3rd workshop on general-purpose computation on graphics processing units. Pittsburgh, PA, pp 104–110

    Google Scholar 

  • Angiulli F, Basta S, Pizzuti C (2006) Distance-based detection of outliers. IEEE Trans Knowl Data Eng 18(2):145–160. Piscataway, NJ

    Google Scholar 

  • Azmandian F, Yilmazer A, Dy JG, Aslam JA, Kaeli DR (2012) GPU-accelerated feature selection for outlier detection using the local kernel density ratio. In: Proceedings of the 12th IEEE international conference on data mining. Brussels, pp 51–60

    Google Scholar 

  • Bastke S, Deml M, Schmidt S (2009) Combining statistical network data, probabilistic neural networks and the computational power of gpus for anomaly detection in computer networks. In: 1st workshop on intelligent security (security and artificial intelligence). Thessaloniki, pp 1–6

    Google Scholar 

  • Bolton RJ, Hand DJ (2002) Statistical fraud detection: a review. Stat Sci 17(3):235–255. Beachwood, OH

    Google Scholar 

  • Breunig MM, Kriegel HP, Ng RT, Sander J (2000) LOF: identifying density-based local outliers. In: Proceedings of the ACM SIGMOD international conference on management of data. Dallas, TX, pp 93–104

    Google Scholar 

  • Chau M, Cheng R, Kao B, Ng J (2006) Uncertain data mining: an example in clustering location data. In: Proceedings of the 10th Pacific-Asia conference on knowledge discovery and data mining. Singapore, pp 199–204

    Google Scholar 

  • Denoeux T (2013) Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Trans Knowl Data Eng 25(1):119–130. Piscataway, NJ

    Google Scholar 

  • Ester M, Kriegel HP, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the 2nd international conference on knowledge discovery and data mining. Portland, OR, pp 226–231

    Google Scholar 

  • Hawkins DM (1980) Identification of outliers. Chapman and Hall, London

    Google Scholar 

  • Heymann S, Latapy M, Magnien C (2012) Outskewer: using skewness to spot outliers in samples and time series. In: Proceedings of the IEEE/ACM international conference on advances in social network analysis and mining. Istanbul, pp 527–534

    Google Scholar 

  • Huhle B, Schairer T, Jenke P, Strasser W (2008) Robust non-local denoising of colored depth data. In: IEEE Computer society conference on computer vision and pattern recognition, workshop on time of flight camera based computer vision. Anchorage, AK, pp 1–7

    Google Scholar 

  • Hung E, Cheung DW (2002) Parallel mining of outliers in large database. Distrib Parallel Database 12(1):5–26. Hingham, MA

    Google Scholar 

  • Kao B, Lee SD, Cheung DW, Ho WS, Chan KF (2008) Clustering uncertain data using voronoi diagrams. In: Proceedings of the 8th IEEE international conference on data mining. Pisa, pp 333–342

    Google Scholar 

  • Khronos Group (2011) OpenCL. http://www.khronos.org/opencl. Accessed Oct 2012

  • Knorr EM, Ng RT (1998) Algorithms for mining distance-based outliers in large datasets. In: Proceedings of the 24th VLDB conference. New York, pp 392–403

    Google Scholar 

  • Knorr EM, Ng RT (1999) Finding intensional knowledge of distance-based outliers. In: Proceedings of the 25th VLDB conference. Edinburgh, Scotland, pp 211–222

    Google Scholar 

  • Kriegel HP, Pfeifle M (2005a) Density-based clustering of uncertain data. In: Proceedings of the 11th ACM SIGKDD. Chicago, IL, pp 672–677

    Google Scholar 

  • Kriegel HP, Pfeifle M (2005b) Hierarchical density-based clustering of uncertain data. In: Proceedings of the 5th IEEE international conference on data mining. Houston, TX, pp 689–692

    Google Scholar 

  • Lan Z, Zheng Z, Li Y (2010) Toward automated anomaly identification in large-scale systems. IEEE Trans ParallelDistri Syst 21(2):174–187. Piscataway, NJ

    Google Scholar 

  • Lozano E, Acuna E (2005) Parallel algorithms for distance-based and density based outliers. In: Proceedings of the 5th IEEE international conference on data mining. Houston, TX, pp 729–732

    Google Scholar 

  • Marsaglia G (2003) Xorshift RNGs. J Stat Softw 8(14):1–6. Innsbruck

    Google Scholar 

  • Matsumoto T, Hung E (2012) Accelerating outlier detection with uncertain data using graphics processors. In: Proceedings of the 16th Pacific-Asia conference on knowledge and data mining. Kuala Lumpur, pp 169–180

    Google Scholar 

  • Micikevicius P (2010) Analysis-driven optimization. In: GPU technology conference. New Orleans, LA, pp 1–55

    Google Scholar 

  • Murakami T, Kasahara R, Saito T (2010) An implementation and its evaluation of password cracking tool parallelized on GPGPU. In: Proceedings of the international symposium on communications and information technologies. Tokyo, pp 534–538

    Google Scholar 

  • Ngai WK, Kao B, Chui CK, Cheng R, Chau M, Yip KY (2006) Efficient clustering of uncertain data. In: Proceedings of the 6th IEEE international conference on data mining. Hong Kong, pp 436–445

    Google Scholar 

  • Nguyen HV, Gopalkrishnan V (May 2010) Feature extraction for outlier detection in high-dimensional spaces. J Mach Learn Res 10:66–75. Brookline, MA

    Google Scholar 

  • NVIDIA Corporation (2011) CUDA. http://www.nvidia.com/object/cuda_home_new.html. Accessed Oct 2012

  • Ramaswamy S, Rastogi R, Shim K (2000) Efficient algorithms for mining outliers from large data sets. In: Proceedings of the ACM SIGMOD international conference on management of data. Dallas, TX, pp 427–438

    Google Scholar 

  • Reif M, Goldstein M, Stahl A (2008) Anomaly detection by combining decision trees and parametric densities. In: Proceedings of the 19th international conference on pattern recognition. Tampa, FL, pp 1–4

    Google Scholar 

  • Sequeria K, Zaki M (2002) ADMIT: anomaly-based data mining for intrusions. In: Proceedings of the 8th ACM SIGKDD. Edmonton, pp 386–395

    Google Scholar 

  • Tang J, Chen Z, Fu AW, Cheung DW (2006) Capabilities of outlier detection schemes in large datasets, framework and methodologies. Knowl Inf Syst 11(1):45–84. New York

    Google Scholar 

  • Tarabalka Y, Haavardsholm TV, Kaasen I, Skauli T (2009) Real-time anomaly detection in hyperspectral images using multivariate normal mixture models and GPU processing. J Real-Time Image Proc 4(3):287–300. Boston, MA

    Google Scholar 

  • Wang L, Cheung DWL, Cheng R, Lee SD, Yang XS (2012) Efficient mining of frequent item sets on large uncertain databases. IEEE Trans Knowl Data Eng 24(12):2170–2183. Piscataway, NJ

    Google Scholar 

  • Zhang Y, Lin X, Tao Y, Zhang W, Wang H (2012) Efficient computation of range aggregates against uncertain location-based queries. IEEE Trans Knowl Data Eng 24(7):1244–1258. Piscataway, NJ

    Google Scholar 

Download references

Acknowledgments

The work described in this entry was partially supported by grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (PolyU 5191/09E, PolyU 5182/08E, PolyU 5166/11E), and the Hong Kong PhD Fellowship.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takazumi Matsumoto .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this entry

Cite this entry

Matsumoto, T., Hung, E., Yiu, M.L. (2018). Outlier Detection with Uncertain Data Using Graphics Processors. In: Alhajj, R., Rokne, J. (eds) Encyclopedia of Social Network Analysis and Mining. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7163-9_376-1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-7163-9_376-1

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-7163-9

  • Online ISBN: 978-1-4614-7163-9

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics