Abstract
It is difficult to detect the anomalies whose matching relationship among some data attributes is very different from others’ in a dataset. Aiming at this problem, an approach based on wavelet analysis for detecting and amending anomalous samples was proposed. Taking full advantage of wavelet analysis’ properties of multi-resolution and local analysis, this approach is able to detect and amend anomalous samples effectively. To realize the rapid numeric computation of wavelet translation for a discrete sequence, a modified algorithm based on Newton-Cores formula was also proposed. The experimental result shows that the approach is feasible with good result and good practicality.
Similar content being viewed by others
References
Eskin E. Anomaly detection over noisy data using learned probability distributions [C]// Langley P. Proceedings of the 17th International Conference on Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc, 2000: 255–262.
Yamanishi K, Takeuchi J I, Williams G. On-line unsupervised outlier detection using finite mixtures with discounting learning algorithms[C]// Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Boston: ACM Press, 2000: 320–324.
Knorr E M, Ng R T. Algorithms for mining distance-based outliers in large datasets [C]// Gupta A, Shmueli O, Widom J. Proceedings of the 24th International Conference on Very Large Data Bases. New York: Morgan Kaufmann, 1998: 392–403.
Knorr E M, Ng R T. Finding intensional knowledge of distance-based outliers[C]// Atkinson M P, Orlowska M E, Valduriez P. Proceedings of the 25th International Conference on Very Large Data Bases. Edinburgh: Morgan Kaufmann, 1999: 211–222.
Ramaswamy S, Rastogi R, Kyuseok S. Efficient algorithms for mining outliers from large data sets[C]// Chen W, Naughton J F, Bernstein P A. Proceedings of the ACM SIGMOD International Conference on Management of Data. Dallas: ACM Press, 2000: 427–438.
Bay S D, Schwabacher M. Mining distance-based outliers in near linear time with randomization and a simple pruning rule[C]// Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington: ACM Press, 2003: 29–38.
Breunig M M, Kriegel H P, Ng R T, et al. OPTICS-OF: identifying local outliers [C]// Zytkow J M, Rauch J. Proceedings of the 3rd European Conference on Principles and Practice of Knowledge Discovery in Databases. Berlin: Springer, 1999: 262–270.
Breunig M M, Kriegel H P, Ng R. T, et al. LOF: identifying density-based local outliers[C]// Chen W, Naughton J F, Bernstein P A. Proceedings of the ACM SIGMOD International Conference on Management of Data. Dallas: ACM Press, 2000: 93–104.
Jiang M F, Tseng S S, Su C M. Two-phase clustering process for outliers detection[J]. Pattern Recognition Letters, 2001, 22(6–7): 691–700.
HE Zeng-you, XU Xiao-fei, DENG Sheng-chun. Discovering cluster-based local outliers[J]. Pattern Recognition Letters, 2003, 24(9–10): 1641–1650.
Arshad M H, Chan P K. Identifying outliers via clustering for anomaly detection[EB/OL]. [2003-06-13]. http://www.cs.fit.edu/Projects/tech-reports/cs-2003-19.pdf
HE Zeng-you, DENG Sheng-chun, XU Xiao-fei. Outlier detection integrating semantic knowledge [C]// Proceeding of the 3rd International Conference on Web-Age Information Management. London: Springer-verlag, 2002: 126–131.
Hawkins S, HE Hong-xing, Williams G, et al. Outlier detection using replicator neural networks [C]// Proceedings of the 4th International Conference and Data Warehousing and Knowledge Discovery. London: Springer-Verlag, 2002: 170–180.
YANG Fu-sheng. Wavelet transformation’s analysis and application in engineering[M]. Beijing: Science Press, 2000. (in Chinese)
LI Qing-yan, WANG Neng-chao, YI Da-yi. Numerical analysis[M]. 3rd ed. Wuhan: Huazhong University of Science and Technology Press, 1986. (in Chinese)
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: Project(50374079) supported by the National Natural Science Foundation of China
Rights and permissions
About this article
Cite this article
Peng, Xq., Song, Yp., Tang, Y. et al. Approach based on wavelet analysis for detecting and amending anomalies in dataset. J Cent. South Univ. Technol. 13, 491–495 (2006). https://doi.org/10.1007/s11771-006-0074-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11771-006-0074-9