Abstract
The problem of potentially hazardous text identification is an important one in the intelligent data analysis area. As usual, this problem is solved by methods and techniques, which are of a low efficiency in conditions of theme uncertainty.
Within this paper, a novel approach to the potentially hazardous text identification under theme uncertainty is presented. The main idea of data processing approach proposed is based on the user and automatically extracted keywords comparison. This paper contains the brief overview of the text identification methods, the description of the approach presented, some statistical experimental results, discussion and conclusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nigam, K., McCallum, A.D., Thrun, S., Mitchell, T.M.: Text classification from labelled and unlabeled documents using EM. Mach. Learn. 39, 103–134 (2007)
Omer, E.: Using machine learning to identify jihadist messages on Twitter. M.S. theses, Department of Information Technology, Uppsala University Sweden (2015)
Zhang, L., Zhu, J., Yao, T.: An evaluation of statistical spam filtering techniques. ACM Trans. Asian Lang. Inf. Process. (TALIP) 3(4), 243–269 (2004)
Rish, I.: An empirical study of the Naïve Bayes classifier. In: IJCAI 2001 Work Empire Methods Artificial Intelligence, vol. 3 (2001)
Kwon, O.-W., Lee, J.-H.: Text categorization based on k-nearest neighbor approach for web site classification. Inf. Process. Manag. 39(1), 25–44 (2003)
Tresch, M., Luniewski, A.: An extensible classifier for semi-structured documents. In: Park, E.K., Makki, K. (eds.) Proceedings of the Fourth International Conference on Information and Knowledge Management (CIKM 1995), Niki Pissinou, Avi Silber-schatz, pp. 226–233. ACM, New York (1995)
Haykin, S.: Neural Networks - A Comprehensive Foundation. Canada (1999)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Mahinovs, A., Tiwari, A.: Text Classification Method Review. Cranfield University, Cranfield (2007)
Jordan, M.I., Bishop, C.: Neural Networks. CRC Press, Boca Raton (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Babutskiy, V., Sidorov, I. (2019). A Novel Approach to the Potentially Hazardous Text Identification Under Theme Uncertainty Based on Intelligent Data Analysis. In: Silhavy, R., Silhavy, P., Prokopova, Z. (eds) Computational and Statistical Methods in Intelligent Systems. CoMeSySo 2018. Advances in Intelligent Systems and Computing, vol 859. Springer, Cham. https://doi.org/10.1007/978-3-030-00211-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-00211-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00210-7
Online ISBN: 978-3-030-00211-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)