Abstract
In this work, we propose a novel unsupervized discretization method based on a Left to Right (LR) scanning technique, namely ULR-Discr. Its originality resides in the fact it uses fusion and division operations at the same time and among its strengths, we report two advantages. The first one consists in designing the algorithm by crossing the input stream in a single pass, and this way the time complexity is significantly reduced relatively to that of the previous works. The second is the possibility offered to provide easily any cut-point function to reach the desired effectiveness. To evaluate our method, extensive experiments were conducted on large datasets in order to undertake comparison with several classical discretization methods and recent ones.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bettinger, R.: ChiD. A \(\chi ^2\)-based discretization algorithm. In: Proceedings of WUSS, Modern Analytics, San Francisco, CA (2011)
Biba, M., Esposito, F., Ferilli, S., Di Mauro N., Basile, T.: Unsupervised discretization using kernel density estimation. In: IJCAI, pp. 696–701 (2007)
Dougherty, J., Kohavi, R., Sahami, M.: Supervised and unsupervised discretization of continuous features. In: ICML 1995, pp. 194–202 (1995)
Fayyad, U., Irani, K.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027. Morgan Kaufmann, San Mateo (1993)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63–91 (1993)
Kerber, R.: ChiMerge: discretization of numeric attributes. In: AAAI Proceedings, pp. 123–128 (1992)
Lesk, M.E., Schmidt, E.: Lex: a lexical analyzer generator. Accessed 12 Dec 2016
Liu, H., et al.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6, 393–423 (2002). Kluwer Academic Publishers
Madhu, G., Rajinikanth, T.V., Govardhan, A.: Improve the classifier accuracy for continuous attributes in biomedical datasets using a new discretization method. In: 2nd International Conference on Information Technology and Quantitative Management, ITQM (2014)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Drias, H., Rehkab, N., Moulai, H. (2017). ULR-Discr: A New Unsupervised Approach for Discretization. In: Ghosh, A., Pal, R., Prasath, R. (eds) Mining Intelligence and Knowledge Exploration. MIKE 2017. Lecture Notes in Computer Science(), vol 10682. Springer, Cham. https://doi.org/10.1007/978-3-319-71928-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-71928-3_36
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-71927-6
Online ISBN: 978-3-319-71928-3
eBook Packages: Computer ScienceComputer Science (R0)