Abstract
With the increasing size of the data, reducing dataset to reduce computational complexity has become an important task. Instance selection is one of the common preprocessing processes in data mining, which can delete the redundant instances and noisy points from dataset. In the past, various instance selection algorithms have been proposed. Most of them are effective for selecting convex instances, but fail to achieve good performances when dealing with concave instances. This paper proposes a hybrid instance selection algorithm based on convex hull and nearest neighbor information. Firstly, he proposed algorithm identifies the nearest enemy for each data point, and then divides the dataset into several subsets by grouping the points, which have the same nearest enemy, into one subset. Finally, the convex hull algorithm is executed on each subset to select the convex hull points. Our algorithm is evaluated on 14 datasets and compared with some traditional instance selection algorithms. The experimental results show that the proposed algorithm performs better than other traditional algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Carbonera, J.L., Abel, M.: A density-based approach for instance selection. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI). IEEE Computer Society (2015)
García, S., Derrac, J., Cano, J.R., et al.: Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans. Pattern Anal. Mach. Intell. 34(3), 417 (2012)
He, H., Ma, Y.: Class imbalance learning methods for support vector machines, pp. 83–99. Wiley-IEEE Press (2013)
Nikolaidis, K., Goulermas, J.Y., Wu, Q.H.: A class boundary preserving algorithm for data condensation. Pattern Recogn. 44(3), 704–715 (2011)
Leyva, E., González, A., Pérez, R.: Three new instance selection methods based on local sets: a comparative study with several approaches from a bi-objective perspective. Pattern Recogn. 48(4), 1523–1537 (2015)
Lin, W.-C., Tsai, C.-F., Ke, S.-W., Hung, C.-W., Eberle, W.: Learning to detect representative data for large scale instance selection. J. Syst. Softw. 106, 1–8 (2015)
Arnaiz-González, Á., Díez-Pastor, J.-F., Rodríguez, J.J., et al.: Instance selection of linear complexity for big data. Knowl.-Based Syst. 107, 83–95 (2016)
Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Trans. Syst. Man Cybern. SMC-2(3), 408–421 (1972)
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)
Brighton, H., Mellish, C.: Advances in Instance Selection for Instance-Based Learning Algorithms. Kluwer Academic Publishers (2002)
Zhou, X., Jiang, W., Tian, Y., et al.: Kernel subclass convex hull sample selection method for SVM on face recognition. Neurocomputing 73(10–12), 2234–2246 (2010)
Angiulli, F.: Fast nearest neighbor condensation for large data sets classification. IEEE Trans. Knowl. Data Eng. 19(11), 1450–1464 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Wang, S., Tian, J. (2020). A Hybrid Instance Selection Method Based on Convex Hull and Nearest Neighbor. In: Chien, CF., Qi, E., Dou, R. (eds) IE&EM 2019. Springer, Singapore. https://doi.org/10.1007/978-981-15-4530-6_1
Download citation
DOI: https://doi.org/10.1007/978-981-15-4530-6_1
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-4529-0
Online ISBN: 978-981-15-4530-6
eBook Packages: Business and ManagementBusiness and Management (R0)