Abstract
Recent work has focused on techniques to construct a learning machine able to classify, at any given accuracy, all members of two mutually exclusive classes. Good numerical results have been reported; however, there remain some concerns regarding prediction ability when dealing with large data bases. This paper introduces clustering, which decreases the number of variables in the linear programming models that need be solved at each iteration. Preliminary results provide better prediction accuracy, while keeping the good characteristics of the previous classification scheme: a piecewise (non)linear surface that discriminates individuals from two classes with an a priori classification accuracy is built and at each iteration, a new piece of the surface is obtained by solving a linear programming (LP) model. The technique proposed in this work reduces the number of LP variables by linking one error variable to each cluster, instead of linking one error variable to each individual in the population. Preliminary numerical results are reported on real datasets from the Irvine repository of machine learning databases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Armato-III, S.G., Giger, M.L., MacMahon, H.: Automated detection of lung nodules in ct scans: preliminary results. Medical Physics 28(8), 1552–1561 (2001)
Barros de Almeida, M., de Padua Braga, A., Braga, J.P.: SVM-KM: speeding svms learning with a priori cluster selection and k-means. In: Proceedings Sixth Brazilian Symposium on Neural Networks (November 2000)
Barzilay, O., Brailovsky, V.: On domain knowledge and feature selecion using a support vector machine. Pattern Recognition Letters 20, 475–484 (1999)
Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares, M., Hausler, D.: Knowledge-based analysis of microarray gene expression data by using support vector machines. Proceedings of The International Academy of Sciences of The United States of America 97, 262–267 (2000)
Buchbinder, S., Leichter, I., Lederman, R., Novak, B., Bamberger, P., Sklair-Levy, M., Yarmish, G., Fields, S.: Computer-aided classifications of bi-rads category 3 breast lesions, radiology. Radiology 280, 820–823 (2004)
Chau, M., Chen, H.: A machine learning approach to web page filtering using content and structure analysis. Decision Support Systems 44, 482–494 (2008)
Cheung, K., Kwok, J.T., Law, M., Tsui, K.: Mining costumer product ratings for personalized marketing. Decision Support Systems 35, 231–243 (2003)
Druker, H., Wu, D., Vapnik, V.: Support vector machines for spam categorization. IEEE Transactions on Neural Networks 10, 1048–1054 (1999)
Espinal-Kohler, J.: Método multi-superficies para clasificación binaria con minimización de errores de grupos de datos. Master’s thesis. Universidad Simón Bolívar (2012)
Frank, A., Asunción, A.: Uci machine learning repository (2010), http://archive.ices.uci.edu/ml
Fung, G., Mangasarian, O.L., Smola, A.: Minimal kernel classifiers. In: Shawe Taylor, J. (ed.), pp. 312–315 (2002)
García-Palomares, U.M., Manzanilla-Salazar, O.G.: Novel linear programming approach for building a piecewise nonlinear binary classifier with a priori accurancy. Decision Support Systems 51, 717–728 (2012)
Ince, H., Trafalis, T.B.: A hybrid model for exchange rate prediction. Decision Support Systems 42, 1054–1062 (2006)
Joachims, J.: Advances in kernel methods: support vector machines, ch. 11. MIT Press, Cambridge (1998)
Li, X.: A scalable decision tree system and its application in pattern recognition and intrusion detection. Decision Support Systems 41, 1–32 (2005)
Mangasarian, O.L.: Mathematical programing in neural networks. Journal on Computing 5, 349–360 (1993)
Mangasarian, O.L., Setonio, R., Wolberg, W.: Pattern recognition via linear programming: Theory and application in medical diagnosis. In: Coleman, T.F., Li, Y. (eds.) Proceedings of the Workshop on Large-Scale Numerical Optimization, pp. 22–31. SIAM (1990)
Nakayama, H., Yun, Y.B., Asada, T., Yoon, M.: Mop/gp mmodel for machine learning. European Journal of Operational Research 166, 756–768 (2005)
Peng, Y., Kou, G., Chen, Z.: A multi-criteria convex quadratic programming model for credit data analysis. Decision Support Systems 44, 1016–1030 (2008)
Stam, A., Rasgdale, C.T.: On the classification gap in mathematical programming-based approaches to the discriminant problem. Naval Research Logistics 39, 545–559 (1992)
Sueyoshi, T.: Extended dea-discriminant analysis. European Journal of Operational Research 131, 324–351 (2001)
Trötscher, T.: Linear mixed integer program (September 2009), http://www.mathworks.com/matlabcentral/fileexchange/25259-linear-mixed-integer-program-solver
Wang, J.: A linear assignment clustering algorithm based on the least similar cluster representatives. IEEE Transactions on Systems, Man and Cybernetics 29, 100–104 (1999)
Wu, Q., Zhou, D.: The F ∞ -norm support vector machine. Statistica Sinica 18, 379–398 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Manzanilla-Salazar, O.G., Espinal-Kohler, J., García-Palomares, U.M. (2014). Minimizing Cluster Errors in LP-Based Nonlinear Classification. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2014. Lecture Notes in Computer Science(), vol 8556. Springer, Cham. https://doi.org/10.1007/978-3-319-08979-9_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-08979-9_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08978-2
Online ISBN: 978-3-319-08979-9
eBook Packages: Computer ScienceComputer Science (R0)