Optimal Subset Selection for Classification through SAT Encodings

Angiulli, Fabrizio; Basta, Stefano

doi:10.1007/978-0-387-09695-7_30

Fabrizio Angiulli² &
Stefano Basta³

Part of the book series: IFIP – The International Federation for Information Processing ((IFIPAICT,volume 276))

Included in the following conference series:

IFIP International Conference on Artificial Intelligence in Theory and Practice

1163 Accesses

Abstract

In this work we propose a method for computing a minimum size training set consistent subset for the Nearest Neighbor rule (also said CNN problem) via SAT encodings. We introduce the SAT–CNN algorithm, which exploits a suitable encoding of the CNN problem in a sequence of SAT problems in order to exactly solve it, provided that enough computational resources are available. Comparison of SAT–CNN with well-known greedy methods shows that SAT–CNN is able to return a better solution. The proposed approach can be extended to several hard subset selection classification problems.

Download to read the full chapter text

Chapter PDF

Cost-sensitive sparse subset selection

Article 12 October 2023

Fast and Accurate Steepest-Descent Consistency-Constrained Algorithms for Feature Selection

Minimum Rule-Repair Algorithm for Supervised Learning Classifier Systems on Real-Valued Classification Tasks

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Angiulli, F. (2005). Fast condensed nearest neighbor rule. In 22nd International Conference on Machine Learning (ICML), Bonn, Germany.
Google Scholar
Angiulli, F. (2007). Condensed nearest neighbor data domain description. IEEE Trans. Pattern Anal. Mach. Intell., 29(10):1746-1758.
Article Google Scholar
Angiulli, F. (2007). Fast nearest neighbor condensation for large data sets classification. IEEE Trans. Knowl. Data Eng., 19(11):1450-1464.
Article Google Scholar
Many à , F., & Ans ótegui, C (2004). Mapping problems with finite-domain variables into problems with boolean variables. In Proc. of the Seventh Int. Conf. on Theory and Applications of Satisifiability Testing (SAT), pages 111-119, Vancouver, BC, Canada.
Google Scholar
Cook, S.A. (1971). The complexity of theorem-proving procedures. In 3rd ACM Symposium on Theory of Computing, pages 151-158, Ohio, United States.
Google Scholar
Hart P.E., & Cover, T.M. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1):21-27.
Article MATH Google Scholar
Dasarathy, B. (1994). Minimal consistent subset (mcs) identification for optimal nearest neighbor decision systems design. IEEE Transactions on Systems, Man, and Cybernetics, 24(3):511-517.
Article Google Scholar
Logemann, G., Loveland, D., & Davis, M. (1962). A machine program for theorem-proving. Communications of the ACM, 5(7):394-397.
Article MATH MathSciNet Google Scholar
Murty, M.N., & Devi, F.S. (2002). An incremental prototype set building technique. Pattern Recognition, 35(2):505-513.
Article MATH Google Scholar
Devroye, L. (1981). On the inequality of cover and hart in nearest neighbor discrimination. IEEE Transactions on Pattern Analysis and Machine Intelligence, 3:75-78.
Article MATH Google Scholar
S örensson, N., & E én, N. (2005). Minisat a sat solver with conflict-clause minimization. In International Conference on Theory and Applications of Satisfiability Testing.
Google Scholar
Warmuth, M., & Floyd, S. (1995). Sample compression, learnability, and the vapnikchervonenkis dimension. Machine Learning, 21(3):269-304.
Google Scholar
Hostetler, L.D., & Fukunaga, K. (1975). k-nearest-neighbor bayes-risk estimation. IEEE Trans. on Information Theory, 21:285-293.
Article MATH MathSciNet Google Scholar
Johnson, D.S., & Garey, M.R. (1979). Computers and Intractability. A Guide to the Theory of NP-completeness. Freeman and Comp., NY, USA.
MATH Google Scholar
Gates, W. (1972). The reduced nearest neighbor rule. IEEE Transactions on Information Theory, 18(3):431-433.
Article Google Scholar
Prosser, P., & Gent, I.P. (2002). In Proc. of the Fifth Int. Conf. on Theory and Applications of Satisifiability Testing (SAT), Cincinnati, Ohio, USA.
Google Scholar
Gonzalez, T. (1985). Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293-306.
Article MATH MathSciNet Google Scholar
Hart, P.E. (1968). The condensed nearest neighbor rule. IEEE Transactions on Information Theory, 14(3):515-516.
Article Google Scholar
Krim, H., & Karaçali, B. (2003). Fast minimization of structural risk by nearest neighbor rule. IEEE Transactions on Neural Networks, 14(1):127-134.
Article Google Scholar
Nakagawa, M., & Liu, C.L. (2001). Evaluation of prototype learning algorithms for nearest-neighbor classifier in application to handwritten character recognition. Pattern Recognition, 34(3):601-615.
Article MATH Google Scholar
Madigan, C., Zhao, Y., Zhang, L., Malik, S., & Moskewicz, M. (2001). Engineering an efficient sat solver. In 39th Design Automation Conference (DAC).
Google Scholar
Darwiche, A., & Pipatsrisawat, K. (2007). Rsat 2.0: Sat solver description. Technical Report D-153, Automated Reasoning Group, Computer Science Department, UCLA.
Google Scholar
Woodruff, H.B., Lowry, S.R., Isenhour, T.L., & Ritter, G.L. (1975). An algorithm for a selective nearest neighbor decision rule. IEEE Transactions on Information Theory, 21:665-669.
Article MATH Google Scholar
Stone, C. (1977). Consistent nonparametric regression. Annals of Statistics, 8:1348-1360.
Article Google Scholar
Toussaint, G. (2002). Proximity graphs for nearest neighbor decision rules: Recent progress. In Proceedings of the Symposium on Computing and Statistics, Montreal, Canada, April 17-20.
Google Scholar
Vapnik, V. (1995). The Nature of the statistical learning theory. Springer Verlag, New York.
MATH Google Scholar
Wilfong, G. (1992). Nearest neighbor problems. International Journal of Computational Geometry & Applications, 2(4):383-416.
Article MATH MathSciNet Google Scholar
Martinez, T.R., & Wilson, D.R. (2000). Reduction techniques for instance-based learning algorithms. Machine Learning, 38(3):257-286.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

DEIS, Università della Calabria, Via P. Bucci 41C, 87036, Rende (CS), Italy
Fabrizio Angiulli
ICAR-CNR, Via P. Bucci 41C, 87036, Rende (CS), Italy
Stefano Basta

Authors

Fabrizio Angiulli
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Basta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Portsmouth, UK
Max Bramer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Angiulli, F., Basta, S. (2008). Optimal Subset Selection for Classification through SAT Encodings. In: Bramer, M. (eds) Artificial Intelligence in Theory and Practice II. IFIP AI 2008. IFIP – The International Federation for Information Processing, vol 276. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-09695-7_30

Download citation

DOI: https://doi.org/10.1007/978-0-387-09695-7_30
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-09694-0
Online ISBN: 978-0-387-09695-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimal Subset Selection for Classification through SAT Encodings

Abstract

Chapter PDF

Similar content being viewed by others

Cost-sensitive sparse subset selection

Fast and Accurate Steepest-Descent Consistency-Constrained Algorithms for Feature Selection

Minimum Rule-Repair Algorithm for Supervised Learning Classifier Systems on Real-Valued Classification Tasks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Optimal Subset Selection for Classification through SAT Encodings

Abstract

Chapter PDF

Similar content being viewed by others

Cost-sensitive sparse subset selection

Fast and Accurate Steepest-Descent Consistency-Constrained Algorithms for Feature Selection

Minimum Rule-Repair Algorithm for Supervised Learning Classifier Systems on Real-Valued Classification Tasks

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation