Skip to main content

Learning via Prototype Generation and Filtering

  • Chapter
Instance Selection and Construction for Data Mining

Abstract

The family of instance-based learning algorithms have been shown to be effective for learning classification schemes in many domains. However, it demands high data retention rate and is sensitive to noise. We investigate an integration of instance-filtering and instance-averaging techniques to solve the problem. We compare different variants of integration as well as existing learning algorithms such as C4.5 and KNN. Our new framework shows good performance in data reduction while maintaining or even improving classification accuracy in 19 real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aha, D. W. (1992). Tolerating Noisy, Irrelevant, and Novel Attributes in Instance-Based Learning Algorithms. International Journal of ManMachine Studies, 36:267–287.

    Article  Google Scholar 

  • Aha, D. W. and Kibler, D. (1989). Noise-Tolerant Instance-Based Learning Algorithms. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, pages 794–799.

    Google Scholar 

  • Aha, D. W., Kibler, D. and Albert, M. K. (1991). Instance-Based Learning Algorithms. Machine Learning, 6:37–66.

    Google Scholar 

  • Bezdek, J. C., Reichherzer, T. R., Lim, G. S. and Attikiouzel, Y. (1998). Multiple-Prototype Classifier Design. IEEE Transactions on Systems, Man, and Cyberneics, 28 (1):67–79.

    Article  Google Scholar 

  • Blake, C.L. and Merz, C.J. (1998). UCI Repository of Machine Learning Database. Irvine, CA: University of California lrvine, Department of Information and Computer Science. http://www.ics.uci.eduh~mlearn/MLRepository.html.

    Google Scholar 

  • Bradshaw, G. (1987). Learning about Speech Sounds: The NEXUS project. Proceedings of the Fourth International Workshop on Machine Learning, pages 1–11.

    Google Scholar 

  • Cameron-Jones, R. M. (1992). Minimum Description Length InstanceBased Learning. Proceedings of the Fifth Australian Joint Conference on Artificial Intelligence, pages 368–373.

    Google Scholar 

  • Cameron-Jones, R. M. (1995). Instance Selection by Encoding Length Heuristic with Random Mutation Hill Climbing. Proceedings of the Eighth Australian Joint Conference on Artificial Intelligence, pages 293–301.

    Google Scholar 

  • Chang, C. L. (1974). Finding Prototypes for Nearest Neighbor Classifiers. IEEE Transactions on Computers, 23(11):1179–1184.

    Article  MATH  Google Scholar 

  • Cost, S and Salzberg, S. (1993). A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Feature. Machine Learning, 10:57–78.

    Google Scholar 

  • Dasarathy, B. V. (1990). Nearest Neighbor (NN) Norms: NN Pattern Classification Technique. IEEE Computer Society Press.

    Google Scholar 

  • Dasarathy, B. V. (1994). Minimal Consistent Set (MCS) Identification for Optimal Nearest Neighbor Decision Systems Design. IEEE Transactions on Systems, Man, and Cyberneics, 24(3):511–517.

    Article  Google Scholar 

  • Datta, P. and Kibler, D. (1997). Learning Symbolic Prototypes. Proceedings of the Fourteenth International Conference on Machine Learning, pages 75–82.

    Google Scholar 

  • Datta, P. and Kibler, D. (1997). Symbolic Nearest Mean Classifier. Proceedings of the Fourteenth National Conference of Artificial Intelligence, pages 82–87.

    Google Scholar 

  • Datta, P. and Kibler, D. (1995). Learning Prototypical Concept Description. Proceedings of the Twelfth International Conference on Machine Learning, pages 158–166.

    Google Scholar 

  • Gates, G. W. (1972). The Reduced Nearest Neighbor Rule. IEEE Transactions on Information Theory, 18(3):431–433.

    Article  Google Scholar 

  • Gowda, K. C. and Krisha, G. (1979). The Condensed Nearest Neighbor Rule Using the Concept of Mutual Nearest Neighborhood. IEEE Transactions on Information Theory, 25(4):488–490.

    Article  Google Scholar 

  • Hart, P. E. (1968). The Condensed Nearest Neighbor Rule. IEEE Transactions on Information Theory, 14(3):515–516.

    Article  Google Scholar 

  • Kibler, D. and Aha, D. W. (1988). Comparing Instance-Averaging with Instance-Filtering Learning Algorithms. Proceedings of the Third European Working Session on Learning, pages 63–80.

    Google Scholar 

  • Kuncheva, L. I., Bezdek, J. C. (1998). Nearest Prototype Classification: Clustering, Genetic Algorithms, or Random Search? IEEE Transactions on Systems, Man, and Cyberneics, 28 (1):160–164.

    Article  Google Scholar 

  • Ricci, F. and Avesani, P. (1999). Date Compression and Local Metrics for Nearest Neighbor Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(4):380–384.

    Article  Google Scholar 

  • Ritter, G. L, Woodruff, H. B. and Lowry, S. R. (1975). An Algorithm for a Selective Nearest Neighbor Decision Rule. IEEE Transactions on Information Theory, 21(6):665–669.

    Article  MATH  Google Scholar 

  • Salzberg, S. (1991). A Nearest Hyperrectangle Learning Method. Machine Learning, 6:251–276.

    Google Scholar 

  • Sebestyen, G. S. (1962). Decision-Making Process in Pattern Recognition. New York: The Macmillan Company.

    Google Scholar 

  • Skalak, D. B. (1994). Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms. Proceedings of the Eleventh International Conference on Machine Learning, pages 293– 301.

    Google Scholar 

  • Swonger, C. W. (1972). Sample Set Condensation for a Condensed Nearest Neighbor Decision Rule for Pattern Recognition. In Watanabe, S., editor, Frontiers of Pattern Recognition. Academic Press, New York, NY, pages 511–519.

    Google Scholar 

  • Tomek, I. (1976). An Experiment with the Edited Nearest—Neighbor Rule. IEEE Transactions on Systems, Man, and Cyberneics, 6(6):448– 452.

    Article  MathSciNet  MATH  Google Scholar 

  • Trri, H., Knotkanen, P. and Myllymäki, P. (1996). Probabilistic InstanceBased Learning. Proceedings of the Thirteenth International Conference on Machine Learning, pages 158–166.

    Google Scholar 

  • Ullmann, J. R. (1974). Automatic Selection of Reference Data for Use in a Nearest Neighbor Method of Pattern Classification. IEEE Transactions on Information Theory, 20(4):431–433.

    Article  Google Scholar 

  • Wettschereck, D. (1994). A Hybrid Nearest—Neighbor and Nearest—Hyperrectangle Algorithm. Proceedings of the Seventh European Conference on Machine Learning, pages 323–335.

    Google Scholar 

  • Wettschereck, D., Aha, D. W. and Mohri, T. (1997). A Review and Empirical Evaluation of Feature Weighting Methods for a Class of Lazy Learning Algorithms. Artificial Intelligence Review, 11:273–314.

    Article  Google Scholar 

  • Wettschereck, D. and Dietterich, T. G. (1995). An Experimental Comparison of the Nearest—Neighbor and Nearest—Hyperrectangle Algorithms. Machine Learning, 19:5–27.

    Google Scholar 

  • Wilson, D. L. (1972). Asymptotic Properties of Nearest Neighbor Rules Using Edited Data. IEEE Transactions on Systems, Man, and Cyberneics, 2:431–433.

    Google Scholar 

  • Wilson, D. R. and Martinez T. R. (1997). Instance Pruning Techniques. Proceedings of the Fourteenth International Conference on Machine Learning, pages 403–411.

    Google Scholar 

  • Zhang, J. (1992). Selecting Typical Instances in Instance-Based Learning. Proceedings of International Conference on Machine Learning, pages 470–479.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Lam, W., Keung, CK., Ling, C.X. (2001). Learning via Prototype Generation and Filtering. In: Liu, H., Motoda, H. (eds) Instance Selection and Construction for Data Mining. The Springer International Series in Engineering and Computer Science, vol 608. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-3359-4_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-3359-4_13

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-4861-8

  • Online ISBN: 978-1-4757-3359-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics