Skip to main content

Improving Supervised Learning by Feature Decomposition

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2284))

Abstract

This paper presents the Feature Decomposition Approach for improving supervised learning tasks. While in Feature Selection the aim is to identify a representative set of features from which to construct a classification model, in Feature Decomposition, the goal is to decompose the original set of features into several subsets. A classification model is built for each subset, and then all generated models are combined. This paper presents theoretical and practical aspects of the Feature Decomposition Approach. A greedy procedure, called DOT (Decomposed Oblivious Trees), is developed to decompose the input features set into subsets and to build a classification model for each subset separately. The results achieved in the empirical comparison testing with well-known learning algorithms (like C4.5) indicate the superiority of the feature decomposition approach in learning tasks that contains high number of features and moderate numbers of tuples.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ali K. M., Pazzani M. J., Error Reduction through Learning Multiple Descriptions, Machine Learning, 24(3): 173–202, 1996.

    Google Scholar 

  2. Almuallim H. and Dietterich T.G., Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69(1–2):279–306, 1994.

    Article  MATH  MathSciNet  Google Scholar 

  3. Attneave, F., Applications of Information Theory to Psychology. Holt, Rinehart and Winston, 1959.

    Google Scholar 

  4. Bay, S. Nearest neighbor classification from multiple feature subsets. Intelligent Data Analysis, 3(3): 191–209, 1999.

    Article  Google Scholar 

  5. Bellman, R., Adaptive Control Processes: A Guided Tour, Princeton University Press, 1961.

    Google Scholar 

  6. Blum, A. and Mitchell, T., “Combining Labeled and Unlabeled Data with Cotraining”, COLT: Proceedings of the Workshop on Computational Learning Theory, Morgan Kaufmann Publishers, 1998.

    Google Scholar 

  7. Buntine, W., “Graphical Models for Discovering Knowledge”, in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp 59–82. AAAI/MIT Press, 1996.

    Google Scholar 

  8. Chan, P.K. and Stolfo, S.J., A Comparative Evaluation of Voting and Metalearning on Partitioned Data, Proc. 12th Intl. Conf. on Machine Learning ICML-95, 1995.

    Google Scholar 

  9. Dietterich, T. G., and Bakiri, G., Solving multiclass learning problems via errorcorrecting output codes. Journal of Artificial Intelligence Research, 2:263–286, 1995.

    MATH  Google Scholar 

  10. Domingos, P., and Pazzani, M., “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Machine Learning, 29: 103–130, 1997.

    Article  MATH  Google Scholar 

  11. Duda, R., and Hart, P., Pattern Classification and Scene Analysis, New-York, NY: Wiley, 1973.

    MATH  Google Scholar 

  12. Dunteman, G.H., Principal Components Analysis, Sage Publications, 1989.

    Google Scholar 

  13. Fayyad, U., Piatesky-Shapiro, G., and Smyth P., “From Data Minig to Knowledge Discovery: An Overview,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp 1–30, MIT Press, 1996.

    Google Scholar 

  14. Friedman, J.H., and Tukey, J.W., “A Projection Pursuit Algorithm for Exploratory Data Analysis,” IEEE Transactions on Computers, 23 (9): 881–889, 1974.

    Article  MATH  Google Scholar 

  15. Friedman, J.H., “On bias, variance, 0/1-loss and the curse of dimensionality,” Data Mining and Knowledge Discovery, 1 (1): 55–77, 1997.

    Article  Google Scholar 

  16. Fukunaga, K., Introduction to Statistical Pattern Recognition. San Diego, CA: Academic, 1990.

    MATH  Google Scholar 

  17. Hwang J., Lay S., and Lippman A., “Nonparametric multivariate density estimation: A comparative study,” IEEE Trans. Signal Processing, vol. 42, pp. 2795–2810, Oct. 1994.

    Google Scholar 

  18. Jimenez, L. O., and Landgrebe D. A., “Supervised Classification in High-Dimensional Space: Geometrical, Statistical, and Asymptotical Properties of Multivariate Data.” IEEE Transaction on Systems Man, and Cybernetics — Part C: Applications and Reviews, 28:39–54, 1998.

    Article  Google Scholar 

  19. Kim J.O., and C.W. Mueller, Factor Analysis: Statistical Methods and Practical Issues. Sage Publications, 1978.

    Google Scholar 

  20. Kononenko, I., “Comparison of inductive and naive Bayesian learning approaches to automatic knowledge acquisition”. In Current Trends In reply to: Knowledge Acquisition, IOS Press, 1990.

    Google Scholar 

  21. Kononenko, I., “Semi-naive Bayesian classifier,” in Proceedings of the Sixth European Working Session on Learning, Springer-Verlag, pp. 206–219, 1991.

    Google Scholar 

  22. Kusiak, A., Decomposition in Data Mining: An Industrial Case Study, IEEE Transactions on Electronics Packaging Manufacturing, Vol. 23, No. 4, 2000, pp. 345–353

    Article  Google Scholar 

  23. Langley, P., “Selection of relevant features in machine learning,” in Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.

    Google Scholar 

  24. Langley, P. and Sage, S., Oblivious decision trees and abstract cases. Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, Seattle, WA: AAAI Press, 113–117.

    Google Scholar 

  25. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers, 1998.

    Google Scholar 

  26. Maimon, O., and M. Last, Knowledge Discovery and Data Mining: The Info-Fuzzy network (IFN) methodology, Kluwer Academic Publishers, 2000.

    Google Scholar 

  27. Maimon, O. and Rokach, L., “Data Mining by Attribute Decomposition with semiconductors manufacturing case study” in D. Bracha, Editor, Data Mining for Design and Manufacturing: Methods and Applications, Kluwer Academic Publishers, 2001.

    Google Scholar 

  28. Mansour, Y., and McAllester, D., Generalization Bounds for Decision Trees, COLT 2000: 220–224.

    Google Scholar 

  29. Merz, C.J, and Murphy. P.M., UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science, 1998.

    Google Scholar 

  30. Michie, D., “Problem decomposition and the learning of skills,” in Proceedings of the European Conference on Machine Learning, Springer-Verlag, PP. 17–31, 1995.

    Google Scholar 

  31. Pfahringer, B., “Controlling constructive induction in CiPF,” in Proceedings of the European Conference on Machine Learning, Springer-Verlag, pp. 242–256. 1994.

    Google Scholar 

  32. Pickard, L., B. Kitchenham, and S. Linkman., “An investigation of analysis techniques for software datasets”, in Proc. 6th IEEE Intl. Software Metrics Symposium. Boca Raton, FL: IEEE Computer Society, 1999.

    Google Scholar 

  33. Quinlan, J.R., C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.

    Google Scholar 

  34. Ridgeway, G., Madigan, D., Richardson, T. and O’Kane, J. (1998), “Interpretable Boosted Naive Bayes Classification”, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp 101–104.

    Google Scholar 

  35. Salzberg. S. L. (1997), “On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach”. Data Mining and Knowledge Discovery, 1, 312–327, Kluwer Academic Publishers, Boston.

    Google Scholar 

  36. Schmitt, M. On the complexity of computing and learning with multiplicative neural networks, to appear in Neural Computation, 2001.

    Google Scholar 

  37. Schlimmer, J. C. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of the 1993 International Conference on Machine Learning, pages 284–290, San Mateo, CA, 1993. Morgan Kaufmann.

    Google Scholar 

  38. Shapiro, A. D., Structured induction in expert systems, Turing Institute Press in association with Addison-Wesley Publishing Company, 1987.

    Google Scholar 

  39. Vapnik, V.N., 1995. The Nature of Statistical Learning The-ory. Springer-Verlag, New York.

    Google Scholar 

  40. Wallace, C. S., 1996. MML Inference of Predictive Trees, Graphs and Nets. In Computational Learning and Probabilitic Reasoning, A., Gammerman (ed), Wiley, pp43–66.

    Google Scholar 

  41. Walpole, R. E., and Myers, R. H., Probability and Statistics for Engineers and Scientists, pp. 268–272, 1986.

    Google Scholar 

  42. Zaki, M. J., and Ho, C. T., Eds., Large-Scale Parallel Data Mining. New York: Springer-Verlag, 2000.

    Google Scholar 

  43. Zupan, B., Bohanec, M., Demsar, J., and Bratko, I., “Feature transformation by function decomposition,” IEEE intelligent systems & their applications, 13: 38–43, 1998.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maimon, O., Rokach, L. (2002). Improving Supervised Learning by Feature Decomposition. In: Eiter, T., Schewe, KD. (eds) Foundations of Information and Knowledge Systems. FoIKS 2002. Lecture Notes in Computer Science, vol 2284. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45758-5_12

Download citation

  • DOI: https://doi.org/10.1007/3-540-45758-5_12

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43220-3

  • Online ISBN: 978-3-540-45758-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics