Data Mining by Attribute Decomposition with Semiconductor Manufacturing Case Study

  • Oded Maimon
  • Lior S. Rokach
Part of the Massive Computing book series (MACO, volume 3)

Abstract

This chapter examines the Attribute Decomposition Approach with simple Bayesian combination for dealing with classification problems in the semiconductor industry. Often classification problems in this industry contain high number of attributes (due to the complexity of the manufacturing process) and moderate numbers of records. According to the Attribute Decomposition Approach, the set of input attributes is automatically decomposed into several subsets. A classification model is built for each subset, then all the models are combined using simple Bayesian combination. This chapter presents theoretical and practical foundation for the Attribute Decomposition Approach. A greedy procedure, called D-IFN, is developed to decompose the input attributes set into subsets and build a classification model for each subset separately. The algorithm has been applied to problems in the semiconductor industry and on variety of databases from other application domains. The results achieved in the empirical comparison testing with well-known classification methods (like C4.5) indicate the superiority of the decomposition approach.

Keywords

Data Mining Feature Selection Classification Problem Decomposition Approach Target Attribute 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Attneave, F., Applications of Information Theory to Psychology. Holt, Rinehart and Winston, 1959.Google Scholar
  2. Biermann, A. W., Fierfield, J., and Beres, T., “Signature table systems and learning,” IEEE Transactions on Systems, Man, and Cybernetics, 12 (5): 635–648, 1982.CrossRefGoogle Scholar
  3. Buntine, W., “Graphical Models for Discovering Knowledge”, in U. Fayyad, G. PiatetskyShapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp 59–82. AAAI/MIT Press, 1996.Google Scholar
  4. Cover T. M., Elements of Information Theory. Wiley, 1991.MATHGoogle Scholar
  5. Dietterich, T. G., and Michalski, R. S., “A comparative review of selected methods for learning from examples,” Machine Learning, an Artificial Intelligence approach, 1: 41–81, 1983.Google Scholar
  6. Domingos, P., and Pazzani, M., “On the Optimality of the Simple Bayesian Classifier under Zero-One Loss,” Machine Learning, 29: 103–130, 1997.CrossRefMATHGoogle Scholar
  7. Dougherty, J., Kohavi, R., and Sahami, M., “Supervised and unsupervised discretization of continuous features,” in Proceedings of the Twelfth International Conference on Machine Learning, pp. 194–202, 1995.Google Scholar
  8. Duda, R., and Hart, P., Pattern Classification and Scene Analysis, New-York, NY: Wiley, 1973.Google Scholar
  9. Dunteman, G.H., Principal Components Analysis, Sage Publications, 1989.Google Scholar
  10. Elder IV, J.F. and Pregibon, D., “A Statistical Perspective on Knowledge Discovery in Databases,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp 83–113. AAAI/MIT Press, 1996.Google Scholar
  11. Fayyad, U., Piatesky-Shapiro, G., and Smyth P., “From Data Minig to Knowledge Discovery: An Overview,” in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors, Advances in Knowledge Discovery and Data Mining, pp 1–30, MIT Press, 1996.Google Scholar
  12. Friedman, J.H., and Tukey, J.W., “A Projection Pursuit Algorithm for Exploratory Data Analysis,” IEEE Transactions on Computers, 23 (9): 881–889, 1974.CrossRefMATHGoogle Scholar
  13. Friedman, J.H., “On bias, variance, 0/1–loss and the curse of dimensionality,” Data Mining and Knowledge Discovery, 1 (1): 55–77, 1997.CrossRefGoogle Scholar
  14. Heckerman, D., “Bayesian Networks for Data Mining,” Data Mining and Knowledge Discovery, 1: 1, pp. 79–119, 1997.CrossRefGoogle Scholar
  15. Kim J.O., and C.W. Mueller, Factor Analysis: Statistical Methods and Practical Issues. Sage Publications, 1978.Google Scholar
  16. Kononenko, I., “Comparison of inductive and naive Bayesian learning approaches to automatic knowledge acquisition”. In Current Trends in Reply to: Knowledge Acquisition, IOS Press, 1990.Google Scholar
  17. Kononenko, I., “Semi-naive Bayesian classifier,” in Proceedings of the Sixth European Working Session on Learning, Springer-Verlag, pp. 206–219, 1991.Google Scholar
  18. Langley, P., “Selection of relevant features in machine learning,” in Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.Google Scholar
  19. Liu and H. Motoda, Feature Selection for Knowledge Discovery and Data Mining, Kluwer Academic Publishers, 1998.Google Scholar
  20. Maimon, O., and M. Last, Knowledge Discovery and Data Mining: The Info-Fuzzy network ( IFN) methodology, Kluwer Academic Publishers, 2000.MATHGoogle Scholar
  21. McMenamin, S., and Monforte, F., “Short Term Energy Forecasting with Neural Networks,” The energy journal, 19 (4): 43–61, 1998.Google Scholar
  22. Merz, C.J, and Murphy. P.M., UCI Repository of machine learning databases. Irvine, CA: University of California, Department of Information and Computer Science, 1998.Google Scholar
  23. Michie, D., “Problem decomposition and the learning of skills,” in Proceedings of the European Conference on Machine Learning, Springer-Verlag, PP. 17–31, 1995.Google Scholar
  24. Mitchell T.M., Machine Learning, McGraw-Hill, 1997.Google Scholar
  25. Pfahringer, B., “Controlling constructive induction in CiPF,” in Proceedings of the European Conference on Machine Learning, Springer-Verlag, pp. 242–256. 1994.Google Scholar
  26. Quinlan J.R., “Induction of Decision Trees,” Machine Learning, 1(1): 81–106, 1986.Google Scholar
  27. Quinlan J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993.Google Scholar
  28. Ragavan, H., and Rendell, L., “Look ahead feature construction for learning hard concepts,” in Proceedings of the Tenth International Machine Learning Conference, Morgan Kaufman, pp. 252–259, 1993.Google Scholar
  29. Samuel, A., “Some studied in machine learning using the game of checkers II: Recent progress,” IBM Journal of Research and Development, 11: 601–617, 1967.CrossRefGoogle Scholar
  30. Shapiro, A. D., Structured induction in expert systems, Turing Institute Press in association with Addison-Wesley Publishing Company, 1987.Google Scholar
  31. Schwarz G., “Estimation Dimension of a Model,” Ann., Stat., 6: 461–464, 1978.Van Zant, P., Microchip fabrication: a practical guide to semiconductor processing, third edition, New York: McGraw-Hill, 1997.Google Scholar
  32. Walpole, R. E., and Myers, R. H., Probability and Statistics for Engineers and Scientists, pp. 268–272, 1986.Google Scholar
  33. Zupan, B., Bohanec, M., Demsar, J., and Bratko, I., “Learning by discovering concept hierarchies,” Artificial Intelligence, 109: 211–242, 1999.MathSciNetCrossRefMATHGoogle Scholar
  34. Zupan, B., Bohanec, M., Demsar, J., and Bratko, I., “Feature transformation by function decomposition,” IEEE intelligent systems & their applications, 13: 38–43, 1998.CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2001

Authors and Affiliations

  • Oded Maimon
    • 1
  • Lior S. Rokach
    • 1
  1. 1.Department of Industrial EngineeringTel-Aviv UniversityIsrael

Personalised recommendations