Skip to main content

Decomposition Methodology for Knowledge Discovery and Data Mining

  • Chapter
Data Mining and Knowledge Discovery Handbook

Abstract

The idea of decomposition methodology is to break down a complex Data Mining task into several smaller, less complex and more manageable, sub-tasks that are solvable by using existing tools, then joining their solutions together in order to solve the original problem. In this chapter we provide an overview of decomposition methods in classification tasks with emphasis on elementary decomposition methods. We present the main properties that characterize various decomposition frameworks and the advantages of using these framework. Finally we discuss the uniqueness of decomposition methodology as opposed to other closely related fields, such as ensemble methods and distributed data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Ali K. M., Pazzani M. J, Error Reduction through Learning Multiple Descriptions, Machine Learning, 24:3, 173–202, 1996.

    Google Scholar 

  • Anand R, Methrotra K, Mohan CK, Ranka S. Efficient classification for multiclass problems using modular neural networks. IEEE Trans Neural Networks, 6(1):117–125, 1995.

    Article  Google Scholar 

  • Baxt, W. G., Use of an artificial neural network for data analysis in clinical decision making: The diagnosis of acute coronary occlusion. Neural Computation, 2(4):480–489, 1990.

    Google Scholar 

  • Bay, S., Nearest neighbor classification from multiple feature subsets. Intelligent Data Analysis, 3(3): 191–209, 1999.

    Article  Google Scholar 

  • Bhargava H. K., Data Mining by Decomposition: Adaptive Search for Hypothesis Generation. INFORMS Journal on Computing Vol. 11, Iss. 3, pp. 239–47, 1999.

    Article  MATH  Google Scholar 

  • Biermann, A. W., Faireld, J., and Beres, T, 1982. Signature table systems and learning. IEEE Trans. Syst. Man Cybern., 12(5):635–648.

    Article  MATH  Google Scholar 

  • Blum A., and Mitchell T., Combining Labeled and Unlabeled Data with Co-Training. In Proc of the 11th Annual Conference on Computational Learning Theory, pages 92–100, 1998.

    Google Scholar 

  • Breiman L., Bagging predictors. Machine Learning, 24(2):123–140, 1996.

    MATH  MathSciNet  Google Scholar 

  • Buntine. W., “Graphical Models for Discovering Knowledge”, in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining, pp 59–82. AAAI/MIT Press, 1996.

    Google Scholar 

  • Chan P.K. and Stolfo S.J, On the Accuracy of Meta-learning for Scalable Data Mining, J. Intelligent Information Systems, 8:5–28, 1997.

    Article  Google Scholar 

  • Chen K., Wang L. and Chi H., Methods of Combining Multiple Classifiers with Different Features and Their Applications to Text-Independent Speaker Identification, International Journal of Pattern Recognition and Artificial Intelligence, 11(3): 417–445, 1997.

    Article  Google Scholar 

  • Cherkauer, K.J., Human Expert-Level Performance on a Scientific Image Analysis Task by a System Using Combined Artificial Neural Networks. In Working Notes, Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms Workshop, Thirteenth National Conference on Artificial Intelligence. Portland, OR: AAAI Press, 1996.

    Google Scholar 

  • Dietterich, T. G., and Ghulum Bakiri. Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2:263–286, 1995.

    MATH  Google Scholar 

  • Domingos, P., Using Partitioning to Speed Up Specific-to-General Rule Induction. In Proceedings of the AAAI-96 Workshop on Integrating Multiple Learned Models, pp. 29–34. AAAI Press, 1996.

    Google Scholar 

  • Domingos, P., & Pazzani, M., On the Optimality of the Naive Bayes Classifier under Zero-One Loss, Machine Learning, 29:2, 103–130, 1997.

    Article  MATH  Google Scholar 

  • Fischer, B.. “Decomposition of Time Series-Comparing Different Methods in Theory and Practice”, Eurastat Working Paper. 1995.

    Google Scholar 

  • Friedman, J. H., “Multivariate Adaptive Regression Splines”, The Annual Of Statistics, 19, 1–141, 1991.

    MATH  Google Scholar 

  • Friedman N., Geiger D, and Goldszmidt M., Bayesian Network Classifiers, Machine Learning 29:2–3, 131–163, 1997.

    Article  MATH  Google Scholar 

  • Gama J., A Linear-Bayes Classifier. In C. Monard, editor, Advances on Artificial intelligence — SBIA2000. LNA1 1952, pp 269–279, Springer Verlag, 2000

    Google Scholar 

  • Grossman R., Kasif S., Moore R., Rocke D., and Ullman J., Data Mining research: Opportunities and challenges. Report of three NSF workshops on mining large, massive, and distributed data, 1999.

    Google Scholar 

  • Guo Y. and Sutiwaraphun J., Knowledge probing in distributed Data Mining, in Proc. 4h Int. Conf. Knowledge Discovery Data Mining, pp 61–69, 1998.

    Google Scholar 

  • Hansen J., Combining Predictors. Meta Machine Learning Methods and Bias, Variance & Ambiguity Decompositions. PhD dissertation. Aurhus University. 2000.

    Google Scholar 

  • Hampshire, J. B., and Waibel, A. The meta-Pi network-building distributed knowledge representations for robust multisource pattern-recognition. Pattern Analyses and Machine Intelligence 14(7): 751–769, 1992

    Article  Google Scholar 

  • He D. W., Strege B., Tolle H., and Kusiak A., Decomposition in Automatic Generation of Petri Nets for Manufacturing System Control and Scheduling, International Journal of Production Research, 38(6): 1437–1457. 2000.

    Article  MATH  Google Scholar 

  • Holmstrom, L., Koistinen, P., Laaksonen, J., and Oja, E., Neural and statistical classifiers-taxonomy and a case study. IEEE Trans, on Neural Networks, 8,:5–17, 1997.

    Article  Google Scholar 

  • Hrycej T., Modular Learning in Neural Networks. New York: Wiley, 1992.

    MATH  Google Scholar 

  • Hu, X., Using Rough Sets Theory and Database Operations to Construct a Good Ensemble of Classifiers for Data Mining Applications. ICDM01. pp 233–240, 2001.

    Google Scholar 

  • Jenkins R. and Yuhas, B. P. A simplified neural network solution through problem decomposition: The case of Truck backer-upper, IEEE Transactions on Neural Networks 4(4):718–722, 1993.

    Article  Google Scholar 

  • Johansen T. A. and Foss B. A., A narmax model representation for adaptive control based on local model-Modeling, Identification and Control, 13(l):25–39, 1992.

    Article  Google Scholar 

  • Jordan, M. I., and Jacobs, R. A., Hierarchical mixtures of experts and the EM algorithm. Neural Computation, 6, 181–214, 1994.

    Google Scholar 

  • Kargupta, H. and Chan P., eds, Advances in Distributed and Parallel Knowledge Discovery, pp. 185–210, AAAl/MIT Press, 2000.

    Google Scholar 

  • Kohavi R., Becker B., and Sommerfield D., Improving simple Bayes. In Proceedings of the European Conference on Machine Learning, 1997.

    Google Scholar 

  • Kononenko, I., Comparison of inductive and Naive Bayes learning approaches to automatic knowledge acquisition. In B. Wielinga (Ed.), Current Trends in Knowledge Acquisition, Amsterdam, The Netherlands IOS Press, 1990.

    Google Scholar 

  • Kononenko, I., SemiNaive Bayes classifier, Proceedings of the Sixth European Working Session on Learning, pp. 206–219, Porto, Portugal: SpringerVerlag, 1991.

    Google Scholar 

  • Kusiak, A., Decomposition in Data Mining: An Industrial Case Study, IEEE Transactions on Electronics Packaging Manufacturing, Vol. 23, No. 4, pp. 345–353, 2000.

    Article  Google Scholar 

  • Kusiak, E. Szezerbicki, and K. Park, A Novel Approach to Decomposition of Design Specifications and Search for Solutions, International Journal of Production Research, 29(7): 1391–1406, 1991.

    Google Scholar 

  • Langley, P. and Sage, S., Oblivious decision trees and abstract cases. in Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, pp. 113–117, Seattle, WA: AAAI Press, 1994.

    Google Scholar 

  • Liao Y., and Moody J., Constructing Heterogeneous Committees via Input Feature Grouping, in Advances in Neural Information Processing Systems, Vol.12. S.A. Solla, T.K. Leen and K.-R. Muller (eds.),MIT Press, 2000.

    Google Scholar 

  • Long C, Bi-Decomposition of Function Sets Using Multi-Valued Logic, Eng. Doc. Dissertation, Technischen Universitat Bergakademie Freiberg 2003.

    Google Scholar 

  • Lu B.L., Ito M., Task Decomposition and Module Combination Based on Class Relations: A Modular Neural Network for Pattern Classification, IEEE Trans. on Neural Networks, 10(5):1244–1256, 1999.

    Article  Google Scholar 

  • Maimon O. and Rokach L., “Improving supervised learning by feature decomposition”, Proceedings of the Second International Symposium on Foundations of Information and Knowledge Systems, Lecture Notes in Computer Science, Springer, pp. 178–196, 2002.

    Google Scholar 

  • Maimon O. and Rokach L., “Decomposition Methodology for Knowledge Discovery and Data Mining: Theory and Applications”, World Scientific, 2005.

    Google Scholar 

  • Meretakis, D. and Wthrich, B., Extending Nave Bayes Classifiers Using Long Itemsets, in Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining, pp. 165–174, San Diego, USA, 1999.

    Google Scholar 

  • Michie, D., Problem decomposition and the learning of skills, in Proceedings of the European Conference on Machine Learning, pp. 17–31, Springer-Verlag, 1995.

    Google Scholar 

  • Nowlan S. J., and Hinton G. E. Evaluation of adaptive mixtures of competing experts. In Advances in Neural Information Processing Systems. R. P. Lippmann, J. E. Moody, and D. S. Touretzky, Eds., vol. 3, pp. 774–780, Morgan Kaufmann Publishers Inc., 1991.

    Google Scholar 

  • Ohno-Machado, L., and Musen, M. A. Modular neural networks for medical prognosis: Quantifying the benefits of combining neural networks for survival prediction. Connection Science 9,1, 1997, 71–86.

    Article  Google Scholar 

  • Peng, F. and Jacobs R. A., and Tanner M. A., Bayesian Inference in Mixtures-of-Experts and Hierarchical Mixtures-of-Experts Models With an Application to Speech Recognition, Journal of the American Statistical Association, 1995.

    Google Scholar 

  • Pratt, L. Y., Mostow, J., and Kamm C. A., Direct Transfer of Learned Information Among Neural Networks, in: Proceedings of the Ninth National Conference on Artificial Intelligence, Anaheim, CA, 584–589, 1991.

    Google Scholar 

  • Provost, F.J. and Kolluri, V., A Survey of Methods for Scaling Up Inductive Learning Algorithms, Proc. 3rd International Conference on Knowledge Discovery and Data Mining, 1997.

    Google Scholar 

  • Quinlan, J. R., C4.5: Programs for Machine Learning, Morgan Kaufmann, Los Altos, 1993.

    Google Scholar 

  • Rahman, A. F. R., and Fairhurst, M. C. A new hybrid approach in combining multiple experts to recognize handwritten numerals. Pattern Recognition Letters, 18: 781–790, 1997.

    Article  Google Scholar 

  • Ramamurti, V., and Ghosh, J., Structurally Adaptive Modular Networks for Non-Stationary Environments, IEEE Transactions on Neural Networks, 10(1):152–160, 1999.

    Article  Google Scholar 

  • Ridgeway, G., Madigan, D., Richardson, T. and O’Kane, J., Interpretable Boosted Naive Bayes Classification. Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining, pp 101–104, 1998.

    Google Scholar 

  • Ronco, E., Gollee, H., and Gawthrop, P. J., Modular neural network and self-decomposition. CSC Research Report CSC-96012, Centre for Systems and Control, University of Glasgow, 1996.

    Google Scholar 

  • Saaty, X., The analytic hierarchy process: A 1993 overview. Central European Journal for Operations Research and Economics, Vol. 2, No. 2. p. 119–137, 1993.

    MATH  MathSciNet  Google Scholar 

  • Samuel, A., Some studies in machine learning using the game of checkers II: Recent progress. IBM J. Res. Develop., 11:601–617, 1967.

    Article  Google Scholar 

  • Sharkey, A., On combining artificial neural nets, Connection Science, Vol. 8, pp.299–313, 1996.

    Article  Google Scholar 

  • Sharkey, A., Multi-Net Iystems, In Sharkey A. (Ed.) Combining Artificial Neural Networks: Ensemble and Modular Multi-Net Systems, pp. 1–30. Springer-Verlag, 1999.

    Google Scholar 

  • Tumer, K. and Ghosh J., Error Correlation and Error Reduction in Ensemble Classifiers, Connection Science, Special issue on combining artificial neural networks: ensemble approaches, 8(3–4): 385–404, 1996.

    Article  Google Scholar 

  • Tumer, K., and Ghosh J., Linear and Order Statistics Combiners for Pattern Classification, in Combining Articial Neural Nets, A. Sharkey (Ed.), pp. 127–162, Springer-Verlag, 1999.

    Google Scholar 

  • Weigend, A. S., Mangeas, M., and Srivastava, A. N. Nonlinear gated experts for time-series-discovering regimes and avoiding overfitting. International Journal of Neural Systems 6(5):373–399, 1995.

    Article  Google Scholar 

  • Zaki, M. J., Ho C. T., and Agrawal, R., Scalable parallel classification for Data Mining on shared-memory multiprocessors, in Proc. IEEE Int. Conf. Data Eng., Sydney, Australia, WKDD99, pp. 198–205, 1999.

    Google Scholar 

  • Zaki, M. J., Ho C. T., Eds., Large-Scale Parallel Data Mining. New York: Springer-Verlag, 2000

    Google Scholar 

  • Zupan, B., Bohanec, M., Demsar J., and Bratko, I., Feature transformation by function decomposition, IEEE intelligent systems & their applications, 13:38–43, 1998.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science+Business Media, Inc.

About this chapter

Cite this chapter

Maimon, O., Rokach, L. (2005). Decomposition Methodology for Knowledge Discovery and Data Mining. In: Maimon, O., Rokach, L. (eds) Data Mining and Knowledge Discovery Handbook. Springer, Boston, MA. https://doi.org/10.1007/0-387-25465-X_46

Download citation

  • DOI: https://doi.org/10.1007/0-387-25465-X_46

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-24435-8

  • Online ISBN: 978-0-387-25465-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics