Abstract
So far, we have mostly assumed a concept learning framework, where the learner’s task is to learn a rule set describing the target concept from a set of positive and negative examples for this concept. In this chapter, we discuss approaches that allow to extend this framework. We start with multiclass problems, which commonly occur in practice, and discuss the most popular methods for handling them: one-against-all classification and pairwise classification. We also discuss error-correcting output codes as a general framework for reducing multiclass problems to binary classification. As many prediction problems have complex, structured output variables, we also present label ranking and show how a generalization of pairwise classification can address this problem and related problems such as multilabel, hierarchical, and ordered classification. General ranking problems, in particular methods for optimizing the area under the ROC curve, are also addressed in this section. Finally, we briefly review rule learning approaches to regression and clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Recall that \(\hat{{\pi }}_{i} = \frac{\hat{{P}}_{i}} {\hat{E}}\) is the proportion of covered examples of class c i .
- 3.
We here refer to William Cohen’s original C-implementation of the algorithm. At the time of this writing, JRip, the more accessible Weka re-implementation of Ripper, does not support these options.
- 4.
Stacking (Wolpert, 1992) denotes a family of techniques that use the predictions of a set of classifiers as inputs for a meta-level classifier that makes the final prediction.
- 5.
Bagging (Breiman, 1996) is a popular ensemble technique which trains a set of classifiers, each on a sample of the training data that was generated by sampling uniformly and with replacement. The predictions of these classifiers are then combined, which often yields a better practical performance than using the predictions of a single classifier.
- 6.
References
Ali, K. M., & Pazzani, M. J. (1993). HYDRA: A noise-tolerant relational concept learning algorithm. In R. Bajcsy (Ed.), Proceedings of the 13th Joint International Conference on Artificial Intelligence (IJCAI-93), Chambéry, France (pp. 1064–1071). San Mateo, CA: Morgan Kaufmann.
Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1, 113–141.
Bisson, G. (1992). Conceptual clustering in a first order logic representation. In B. Neumann (Ed.), Proceedings of the 10th European Conference on Artificial Intelligence (ECAI-92), Vienna (pp. 458–462). Chichester, UK/New York: Wiley.
Blaszczynski, J., Stefanowski, J., & Zajac, M. (2009). Ensembles of abstaining classifiers based on rule sets. In J. Rauch, Z. W. Ras, P. Berka, & T. Elomaa (Eds.), Proceedings of the 18th International Symposium on Foundations of Intelligent Systems (ISMIS-09), Prague, Czech Republic (pp. 382–391). Berlin, Germany: Springer.
Blockeel, H., De Raedt, L., & Ramon, J. (1998). Top-down induction of clustering trees. In J. Shavlik (Ed.), Proceedings of the 15th International Conference on Machine Learning, Madison, WI (pp. 55–63). San Francisco: Morgan Kaufmann.
Bose, R. C., & Ray Chaudhuri, D. K. (1960). On a class of error correcting binary group codes. Information and Control, 3(1), 68–79.
Boström, H. (2007). Maximizing the area under the ROC curve with decision lists and rule sets. In Proceedings of the 7th SIAM International Conference on Data Mining (SDM-07), Minneapolis, MN (pp. 27–34). Philadelphia: SIAM.
Bradley, R. A., & Terry, M. E. (1952). The rank analysis of incomplete block designs—I. The method of paired comparisons. Biometrika, 39, 324–345.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. (1984). Classification and regression trees. Pacific Grove, CA: Wadsworth & Brooks.
Cardoso, J. S., & da Costa, J. F. P. (2007). Learning to classify ordinal data: The data replication method. Journal of Machine Learning Research, 8, 1393–1429.
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proceedings of the 5th European Working Session on Learning (EWSL-91), Porto, Portugal (pp. 151–163). Berlin, Germany: Springer.
Cohen, W. W., Schapire, R. E., & Singer, Y. (1999). Learning to order things. Journal of Artificial Intelligence Research, 10, 243–270.
Cook, D. J., & Holder, L. B. (1994). Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1, 231–255.
Crammer, K., & Singer, Y. (2002). On the learnability and design of output codes for multiclass problems. Machine Learning, 47(2–3), 201–233.
Davis, J., Burnside, E., Castro Dutra, I. d., Page, D., & Santos Costa, V. (2004). Using Bayesian classifiers to combine rules. In Proceedings of the 3rd SIGKDD Workshop on Multi-Relational Data Mining (MRDM-04), Seattle, WA.
Dekel, O., Manning, C. D., & Singer, Y. (2004). Log-linear models for label ranking. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems (NIPS-03) (pp. 497–504). Cambridge, MA: MIT.
Dembczyński, K., Kotłowski, W., & Słowiński, R. (2008). Solving regression by learning an ensemble of decision rules. In L. Rutkowski, R. Tadeusiewicz, L. A. Zadeh, & J. M. Zurada (Eds.), Proceedings of the 9th International Conference on Artificial Intelligence and Soft Computing (ICAISC-08), Zakopane, Poland (pp. 533–544). Berlin, Germany/New York: Springer.
Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
Eineborg, M., & Boström, H. (2001). Classifying uncovered examples by rule stretching. In C. Rouveirol & M. Sebag (Eds.), Proceedings of the Eleventh International Conference on Inductive Logic Programming (ILP-01), Strasbourg, France (pp. 41–50). Berlin, Germany/New York: Springer.
Escalera, S., Pujol, O., & Radeva, P. (2006). Decoding of ternary error correcting output codes. In J. F. M. Trinidad, J. A. Carrasco-Ochoa, & J. Kittler (Eds.), Proceedings of the 11th Iberoamerican Congress in Pattern Recognition (CIARP-06), Cancun, Mexico (pp. 753–763). Berlin, Germany/Heidelberg, Germany/New York: Springer.
Fawcett, T. E. (2001). Using rule sets to maximize ROC performance. In Proceedings of the IEEE International Conference on Data Mining (ICDM-01), San Jose, CA (pp. 131–138). Los Alamitos, CA: IEEE.
Fawcett, T. E. (2008). PRIE: A system for generating rulelists to maximize ROC performance. Data Mining and Knowledge Discovery, 17(2), 207–224.
Fisher, D. H. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2(2), 139–172.
Fodor, J., & Roubens, M. (1994). Fuzzy preference modelling and multicriteria decision support. Dordrecht, The Netherlands/Boston: Kluwer.
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science.
Frank, E., & Hall, M. (2001). A simple approach to ordinal classification. In L. D. Raedt & P. Flach (Eds.), Proceedings of the 12th European Conference on Machine Learning (ECML-01), Freiburg, Germany (pp. 145–156). Berlin, Germany/New York: Springer.
Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In J. Shavlik (Ed.), Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI (pp. 144–151). San Francisco: Morgan Kaufmann.
Friedman, J. H. (1996). Another approach to polychotomous classification (Tech. rep.). Stanford, CA: Department of Statistics, Stanford University.
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian networks classifiers. Machine Learning, 29, 131–161.
Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. Annals of Applied Statistics, 2, 916–954.
Fürnkranz, J. (2002b). Round robin classification. Journal of Machine Learning Research, 2, 721–747.
Fürnkranz, J. (2003). Round robin ensembles. Intelligent Data Analysis, 7(5), 385–404.
Fürnkranz, J., & Flach, P. (2005). ROC ’n’ rule learning – Towards a better understanding of covering algorithms. Machine Learning, 58(1), 39–77.
Fürnkranz, J., & Hüllermeier, E. (2003). Pairwise preference learning and ranking. In N. Lavrač, D. Gamberger, H. Blockeel, & L. Todorovski (Eds.), Proceedings of the 14th European Conference on Machine Learning (ECML-03), Cavtat, Croatia (pp. 145–156). Berlin, Germany/New York: Springer.
Fürnkranz, J., & Hüllermeier, E. (Eds.). (2010a). Preference learning. Heidelberg, Germany/New York: Springer.
Fürnkranz, J., & Hüllermeier, E. (2010b). Preference learning and ranking by pairwise comparison. In J. Fürnkranz & E. Hüllermeier (Eds.), Preference learning (pp. 65–82). Heidelberg, Germany/New York: Springer.
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., & Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133–153.
Fürnkranz, J., Hüllermeier, E., & Vanderlooy, S. (2009). Binary decomposition methods for multipartite ranking. In W. L. Buntine, M. Grobelnik, D. Mladenić, & J. Shawe-Taylor (Eds.), Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-09), Bled, Slovenia (Vol. Part I, pp. 359–374). Berlin, Germany: Springer.
Fürnkranz, J., & Sima, J. F. (2010). On exploiting hierarchical label structure with pairwise classifiers. SIGKDD Explorations, 12(2), 21–25. Special Issue on Mining Unexpected Results.
Gamberger, D., Lavrač, N., & Krstačić, G. (2002). Confirmation rule induction and its applications to coronary heart disease diagnosis and risk group discovery. Journal of Intelligent and Fuzzy Systems, 12(1), 35–48.
Ghani, R. (2000). Using error-correcting codes for text classification. In Proceedings of the 17th International Conference on Machine Learning (ICML-00) (pp. 303–310). San Francisco: Morgan Kaufmann Publishers.
Gönen, M., & Heller, G. (2005). Concordance probability and discriminatory power in proportional hazards regression. Biometrika, 92(4), 965–970.
Har-Peled, S., Roth, D., & Zimak, D. (2002). Constraint classification: A new approach to multiclass classification. In N. Cesa-Bianchi, M. Numao, & R. Reischuk (Eds.), Proceedings of the 13th International Conference on Algorithmic Learning Theory (ALT-02), Lübeck, Germany (pp. 365–379). Berlin, Germany/New York: Springer.
Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. In M. Jordan, M. Kearns, & S. Solla (Eds.), Advances in neural information processing systems 10 (NIPS-97) (pp. 507–513). Cambridge, MA: MIT.
Hocquenghem, A. (1959). Codes correcteurs d’erreurs. Chiffres, 2, 147–156. In French.
Holmes, G., Hall, M., & Frank, E. (1999). Generating rule sets from model trees. In N. Y. Foo (Ed.), Proceedings of the 12th Australian Joint Conference on Artificial Intelligence (AI-99), Sydney, Australia (pp. 1–12). Berlin, Germany/New York: Springer.
Hsu, C.-W., & Lin, C.-J. (2002). A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425.
Hühn, J., & Hüllermeier, E. (2009a). FR3: A fuzzy rule learner for inducing reliable classifiers. IEEE Transactions on Fuzzy Systems, 17(1), 138–149.
Hüllermeier, E., & Fürnkranz, J. (2010). On predictive accuracy and risk minimization in pairwise label ranking. Journal of Computer and System Sciences, 76(1), 49–62.
Hüllermeier, E., Fürnkranz, J., Cheng, W., & Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172, 1897–1916.
Janssen, F., & Fürnkranz, J. (2011). Heuristic rule-based regression via dynamic reduction to classification. In T. Walsh (Ed.), Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, Spain (pp. 1330–1335). Menlo Park, CA: AAAI.
Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-02), Edmonton, AB (pp. 133–142). New York: ACM.
Joachims, T. (2006). Training linear SVMs in linear time. In T. Eliassi-Rad, L. H. Ungar, M. Craven, & D. Gunopulos (Eds.), Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), Philadelphia (pp. 217–226). New York: ACM.
Karalič, A., & Bratko, I. (1997). First order regression. Machine Learning, 26(2/3), 147–176. Special Issue on Inductive Logic Programming.
Kittler, J., Ghaderi, R., Windeatt, T., & Matas, J. (2003). Face verification via error correcting output codes. Image and Vision Computing, 21(13–14), 1163–1169.
Knerr, S., Personnaz, L., & Dreyfus, G. (1990). Single-layer learning revisited: A stepwise procedure for building and training a neural network. In F. Fogelman Soulié & J. Hérault (Eds.), Neurocomputing: Algorithms, architectures and applications (NATO ASI Series, Vol. F68, pp. 41–50). Berlin, Germany/New York: Springer.
Knerr, S., Personnaz, L., & Dreyfus, G. (1992). Handwritten digit recognition by neural networks with single-layer training. IEEE Transactions on Neural Networks, 3(6), 962–968.
Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. In Proceedings of the 14th International Conference on Machine Learning (ICML-97), Nashville, TN (pp. 170–178). San Francisco: Morgan Kaufmann Publishers
Kong, E. B., & Dietterich, T. G. (1995). Error-correcting output coding corrects bias and variance. In Proceedings of the 12th International Conference on Machine Learning (ICML-95) (pp. 313–321). San Mateo, CA: Morgan Kaufmann.
Kramer, S. (1996). Structural regression trees. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-96) (pp. 812–819). Menlo Park, CA: AAAI.
Kreßel, U. H.-G. (1999). Pairwise classification and support vector machines. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in Kernel methods: Support vector learning (pp. 255–268). Cambridge, MA: MIT. Chap. 15.
Landwehr, N., Kersting, K., & De Raedt, L. (2007). Integrating Naive Bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.
Langford, J., Oliveira, R., & Zadrozny, B. (2006). Predicting conditional quantiles via reduction to classification. In Proceedings of the 22nd Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Cambridge, MA (pp. 257–264). Arlington, VA: AUAI.
Lindgren, T., & Boström, H. (2004). Resolving rule conflicts with double induction. Intelligent Data Analysis, 8(5), 457–468.
Loza Mencía, E., Park, S.-H., & Fürnkranz, J. (2009). Efficient voting prediction for pairwise multilabel classification. In Proceedings of the 17th European Symposium on Artificial Neural Networks (ESANN-09), Bruges, Belgium (pp. 117–122). Evere, Belgium: d-side publications.
Lu, B.-L., & Ito, M. (1999). Task decomposition and module combination based on class relations: A modular neural network for pattern classification. IEEE Transactions on Neural Networks, 10(5), 1244–1256.
MacWilliams, F. J., & Sloane, N. J. A. (1983). The theory of error-correcting codes. North Holland, The Netherlands: North-Holland Mathematical Library.
Melvin, I., Ie, E., Weston, J., Noble, W. S., & Leslie, C. (2007). Multi-class protein classification using adaptive codes. Journal of Machine Learning Research, 8, 1557–1581.
Michalski, R. S. (1969). On the quasi-minimal solution of the covering problem. In Proceedings of the 5th International Symposium on Information Processing (FCIP-69), Bled, Yugoslavia (Switching circuits, Vol. A3, pp. 125–128).
Michalski, R. S. (1980). Pattern recognition and rule-guided inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 349–361.
Michalski, R. S., & Stepp, R. E. (1983). Learning from observation: Conceptual clustering. In R. Michalski, J. Carbonell, & T. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Palo Alto, CA: Tioga.
Mooney, R. J., & Califf, M. E. (1995). Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, 3, 1–24.
Park, S.-H., & Fürnkranz, J. (2007). Efficient pairwise classification. In J. N. Kok, J. Koronacki, R. López de Mántaras, S. Matwin, D. Mladenić, & A. Skowron (Eds.), Proceedings of 18th European Conference on Machine Learning (ECML-07), Warsaw, Poland (pp. 658–665). Berlin, Germany/New York: Springer.
Park, S.-H., & Fürnkranz, J. (2009). Efficient decoding of ternary error-correcting output codes for multiclass classification. In W. L. Buntine, M. Grobelnik, D. Mladenić, & J. Shawe-Taylor (Eds.), Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-09), Bled, Slovenia (Vol. Part II, pp. 189–204). Berlin, Germany: Springer.
Pazzani, M., Merz, C. J., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs. In W. W. Cohen & H. Hirsh (Eds.), Proceedings of the 11th International Conference on Machine Learning (ML-94) (pp. 217–225). New Brunswick, NJ: Morgan Kaufmann.
Pelleg, D., & Moore, A. (2001). Mixtures of rectangles: Interpretable soft clustering. In C. E. Brodley & A. P. Danyluk (Eds.), Proceedings of the 18th International Conference on Machine Learning (ICML-01), Williamstown, MA (pp. 401–408). San Francisco: Morgan Kaufmann.
Pietraszek, T. (2007). On the use of ROC analysis for the optimization of abstaining classifiers. Machine Learning, 68(2), 137–169.
Pimenta, E., Gama, J., & de Leon Ferreira de Carvalho, A. C. P. (2008). The dimension of ECOCs for multiclass classification problems. International Journal on Artificial Intelligence Tools, 17(3), 433–447.
Platt, J. C., Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGs for multiclass classification. In S. A. Solla, T. K. Leen, & K.-R. Müller (Eds.), Advances in neural information processing systems 12 (NIPS-99) (pp. 547–553). Cambridge, MA/London: MIT.
Prati, R. C., & Flach, P. A. (2005). Roccer: An algorithm for rule learning based on ROC analysis. In L. P. Kaelbling & A. Saffiotti (Eds.), Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), Edinburgh, UK (pp. 823–828). Professional Book Center.
Price, D., Knerr, S., Personnaz, L., & Dreyfus, G. (1995). Pairwise neural network classifiers with probabilistic outputs. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems 7 (NIPS-94) (pp. 1109–1116). Cambridge, MA: MIT.
Pujol, O., Radeva, P., & Vitriá, J. (2006). Discriminant ECOC: A heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 1007–1012.
Quevedo, J. R., Montañés, E., Luaces, O., & del Coz, J. J. (2010). Adapting decision DAGs for multipartite ranking. In J. L. Balcázar, F. Bonchi, A. Gionis, & M. Sebag (Eds.), Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD-10) Barcelona, Spain (Part III, pp. 115–130). Berlin, Germany/Heidelberg, Germany: Springer.
Quinlan, J. R. (1987a). Generating production rules from decision trees. In Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI-87) (pp. 304–307). Los Altos, CA: Morgan Kaufmann.
Quinlan, J. R. (1992). Learning with continuous classes. In N. Adams & L. Sterling (Eds.), Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, TAS (pp. 343–348). Singapore: World Scientific.
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.
Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.
Schmidt, M. S., & Gish, H. (1996). Speaker identification via support vector classifiers. In Proceedings of the 21st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (pp. 105–108). Piscataway, NJ: IEEE.
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222.
Stepp, R. E., & Michalski, R. S. (1986). Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence, 28(1), 43–69.
Sulzmann, J.-N., & Fürnkranz, J. (2011). Rule stacking: An approach for compressing an ensemble of rule sets into a single classifier. In T. Elomaa, J. Hollmèn, & H. Mannila (Eds.), Proceedings of the 14th International Conference on Discovery Science (DS-11), Espoo, Finland (pp. 323–334). Berlin, Germany/New York: Springer.
Torgo, L. (1995). Data fitting with rule-based regression. In J. Zizka & P. B. Brazdil (Eds.), Proceedings of the 2nd International Workshop on Artificial Intelligence Techniques (AIT-95). Brno, Czech Republic: Springer.
Torgo, L., & Gama, J. (1997). Regression using classification algorithms. Intelligent Data Analysis, 1(4), 275-292.
Van Horn, K. S., & Martinez, T. R. (1993). The BBG rule induction algorithm. In Proceedings of the 6th Australian Joint Conference on Artificial Intelligence (AI-93), Melbourne, VIC (pp. 348–355). Singapore: World Scientific.
Webb, G. I. (1994). Recent progress in learning decision lists by prepending inferred rules. In Proceedings of the 2nd Singapore International Conference on Intelligent Systems (pp. B280–B285). Singapore: World Scientific.
Webb, G. I., & Brkič, N. (1993). Learning decision lists by prepending inferred rules. In Proceedings of the AI’93 Workshop on Machine Learning and Hybrid Systems, Melbourne, VIC (pp. 6–10). Melbourne, Australia.
Weiss, S. M., & Indurkhya, N. (1995). Rule-based machine learning methods for functional prediction. Journal of Artificial Intelligence Research, 3, 383–403.
Windeatt, T., & Ghaderi, R. (2003). Coding and decoding strategies for multi-class learning problems. Information Fusion, 4(1), 11–21.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–260.
Wu, T.-F., Lin, C.-J., & Weng, R. C. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005.
Zenko, B. (2007). Learning Predictive Clustering Rules. Ph.D. thesis, University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, Slovenia.
Zenko, B., Džeroski, S., & Struyf, J. (2006). Learning predictive clustering rules. In F. Bonchi & J.-F. Boulicaut (Eds.), Proceedings of the 4th International Workshop on Knowledge Discovery in Inductive Databases (KDID-05), Porto, Portugal (pp. 234–250). Berlin, Germany/New York: Springer.
Zimmermann, A., & De Raedt, L. (2009). Cluster-grouping: From subgroup discovery to clustering. Machine Learning, 77(1), 125–159.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Fürnkranz, J., Gamberger, D., Lavrač, N. (2012). Beyond Concept Learning. In: Foundations of Rule Learning. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75197-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-75197-7_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75196-0
Online ISBN: 978-3-540-75197-7
eBook Packages: Computer ScienceComputer Science (R0)