Beyond Concept Learning

Fürnkranz, Johannes; Gamberger, Dragan; Lavrač, Nada

doi:10.1007/978-3-540-75197-7_10

Johannes Fürnkranz⁴,
Dragan Gamberger⁵ &
Nada Lavrač⁶

Part of the book series: Cognitive Technologies ((COGTECH))

2128 Accesses
2 Citations

Abstract

So far, we have mostly assumed a concept learning framework, where the learner’s task is to learn a rule set describing the target concept from a set of positive and negative examples for this concept. In this chapter, we discuss approaches that allow to extend this framework. We start with multiclass problems, which commonly occur in practice, and discuss the most popular methods for handling them: one-against-all classification and pairwise classification. We also discuss error-correcting output codes as a general framework for reducing multiclass problems to binary classification. As many prediction problems have complex, structured output variables, we also present label ranking and show how a generalization of pairwise classification can address this problem and related problems such as multilabel, hierarchical, and ordered classification. General ranking problems, in particular methods for optimizing the area under the ROC curve, are also addressed in this section. Finally, we briefly review rule learning approaches to regression and clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.95; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Parts of this chapter are based on Fürnkranz (2002b), Park and Fürnkranz (2009) and Fürnkranz and Hüllermeier (2010b).
2.
Recall that \(\hat{{\pi }}_{i} = \frac{\hat{{P}}_{i}} {\hat{E}}\) is the proportion of covered examples of class c _i.
3.
We here refer to William Cohen’s original C-implementation of the algorithm. At the time of this writing, JRip, the more accessible Weka re-implementation of Ripper, does not support these options.
4.
Stacking (Wolpert, 1992) denotes a family of techniques that use the predictions of a set of classifiers as inputs for a meta-level classifier that makes the final prediction.
5.
Bagging (Breiman, 1996) is a popular ensemble technique which trains a set of classifiers, each on a sample of the training data that was generated by sampling uniformly and with replacement. The predictions of these classifiers are then combined, which often yields a better practical performance than using the predictions of a single classifier.
6.
http://www.rulequest.com/cubist-info.html

References

Ali, K. M., & Pazzani, M. J. (1993). HYDRA: A noise-tolerant relational concept learning algorithm. In R. Bajcsy (Ed.), Proceedings of the 13th Joint International Conference on Artificial Intelligence (IJCAI-93), Chambéry, France (pp. 1064–1071). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Allwein, E. L., Schapire, R. E., & Singer, Y. (2000). Reducing multiclass to binary: A unifying approach for margin classifiers. Journal of Machine Learning Research, 1, 113–141.
MathSciNet Google Scholar
Bisson, G. (1992). Conceptual clustering in a first order logic representation. In B. Neumann (Ed.), Proceedings of the 10th European Conference on Artificial Intelligence (ECAI-92), Vienna (pp. 458–462). Chichester, UK/New York: Wiley.
Google Scholar
Blaszczynski, J., Stefanowski, J., & Zajac, M. (2009). Ensembles of abstaining classifiers based on rule sets. In J. Rauch, Z. W. Ras, P. Berka, & T. Elomaa (Eds.), Proceedings of the 18th International Symposium on Foundations of Intelligent Systems (ISMIS-09), Prague, Czech Republic (pp. 382–391). Berlin, Germany: Springer.
Google Scholar
Blockeel, H., De Raedt, L., & Ramon, J. (1998). Top-down induction of clustering trees. In J. Shavlik (Ed.), Proceedings of the 15th International Conference on Machine Learning, Madison, WI (pp. 55–63). San Francisco: Morgan Kaufmann.
Google Scholar
Bose, R. C., & Ray Chaudhuri, D. K. (1960). On a class of error correcting binary group codes. Information and Control, 3(1), 68–79.
Article MathSciNet MATH Google Scholar
Boström, H. (2007). Maximizing the area under the ROC curve with decision lists and rule sets. In Proceedings of the 7th SIAM International Conference on Data Mining (SDM-07), Minneapolis, MN (pp. 27–34). Philadelphia: SIAM.
Google Scholar
Bradley, R. A., & Terry, M. E. (1952). The rank analysis of incomplete block designs—I. The method of paired comparisons. Biometrika, 39, 324–345.
Google Scholar
Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.
MathSciNet MATH Google Scholar
Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. (1984). Classification and regression trees. Pacific Grove, CA: Wadsworth & Brooks.
MATH Google Scholar
Cardoso, J. S., & da Costa, J. F. P. (2007). Learning to classify ordinal data: The data replication method. Journal of Machine Learning Research, 8, 1393–1429.
MATH Google Scholar
Clark, P., & Boswell, R. (1991). Rule induction with CN2: Some recent improvements. In Proceedings of the 5th European Working Session on Learning (EWSL-91), Porto, Portugal (pp. 151–163). Berlin, Germany: Springer.
Google Scholar
Cohen, W. W., Schapire, R. E., & Singer, Y. (1999). Learning to order things. Journal of Artificial Intelligence Research, 10, 243–270.
MathSciNet MATH Google Scholar
Cook, D. J., & Holder, L. B. (1994). Substructure discovery using minimum description length and background knowledge. Journal of Artificial Intelligence Research, 1, 231–255.
Google Scholar
Crammer, K., & Singer, Y. (2002). On the learnability and design of output codes for multiclass problems. Machine Learning, 47(2–3), 201–233.
Article MATH Google Scholar
Davis, J., Burnside, E., Castro Dutra, I. d., Page, D., & Santos Costa, V. (2004). Using Bayesian classifiers to combine rules. In Proceedings of the 3rd SIGKDD Workshop on Multi-Relational Data Mining (MRDM-04), Seattle, WA.
Google Scholar
Dekel, O., Manning, C. D., & Singer, Y. (2004). Log-linear models for label ranking. In S. Thrun, L. K. Saul, & B. Schölkopf (Eds.), Advances in neural information processing systems (NIPS-03) (pp. 497–504). Cambridge, MA: MIT.
Google Scholar
Dembczyński, K., Kotłowski, W., & Słowiński, R. (2008). Solving regression by learning an ensemble of decision rules. In L. Rutkowski, R. Tadeusiewicz, L. A. Zadeh, & J. M. Zurada (Eds.), Proceedings of the 9th International Conference on Artificial Intelligence and Soft Computing (ICAISC-08), Zakopane, Poland (pp. 533–544). Berlin, Germany/New York: Springer.
Google Scholar
Dietterich, T. G., & Bakiri, G. (1995). Solving multiclass learning problems via error-correcting output codes. Journal of Artificial Intelligence Research, 2, 263–286.
MATH Google Scholar
Eineborg, M., & Boström, H. (2001). Classifying uncovered examples by rule stretching. In C. Rouveirol & M. Sebag (Eds.), Proceedings of the Eleventh International Conference on Inductive Logic Programming (ILP-01), Strasbourg, France (pp. 41–50). Berlin, Germany/New York: Springer.
Google Scholar
Escalera, S., Pujol, O., & Radeva, P. (2006). Decoding of ternary error correcting output codes. In J. F. M. Trinidad, J. A. Carrasco-Ochoa, & J. Kittler (Eds.), Proceedings of the 11th Iberoamerican Congress in Pattern Recognition (CIARP-06), Cancun, Mexico (pp. 753–763). Berlin, Germany/Heidelberg, Germany/New York: Springer.
Google Scholar
Fawcett, T. E. (2001). Using rule sets to maximize ROC performance. In Proceedings of the IEEE International Conference on Data Mining (ICDM-01), San Jose, CA (pp. 131–138). Los Alamitos, CA: IEEE.
Google Scholar
Fawcett, T. E. (2008). PRIE: A system for generating rulelists to maximize ROC performance. Data Mining and Knowledge Discovery, 17(2), 207–224.
Article MathSciNet Google Scholar
Fisher, D. H. (1987). Knowledge acquisition via incremental conceptual clustering. Machine Learning, 2(2), 139–172.
Google Scholar
Fodor, J., & Roubens, M. (1994). Fuzzy preference modelling and multicriteria decision support. Dordrecht, The Netherlands/Boston: Kluwer.
MATH Google Scholar
Frank, A., & Asuncion, A. (2010). UCI machine learning repository. Irvine, CA: University of California, School of Information and Computer Science.
Google Scholar
Frank, E., & Hall, M. (2001). A simple approach to ordinal classification. In L. D. Raedt & P. Flach (Eds.), Proceedings of the 12th European Conference on Machine Learning (ECML-01), Freiburg, Germany (pp. 145–156). Berlin, Germany/New York: Springer.
Google Scholar
Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In J. Shavlik (Ed.), Proceedings of the 15th International Conference on Machine Learning (ICML-98), Madison, WI (pp. 144–151). San Francisco: Morgan Kaufmann.
Google Scholar
Friedman, J. H. (1996). Another approach to polychotomous classification (Tech. rep.). Stanford, CA: Department of Statistics, Stanford University.
Google Scholar
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian networks classifiers. Machine Learning, 29, 131–161.
Article MATH Google Scholar
Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. Annals of Applied Statistics, 2, 916–954.
Article MathSciNet MATH Google Scholar
Fürnkranz, J. (2002b). Round robin classification. Journal of Machine Learning Research, 2, 721–747.
MATH Google Scholar
Fürnkranz, J. (2003). Round robin ensembles. Intelligent Data Analysis, 7(5), 385–404.
Google Scholar
Fürnkranz, J., & Flach, P. (2005). ROC ’n’ rule learning – Towards a better understanding of covering algorithms. Machine Learning, 58(1), 39–77.
Article MATH Google Scholar
Fürnkranz, J., & Hüllermeier, E. (2003). Pairwise preference learning and ranking. In N. Lavrač, D. Gamberger, H. Blockeel, & L. Todorovski (Eds.), Proceedings of the 14th European Conference on Machine Learning (ECML-03), Cavtat, Croatia (pp. 145–156). Berlin, Germany/New York: Springer.
Google Scholar
Fürnkranz, J., & Hüllermeier, E. (Eds.). (2010a). Preference learning. Heidelberg, Germany/New York: Springer.
MATH Google Scholar
Fürnkranz, J., & Hüllermeier, E. (2010b). Preference learning and ranking by pairwise comparison. In J. Fürnkranz & E. Hüllermeier (Eds.), Preference learning (pp. 65–82). Heidelberg, Germany/New York: Springer.
Chapter Google Scholar
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., & Brinker, K. (2008). Multilabel classification via calibrated label ranking. Machine Learning, 73(2), 133–153.
Article Google Scholar
Fürnkranz, J., Hüllermeier, E., & Vanderlooy, S. (2009). Binary decomposition methods for multipartite ranking. In W. L. Buntine, M. Grobelnik, D. Mladenić, & J. Shawe-Taylor (Eds.), Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-09), Bled, Slovenia (Vol. Part I, pp. 359–374). Berlin, Germany: Springer.
Google Scholar
Fürnkranz, J., & Sima, J. F. (2010). On exploiting hierarchical label structure with pairwise classifiers. SIGKDD Explorations, 12(2), 21–25. Special Issue on Mining Unexpected Results.
Google Scholar
Gamberger, D., Lavrač, N., & Krstačić, G. (2002). Confirmation rule induction and its applications to coronary heart disease diagnosis and risk group discovery. Journal of Intelligent and Fuzzy Systems, 12(1), 35–48.
MATH Google Scholar
Ghani, R. (2000). Using error-correcting codes for text classification. In Proceedings of the 17th International Conference on Machine Learning (ICML-00) (pp. 303–310). San Francisco: Morgan Kaufmann Publishers.
Google Scholar
Gönen, M., & Heller, G. (2005). Concordance probability and discriminatory power in proportional hazards regression. Biometrika, 92(4), 965–970.
Article MathSciNet MATH Google Scholar
Har-Peled, S., Roth, D., & Zimak, D. (2002). Constraint classification: A new approach to multiclass classification. In N. Cesa-Bianchi, M. Numao, & R. Reischuk (Eds.), Proceedings of the 13th International Conference on Algorithmic Learning Theory (ALT-02), Lübeck, Germany (pp. 365–379). Berlin, Germany/New York: Springer.
Google Scholar
Hastie, T., & Tibshirani, R. (1998). Classification by pairwise coupling. In M. Jordan, M. Kearns, & S. Solla (Eds.), Advances in neural information processing systems 10 (NIPS-97) (pp. 507–513). Cambridge, MA: MIT.
Google Scholar
Hocquenghem, A. (1959). Codes correcteurs d’erreurs. Chiffres, 2, 147–156. In French.
Google Scholar
Holmes, G., Hall, M., & Frank, E. (1999). Generating rule sets from model trees. In N. Y. Foo (Ed.), Proceedings of the 12th Australian Joint Conference on Artificial Intelligence (AI-99), Sydney, Australia (pp. 1–12). Berlin, Germany/New York: Springer.
Google Scholar
Hsu, C.-W., & Lin, C.-J. (2002). A comparison of methods for multi-class support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425.
Article Google Scholar
Hühn, J., & Hüllermeier, E. (2009a). FR3: A fuzzy rule learner for inducing reliable classifiers. IEEE Transactions on Fuzzy Systems, 17(1), 138–149.
Article Google Scholar
Hüllermeier, E., & Fürnkranz, J. (2010). On predictive accuracy and risk minimization in pairwise label ranking. Journal of Computer and System Sciences, 76(1), 49–62.
Article MathSciNet MATH Google Scholar
Hüllermeier, E., Fürnkranz, J., Cheng, W., & Brinker, K. (2008). Label ranking by learning pairwise preferences. Artificial Intelligence, 172, 1897–1916.
Article MathSciNet MATH Google Scholar
Janssen, F., & Fürnkranz, J. (2011). Heuristic rule-based regression via dynamic reduction to classification. In T. Walsh (Ed.), Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11), Barcelona, Spain (pp. 1330–1335). Menlo Park, CA: AAAI.
Google Scholar
Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-02), Edmonton, AB (pp. 133–142). New York: ACM.
Google Scholar
Joachims, T. (2006). Training linear SVMs in linear time. In T. Eliassi-Rad, L. H. Ungar, M. Craven, & D. Gunopulos (Eds.), Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), Philadelphia (pp. 217–226). New York: ACM.
Google Scholar
Karalič, A., & Bratko, I. (1997). First order regression. Machine Learning, 26(2/3), 147–176. Special Issue on Inductive Logic Programming.
Google Scholar
Kittler, J., Ghaderi, R., Windeatt, T., & Matas, J. (2003). Face verification via error correcting output codes. Image and Vision Computing, 21(13–14), 1163–1169.
Article Google Scholar
Knerr, S., Personnaz, L., & Dreyfus, G. (1990). Single-layer learning revisited: A stepwise procedure for building and training a neural network. In F. Fogelman Soulié & J. Hérault (Eds.), Neurocomputing: Algorithms, architectures and applications (NATO ASI Series, Vol. F68, pp. 41–50). Berlin, Germany/New York: Springer.
Google Scholar
Knerr, S., Personnaz, L., & Dreyfus, G. (1992). Handwritten digit recognition by neural networks with single-layer training. IEEE Transactions on Neural Networks, 3(6), 962–968.
Article Google Scholar
Koller, D., & Sahami, M. (1997). Hierarchically classifying documents using very few words. In Proceedings of the 14th International Conference on Machine Learning (ICML-97), Nashville, TN (pp. 170–178). San Francisco: Morgan Kaufmann Publishers
Google Scholar
Kong, E. B., & Dietterich, T. G. (1995). Error-correcting output coding corrects bias and variance. In Proceedings of the 12th International Conference on Machine Learning (ICML-95) (pp. 313–321). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Kramer, S. (1996). Structural regression trees. In Proceedings of the 13th National Conference on Artificial Intelligence (AAAI-96) (pp. 812–819). Menlo Park, CA: AAAI.
Google Scholar
Kreßel, U. H.-G. (1999). Pairwise classification and support vector machines. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in Kernel methods: Support vector learning (pp. 255–268). Cambridge, MA: MIT. Chap. 15.
Google Scholar
Landwehr, N., Kersting, K., & De Raedt, L. (2007). Integrating Naive Bayes and FOIL. Journal of Machine Learning Research, 8, 481–507.
MATH Google Scholar
Langford, J., Oliveira, R., & Zadrozny, B. (2006). Predicting conditional quantiles via reduction to classification. In Proceedings of the 22nd Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI-06), Cambridge, MA (pp. 257–264). Arlington, VA: AUAI.
Google Scholar
Lindgren, T., & Boström, H. (2004). Resolving rule conflicts with double induction. Intelligent Data Analysis, 8(5), 457–468.
Google Scholar
Loza Mencía, E., Park, S.-H., & Fürnkranz, J. (2009). Efficient voting prediction for pairwise multilabel classification. In Proceedings of the 17th European Symposium on Artificial Neural Networks (ESANN-09), Bruges, Belgium (pp. 117–122). Evere, Belgium: d-side publications.
Google Scholar
Lu, B.-L., & Ito, M. (1999). Task decomposition and module combination based on class relations: A modular neural network for pattern classification. IEEE Transactions on Neural Networks, 10(5), 1244–1256.
Article Google Scholar
MacWilliams, F. J., & Sloane, N. J. A. (1983). The theory of error-correcting codes. North Holland, The Netherlands: North-Holland Mathematical Library.
Google Scholar
Melvin, I., Ie, E., Weston, J., Noble, W. S., & Leslie, C. (2007). Multi-class protein classification using adaptive codes. Journal of Machine Learning Research, 8, 1557–1581.
MathSciNet MATH Google Scholar
Michalski, R. S. (1969). On the quasi-minimal solution of the covering problem. In Proceedings of the 5th International Symposium on Information Processing (FCIP-69), Bled, Yugoslavia (Switching circuits, Vol. A3, pp. 125–128).
Google Scholar
Michalski, R. S. (1980). Pattern recognition and rule-guided inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2, 349–361.
Article MATH Google Scholar
Michalski, R. S., & Stepp, R. E. (1983). Learning from observation: Conceptual clustering. In R. Michalski, J. Carbonell, & T. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Palo Alto, CA: Tioga.
Google Scholar
Mooney, R. J., & Califf, M. E. (1995). Induction of first-order decision lists: Results on learning the past tense of English verbs. Journal of Artificial Intelligence Research, 3, 1–24.
Google Scholar
Park, S.-H., & Fürnkranz, J. (2007). Efficient pairwise classification. In J. N. Kok, J. Koronacki, R. López de Mántaras, S. Matwin, D. Mladenić, & A. Skowron (Eds.), Proceedings of 18th European Conference on Machine Learning (ECML-07), Warsaw, Poland (pp. 658–665). Berlin, Germany/New York: Springer.
Google Scholar
Park, S.-H., & Fürnkranz, J. (2009). Efficient decoding of ternary error-correcting output codes for multiclass classification. In W. L. Buntine, M. Grobelnik, D. Mladenić, & J. Shawe-Taylor (Eds.), Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML/PKDD-09), Bled, Slovenia (Vol. Part II, pp. 189–204). Berlin, Germany: Springer.
Google Scholar
Pazzani, M., Merz, C. J., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs. In W. W. Cohen & H. Hirsh (Eds.), Proceedings of the 11th International Conference on Machine Learning (ML-94) (pp. 217–225). New Brunswick, NJ: Morgan Kaufmann.
Google Scholar
Pelleg, D., & Moore, A. (2001). Mixtures of rectangles: Interpretable soft clustering. In C. E. Brodley & A. P. Danyluk (Eds.), Proceedings of the 18th International Conference on Machine Learning (ICML-01), Williamstown, MA (pp. 401–408). San Francisco: Morgan Kaufmann.
Google Scholar
Pietraszek, T. (2007). On the use of ROC analysis for the optimization of abstaining classifiers. Machine Learning, 68(2), 137–169.
Article Google Scholar
Pimenta, E., Gama, J., & de Leon Ferreira de Carvalho, A. C. P. (2008). The dimension of ECOCs for multiclass classification problems. International Journal on Artificial Intelligence Tools, 17(3), 433–447.
Google Scholar
Platt, J. C., Cristianini, N., & Shawe-Taylor, J. (2000). Large margin DAGs for multiclass classification. In S. A. Solla, T. K. Leen, & K.-R. Müller (Eds.), Advances in neural information processing systems 12 (NIPS-99) (pp. 547–553). Cambridge, MA/London: MIT.
Google Scholar
Prati, R. C., & Flach, P. A. (2005). Roccer: An algorithm for rule learning based on ROC analysis. In L. P. Kaelbling & A. Saffiotti (Eds.), Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI-05), Edinburgh, UK (pp. 823–828). Professional Book Center.
Google Scholar
Price, D., Knerr, S., Personnaz, L., & Dreyfus, G. (1995). Pairwise neural network classifiers with probabilistic outputs. In G. Tesauro, D. Touretzky, & T. Leen (Eds.), Advances in neural information processing systems 7 (NIPS-94) (pp. 1109–1116). Cambridge, MA: MIT.
Google Scholar
Pujol, O., Radeva, P., & Vitriá, J. (2006). Discriminant ECOC: A heuristic method for application dependent design of error correcting output codes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(6), 1007–1012.
Article Google Scholar
Quevedo, J. R., Montañés, E., Luaces, O., & del Coz, J. J. (2010). Adapting decision DAGs for multipartite ranking. In J. L. Balcázar, F. Bonchi, A. Gionis, & M. Sebag (Eds.), Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD-10) Barcelona, Spain (Part III, pp. 115–130). Berlin, Germany/Heidelberg, Germany: Springer.
Google Scholar
Quinlan, J. R. (1987a). Generating production rules from decision trees. In Proceedings of the 10th International Joint Conference on Artificial Intelligence (IJCAI-87) (pp. 304–307). Los Altos, CA: Morgan Kaufmann.
Google Scholar
Quinlan, J. R. (1992). Learning with continuous classes. In N. Adams & L. Sterling (Eds.), Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, TAS (pp. 343–348). Singapore: World Scientific.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14, 465–471.
Article MATH Google Scholar
Salzberg, S. (1991). A nearest hyperrectangle learning method. Machine Learning, 6, 251–276.
Google Scholar
Schmidt, M. S., & Gish, H. (1996). Speaker identification via support vector classifiers. In Proceedings of the 21st IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-96), Atlanta, GA (pp. 105–108). Piscataway, NJ: IEEE.
Google Scholar
Smola, A. J., & Schölkopf, B. (2004). A tutorial on support vector regression. Statistics and Computing, 14, 199–222.
Article MathSciNet Google Scholar
Stepp, R. E., & Michalski, R. S. (1986). Conceptual clustering of structured objects: A goal-oriented approach. Artificial Intelligence, 28(1), 43–69.
Article Google Scholar
Sulzmann, J.-N., & Fürnkranz, J. (2011). Rule stacking: An approach for compressing an ensemble of rule sets into a single classifier. In T. Elomaa, J. Hollmèn, & H. Mannila (Eds.), Proceedings of the 14th International Conference on Discovery Science (DS-11), Espoo, Finland (pp. 323–334). Berlin, Germany/New York: Springer.
Google Scholar
Torgo, L. (1995). Data fitting with rule-based regression. In J. Zizka & P. B. Brazdil (Eds.), Proceedings of the 2nd International Workshop on Artificial Intelligence Techniques (AIT-95). Brno, Czech Republic: Springer.
Google Scholar
Torgo, L., & Gama, J. (1997). Regression using classification algorithms. Intelligent Data Analysis, 1(4), 275-292.
Article Google Scholar
Van Horn, K. S., & Martinez, T. R. (1993). The BBG rule induction algorithm. In Proceedings of the 6th Australian Joint Conference on Artificial Intelligence (AI-93), Melbourne, VIC (pp. 348–355). Singapore: World Scientific.
Google Scholar
Webb, G. I. (1994). Recent progress in learning decision lists by prepending inferred rules. In Proceedings of the 2nd Singapore International Conference on Intelligent Systems (pp. B280–B285). Singapore: World Scientific.
Google Scholar
Webb, G. I., & Brkič, N. (1993). Learning decision lists by prepending inferred rules. In Proceedings of the AI’93 Workshop on Machine Learning and Hybrid Systems, Melbourne, VIC (pp. 6–10). Melbourne, Australia.
Google Scholar
Weiss, S. M., & Indurkhya, N. (1995). Rule-based machine learning methods for functional prediction. Journal of Artificial Intelligence Research, 3, 383–403.
MATH Google Scholar
Windeatt, T., & Ghaderi, R. (2003). Coding and decoding strategies for multi-class learning problems. Information Fusion, 4(1), 11–21.
Article Google Scholar
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–260.
Article MathSciNet Google Scholar
Wu, T.-F., Lin, C.-J., & Weng, R. C. (2004). Probability estimates for multi-class classification by pairwise coupling. Journal of Machine Learning Research, 5, 975–1005.
MathSciNet MATH Google Scholar
Zenko, B. (2007). Learning Predictive Clustering Rules. Ph.D. thesis, University of Ljubljana, Faculty of Computer and Information Science, Ljubljana, Slovenia.
Google Scholar
Zenko, B., Džeroski, S., & Struyf, J. (2006). Learning predictive clustering rules. In F. Bonchi & J.-F. Boulicaut (Eds.), Proceedings of the 4th International Workshop on Knowledge Discovery in Inductive Databases (KDID-05), Porto, Portugal (pp. 234–250). Berlin, Germany/New York: Springer.
Google Scholar
Zimmermann, A., & De Raedt, L. (2009). Cluster-grouping: From subgroup discovery to clustering. Machine Learning, 77(1), 125–159.
Article Google Scholar

Download references

Author information

Authors and Affiliations

FB Informatik, TU Darmstadt, Darmstadt, Germany
Johannes Fürnkranz
Rudjer Bošković Institute, Zagreb, Croatia
Dragan Gamberger
Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia
Nada Lavrač

Authors

Johannes Fürnkranz
View author publications
You can also search for this author in PubMed Google Scholar
Dragan Gamberger
View author publications
You can also search for this author in PubMed Google Scholar
Nada Lavrač
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fürnkranz, J., Gamberger, D., Lavrač, N. (2012). Beyond Concept Learning. In: Foundations of Rule Learning. Cognitive Technologies. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75197-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-75197-7_10
Published: 27 September 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75196-0
Online ISBN: 978-3-540-75197-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics