Using Subclasses to Improve Classification Learning

Hoffmann, Achim; Kwok, Rex; Compton, Paul

doi:10.1007/3-540-44795-4_18

Achim Hoffmann³,
Rex Kwok³ &
Paul Compton³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2167))

Included in the following conference series:

European Conference on Machine Learning

2629 Accesses
4 Citations

Abstract

We propose to use systematic simulation studies as opposed to the use of real-world benchmark datasets to better understand the behaviour, strengths and weaknesses of machine learning algorithms. Simulated data sets allow much better control and understanding of the nature of the learning problem than empirical benchmark data sets.

To demonstrate the value of our proposed research methodology, we describe in this paper the results of our studies concerning the problem of learning multiple classes. We derived the following hypothesis: “Learning classification functions using decision tree learners can be helped by providing additional subclass labels.” To illustrate, for learning a two class problem “car is OK/car needs service” it can be helpful to provide a finer-grained classification in the training data such as “car OK”, “faulty brakes”, “faulty engine”, “faulty lights”, etc.

This hypothesis was corroborated using a number of ‘real-world’ multi-class data sets from the UCIMLrepository. Our empirical studies demonstrate the usefulness of the proposed research methodology using artificial data sets as an important methodological complement to using real-world datasets.

Download to read the full chapter text

Chapter PDF

A Comprehensive Review on the Issue of Class Imbalance in Predictive Modelling

A snapshot on nonstandard supervised learning problems: taxonomy, relationships, problem transformations and algorithm adaptations

Article 23 November 2018

Cost-sensitive ensemble learning: a unifying framework

Article Open access 28 September 2021

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

T. Dietterich. Approximate statistical tests for comparing supervised classification learning algorithms. Neural Computation, 7(10):1895–1924, 1998.
Article Google Scholar
J. H. Gennari, P. Langley, and D. Fisher. Model of incremental concept formation. Artificial Intelligence, 40:11–61, 1989.
Article Google Scholar
K. Lang and M. Witbrock. Learning to tell 2-spirals apart. In Connectionist Models Summer School, 1988.
Google Scholar
D. D. Margineantu and T. G. Dietterich. Bootstrap methods for the cost-sensitive evaluation of classifiers. In Proceedings of the 17^th International Conference on Machine Learning, pages 582–590. Kaufmann, 2000.
Google Scholar
C. Mesterharm. A multi-class linear learning algorithm related to Winnow. In Neural Information Processing Systems (NIPS-12), pages 519–525. MIT Press, 2000.
Google Scholar
D. Michie and D. Spiegelhalter. Machine Learning, Neural and Statistical Classification. Ellis Horwood, 1994.
Google Scholar
T. Scheffer. Predicting the generalization performance of cross validatory model selection criteria. In Proceedings of the 17^th International Conference on Machine Learning. Kaufmann, 2000.
Google Scholar
J. Weston and C. Watkins. Multi-class support vector machines. Technical Report CSD-TR-98-04, Royal Holloway, University of London, 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Engineering, The University of New South Wales, UNSW, SYDNEY, NSW, 2052, Australia
Achim Hoffmann, Rex Kwok & Paul Compton

Authors

Achim Hoffmann
View author publications
You can also search for this author in PubMed Google Scholar
Rex Kwok
View author publications
You can also search for this author in PubMed Google Scholar
Paul Compton
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Albert-Ludwigs University Freiburg, Georges Köhler-Allee, Geb. 079, 79110, Freiburg, Germany
Luc De Raedt
Department of Computer Science, University of Bristol, Merchant Ventures Bldg., Woodland Road, Bristol, BS8 1UB, UK
Peter Flach

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hoffmann, A., Kwok, R., Compton, P. (2001). Using Subclasses to Improve Classification Learning. In: De Raedt, L., Flach, P. (eds) Machine Learning: ECML 2001. ECML 2001. Lecture Notes in Computer Science(), vol 2167. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44795-4_18

Download citation

DOI: https://doi.org/10.1007/3-540-44795-4_18
Published: 30 August 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42536-6
Online ISBN: 978-3-540-44795-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Using Subclasses to Improve Classification Learning

Abstract

Chapter PDF

Similar content being viewed by others

A Comprehensive Review on the Issue of Class Imbalance in Predictive Modelling

A snapshot on nonstandard supervised learning problems: taxonomy, relationships, problem transformations and algorithm adaptations

Cost-sensitive ensemble learning: a unifying framework

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Using Subclasses to Improve Classification Learning

Abstract

Chapter PDF

Similar content being viewed by others

A Comprehensive Review on the Issue of Class Imbalance in Predictive Modelling

A snapshot on nonstandard supervised learning problems: taxonomy, relationships, problem transformations and algorithm adaptations

Cost-sensitive ensemble learning: a unifying framework

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation