Transductive Learning for Text Classification Using Explicit Knowledge Models

Ifrim, Georgiana; Weikum, Gerhard

doi:10.1007/11871637_24

Georgiana Ifrim²¹ &
Gerhard Weikum²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4213))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

3502 Accesses
11 Citations

Abstract

We present a generative model based approach for transductive learning for text classification. Our approach combines three methodological ingredients: learning from background corpora, latent variable models for decomposing the topic-word space into topic-concept and concept-word spaces, and explicit knowledge models (light-weight ontologies, thesauri, e.g. WordNet) with named concepts for populating latent variables. The combination has synergies that can boost the combined performance. This paper presents the theoretical model and extensive experimental results on three data collections. Our experiments show improved classification results over state-of-the-art classification techniques such as the Spectral Graph Transducer and Transductive Support Vector Machines, particularly for the case of sparse training.

Download to read the full chapter text

Chapter PDF

Learning to Classify Text Using a Few Labeled Examples

Term Network Approach for Transductive Classification

Knowledge-Based Representation for Transductive Multilingual Document Classification

Keywords

References

Bennet, K.: Combining support vector and mathematical programming methods for classification. In: Advances in Kernel Methods. MIT-Press, Cambridge (1999)
Google Scholar
Blei, D., Ng, A., Jordan, M.: Latent dirichlet allocation. In: NIPS (2002)
Google Scholar
Bloehdorn, S., Hotho, A.: Text classification by boosting weak learners based on terms and concepts. In: ICDM (2004)
Google Scholar
Blum, A., Chawla, S.: Learning from labeled and unlabeled data using graph mincuts. In: ICML (2001)
Google Scholar
Chakrabarti, S.: Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufman Publishers, San Francisco (2003)
Google Scholar
Deerwester, S., Dumais, S.T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society of Information Science 41(6) (1990)
Google Scholar
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1999)
Google Scholar
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1) (2001)
Google Scholar
Ifrim, G., Theobald, M., Weikum, G.: Learning word-to-concept mappings for automatic text classification. In: Learning in Web Search Workshop, ICML (2005)
Google Scholar
Ifrim, G.: A Bayesian Learning Approach to Concept-Based Document Classification. Master Thesis (2005), http://www.mpi-inf.de/~ifrim/publications/
Joachims, T.: Transductive learning via spectral graph partitioning. In: ICML (2003)
Google Scholar
Joachims, T.: Transductive inference for text classification using Support Vector Machines. In: ICML (1999)
Google Scholar
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: ECML (1998)
Google Scholar
McCallum, A., Nigam, K.: A comparison of event models for naive bayes text classification. In: AAAI 1998 Workshop on Learning for Text Categorization (1998)
Google Scholar
Ng, A., Jordan, M.: On discriminative versus generative classifiers: A comparison of logistic regression and naive bayes. In: NIPS (2001)
Google Scholar
Nigam, K., McCallum, A., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Machine Learning (39) (2000)
Google Scholar
Rennie, J.: Tackling the poor assumptions of naive bayes. In: ICML (2003)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM, New York (2002)
Google Scholar
Scott, S., Matwin, S.: Feature engineering for text classification. In: ICML (1999)
Google Scholar
Vapnik, V.: Statistical learning theory. Wiley, Chichester (1998)
MATH Google Scholar
Zhang, T., Oles, F.J.: A probability analysis on the value of unlabeled data for classification problems. In: ICML (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Max-Planck Institute for Informatics, Stuhlsatzenhausweg 85, 66123, Saarbrücken, Germany
Georgiana Ifrim & Gerhard Weikum

Authors

Georgiana Ifrim
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Weikum
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Engineering Group, Technische Universität Darmstadt,
Johannes Fürnkranz
Max Planck Institute for Computer Science, Saarbrücken, Germany
Tobias Scheffer
Faculty of Computer Science, Otto-von-Guericke-University Magdeburg, Germany
Myra Spiliopoulou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ifrim, G., Weikum, G. (2006). Transductive Learning for Text Classification Using Explicit Knowledge Models. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds) Knowledge Discovery in Databases: PKDD 2006. PKDD 2006. Lecture Notes in Computer Science(), vol 4213. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11871637_24

Download citation

DOI: https://doi.org/10.1007/11871637_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45374-1
Online ISBN: 978-3-540-46048-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Transductive Learning for Text Classification Using Explicit Knowledge Models

Abstract

Chapter PDF

Similar content being viewed by others

Learning to Classify Text Using a Few Labeled Examples

Term Network Approach for Transductive Classification

Knowledge-Based Representation for Transductive Multilingual Document Classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Transductive Learning for Text Classification Using Explicit Knowledge Models

Abstract

Chapter PDF

Similar content being viewed by others

Learning to Classify Text Using a Few Labeled Examples

Term Network Approach for Transductive Classification

Knowledge-Based Representation for Transductive Multilingual Document Classification

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation