Abstract
Traditional machine learning methods only consider relationships between feature values within individual data instances while disregarding the dependencies that link features across instances. In this work, we develop a general approach to supervised learning by leveraging higher-order dependencies between features. We introduce a novel Bayesian framework for classification named Higher Order Naive Bayes (HONB). Unlike approaches that assume data instances are independent, HONB leverages co-occurrence relations between feature values across different instances. Additionally, we generalize our framework by developing a novel data-driven space transformation that allows any classifier operating in vector spaces to take advantage of these higher-order co-occurrence relations. Results obtained on several benchmark text corpora demonstrate that higher-order approaches achieve significant improvements in classification accuracy over the baseline (first-order) methods.
Chapter PDF
Similar content being viewed by others
Keywords
References
Chakrabarti, S., Dom, B., Indyk, P.: Enhanced hypertext categorization using hyperlinks. SIGMOD Rec. 27(2), 307–318 (1998)
Neville, J., Jensen, D.: Iterative classification in relational data. In: Proc. AAAI, pp. 13–20. AAAI Press, Menlo Park (2000)
Taskar, B., Segal, E., Koller, D.: Probabilistic classification and clustering in relational data. In: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pp. 870–878 (2001)
Ganiz, M.C., Kanitkar, S., Chuah, M.C., Pottenger, W.M.: Detection of interdomain routing anomalies based on higher-order path analysis. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 874–879. IEEE Computer Society, Los Alamitos (2006)
Kontostathis, A., Pottenger, W.M.: A framework for understanding latent semantic indexing (LSI) performance. Inf. Process. Manage. 42(1), 56–73 (2006)
Slonim, N., Tishby, N.: The power of word clusters for text classification. In: 23rd European Colloquium on Information Retrieval Research (2001)
Getoor, L., Diehl, C.P.: Link mining: a survey. SIGKDD Explor. Newsl. 7(2), 3–12 (2005)
Lu, Q., Getoor, L.: Link-based classification. In: Fawcett, T., Mishra, N. (eds.) ICML, pp. 496–503. AAAI Press, Menlo Park (2003)
Neville, J., Jensen, D.: Dependency networks for relational data. In: Fourth IEEE International Conference on Data Mining, 2004. ICDM 2004, pp. 170–177 (2004)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Li, S., Wu, T., Pottenger, W.M.: Distributed higher order association rule mining using information extracted from textual data. SIGKDD Explor. Newsl. 7(1), 26–35 (2005)
Edmonds, P.: Choosing the word most typical in context using a lexical co-occurrence network. In: Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, pp. 507–509 (1997)
Zhang, X., Berry, M.W., Raghavan, P.: Level search schemes for information filtering and retrieval. Inf. Process. Manage. 37(2), 313–334 (2001)
Schütze, H.: Automatic word sense discrimination. Comput. Linguist. 24(1), 97–123 (1998)
Xu, J., Croft, W.B.: Corpus-based stemming using cooccurrence of word variants. ACM Trans. Inf. Syst. 16(1), 61–81 (1998)
Scholkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)
Lang, K.: Newsweeder: Learning to filter netnews. In: Proceedings of the Twelfth International Conference on Machine Learning, pp. 331–339 (1995)
Sen, P., Getoor, L.: Link-based classification. Technical Report CS-TR-4858, University of Maryland (February 2007)
Vapnik, V.: Statistical Learning Theory. John Wiley, Chichester (1998)
Joachims, T.: Learning to Classify Text Using Support Vector Machines: Methods, Theory and Algorithms. Kluwer Academic Publishers, Norwell (2002)
Kreßel, U.H.G.: Pairwise classification and support vector machines. In: Advances in kernel methods: support vector learning, pp. 255–268. MIT Press, Cambridge (1999)
Ganiz, M.C., Lytkin, N.I., Pottenger, W.M.: Leveraging higher order dependencies between features for text classification. Technical Report 2009-16, DIMACS, Rutgers University (June 2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ganiz, M.C., Lytkin, N.I., Pottenger, W.M. (2009). Leveraging Higher Order Dependencies between Features for Text Classification. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04180-8_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-04180-8_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04179-2
Online ISBN: 978-3-642-04180-8
eBook Packages: Computer ScienceComputer Science (R0)