Abstract
Nowadays information systems are required to be more adaptable and flexible than before to deal with the rapidly increasing quantity of available data and changing information needs. Text Classification (TC) is a useful task that can help to solve different problems in different fields. This paper investigates the application of descriptive approaches for modelling classification. The main objectives are increasing abstraction and flexibility so that expert users are able to customise specific strategies for their needs.
The contribution of this paper is two-fold. Firstly, it illustrates that the modelling of classifiers in a descriptive approach is possible and it leads to a close definition w.r.t. mathematical formulations. Moreover, the automatic translation from PDatalog to mathematical formulation is discussed. Secondly, quality and efficiency results prove the approach feasibility for real-scale collections.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Cumbo, C., Iiritano, S., Rullo, P.: Reasoning-Based Knowledge Extraction for Text Classification. In: Suzuki, E., Arikawa, S. (eds.) DS 2004. LNCS (LNAI), vol. 3245, pp. 380–387. Springer, Heidelberg (2004)
Eisner, J., Goldlust, E., Smith, N.A.: Compiling Comp Ling: practical weighted dynamic programming and the Dyna language. In: Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), pp. 281–290 (2005)
Forst, J.F., Tombros, A., Roelleke, T.: POLIS: A Probabilistic Logic for Document Summarisation. In: Proceedings of the 1st International Conference on the Theory of Information Retrieval (ICTIR 2007), pp. 201–212 (2007)
Frommholz, I., Fuhr, N.: Probabilistic, object-oriented logics for annotation-based retrieval in digital libraries. In: Proceedings of Joint Conference on Digital Libraries (JCDL 2006), pp. 55–64 (2006)
Fuhr, N.: Probabilistic Datalog - a logic for powerful retrieval methods. In: Proceedings of the 18th ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 1995), pp. 282–290 (1995)
Gelfond, M., Rushton, N., Zhu, W.: Combining Logical and Probabilistic Reasoning. In: Proceedings of AAAI 2006 Spring Symposium, pp. 50–55 (2006)
Hunter, A., Liu, W.: A survey of formalisms for representing and reasoning with scientific knowledge. The Knowledge Engineering Review 25, 199–222 (2010)
Lloyd, J.W.: Practical Advantages of Declarative Programming. In: Proceedings of Joint Conference on Declarative Programming, GULP-PRODE 1994 (1994)
Lopez, A.: Translation as Weighted Deduction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), pp. 532–540 (2009)
McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: Workshop on Learning for Text Categorization in AAAT/ICML 1998, p. 41 (1998)
Meghini, C., Sebastiani, F., Straccia, U., Thanos, C.: A model of information retrieval based on a terminological logic. In: ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 298–307 (1993)
Nottelmann, H.: PIRE: An Extensible IR Engine Based on Probabilistic Datalog. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 260–274. Springer, Heidelberg (2005)
Nottelmann, H., Fuhr, N.: Learning Probabilistic Datalog Rules for Information Classification and Transformation. In: Proceedings of International Conference on Information and Knowledge Management (CIKM 2001), pp. 387–394 (2001)
Raedt, L.D., Kimmig, A., Toivonen, H.: ProbLog: a probabilistic Prolog and its application in link discovery. In: Proceeding of the International Joint Conference on Artificial Intelligence (JCAI 2007), pp. 2468–2473 (2007)
Roelleke, T., Fuhr, N.: Information retrieval with probabilistic Datalog. In: Crestani, F., Lalmas, M., Rijsbergen, C.J. (eds.) Uncertainty and Logics - Advanced Models for the Representation and Retrieval of Information. Kluwer Academic Publishers, Dordrecht (1998)
Roelleke, T., Wu, H., Wang, J., Azzam, H.: Modelling retrieval models in a probabilistic relational algebra with a new operator: The relational Bayes. VLDB Journal 17(1), 5–37 (2008)
Rolleke, T., Lubeck, R., Kazai, G.: The HySpirit retrieval platform. In: Proceedings of the 24th International ACM SIGIR Conference on Research and Development in Information Retrieval, p. 454. ACM, New York (2001)
Sebastiani, F.: Machine learning in automated text categorization. ACM Comput. Surv. 34, 1–47 (2002)
Shen, W., Doan, A., Naughton, J.F., Ramakrishnan, R.: Declarative information extraction using datalog with embedded extraction predicates. In: VLDB 2007: International Conference on Very Large Data Bases, pp. 1033–1044 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martinez-Alvarez, M., Roelleke, T. (2011). A Descriptive Approach to Classification. In: Amati, G., Crestani, F. (eds) Advances in Information Retrieval Theory. ICTIR 2011. Lecture Notes in Computer Science, vol 6931. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23318-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-23318-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23317-3
Online ISBN: 978-3-642-23318-0
eBook Packages: Computer ScienceComputer Science (R0)