Abstract
This paper describes a clustering method for organizing in semantic classes a list of terms. The experiments were made using a POS annotated corpus, the ACL Anthology, which consists of technical articles in the field of Computational Linguistics. The method, mainly based on some assumptions of Formal Concept Analysis, consists in building bi-dimensional clusters of both terms and their lexico-syntactic contexts. Each generated cluster is defined as a semantic class with a set of terms describing the extension of the class and a set of contexts perceived as the intensional attributes (or properties) valid for all the terms in the extension. The clustering process relies on two restrictive operations: abstraction and specification. The result is a concept lattice that describes a domain-specific ontology of terms.
This work has been supported by the Spanish Government, within the project GaricoTer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Hereth, J., Stumme, G., Wille, R., Wille, U.: Conceptual knowledge discovery - a human-centered approach. Journal of Applied Artificial Intelligence 17(3), 288–301 (2003)
Priss, U.: Formal concept analysis in information science. Information Science and Technology 40, 521–543 (2006)
Pereira, F., Tishby, N., Lee, L.: Distributional clustering of english words. In: ACL 1993, Columbos, Ohio, pp. 183–190 (1993)
Roth, M.: Two-dimensional clusters in grammatical relations. In: AAAI-95 (1995)
Pantel, P., Lin, D.: Discovering word senses from text. In: ACM SIGKDD, Edmonton, Canada, pp. 613–619 (2002)
Faure, D., Nédellec, C.: Asium: Learning subcategorization frames and restrictions of selection. In: Nédellec, C., Rouveirol, C. (eds.) Machine Learning: ECML 1998. LNCS, vol. 1398, Springer, Heidelberg (1998)
Allegrini, P., Montemagni, S., Pirrelli, V.: Example-based automatic induction of semantic classes through entropic scores. Linguistica Computazionale, 1–45 (2003)
Reinberger, M.L., Daelemans, W.: Is shallow parsing useful for unsupervised learning of semantic clusters? In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 304–312. Springer, Heidelberg (2003)
Gamallo, P., Agustini, A., Lopes, G.: Clustering syntactic positions with similar semantic requirements. Computational Linguistics 31(1), 107–146 (2005)
Allegrini, P., Montemagni, S., Pirrelli, V.: Learning word clusters from data types. In: Coling-2000, pp. 8–14 (2000)
Lin, D., Pantel, P.: Induction of semantic classes from natural language text. In: SIGKDD-01, San Francisco (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gamallo, P., Lopes, G.P., Agustini, A. (2007). Inducing Classes of Terms from Text. In: Matoušek, V., Mautner, P. (eds) Text, Speech and Dialogue. TSD 2007. Lecture Notes in Computer Science(), vol 4629. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74628-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-540-74628-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74627-0
Online ISBN: 978-3-540-74628-7
eBook Packages: Computer ScienceComputer Science (R0)