Abstract
An algorithm that bootstraps the acquisition of large dictionaries of entity types (names) and pattern types from a few seeds and a large unannotated corpora is presented. The algorithm iteratively builds a bigraph of entities and collocated patterns by querying the text. Several classes simultaneously compete to label the entity types. Different experiments have been carried to acquire resources from a 1GB corpus of Spanish news. The usefulness of the acquired list of entity types for the task of Name Classification has also been evaluated with good results for a weakly supervised method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Collins, M., Singer, Y.: Unsupervised models for named entity classification. In: EMNLP 1999 (1999)
Cucerzan, S., Yarowsky, D.: Language independent named entity recognition combining morphological and contextual evidence. In: Joint SIGDAT Conference on EMNLP and VLC, pp. 90–99 (1999)
Thelen, M., Riloff, E.: A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In: EMNLP 2003, Morristown, NJ, USA, pp. 214–221. Association for Computational Linguistics (2002)
Lin, W., Yangarber, R., Grishman, R.: Bootstrapped learning of semantic classes from positive and negative examples. In: ICML 2003 (2003)
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised named-entity extraction from the web: an experimental study. Artificial Intelligence 165(1), 91–134 (2005)
Nadeau, D., Turney, P., Matwin, S.: Unsupervised named-entity recognition: Generating gazetteers and resolving ambiguity. In: Lamontagne, L., Marchand, M. (eds.) Canadian AI 2006. LNCS, vol. 4013, pp. 266–277. Springer, Heidelberg (2006)
Ji, H., Grishman, R.: Data selection in semi-supervised learning for name tagging. In: ACL 2006 (2006)
Wong, Y., Ng, H.T.: One class per named entity: Exploiting unlabeled text for named entity recognition. In: IJCAI 2007 (2007)
Miller, S., Guinness, J., Zamanian, A.: Name tagging with word clusters and discriminative training. In: HLT-NAACL 2004, pp. 337–342 (2004)
Yangarber, R., Lin, W., Grishman, R.: Unsupervised learning of generalized names. In: ACL 2002, Taipei, Taiwan, pp. 1–7. Association for Computational Linguistics (2002)
Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: DL 2000, pp. 85–94. ACM Press, New York (2000)
Ipeirotis, P.G., Agichtein, E., Jain, P., Gravano, L.: To search or to crawl?: towards a query optimizer for text-centric tasks. In: SIGMOD 2006, pp. 265–276. ACM, New York (2006)
Sang, E.F.T.K.: Introduction to the conll-2002 shared task: Language-independent named entity recognition. In: CoNLL 2002 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Pablo-Sánchez, C., Martínez, P. (2009). Building a Graph of Names and Contextual Patterns for Named Entity Classification. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_47
Download citation
DOI: https://doi.org/10.1007/978-3-642-00958-7_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00957-0
Online ISBN: 978-3-642-00958-7
eBook Packages: Computer ScienceComputer Science (R0)