Semantic Rule Filtering for Web-Scale Relation Extraction

Moro, Andrea; Li, Hong; Krause, Sebastian; Xu, Feiyu; Navigli, Roberto; Uszkoreit, Hans

doi:10.1007/978-3-642-41335-3_22

Andrea Moro²⁶,
Hong Li²⁷,
Sebastian Krause²⁷,
Feiyu Xu²⁷,
Roberto Navigli²⁶ &
…
Hans Uszkoreit²⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8218))

Included in the following conference series:

International Semantic Web Conference

4406 Accesses
7 Citations
3 Altmetric

Abstract

Web-scale relation extraction is a means for building and extending large repositories of formalized knowledge. This type of automated knowledge building requires a decent level of precision, which is hard to achieve with automatically acquired rule sets learned from unlabeled data by means of distant or minimal supervision. This paper shows how precision of relation extraction can be considerably improved by employing a wide-coverage, general-purpose lexical semantic network, i.e., BabelNet, for effective semantic rule filtering. We apply Word Sense Disambiguation to the content words of the automatically extracted rules. As a result a set of relation-specific relevant concepts is obtained, and each of these concepts is then used to represent the structured semantics of the corresponding relation. The resulting relation-specific subgraphs of BabelNet are used as semantic filters for estimating the adequacy of the extracted rules. For the seven semantic relations tested here, the semantic filter consistently yields a higher precision at any relative recall value in the high-recall range.

Download to read the full chapter text

Chapter PDF

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

An Integrated Approach for Large-Scale Relation Extraction from the Web

Joint Information Extraction from the Web Using Linked Data

Keywords

References

Agichtein, E.: Confidence estimation methods for partially supervised information extraction. In: Proc. of the Sixth SIAM International Conference on Data Mining (2006)
Google Scholar
Ballesteros, M., Nivre, J.: Maltoptimizer: An optimization tool for maltparser. In: Proc. of EACL, pp. 58–62 (2012)
Google Scholar
Banko, M., Etzioni, O.: The Tradeoffs Between Open and Traditional Relation Extraction. In: Proc. of ACL/HLT, pp. 28–36 (2008)
Google Scholar
Banko, M., Cafarella, M.J., Soderland, S., Broadhead, M., Etzioni, O.: Open information extraction from the Web. In: Proc. of the 20th IJCAI, pp. 2670–2676 (2007)
Google Scholar
Betteridge, J., Carlson, A., Hong, S.A., Hruschka Jr., E.R., Law, E.L.M., Mitchell, T.M., Wang, S.H.: Toward never ending language learning. In: Proc. of the 2009 AAAI Spring Symposium on Learning by Reading and Learning to Read (2009)
Google Scholar
Bollacker, K.D., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proc. of SIGMOD, pp. 1247–1250 (2008)
Google Scholar
Brin, S.: Extracting patterns and relations from the World Wide Web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)
Chapter Google Scholar
Carlson, A., Betteridge, J., Kisiel, B., Settles, B., Hruschka Jr., E., Mitchell, T.: Toward an Architecture for Never-Ending Language Learning. In: Proc. of AAAI, pp. 1306–1313 (2010)
Google Scholar
Carlson, A., Betteridge, J., Hruschka Jr., E.R., Mitchell, T.M.: Coupling semi-supervised learning of categories and relations. In: Proc. of the NAACL HLT 2009 Workskop on Semi-supervised Learning for Natural Language Processing (2009)
Google Scholar
Carlson, A., Betteridge, J., Wang, R.C., Hruschka Jr., E.R., Mitchell, T.M.: Coupled semi-supervised learning for information extraction. In: Proc. of WSDM (2010)
Google Scholar
Chan, Y.S., Roth, D.: Exploiting Syntactico-Semantic Structures for Relation Extraction. In: Proc. of ACL, pp. 551–560 (2011)
Google Scholar
Chiarcos, C., Nordhoff, S., Hellmann, S.: Linked Data in Linguistics. Representing and Connecting Language Data and Language Metadata. Springer, Heidelberg (2012)
Book Google Scholar
Etzioni, O., Fader, A., Christensen, J., Soderland, S.: Mausam: Open Information Extraction: The Second Generation. In: Proc. of IJCAI, pp. 3–10 (2011)
Google Scholar
Fader, A., Soderland, S., Etzioni, O.: Identifying Relations for Open Information Extraction. In: Proc. of EMNLP, pp. 1535–1545 (2011)
Google Scholar
Fellbaum, C.: WordNet: an electronic lexical database, Cambridge, MA, USA (1998)
Google Scholar
Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: Proc. of ACL, pp. 363–370 (2005)
Google Scholar
Grishman, R., Sundheim, B.: Message understanding conference - 6: A brief history. In: Proc. of the 16th International Conference on Computational Linguistics, Copenhagen (June 1996)
Google Scholar
Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence 194, 28–61 (2013)
Article MathSciNet MATH Google Scholar
Jiang, J., Zhai, C.: A Systematic Exploration of the Feature Space for Relation Extraction. In: Proc. of NAACL, pp. 113–120 (2007)
Google Scholar
Kambhatla, N.: Combining lexical, syntactic, and semantic features with maximum entropy models for information extraction. In: Proc. of ACL (Demonstration), pp. 178–181 (2004)
Google Scholar
Kozareva, Z., Hovy, E.H.: A semi-supervised method to learn and construct taxonomies using the Web. In: Proc. of EMNLP, pp. 1110–1118 (2010)
Google Scholar
Krause, S., Li, H., Uszkoreit, H., Xu, F.: Large-scale learning of relation-extraction rules with distant supervision from the web. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 263–278. Springer, Heidelberg (2012)
Chapter Google Scholar
Lao, N., Mitchell, T., Cohen, W.W.: Random walk inference and learning in a large scale knowledge base. In: Proc. of EMNLP, pp. 529–539 (2011)
Google Scholar
Miller, S., Fox, H., Ramshaw, L., Weischedel, R.: A Novel Use of Statistical Parsing to Extract Information from Text. In: Proc. of NAACL, pp. 226–233 (2000)
Google Scholar
Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: Proc. of ACL/AFNLP, pp. 1003–1011 (2009)
Google Scholar
Mohamed, T., Hruschka, E., Mitchell, T.: Discovering relations between noun categories. In: Proc. of EMNLP, pp. 1447–1455 (2011)
Google Scholar
Moro, A., Navigli, R.: WiSeNet: building a wikipedia-based semantic network with ontologized relations. In: Proc. of CIKM, pp. 1672–1676 (2012)
Google Scholar
Moro, A., Navigli, R.: Integrating Syntactic and Semantic Analysis into the Open Information Extraction Paradigm. In: Proc. of IJCAI, pp. 2148–2154 (2013)
Google Scholar
Nastase, V., Strube, M.: Transforming Wikipedia into a large scale multilingual concept network. Artificial Intelligence 194, 62–85 (2013)
Article MathSciNet MATH Google Scholar
Navigli, R.: Word Sense Disambiguation: A survey. ACM Comput. Surv. 41(2), 1–69 (2009)
Article Google Scholar
Navigli, R., Ponzetto, S.P.: BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193, 217–250 (2012)
Article MathSciNet MATH Google Scholar
Navigli, R.: BabelNet goes to the (Multilingual) Semantic Web. In: Proc. of MSW (2012)
Google Scholar
Navigli, R., Ponzetto, S.P.: Joining forces pays off: Multilingual Joint Word Sense Disambiguation. In: Proc. of EMNLP-CoNLL, pp. 1399–1410 (2012)
Google Scholar
Navigli, R., Ponzetto, S.P.: Multilingual WSD with Just a Few Lines of Code: the BabelNet API. In: Proc. of ACL (System Demonstrations), pp. 67–72 (2012)
Google Scholar
Nguyen, Q., Tikk, D., Leser, U.: Simple tricks for improving pattern-based information extraction from the biomedical literature. Journal of Biomedical Semantics 1(1) (2010)
Google Scholar
Nguyen, T.V.T., Moschitti, A.: Joint distant and direct supervision for relation extraction. In: Proc. of 5th IJCNLP, pp. 732–740 (2011)
Google Scholar
Parker, R.: English Gigaword, 5th edn. Linguistic Data Consortium. Philadelphia (2011)
Google Scholar
Pasca, M., Lin, D., Bigham, J., Lifchits, A., Jain, A.: Names and Similarities on the Web: Fact Extraction in the Fast Lane. In: Proc. of ACL/COLING (2006)
Google Scholar
Ravichandran, D., Hovy, E.H.: Learning surface text patterns for a Question Answering System. In: Proc. of ACL, pp. 41–47 (2002)
Google Scholar
Shinyama, Y., Sekine, S.: Preemptive Information Extraction using Unrestricted Relation Discovery. In: Proc. of HLT-NAACL (2006)
Google Scholar
Soderland, S., Roof, B., Qin, B., Xu, S., Mausam, E.O.: Adapting Open Information Extraction to Domain-Specific Relations. AI Magazine 31(3), 93–102 (2010)
Google Scholar
Suchanek, F.M., Kasneci, G., Weikum, G.: YAGO: A large ontology from Wikipedia and WordNet. J. Web. Semant. 6, 203–217 (2008)
Article Google Scholar
Surdeanu, M., Ciaramita, M.: Robust information extraction with perceptrons. In: Proc. of the NIST 2007 Automatic Content Extraction Workshop, ACE 2007 (March 2007)
Google Scholar
Surdeanu, M., Gupta, S., Bauer, J., McClosky, D., Chang, A.X., Spitkovsky, V.I., Manning, C.D.: Stanford’s distantly-supervised slot-filling system. In: Proc. of TAC (2011)
Google Scholar
Uszkoreit, H.: Learning relation extraction grammars with minimal human intervention: Strategy, results, insights and plans. In: Gelbukh, A. (ed.) CICLing 2011, Part II. LNCS, vol. 6609, pp. 106–126. Springer, Heidelberg (2011)
Chapter Google Scholar
Volokh, A., Neumann, G.: Comparing the benefit of different dependency parsers for textual entailment using syntactic constraints only. In: Proc. of SemEval, pp. 308–312 (2010)
Google Scholar
Weld, D.S., Hoffmann, R., Wu, F.: Using Wikipedia to bootstrap open information extraction. SIGMOD Record 37, 62–68 (2008)
Article Google Scholar
Wu, F., Weld, D.S.: Open Information Extraction Using Wikipedia. In: Proc. of ACL (2010)
Google Scholar
Wu, F., Hoffmann, R., Weld, D.S.: Information extraction from Wikipedia: moving down the long tail. In: Proc. of KDD, pp. 731–739 (2008)
Google Scholar
Xu, F.: Bootstrapping Relation Extraction from Semantic Seeds. PhD thesis, Saarland University (2007)
Google Scholar
Xu, F., Uszkoreit, H., Krause, S., Li, H.: Boosting relation extraction with limited closed-world knowledge. In: Proc. of COLING (Posters), pp. 1354–1362 (2010)
Google Scholar
Xu, F., Uszkoreit, H., Li, H.: A seed-driven bottom-up machine learning framework for extracting relations of various complexity. In: Proc. of ACL (2007)
Google Scholar
Xu, W., Grishman, R., Zhao, L.: Passage retrieval for information extraction using distant supervision. In: Proc. of IJCNLP, pp. 1046–1054 (2011)
Google Scholar
Yangarber, R.: Counter-training in discovery of semantic patterns. In: Proc. of ACL (2003)
Google Scholar
Yangarber, R., Grishman, R., Tapanainen, P.: Automatic acquisition of domain knowledge for information extraction. In: Proc. of COLING, pp. 940–946 (2000)
Google Scholar
Yates, A., Cafarella, M., Banko, M., Etzioni, O., Broadhead, M., Soderland, S.: TextRunner: open information extraction on the Web. In: Proc. of HLT-NAACL (Demo), pp. 25–26 (2007)
Google Scholar
Yates, A., Etzioni, O.: Unsupervised Resolution of Objects and Relations on the Web. In: Proc. of HLT-NAACL, pp. 121–130 (2007)
Google Scholar
Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. The Journal of Machine Learning Research 3, 1083–1106 (2003)
MathSciNet MATH Google Scholar
Zhou, G., Qian, L., Fan, J.: Tree kernel-based semantic relation extraction with rich syntactic and semantic information. Inf. Sci. 180(8), 1313–1325 (2010)
Article MathSciNet Google Scholar
Zhou, G., Zhang, M.: Extracting relation information from text documents by exploring various types of knowledge. Inf. Process. Manage. 43(4), 969–982 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Informatica, Sapienza Università di Roma, Viale Regina Elena 295, 00161, Roma, Italy
Andrea Moro & Roberto Navigli
Language Technology Lab, DFKI, Alt-Moabit 91c, Berlin, Germany
Hong Li, Sebastian Krause, Feiyu Xu & Hans Uszkoreit

Authors

Andrea Moro
View author publications
You can also search for this author in PubMed Google Scholar
Hong Li
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Krause
View author publications
You can also search for this author in PubMed Google Scholar
Feiyu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Roberto Navigli
View author publications
You can also search for this author in PubMed Google Scholar
Hans Uszkoreit
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Knowledge Media Institute, The Open University, Milton Keynes, UK
Harith Alani
Massachusetts Institute of Technology, Cambridge, MA, USA
Lalana Kagal
IBM Research, Hawthorne, NY, USA
Achille Fokoue
Free University Amsterdam, The Netherlands
Paul Groth
Technical University Darmstadt, Germany
Chris Biemann
Digital Enterprise Research Institute, National University of Ireland, Galway, Ireland
Josiane Xavier Parreira
VU Amsterdam, The Netherlands
Lora Aroyo
Stanford University, CA, USA
Natasha Noy
IBM Research, Yorktown Heights, NY, USA
Chris Welty
University of California, Santa Barbara, CA, USA
Krzysztof Janowicz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moro, A., Li, H., Krause, S., Xu, F., Navigli, R., Uszkoreit, H. (2013). Semantic Rule Filtering for Web-Scale Relation Extraction. In: Alani, H., et al. The Semantic Web – ISWC 2013. ISWC 2013. Lecture Notes in Computer Science, vol 8218. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41335-3_22

Download citation

DOI: https://doi.org/10.1007/978-3-642-41335-3_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41334-6
Online ISBN: 978-3-642-41335-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Semantic Rule Filtering for Web-Scale Relation Extraction

Abstract

Chapter PDF

Similar content being viewed by others

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

An Integrated Approach for Large-Scale Relation Extraction from the Web

Joint Information Extraction from the Web Using Linked Data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Semantic Rule Filtering for Web-Scale Relation Extraction

Abstract

Chapter PDF

Similar content being viewed by others

Large Scale Semantic Relation Discovery: Toward Establishing the Missing Link Between Wikipedia and Semantic Network

An Integrated Approach for Large-Scale Relation Extraction from the Web

Joint Information Extraction from the Web Using Linked Data

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation