Skip to main content

Entity Network Extraction Based on Association Finding and Relation Extraction

  • Conference paper
  • 2643 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8092))

Abstract

One of the core aims of semantic search is to directly present users with information instead of lists of documents. Various entity-oriented tasks have been or are being considered, including entity search and related entity finding. In the context of digital libraries for computational humanities, we consider another task, network extraction: given an input entity and a document collection, extract related entities from the collection and present them as a network. We develop a combined approach for entity network extraction that consists of a co-occurrence-based approach to association finding and a machine learning-based approach to relation extraction. We evaluate our approach by comparing the results on a ground truth obtained using a pooling method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agichtein, E., Gravano, L.: Snowball: extracting relations from large plain-text collections. In: DL 2000, pp. 85–94. ACM, New York (2000)

    Google Scholar 

  2. Balog, K., Serdyukov, P., de Vries, A.P.: Overview of the TREC 2011 entity track. In: TREC 2011 Working Notes. NIST (2011)

    Google Scholar 

  3. Balog, K., Fang, Y., de Rijke, M., Serdyukov, P., Si, L.: Expertise retrieval. Foundations and Trends in Information Retrieval 6(2-3), 127–256 (2012)

    Article  Google Scholar 

  4. Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  5. Bron, M., Huurnink, B., de Rijke, M.: Linking archives using document enrichment and term selection. In: Gradmann, S., Borri, F., Meghini, C., Schuldt, H. (eds.) TPDL 2011. LNCS, vol. 6966, pp. 360–371. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  6. Chaudhari, D.L., Damani, O.P., Laxman, S.: Lexical co-occurrence, statistical significance, and word association. In: EMNLP 2011, pp. 1058–1068. ACL, Stroudsburg (2011)

    Google Scholar 

  7. Elson, D.K., Dames, N., McKeown, K.R.: Extracting social networks from literary fiction. In: ACL 2010, pp. 138–147. ACL, Stroudsburg (2010)

    Google Scholar 

  8. Etzioni, O., et al.: Open information extraction from the web. Commun. ACM 51(12), 68–74 (2008)

    Article  Google Scholar 

  9. Farkas, G.: Essays on Elite Networks in Sweden: Power, social integration, and informal contacts among political elites. PhD thesis, Stockholm University (2012)

    Google Scholar 

  10. Finkel, J.R., Grenager, T., Manning, C.: Incorporating non-local information into information extraction systems by gibbs sampling. In: ACL 2005, pp. 363–370. Association for Computational Linguistics, Stroudsburg (2005)

    Google Scholar 

  11. Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)

    Article  MATH  Google Scholar 

  12. Harman, D.K., Voorhees, E.M. (eds.): TREC: Experiment and Evaluation in Information Retrieval. MIT Press (2005)

    Google Scholar 

  13. Joachims, T.: Training linear SVMs in linear time. In: KDD 2006, pp. 217–226. ACM, New York (2006)

    Google Scholar 

  14. Kautz, H., Selman, B., Shah, M.: Referral web: Combining social networks and collaborative filtering. Commun. ACM 40(3), 63–65 (1997)

    Article  Google Scholar 

  15. Klein, D., Manning, C.D.: Accurate unlexicalized parsing. In: ACL 2003, pp. 423–430. ACL, Stroudsburg (2003)

    Google Scholar 

  16. Lunenfeld, P., Burdick, A., Drucker, J., Presner, T., Schnapp, J.: Digital Humanities. MIT Press (2012)

    Google Scholar 

  17. Merhav, Y., Mesquita, F., Barbosa, D., Yee, W.G., Frieder, O.: Extracting information networks from the blogosphere. ACM Trans. Web 6(3), 11:1–11:33 (2012)

    Google Scholar 

  18. Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: AAAI 2008 (2008)

    Google Scholar 

  19. Mintz, M., Bills, S., Snow, R., Jurafsky, D.: Distant supervision for relation extraction without labeled data. In: ACL 2009, pp. 1003–1011. ACL, Stroudsburg (2009)

    Google Scholar 

  20. Pedregosa, F., et al.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)

    MathSciNet  Google Scholar 

  21. Tang, J., Zhang, D., Yao, L.: Social network extraction of academic researchers. In: ICDM 2007, pp. 292–301. IEEE Computer Society, Washington, DC (2007)

    Google Scholar 

  22. Washtell, J., Markert, K.: A comparison of windowless and window-based computational association measures as predictors of syntagmatic human associations. In: EMNLP 2009, pp. 628–637. ACL, Stroudsburg (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Reinanda, R., Utama, M., Steijlen, F., de Rijke, M. (2013). Entity Network Extraction Based on Association Finding and Relation Extraction. In: Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J. (eds) Research and Advanced Technology for Digital Libraries. TPDL 2013. Lecture Notes in Computer Science, vol 8092. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40501-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40501-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40500-6

  • Online ISBN: 978-3-642-40501-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics