Skip to main content

Scientific Claims Characterization for Claim-Based Analysis in Digital Libraries

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11057))

Abstract

In this paper, we promote the idea of automatic semantic characterization of scientific claims to explore entity-entity relationships in Digital collections. Our proposed approach aims at alleviating time-consuming analysis of query results when the information need is not just one document but an overview over a set of documents. With the semantic characterization, we propose to find what we called “dominant” claims and rely on two core properties: the consensual support of a claim in the light of the collection’s previous knowledge as well as the authors’ assertiveness of the language used when expressing it. We will discuss useful features to efficiently capture these two core properties and formalize the idea of finding “dominant” claims by relying on Pareto dominance. We demonstrate the effectiveness of our method regarding quality by a practical evaluation using a real-world document collection from the medical domain to show the potential of our approach.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://www.crowdflower.com/

References

  1. Balke, W.-T., Zheng, J.X., Güntzer, U.: Approaching the efficient frontier: cooperative database retrieval using high-dimensional skylines. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 410–421. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_37

    Chapter  Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). https://doi.org/10.1162/jmlr.2003.3.4-5.993

    Article  MATH  Google Scholar 

  3. Borzsony, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering, pp. 1–20 (2001). https://doi.org/10.1109/icde.2001.914855

  4. Brysbaert, M., Warriner, A.B., Kuperman, V.: Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Methods 46, 904–911 (2014). https://doi.org/10.3758/s13428-013-0403-5

    Article  Google Scholar 

  5. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web - WWW 2011, p. 675 (2011). https://doi.org/10.1145/1963405.1963500

  6. Connell, L., Keane, M.T.: A model of plausibility. Cogn. Sci. 30, 95–120 (2006). https://doi.org/10.1207/s15516709cog0000_53

    Article  Google Scholar 

  7. Gabbay, D.M., Guenthner, F.: Handbook of Philosophical Logic. Springer, Dordrecht (2002). https://doi.org/10.1007/978-94-017-0462-5

    Book  MATH  Google Scholar 

  8. Godfrey, P.: Skyline cardinality for relational processing. In: Seipel, D., Turull-Torres, J.M. (eds.) FoIKS 2004. LNCS, vol. 2942, pp. 78–97. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24627-5_7

    Chapter  Google Scholar 

  9. González Pinto, J.M., Balke, W.-T.: Can plausibility help to support high quality content in digital libraries? In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) TPDL 2017. LNCS, vol. 10450, pp. 169–180. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67008-9_14

    Chapter  Google Scholar 

  10. González Pinto, J.M., Balke, W.-T.: Result set diversification in digital libraries through the use of paper’s claims. In: Choemprayong, S., Crestani, F., Cunningham, S.J. (eds.) ICADL 2017. LNCS, vol. 10647, pp. 225–236. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70232-2_19

    Chapter  Google Scholar 

  11. González Pinto, J.M., Balke, W.-T.: Offering answers for claim-based queries: a new challenge for digital libraries. In: Choemprayong, S., Crestani, F., Cunningham, S.J. (eds.) ICADL 2017. LNCS, vol. 10647, pp. 3–13. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70232-2_1

    Chapter  Google Scholar 

  12. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 521, p. 800. MIT Press, Cambridge (2016). https://doi.org/10.1038/nmeth.3707

    Book  MATH  Google Scholar 

  13. Habernal, I., Gurevych, I.: Which argument is more convincing? Analyzing and predicting convincingness of web arguments using bidirectional LSTM. In: ACL, pp. 1589–1599 (2016)

    Google Scholar 

  14. Islamaj Dogan, R., Murray, G.C., Névéol, A., Lu, Z.: Understanding PubMed® user search behavior through log analysis. Database (2009). https://doi.org/10.1093/database/bap018

  15. Kumar, S., West, R., Leskovec, J.: Disinformation on the web: impact, characteristics, and detection of wikipedia hoaxes. In: WWW, pp. 591–602 (2016). https://doi.org/10.1145/2872427.2883085

  16. Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 957–966 (2015)

    Google Scholar 

  17. Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning - ICML 2014, vol. 32, pp. 1188–1196 (2014). https://doi.org/10.1145/2740908.2742760

  18. Lev, G., Klein, B., Wolf, L.: In defense of word embedding for generic text representation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 35–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19581-0_3

    Chapter  Google Scholar 

  19. Lippi, M., Torroni, P.: Argumentation mining: state of the art and emerging trends. ACM Trans. Internet Technol. 16, 10 (2016). https://doi.org/10.1145/2850417

    Article  Google Scholar 

  20. Lofi, C., Balke, W.-T.: On skyline queries and how to choose from pareto sets. In: Catania, B., Jain, L.C. (eds.) Advanced Query Processing, vol. 36, pp. 15–36. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-28323-9_2

    Chapter  Google Scholar 

  21. Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013). https://doi.org/10.1162/153244303322533223

  22. Mukherjee, S., Weikum, G.: Leveraging joint interactions for credibility analysis in news communities. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 353–362 (2015)

    Google Scholar 

  23. Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs: credibility of user statements in health communities. In: KDD 2014 Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 65–74 (2014). https://doi.org/10.1145/2623330.2623714

  24. Priem, J.: Altmetrics. In: Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact, pp. 263–287 (2014)

    Google Scholar 

  25. Pyysalo, S., Ginter, F., Moen, H., Salakoski, T., Ananiadou, S.: Distributional semantics resources for biomedical text processing. In: Proceedings of LBM 2013 (2013)

    Google Scholar 

  26. Recasens, M., Danescu-Niculescu-Mizil, C., Jurafsky, D.: Linguistic models for analyzing and detecting biased language. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1650–1659 (2013)

    Google Scholar 

  27. Schoenfeld, J.D.: Is everything we eat associated with cancer? A systematic. Am. J. Clinincal Nutr. 97, 127–134 (2013). https://doi.org/10.3945/ajcn.112.047142.1

    Article  Google Scholar 

  28. Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, pp. 253–263 (2015)

    Google Scholar 

  29. IBM Debating Technologies. http://researcher.watson.ibm.com/researcher/view_group.php?id=5443. Accessed 11 Oct 2017

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José María González Pinto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

González Pinto, J.M., Balke, WT. (2018). Scientific Claims Characterization for Claim-Based Analysis in Digital Libraries. In: Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J. (eds) Digital Libraries for Open Knowledge. TPDL 2018. Lecture Notes in Computer Science(), vol 11057. Springer, Cham. https://doi.org/10.1007/978-3-030-00066-0_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00066-0_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00065-3

  • Online ISBN: 978-3-030-00066-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics