Scientific Claims Characterization for Claim-Based Analysis in Digital Libraries

González Pinto, José María; Balke, Wolf-Tilo

doi:10.1007/978-3-030-00066-0_22

Scientific Claims Characterization for Claim-Based Analysis in Digital Libraries

Conference paper
First Online: 05 September 2018

1575 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11057))

Abstract

In this paper, we promote the idea of automatic semantic characterization of scientific claims to explore entity-entity relationships in Digital collections. Our proposed approach aims at alleviating time-consuming analysis of query results when the information need is not just one document but an overview over a set of documents. With the semantic characterization, we propose to find what we called “dominant” claims and rely on two core properties: the consensual support of a claim in the light of the collection’s previous knowledge as well as the authors’ assertiveness of the language used when expressing it. We will discuss useful features to efficiently capture these two core properties and formalize the idea of finding “dominant” claims by relying on Pareto dominance. We demonstrate the effectiveness of our method regarding quality by a practical evaluation using a real-world document collection from the medical domain to show the potential of our approach.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://www.crowdflower.com/

References

Balke, W.-T., Zheng, J.X., Güntzer, U.: Approaching the efficient frontier: cooperative database retrieval using high-dimensional skylines. In: Zhou, L., Ooi, B.C., Meng, X. (eds.) DASFAA 2005. LNCS, vol. 3453, pp. 410–421. Springer, Heidelberg (2005). https://doi.org/10.1007/11408079_37
Chapter Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). https://doi.org/10.1162/jmlr.2003.3.4-5.993
Article MATH Google Scholar
Borzsony, S., Kossmann, D., Stocker, K.: The skyline operator. In: Proceedings of the 17th International Conference on Data Engineering, pp. 1–20 (2001). https://doi.org/10.1109/icde.2001.914855
Brysbaert, M., Warriner, A.B., Kuperman, V.: Concreteness ratings for 40 thousand generally known English word lemmas. Behav. Res. Methods 46, 904–911 (2014). https://doi.org/10.3758/s13428-013-0403-5
Article Google Scholar
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International Conference on World Wide Web - WWW 2011, p. 675 (2011). https://doi.org/10.1145/1963405.1963500
Connell, L., Keane, M.T.: A model of plausibility. Cogn. Sci. 30, 95–120 (2006). https://doi.org/10.1207/s15516709cog0000_53
Article Google Scholar
Gabbay, D.M., Guenthner, F.: Handbook of Philosophical Logic. Springer, Dordrecht (2002). https://doi.org/10.1007/978-94-017-0462-5
Book MATH Google Scholar
Godfrey, P.: Skyline cardinality for relational processing. In: Seipel, D., Turull-Torres, J.M. (eds.) FoIKS 2004. LNCS, vol. 2942, pp. 78–97. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24627-5_7
Chapter Google Scholar
González Pinto, J.M., Balke, W.-T.: Can plausibility help to support high quality content in digital libraries? In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) TPDL 2017. LNCS, vol. 10450, pp. 169–180. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67008-9_14
Chapter Google Scholar
González Pinto, J.M., Balke, W.-T.: Result set diversification in digital libraries through the use of paper’s claims. In: Choemprayong, S., Crestani, F., Cunningham, S.J. (eds.) ICADL 2017. LNCS, vol. 10647, pp. 225–236. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70232-2_19
Chapter Google Scholar
González Pinto, J.M., Balke, W.-T.: Offering answers for claim-based queries: a new challenge for digital libraries. In: Choemprayong, S., Crestani, F., Cunningham, S.J. (eds.) ICADL 2017. LNCS, vol. 10647, pp. 3–13. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70232-2_1
Chapter Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 521, p. 800. MIT Press, Cambridge (2016). https://doi.org/10.1038/nmeth.3707
Book MATH Google Scholar
Habernal, I., Gurevych, I.: Which argument is more convincing? Analyzing and predicting convincingness of web arguments using bidirectional LSTM. In: ACL, pp. 1589–1599 (2016)
Google Scholar
Islamaj Dogan, R., Murray, G.C., Névéol, A., Lu, Z.: Understanding PubMed® user search behavior through log analysis. Database (2009). https://doi.org/10.1093/database/bap018
Kumar, S., West, R., Leskovec, J.: Disinformation on the web: impact, characteristics, and detection of wikipedia hoaxes. In: WWW, pp. 591–602 (2016). https://doi.org/10.1145/2872427.2883085
Kusner, M.J., Sun, Y., Kolkin, N.I., Weinberger, K.Q.: From word embeddings to document distances. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 957–966 (2015)
Google Scholar
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning - ICML 2014, vol. 32, pp. 1188–1196 (2014). https://doi.org/10.1145/2740908.2742760
Lev, G., Klein, B., Wolf, L.: In defense of word embedding for generic text representation. In: Biemann, C., Handschuh, S., Freitas, A., Meziane, F., Métais, E. (eds.) NLDB 2015. LNCS, vol. 9103, pp. 35–50. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19581-0_3
Chapter Google Scholar
Lippi, M., Torroni, P.: Argumentation mining: state of the art and emerging trends. ACM Trans. Internet Technol. 16, 10 (2016). https://doi.org/10.1145/2850417
Article Google Scholar
Lofi, C., Balke, W.-T.: On skyline queries and how to choose from pareto sets. In: Catania, B., Jain, L.C. (eds.) Advanced Query Processing, vol. 36, pp. 15–36. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-28323-9_2
Chapter Google Scholar
Mikolov, T., Corrado, G., Chen, K., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013). https://doi.org/10.1162/153244303322533223
Mukherjee, S., Weikum, G.: Leveraging joint interactions for credibility analysis in news communities. In: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, pp. 353–362 (2015)
Google Scholar
Mukherjee, S., Weikum, G., Danescu-Niculescu-Mizil, C.: People on drugs: credibility of user statements in health communities. In: KDD 2014 Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 65–74 (2014). https://doi.org/10.1145/2623330.2623714
Priem, J.: Altmetrics. In: Beyond Bibliometrics: Harnessing Multidimensional Indicators of Scholarly Impact, pp. 263–287 (2014)
Google Scholar
Pyysalo, S., Ginter, F., Moen, H., Salakoski, T., Ananiadou, S.: Distributional semantics resources for biomedical text processing. In: Proceedings of LBM 2013 (2013)
Google Scholar
Recasens, M., Danescu-Niculescu-Mizil, C., Jurafsky, D.: Linguistic models for analyzing and detecting biased language. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 1650–1659 (2013)
Google Scholar
Schoenfeld, J.D.: Is everything we eat associated with cancer? A systematic. Am. J. Clinincal Nutr. 97, 127–134 (2013). https://doi.org/10.3945/ajcn.112.047142.1
Article Google Scholar
Zhang, Y., Wallace, B.: A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, pp. 253–263 (2015)
Google Scholar
IBM Debating Technologies. http://researcher.watson.ibm.com/researcher/view_group.php?id=5443. Accessed 11 Oct 2017

Download references

Author information

Authors and Affiliations

Institut für Informationssysteme, Mühlenpfordstrasse 23, 28106, Brunswick, Germany
José María González Pinto & Wolf-Tilo Balke

Authors

José María González Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Wolf-Tilo Balke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to José María González Pinto .

Editor information

Editors and Affiliations

University Carlos III, Madrid, Spain
Eva Méndez
USI, Università della Svizzera Italiana, Lugano, Switzerland
Fabio Crestani
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
Cristina Ribeiro
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
Gabriel David
INESC TEC, Faculty of Engineering, University of Porto, Porto, Portugal
João Correia Lopes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

González Pinto, J.M., Balke, WT. (2018). Scientific Claims Characterization for Claim-Based Analysis in Digital Libraries. In: Méndez, E., Crestani, F., Ribeiro, C., David, G., Lopes, J. (eds) Digital Libraries for Open Knowledge. TPDL 2018. Lecture Notes in Computer Science(), vol 11057. Springer, Cham. https://doi.org/10.1007/978-3-030-00066-0_22

Download citation

DOI: https://doi.org/10.1007/978-3-030-00066-0_22
Published: 05 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00065-3
Online ISBN: 978-3-030-00066-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics