Skip to main content

Truthfulness of Candidates in Set of t-uples Expansion

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10438))

Included in the following conference series:

Abstract

Set of t-uples expansion refers to the task of building a set of t-uples from a corpus based on some examples, or seed t-uples. Set of t-uples expansion requires a ranking mechanism to select the relevant candidates. We propose to harness and compare the performance of different state-of-the-art truth finding algorithms for the task of set of t-uples expansion. We empirically and comparatively evaluate the accuracy of these different ranking algorithms. We show that truth finding algorithms provide a practical and effective solution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Waguih, D.A., Berti-Équille, L.: Truth discovery algorithms: an experimental evaluation. CoRR, abs/1409.6428, September 2014

    Google Scholar 

  2. Ba, M.L., Berti-Equille, L., Shah, K., Hammady, H.M.: VERA: a platform for veracity estimation over web data. In: WWW (2016)

    Google Scholar 

  3. Ba, M.L., Horincar, R., Senellart, P., Wu, H.: Truth finding with attribute partitioning. In: WebDB SIGMOD Workshop, Melbourne, Australia, May 2015

    Google Scholar 

  4. Berti-Equille, L.: Data veracity estimation with ensembling truth discovery methods. In: IEEE Big Data Workshop (2015)

    Google Scholar 

  5. Berti-Equille, L., Borge-Holthoefer, J.: Veracity of Big Data: From Truth Discovery Computation Algorithms to Models of Misinformation Dynamics. Morgan & Claypool, San Rafael (2015)

    Google Scholar 

  6. Bing, L., Lam, W., Wong, T.L.: Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning. In: WSDM, New York, NY, USA (2013)

    Google Scholar 

  7. Bleiholder, J., Draba, K., Naumann, F.: FuSem: exploring different semantics of data fusion. In: VLDB, Vienna, Austria (2007)

    Google Scholar 

  8. Brin, S.: Extracting patterns and relations from the world wide web. In: WWW and Databases Workshop (1998)

    Google Scholar 

  9. Chen, Z., Cafarella, M., Jagadish, H.V.: Long-tail vocabulary dictionary extraction from the web. In: WSDM, New York, NY, USA (2016)

    Google Scholar 

  10. Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009)

    Google Scholar 

  11. Dong, X.L., Berti-Equille, L., Srivastava, D.: Truth discovery and copying detection in a dynamic world. PVLDB 2(1), 562–573 (2009)

    Google Scholar 

  12. Dong, X.L., Naumann, F.: Data fusion: resolving data conflicts for integration. PVLDB 2(2), 1654–1655 (2009)

    Google Scholar 

  13. Er, N.A.S., Abdessalem, T., Bressan, S.: Set of t-uples expansion by example. In: iiWAS, New York, NY, USA (2016)

    Google Scholar 

  14. Er, N.A.S., Ba, M.L., Abdessalem, T., Bressan, S.: Truthfulness of candidates in set of t-uples expansion. Technical report, National University of Singapore, School of Computing, TRA5/17, May 2017

    Google Scholar 

  15. Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: WSDM, New York, USA, February 2010

    Google Scholar 

  16. He, Y., Xin, D.: SEISA: set expansion by iterative similarity aggregation. In: WWW, New York, NY, USA (2011)

    Google Scholar 

  17. Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: SIGMOD, Snowbird, Utah, USA, May 2014

    Google Scholar 

  18. Li, X., Dong, X.L., Lyons, K., Meng, W., Srivastava, D.: Truth finding on the deep web: is the problem solved? PVLDB 6(2), 97–108 (2012)

    Google Scholar 

  19. Moens, M., Li, J., Chua, T. (eds.): Mining User Generated Content. Chapman and Hall/CRC, Boca Raton (2014)

    Google Scholar 

  20. Paşca, M.: Weakly-supervised discovery of named entities using web search queries. In: CIKM, New York, NY, USA (2007)

    Google Scholar 

  21. Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: bringing order to the web. Technical report (1999)

    Google Scholar 

  22. Pasternack, J., Roth, D.: Latent credibility analysis. In: WWW, Rio de Janeiro, Brazil, May 2013

    Google Scholar 

  23. Pochampally, R., Das Sarma, A., Dong, X.L., Meliou, A., Srivastava, D.: Fusing data with correlations. In: SIGMOD, Snowbird, Utah, USA, May 2014

    Google Scholar 

  24. Sarker, A., Gonzalez, G.: Portable automatic text classification for adverse drug reaction detection via multi-corpus training. J. Biomed. Inform. 53, 196–207 (2015)

    Article  Google Scholar 

  25. Waguih, D.A., Goel, N., Hammady, H.M., Berti-Equille, L.: AllegatorTrack: combining and reporting results of truth discovery from multi-source data. In: ICDE, Seoul, Korea (2015)

    Google Scholar 

  26. Wang, D., Kaplan, L., Le, H., Abdelzaher, T.: On truth discovery in social sensing: a maximum likelihood estimation approach. In: IPSN, Beijing, China, April 2012

    Google Scholar 

  27. Wang, R.C., Cohen, W.W.: Language-independent set expansion of named entities using the web. In: ICDM (2007)

    Google Scholar 

  28. Wang, R.C., Cohen, W.W.: Character-level analysis of semi-structured documents for set expansion. In: EMNPL, Stroudsburg, PA, USA (2009)

    Google Scholar 

  29. Wang, R.C., Schlaefer, N., Cohen, W.W., Nyberg, E.: Automatic set expansion for list question answering. In: EMNLP, Stroudsburg, PA, USA (2008)

    Google Scholar 

  30. Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. In: IEEE TKDE, June 2008

    Google Scholar 

  31. Zhang, Z., Sun, L., Han, X.: A joint model for entity set expansion and attribute extraction from web search queries. In: AAAI (2016)

    Google Scholar 

  32. Zhao, B., Rubinstein, B.I.P., Gemmell, J., Han, J.: A Bayesian approach to discovering truth from conflicting sources for data integration. PVLDB 5(6), 550–561 (2012)

    Google Scholar 

  33. Zhao, Z., Cheng, J., Ng, W.: Truth discovery in data streams: a single-pass probabilistic approach. In: CIKM, Shangai, China, November 2014

    Google Scholar 

Download references

Acknowledgment

This work has been partially funded by the Big Data and Market Insights Chair of Télécom ParisTech and supported by the National University of Singapore under a grant from Singapore Ministry of Education for research project number T1 251RES1607.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ngurah Agus Sanjaya Er .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Er, N.A.S., Ba, M.L., Abdessalem, T., Bressan, S. (2017). Truthfulness of Candidates in Set of t-uples Expansion. In: Benslimane, D., Damiani, E., Grosky, W., Hameurlain, A., Sheth, A., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2017. Lecture Notes in Computer Science(), vol 10438. Springer, Cham. https://doi.org/10.1007/978-3-319-64468-4_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-64468-4_24

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-64467-7

  • Online ISBN: 978-3-319-64468-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics