Skip to main content

On the Discovery of Continuous Truth: A Semi-supervised Approach with Partial Ground Truths

  • Conference paper
  • First Online:
Book cover Web Information Systems Engineering – WISE 2018 (WISE 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11233))

Included in the following conference series:

Abstract

In many applications, the information regarding to the same object can be collected from multiple sources. However, these multi-source data are not reported consistently. In the light of this challenge, truth discovery is emerged to identify truth for each object from multi-source data. Most existing truth discovery methods assume that ground truths are completely unknown, and they focus on the exploration of unsupervised approaches to jointly estimate object truths and source reliabilities. However, in many real world applications, a set of ground truths could be partially available. In this paper, we propose a semi-supervised truth discovery framework to estimate continuous object truths. With the help of ground truths, even a small amount, the accuracy of truth discovery can be improved. We formulate the semi-supervised truth discovery problem as an optimization task where object truths and source reliabilities are modeled as variables. The ground truths are modeled as a regularization term and its contribution to the source weight estimation can be controlled by a parameter. The experiments show that the proposed method is more accurate and efficient than the existing truth discovery methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://lunadong.com/fusionDataSets.htm.

References

  1. Bertsekas, D.P.: Nonlinear Programming. Athena Scientific, Belmont (1999)

    MATH  Google Scholar 

  2. Cho, J.H., Swami, A., Chen, R.: A survey on trust management for mobile ad hoc networks. IEEE Commun. Surv. Tutor. 13(4), 562–583 (2011)

    Article  Google Scholar 

  3. Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. Proc. VLDB Endow. 2(1), 550–561 (2009)

    Article  Google Scholar 

  4. Dong, X.L., Berti-Equille, L., Srivastava, D.: Truth discovery and copying detection in a dynamic world. Proc. VLDB Endow. 2(1), 562–573 (2009)

    Article  Google Scholar 

  5. Fang, X.S.: Truth discovery from conflicting multi-valued objects. In: Proceedings of the 26th International Conference on World Wide Web Companion, pp. 711–715. International World Wide Web Conferences Steering Committee (2017)

    Google Scholar 

  6. Fang, X.S., Sheng, Q.Z., Wang, X., Ngu, A.H.: SmartMTD: a graph-based approach for effective multi-truth discovery. arXiv preprint arXiv:1708.02018 (2017)

  7. Galland, A., Abiteboul, S., Marian, A., Senellart, P.: Corroborating information from disagreeing views. In: Proceedings of the Third ACM International Conference on Web Search and Data Mining, pp. 131–140. ACM (2010)

    Google Scholar 

  8. Lee, Y.W., Pipino, L.L., Funk, J.D., Wang, R.Y.: Journey to Data Quality. The MIT Press, Cambridge (2009)

    Google Scholar 

  9. Li, M., Sun, X., Wang, H., Zhang, Y., Zhang, J.: Privacy-aware access control with trust management in web service. World Wide Web 14(4), 407–430 (2011)

    Article  Google Scholar 

  10. Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1187–1198. ACM (2014)

    Google Scholar 

  11. Li, Y., et al.: A survey on truth discovery. ACM SIGKDD Explor. Newsl. 17(2), 1–16 (2016)

    Article  MathSciNet  Google Scholar 

  12. Li, Y., et al.: On the discovery of evolving truth. In: Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 675–684. ACM (2015)

    Google Scholar 

  13. Meng, C., et al.: Truth discovery on crowd sensing of correlated entities. In: Proceedings of the 13th ACM Conference on Embedded Networked Sensor Systems, pp. 169–182. ACM (2015)

    Google Scholar 

  14. Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 877–885. Association for Computational Linguistics (2010)

    Google Scholar 

  15. Pasternack, J., Roth, D.: Latent credibility analysis. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1009–1020. ACM (2013)

    Google Scholar 

  16. Pochampally, R., Das Sarma, A., Dong, X.L., Meliou, A., Srivastava, D.: Fusing data with correlations. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 433–444. ACM (2014)

    Google Scholar 

  17. Qi, G.J., Aggarwal, C.C., Han, J., Huang, T.: Mining collective intelligence in diverse groups. In: Proceedings of the 22nd International Conference on World Wide Web, pp. 1041–1052. ACM (2013)

    Google Scholar 

  18. Yin, X., Han, J., Philip, S.Y.: Truth discovery with multiple conflicting information providers on the web. IEEE Trans. Knowl. Data Eng. 20(6), 796–808 (2008)

    Article  Google Scholar 

  19. Yin, X., Tan, W.: Semi-supervised truth discovery. In: Proceedings of the 20th International Conference on World Wide Web, pp. 217–226. ACM (2011)

    Google Scholar 

  20. Zhang, J., Tao, X., Wang, H.: Outlier detection from large distributed databases. World Wide Web 17(4), 539–568 (2014)

    Article  Google Scholar 

  21. Zhao, B., Han, J.: A probabilistic model for estimating real-valued truth from conflicting sources. In: Proceedings of QDB (2012)

    Google Scholar 

  22. Zhao, B., Rubinstein, B.I., Gemmell, J., Han, J.: A Bayesian approach to discovering truth from conflicting sources for data integration. Proc. VLDB Endow. 5(6), 550–561 (2012)

    Article  Google Scholar 

  23. Zheng, Y., Li, G., Li, Y., Shan, C., Cheng, R.: Truth inference in crowdsourcing: is the problem solved? Proc. VLDB Endow. 10(5), 541–552 (2017)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yi Yang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yang, Y., Bai, Q., Liu, Q. (2018). On the Discovery of Continuous Truth: A Semi-supervised Approach with Partial Ground Truths. In: Hacid, H., Cellary, W., Wang, H., Paik, HY., Zhou, R. (eds) Web Information Systems Engineering – WISE 2018. WISE 2018. Lecture Notes in Computer Science(), vol 11233. Springer, Cham. https://doi.org/10.1007/978-3-030-02922-7_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02922-7_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02921-0

  • Online ISBN: 978-3-030-02922-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics