A Faster Algorithm for Truth Discovery via Range Cover

Huang, Ziyun; Ding, Hu; Xu, Jinhui

doi:10.1007/s00453-019-00562-z

A Faster Algorithm for Truth Discovery via Range Cover

Published: 25 March 2019

Volume 81, pages 4118–4133, (2019)
Cite this article

Algorithmica Aims and scope Submit manuscript

Ziyun Huang¹,
Hu Ding² &
Jinhui Xu³

359 Accesses
1 Citation
Explore all metrics

Abstract

Truth discovery is a key problem in data analytics which has received a great deal of attention in recent years. In this problem, we seek to obtain trustworthy information from data aggregated from multiple (possibly) unreliable sources. Most of the existing approaches for this problem are of heuristic nature and do not provide any quality guarantee. Very recently, the first quality-guaranteed algorithm has been discovered. However, the running time of the algorithm depends on the spread ratio of the input points and is fully polynomial only when the spread ratio is relatively small. This could restrict the applicability of the algorithm. To resolve this issue, we propose in this paper a new algorithm which yields a \((1+\epsilon )\)-approximation in near quadratic time for any dataset with constant probability. Our algorithm relies on a data structure called range cover, which is interesting in its own right. The data structure provides a general approach for solving some high dimensional optimization problems by breaking down them into a small number of parametrized cases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Faster Algorithm for Truth Discovery via Range Cover

A Simple Linear-Space Data Structure for Constant-Time Range Minimum Query

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

Article 05 March 2019

References

Agarwal, P.K., Har-Peled, S., Varadarajan, S.R.: Geometric approximation via coresets. Comb. Comput. Geom. 52, 1–30 (2005)
MathSciNet MATH Google Scholar
Chen, K.: On coresets for k-median and k-means clustering in metric and euclidean spaces and their applications. SIAM J. Comput. 39(3), 923–947 (2009)
Article MathSciNet MATH Google Scholar
Dan, F., Langberg, M.: A unified framework for approximating and clustering data. In: Proceedings of 43rd Annual ACM Symposium on Theory of Computing, pp. 569–578 (2011)
Ding, H., Gao, J., Xu, J.: Finding global optimum for truth discovery: entropy based geometric variance. Leibniz International Proceedings in Informatics (LIPIcs). In: 32nd International Symposium on Computational Geometry (SoCG 2016), vol. 51, pp. 34:1–34:16 (2016)
Dong, X.L., Berti-Equille, L., Srivastava, D.: Integrating conflicting data: the role of source dependence. PVLDB 2(1), 550–561 (2009)
Google Scholar
Har-Peled, S.: Geometric Approximation Algorithms, vol. 173. American Mathematical Society, Boston (2011)
MATH Google Scholar
Li, Y., Gao, J., Meng, C., Li, Q., Su, L., Zhao, B., Fan, W., Han, J.: A survey on truth discovery, CoRR abs/1505.02463(2015)
Li, H., Zhao, B., Fuxman, A.: The wisdom of minority: discovering and targeting the right group of workers for crowdsourcing. In: Proceedings of the International Conference on World Wide Web (WWW’14), pp. 165–176 (2014)
Li, Q., Li, Y., Gao, J., Su, L., Zhao, B., Demirbas, M., Fan, W., Han, J.: A confidence-aware approach for truth discovery on long-tail data. PVLDB 8(4), 425–436 (2014)
Google Scholar
Li, Q., Li, Y., Gao, J., Zhao, B., Fan, W., Han, J.: Resolving conflicts in heterogeneous data by truth discovery and source reliability estimation. In: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data (SIGMOD’14), pp. 1187–1198 (2014)
Pasternack, J., Roth, D.: Knowing what to believe (when you already know something). In: Proceedings of the International Conference on Computational Linguistics (COLING’10), pp. 877–885 (2010)
Whitehill, J., Ruvolo, P., Wu, T., Bergsma, J., Movellan, J.: Whose vote should count more: optimal integration of labelers of unknown expertise. In: Advances in Neural Information Processing Systems (NIPS’09), pp. 2035–2043 (2009)
Yin, X., Han, J., Yu, P.S.: Truth discovery with multiple conflicting information providers on the web. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’07), pp. 1048–1052 (2007)

Download references

Author information

Authors and Affiliations

Department of Computer Science and Software Engineering, Penn State Erie, The Behrend College, Erie, USA
Ziyun Huang
School of Computer Science and Technology, University of Science and Technology of China, Hefei, China
Hu Ding
Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, USA
Jinhui Xu

Authors

Ziyun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hu Ding
View author publications
You can also search for this author in PubMed Google Scholar
Jinhui Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ziyun Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The research of the first and third authors was supported in part by NSF through Grants CCF-1422324, IIS-1422591, CNS-1547167, and CCF-1716400. The research of the second author was supported by NSF through Grant CCF-1656905 and a start-up fund from Michigan State University. Part of the research of the first author was conducted when the author was a graduate student at SUNY Buffalo.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, Z., Ding, H. & Xu, J. A Faster Algorithm for Truth Discovery via Range Cover. Algorithmica 81, 4118–4133 (2019). https://doi.org/10.1007/s00453-019-00562-z

Download citation

Received: 14 September 2017
Accepted: 28 February 2019
Published: 25 March 2019
Issue Date: 01 October 2019
DOI: https://doi.org/10.1007/s00453-019-00562-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Faster Algorithm for Truth Discovery via Range Cover

Abstract

Access this article

Similar content being viewed by others

Faster Algorithm for Truth Discovery via Range Cover

A Simple Linear-Space Data Structure for Constant-Time Range Minimum Query

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A Faster Algorithm for Truth Discovery via Range Cover

Abstract

Access this article

Similar content being viewed by others

Faster Algorithm for Truth Discovery via Range Cover

A Simple Linear-Space Data Structure for Constant-Time Range Minimum Query

Orthogonal Range Searching in Moderate Dimensions: k-d Trees and Range Trees Strike Back

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation