Advertisement

Attribute Value Matching by Maximizing Benefit

  • Fengfeng Fan
  • Zhanhuai Li
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11158)

Abstract

Attribute value matching (AVM) identifies equivalent values that refer to the same entities. Traditional approaches ignore the weights of values in itself. In this demonstration, we present AVM-LB, Attribute Value Matching with Limited Budget, that preferentially matches the hot equivalent values such that the maximal benefit to data consistency can be achieved by limited budget. By defining a rank function and greedily matching the hot equivalent values, the AVM-LB enables users to interactively explore the achieved benefit to data consistency.

Keywords

Attribute Value Matching Entity resolution Hot data Data cleaning Big Data 

Notes

Acknowledgments

The work was supported by the Ministry of Science and Technology of China, National Key Research and Development Program (Project Number 2016YFB1000703), the National Natural Science Foundation of China under No. 61732014 No. 61332006, No. 61472321, No. 61502390 and No. 61672432.

References

  1. 1.
    Batini, C., Scannapieco, M.: Data Quality: Concepts. Methodologies and Techniques. Springer Publishing Company, Incorporated (2010)zbMATHGoogle Scholar
  2. 2.
    Fan, F., Li, Z., Wang, Y.: Cohesion based attribute value matching. In: 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI), pp. 1–5, October 2017Google Scholar
  3. 3.
    Fan, F., Li, Z., Chen, Q., Chen, L.: Reasoning about attribute value equivalence in relational data. Inf. Syst. 75, 1–12 (2018)CrossRefGoogle Scholar
  4. 4.
    Naumann, F., Herschel, M.: An introduction to duplicate detection. Synth. Lect. Data Manag. 2(1), 1–87 (2010)CrossRefGoogle Scholar
  5. 5.
    Whang, S.E., Marmaros, D., Garcia-Molina, H.: Pay-as-you-go entity resolution. IEEE Trans. Knowl. Data Eng. 25(5), 1111–1124 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer ScienceNorthwestern Polytechnical UniversityXi’anChina
  2. 2.Key Laboratory of Big Data Storage and Management, Ministry of Industry and Information TechnologyNorthwestern Polytechnical UniversityXi’anChina

Personalised recommendations