Skip to main content

Probabilistic Entity Resolution

  • Reference work entry
  • First Online:
  • 12 Accesses

Synonyms

Deduplication; Linkage; Matching

Definition

Entity Resolution is the task of analyzing a collection of data (e.g., database, data set) in order to create entities by merging the data instances that describe the same real-world objects. Uncertain entity resolution is a group of resolution methodologies focusing on handling the uncertainties that are present either in the data or are generated during the resolution process.

Historical Background

The fundamental component of resolution techniques is an instance that provides some characteristic of a real-world object. An instance is a tuple with k attributes 〈v1, …, vk〉, with each attribute being one characteristic of the corresponding object. Consider now a collection of instances. The goal of resolution is to detect the instances that describe the same real-world objects and merge them into entities, i.e., create entity e for representing instances r1, r2, and r3.

The initial resolution approaches focused on handling the...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Andritsos P, Fuxman A, Miller R. Clean answers over dirty databases: a probabilistic approach. In: Proceedings of the 22nd International Conference on Data Engineering; 2006.

    Google Scholar 

  2. Beskales G, Soliman M, Ilyas I, Ben-David S. Modeling and querying possible repairs in duplicate detection. Proc VLDB Endow. 2009;2(1):598–609.

    Article  Google Scholar 

  3. Dong XL, Halevy A, Yu C. Data integration with uncertainty. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007. p. 687–98.

    Google Scholar 

  4. Elmagarmid A, Ipeirotis P, Verykios V. Duplicate record detection: a survey. IEEE Trans Knowl Data Eng. 2007;19(1):1–16.

    Article  Google Scholar 

  5. Ioannou E, Nejdl W, Niederée C, Velegrakis Y. On-the-fly entity-aware query processing in the presence of linkage. Proc VLDB Endow. 2010;3(1):429–38.

    Article  Google Scholar 

  6. Ioannou E, Staworko S. Management of inconsistencies in data integration. In: Data exchange, integration, and streams. 2013. p. 217–25.

    Google Scholar 

  7. Re C, Dalvi N, Suciu D. Efficient top-k query evaluation on probabilistic data. In: Proceedings of the 23rd International Conference on Data Engineering; 2007. p. 886–95.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ekaterini Ioannou .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Ioannou, E. (2018). Probabilistic Entity Resolution. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_80805

Download citation

Publish with us

Policies and ethics