Skip to main content

Entity Matching Technique for Bibliographic Database

  • Conference paper
Database and Expert Systems Applications (DEXA 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8056))

Included in the following conference series:

Abstract

Some of the attributes of a database relation may evolve over time i.e., they change their values at different instants of time. For example, affiliation attribute of an author relation in a bibliographic database which maintains publication details of various authors, may change its value. When a database contains records of this nature and number of records grows to a large extent then it becomes really very challenging to identify which records belong to which entity due to lack of a proper key. In such a situation, the other attributes of the records and the timed information associated with the records may be useful in identifying whether the records belong to the same entity or different. In the proposed work, the records are initially clustered based on email-id attribute and the clusters are further refined based on other temporal and non-temporal attributes. The refinement process involves similarity check with other records and clusters. A comparative analysis with two existing systems DBLP and ArnetMiner shows that the proposed technique can able to produce better results in many cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: KDD, pp. 39–48 (August 2003)

    Google Scholar 

  2. Chaudhuri, S., Chen, B.-C., Ganti, V., Kaushik, R.: Example-driven design of efficient record matching queries. In: VLDB, pp. 327–338 (September 2007)

    Google Scholar 

  3. Fan, W., Jia, X., Li, J., Ma, S.: Reasoning about record matching rules. VLDB 2(1), 407–418 (2009)

    Google Scholar 

  4. Gal, A., Atluri, V.: An authorization model for temporal data. In: CCS, pp. 144–153 (November 2000)

    Google Scholar 

  5. Hernández, M.A., Stolfo, S.J.: The merge/purge problem for large databases. In: SIGMOD, pp. 127–138 (May 1995)

    Google Scholar 

  6. Li, P., Dong, X.L., Maurino, A., Srivastava, D.: Linking temporal records. VLDB 4(11), 956–967 (2011)

    Google Scholar 

  7. Li, P., Wang, H., Tziviskou, C., Dong, X.L., Liu, X., Muarino, A., Srivastava, D.: CHRONOS: facilitating history discovery by linking temporal records. VLDB 5(12), 2006–2009 (2012)

    Google Scholar 

  8. Li, S., Cong, G., Miao, C.: Author name disambiguation using a new categorical distribution similarity. In: Flach, P.A., De Bie, T., Cristianini, N. (eds.) ECML PKDD 2012, Part I. LNCS, vol. 7523, pp. 569–584. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Wang, J., Li, G., Yu, J.X., Feng, J.: Entity Matching: how similar is similar. VLDB 4(10), 622–633 (2011)

    Google Scholar 

  10. Yin, X., Han, J., Yu, P.S.: Object Distinction: distinguishing objects with identical names. In: ICDE, pp. 1242–1246 (April 2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mishra, S., Mondal, S., Saha, S. (2013). Entity Matching Technique for Bibliographic Database. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds) Database and Expert Systems Applications. DEXA 2013. Lecture Notes in Computer Science, vol 8056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40173-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40173-2_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40172-5

  • Online ISBN: 978-3-642-40173-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics