On Link Validity in Bibliographic Knowledge Bases

  • Madalina Croitoru
  • Léa Guizol
  • Michel Leclère
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 297)


The Entity Resolution problem has been widely addressed in the literature. In its simplest version, the problem takes as input a knowledge base composed of records describing real world entities and outputs the sets of records judged to correspond to the same real world entity. More elaborated versions take into account links amongst records representing relationships between the entities which represent. However, none of the approaches in the literature question the validity of certain links between records. In this paper we highlight this new aspect of “link validity” in knowledge bases and show how Entity Resolution approaches should take this aspect into consideration.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Arasu, A., Ré, C., Suciu, D.: Large-scale deduplication with constraints using dedupalog. In: Proceedings of the 25th International Conference on Data Engineering (ICDE), pp. 952–963 (2009)Google Scholar
  2. 2.
    Benjelloun, O., Garcia-Molina, H., Menestrina, D., Su, Q., Whang, S., Widom, J.: Swoosh: a generic approach to entity resolution. The VLDB Journal 18, 255–276 (2009)Google Scholar
  3. 3.
    Bhattacharya, I., Getoor, L.: Entity Resolution in Graphs, pp. 311–344. John Wiley & Sons, Inc. (2006)Google Scholar
  4. 4.
    Bouquet, P., Stoermer, H., Bazzanella, B.: An Entity Name System (ENS) for the Semantic Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 258–272. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  5. 5.
    Chaudhuri, S., Ganjam, K., Ganti, V., Motwani, R.: Robust and efficient fuzzy match for online data cleaning. In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD 2003, pp. 313–324. ACM, New York (2003)CrossRefGoogle Scholar
  6. 6.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: A survey. IEEE Transactions on Knowledge and Data Engineering 19, 1–16 (2007)CrossRefGoogle Scholar
  7. 7.
    Guizol, L., Croitoru, M., Leclère, M.: On link validity and entity resolution. Technical Report RR-11010, LIRMM - Laboratoire d’Informatique de Robotique et de Microélectronique de Montpellier, INRIA Sophia Antipolis (2011),
  8. 8.
    Kan, M.-Y., Tan, Y.F.: Record matching in digital library metadata. Commun. ACM 51, 91–94 (2008)CrossRefGoogle Scholar
  9. 9.
    Moreau, N., Leclère, M., Croitoru, M.: Distinguishing Answers in Conceptual Graph Knowledge Bases. In: Rudolph, S., Dau, F., Kuznetsov, S.O. (eds.) ICCS 2009. LNCS (LNAI), vol. 5662, pp. 233–246. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  10. 10.
    Newcombe, H.B., Kennedy, J.M., Axford, S.J., JAMES, A.P.: Automatic linkage of vital records.. Science 130, 954–959 (1959)CrossRefGoogle Scholar
  11. 11.
    Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 846–850 (1971)Google Scholar
  12. 12.
    Saïs, F., Pernelle, N., Rousset, M.-C.: Reconciliation de references: une approche logique adaptee aux grands volumes de donnees. In: EGC, pp. 623–634 (2007)Google Scholar
  13. 13.
    Smalheiser, N.R., Torvik, V.I.: Author Name Disambiguation. In: Annual Review of Information Science and Technology (ARIST), vol. 43. Information Today, Inc. (2009)Google Scholar
  14. 14.
    Winkler, W.E.: Overview of record linkage and current research directions. Technical report, Bureau of the Census (2006)Google Scholar
  15. 15.
    Wölger, S., Hofer, C., Siorpaes, K., Thaler, S., Simperl, E., Bürger, T.: Interlinking data - approaches and tools. Technical report, STI Innsbruck, University of Innsbruck (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Madalina Croitoru
    • 1
  • Léa Guizol
    • 1
  • Michel Leclère
    • 1
  1. 1.LIRMM (University of Montpellier II & CNRS), INRIASophia-AntipolisFrance

Personalised recommendations