Community Enhanced Record Linkage Method for Vehicle Insurance System

  • Christian LuEmail author
  • Guangyan HuangEmail author
  • Yong XiangEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11888)


Record linkage is a pivotal data integration stage in the vehicle insurance claims analysis system and serves as a foundation for fraud detection, market promotion and other major business applications. While the traditional method of rules based classification plus clerical review is still in use in the industry, the latest development has advanced into link analysis based collective record linkage which has put the blocking and classification processes under the global context. To apply this method with a fraud detection objective, we have developed a community enhanced record linkage model specially tailored for the requirements of vehicle insurance claim system. A major novel approach is the construction of claim communities linking the claims, customers and vehicles involved and apply probabilistic data matching algorithms integrated with spatio-temporal co-occurrence patterns. In addition, the matched results could be used to identify the outliers in fraud detection analysis.


Record linkage Collective classification Spatio-temporal co-occurrence Fraud detection Vehicle insurance 


  1. 1.
    Combating insurance claims fraud. Technical report, SAS (2012)Google Scholar
  2. 2.
    Barber, D.: Bayesian Reasoning and Machine Learning. Cambridge University Press, Cambridge (2010)zbMATHGoogle Scholar
  3. 3.
    Bolton, R.J., Hand, D.J.: Statistical fraud detection: a review. Stat. Sci. 17, 235–255 (2002)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Boongoen, T.: Discovering identity in intelligence data using weighted link based similarities: a case study of Thailand (2015)Google Scholar
  5. 5.
    Brockmann, D., Hufnagel, L., Geisel, T.: The scaling laws of human travel. Nature 439(7075), 462–465 (2006). Scholar
  6. 6.
    Christen, P.: Data Matching Concepts and Techniques for Record Linkage, Entity Resolution, and Duplicate Detection. Springer, Berlin (2012)Google Scholar
  7. 7.
    David, J., Backstrom, L., Cosley, D., Suri, S., Huttenlocher, D., Kleinberg, J.: Inferring social ties from geographic coincidences. Proc. Nat. Acad. Sci. 107, 22436–22441 (2010)CrossRefGoogle Scholar
  8. 8.
    Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64, 1183–1210 (1969)CrossRefGoogle Scholar
  9. 9.
    Government, Q.: Travel in South-East Queensland an analysis of travel data from 1992 to 2009. Technical report (2012)Google Scholar
  10. 10.
    Newcombe, H.B., Kennedy, J.M., Axford, S.J., James, A.P.: Automatic linkage of vital records. Science 130, 954–959 (1959)CrossRefGoogle Scholar
  11. 11.
    Bhattacharya, I., Getoor, L.: Collective entity resolution in relational data. ACM Trans. Knowl. Disc. Data 1, 5 (2007)CrossRefGoogle Scholar
  12. 12.
    Nin, J., Munt’es-Mulero, V., Martinez-Bazan, N., Larriba-Pey, J.L.: On the use of semantic blocking techniques for data cleansing and integration. In: 11th International Database Engineering and Applications Symposium (IDEAS 2007) (2007)Google Scholar
  13. 13.
    Liu, J., et al.: Graph analysis for detecting fraud, waste and abuse in healthcare data. In: Association for the Advancement of Artificial Intelligence ( (2015)Google Scholar
  14. 14.
    Kalashinikov, D.V., Mehrotra, S.: Domain-independent data cleaning via analysis of entity-relationship graph. ACM Trans. Database Syst. 31, 716–767 (2006)CrossRefGoogle Scholar
  15. 15.
    Kouki, P., Pujara, J., Marcum, C., Koehly, L., Getoor, L.: Collective entity resolution in familial networks. In: 2017 IEEE International Conference on Data Mining (ICDM), pp. 227–236 (2017).
  16. 16.
    Wang, P., Hunter, T., Bayen, A.M., Schechtner, K., González, M.C.: Understanding road usage patterns in urban areas. Sci. Rep. (2012) Google Scholar
  17. 17.
    DuVall, S.L., Kerber, R.A., Thomas, A.: Extending the Fellegi-Sunter probabilistic record linkage method for approximate field comparators. J. Biomed. Inf. 43, 24–30 (2010)CrossRefGoogle Scholar
  18. 18.
    Sun, L., Axhausen, K.W., Lee, D.H., Huang, X.: Understanding metropolitan patterns of daily encounters. Proc. Nat. Acad. Sci. U.S.A. 110(34), 13774–9 (2013). Scholar
  19. 19.
    Herzog, T.N., Scheuren, F.J., Winkler, W.E.: Data Quality And Record Linkage Techniques. Springer, Berlin (2007)Google Scholar
  20. 20.
    Viaene, S., Dedene, G.: Insurance fraud-issues and challenges. In: 2004 The International Association for the Study of Insurance Economics (2004)CrossRefGoogle Scholar
  21. 21.
    Rastogi, V., Dalvi, N., Garofalakis, M.: Large-scale collective entity matching. Proc. VLDB Endowment 4, 208–218 (2009)CrossRefGoogle Scholar
  22. 22.
    Cohen, W.W., Ravikumar, P., Fienberg, S.E.: A comparison of string distance metrics for name-matching tasks (2003)Google Scholar
  23. 23.
    Christen, P., Churches, T., Willmore, A.: A probabilistic geocoding system based on a national address file (2004)Google Scholar
  24. 24.
    Dong, X., Halevy, A.: Reference reconciliation in complex information spaces (2005)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Deakin UniversityBurwoodAustralia

Personalised recommendations