Advertisement

Electre Tri Machine Learning Approach to the Record Linkage

  • Valentina MinnettiEmail author
  • Renato De Leone
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

This paper proposes, for the first time in the literature, the application of the Electre Tri method for solving the record linkage matching. Results of the preliminary stage show that, by using the Electre Tri method, high accuracy can be achieved and more than 99% of the matches and nonmatches are correctly identified by the procedure.

Keywords

Multiple criteria classification Linked data Linear programming 

References

  1. 1.
    Minnetti, V.: A new distance measure for solving Record Linkage with Electre Tri. Technical Report (February, 2015)Google Scholar
  2. 2.
    Minnetti, V.: On the parameters of the Electre Tri method: a proposal of a new two phases procedure. Ph.D. thesis on Operational Research, Sapienza University of Rome (2015)Google Scholar
  3. 3.
    Winkler, W.E.: The state of Record Linkage and current research problems. U.S. Bureau of the Census, available at: https://www.census.gov/srd/papers/pdf/rr99-04.pdf (1999)
  4. 4.
    Fellegi, I.P., Sunter, A.B.: A theory for record linkage. J. Am. Stat. Assoc. 64, 1183–1210 (1969)CrossRefGoogle Scholar
  5. 5.
    Cibella, N., Fortini, M., Scannapieco, M., Tosco, L., Tuoto, T., Valentino, E.: RELAIS User’s Guide, Version 2.2, available at www.istat.it/it/files/2011/03/Relais2.2UserGuide.pdf (2010)
  6. 6.
    Cohen, W.W.: The WHIRL approach to data integration. IEEE Intell. Syst. 13(3), 20–24 (1998)CrossRefGoogle Scholar
  7. 7.
    Elfeky, M., Elmagarmid, A.: TAILOR: a record linkage toolbox. In: Proceedings of the 18th International Conference on Data Engineering (ICDE’02), pp. 17–28 (2002)Google Scholar
  8. 8.
    Bilenko, M., Mooney, R.: Adaptive duplicate detection using learnable string similarity. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp. 39–48 (2003)Google Scholar
  9. 9.
    Christen, P: Automatic training example selection for scalable unsupervised record linkage. In: Washio, T., Suzuki, E., Ting, K.M. Inokuchi, A. (eds.), Advances in Knowledge Discovery and Data Mining: 12th Pacific-Asia Conference, PAKDD, pp. 511–518 (2008)Google Scholar
  10. 10.
    Wilson, D.R.: Beyond probabilistic record linkage: using neural network and complex features to improve genealogical record linkage. In: Proceedings of IEEE International Joint Conference on Neural Networks, pp. 9–14 (2011)Google Scholar
  11. 11.
    Mousseau, V., Slowinski, R., Zielniewicz, P.: ELECTRE TRI 2.0, a methodological guide and user’s manual. Document du LAMSADE no111, Universit Paris-Dauphine (1969)Google Scholar
  12. 12.
    Mousseau, V., Slowinski, R.: Inferring an ELECTRE TRI model from assignment examples. J. Glob. Optim. 12, 157–174 (1998)MathSciNetCrossRefGoogle Scholar
  13. 13.
    De Leone, R., Minnetti, V.: New approach to estimate the parameters of Electre Tri model in the ordinal sorting problem. In: Proceedings of AIRO 2011—Operational Research in Transportation and Logistics, p. 69 (2011)Google Scholar
  14. 14.
    De Leone, R., Minnetti, V.: The estimation of the parameters in multi-criteria classification problem: the case of the electre tri method. In: Vicari, D., Okada, A., Ragozini, G., Weihs, C. (eds.) Analysis and Modeling of Complex Data in Behavioral and Social Sciences, pp. 93–101. Springer, Berlin (2014)Google Scholar
  15. 15.
    Cohen, W.W., Ravikumar, P., Fienberg, S.E: A Secure protocol for computing string distance metrics. In: Proceedings of IEEE International Conference on Data Mining (ICDM), pp. 40–46 (2004)Google Scholar
  16. 16.
    Chu, K., Poirier, C.: Machine Learning Documentation Initiative. https://www.unece.org/fileadmin/DAM/stats/documents/ece/ces/ge.50/2015/Topic3_Canada_paper.pdf (2015)
  17. 17.
    Winkler, W.E.: Matching and record linkage. WIREs Comput. Stat. 6, 313–325 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  1. 1.Department of Statistic Science, Faculty of Information Engineering, Informatics and StatisticsSapienza University of RomeRomeItaly
  2. 2.School of Science and Technology, Section of MathematicsUniversity of CamerinoCamerinoItaly

Personalised recommendations