Skip to main content

New Approach to the Re-identification Problem Using Neural Networks

  • Conference paper
Modeling Decisions for Artificial Intelligence (MDAI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3885))

Abstract

Schema and record matching are tools to integrate files or databases. Record linkage is one of the tools used to link those records that while belonging to different files correspond to the same individual.

Standard record linkage methods are applied when the records of both files are described using the same variables. One of the non-standard record linkage methods corresponds to the case when files are not described using the same variables.

In this paper we study record linkage for non common variables. In particular, we use a supervised approach based on neural networks. We use a neural network to find the relationships between variables. Then, we use these relationships to translate the information in the domain of one file into the domain of the other file.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Srikant, R.: Privacy Preserving Data Mining. In: Proc. of the ACM SIGMOD Conference on Management of Data, pp. 439–450 (2000)

    Google Scholar 

  2. Data Extraction System, U.S. Census Bureau, http://www.census.gov/DES/www/welcome.html

  3. Freeman, J.A., Skapura, D.M.: Neural Networks. Algorithms Applications and Programming Techniques. Addison-Wesley, Reading (1991)

    MATH  Google Scholar 

  4. Li, W., Clifton, C.: SEMINT: A tool for identifying correspondences in heterogeneus databases using neural networks. Data & Knowledge Engineering 33, 49–84 (2000)

    Article  MATH  Google Scholar 

  5. Narukawa, Y., Torra, V.: Twofold integral and Multi-step Choquet integral. Kybernetika 40(1), 39–50 (2004)

    MathSciNet  MATH  Google Scholar 

  6. Narukawa, Y., Torra, V.: Graphical interpretation of the twofold integral and its generalization. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 13(4), 415–424 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  7. Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of the adaptive weights. In: Proc. of the Int’l. Joint Conference on Neural Networks, vol. 3, pp. 21–26 (1990)

    Google Scholar 

  8. Nin, J., Torra, V.: Towards the use of OWA operators for record linkage. In: Proc. of the European Soc. on Fuzzy Logic and Technologies (in press, 2005)

    Google Scholar 

  9. Nin, J., Torra, V.: Empirical analysis of database privacy using twofold integrals. In: Hao, Y., Liu, J., Wang, Y.-P., Cheung, Y.-m., Yin, H., Jiao, L., Ma, J., Jiao, Y.-C. (eds.) CIS 2005. LNCS (LNAI), vol. 3801, pp. 1–8. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  10. Rojas, R.: Neural Networks - A Systematic Introduction. Springer, Heidelberg (1996)

    MATH  Google Scholar 

  11. Torra, V., Domingo-Ferrer, J.: Record linkage methods for multidatabase data mining. In: Torra, V. (ed.) Information Fusion in Data Mining, pp. 101–132. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  12. Torra, V.: Towards the re-identification of individuals in data files with non-common variables. In: Proc. of the 14th European Conference on Artificial Intelligence (ECAI 2000), Berlin, Germany, pp. 326–330. IOS Press, Amsterdam (2000)

    Google Scholar 

  13. Torra, V.: OWA operators in data modeling and re-identification. IEEE Trans. on Fuzzy Systems 12(5), 652–660 (2004)

    Article  Google Scholar 

  14. Murphy, P.M., Aha, D.W.: UCI Repository machine learning databases. University of California, Department of Information and Computer Science, Irvine, CA (1994), http://www.ics.uci.edu/~mlearn/MLRepository.html

  15. Willenborg, L., de Waal, T.: Elements of Statistical Disclosure Control. Lecture Notes in Statistics. Springer, Heidelberg (2001)

    Book  MATH  Google Scholar 

  16. Winkler, W.E.: Data Cleaning Methods. In: Proc. SIGKDD 2003, Washington (2003)

    Google Scholar 

  17. Winkler, W.E.: Re-identification methods for masked microdata. In: Domingo-Ferrer, J., Torra, V. (eds.) PSD 2004. LNCS, vol. 3050, pp. 216–230. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  18. Yager, R.R.: On ordered weighted averaging aggregation operators in multi-criteria decision making. IEEE Trans. Syst., Man, Cybern. 18, 183–190 (1988)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nin, J., Torra, V. (2006). New Approach to the Re-identification Problem Using Neural Networks. In: Torra, V., Narukawa, Y., Valls, A., Domingo-Ferrer, J. (eds) Modeling Decisions for Artificial Intelligence. MDAI 2006. Lecture Notes in Computer Science(), vol 3885. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11681960_25

Download citation

  • DOI: https://doi.org/10.1007/11681960_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32780-6

  • Online ISBN: 978-3-540-32781-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics