Skip to main content

Privacy-Enhanced Fraud Detection with Bloom Filters

  • Conference paper
  • First Online:
Security and Privacy in Communication Networks (SecureComm 2018)

Abstract

The online shopping sector is continuously growing, generating a turnover of billions of dollars each year. Unfortunately, this growth in popularity is not limited to regular customers: Organized crime targeting online shops has considerably evolved in the past years, causing significant financial losses to the merchants. As criminals often use similar strategies among different merchants, sharing information about fraud patterns could help mitigate the success of these malicious activities. In practice, however, the sharing of data is difficult, since shops are often competitors or have to follow strict privacy laws. In this paper, we propose a novel method for fraud detection that allows merchants to exchange information on recent fraud incidents without exposing customer data. To this end, our method pseudonymizes orders on the client-side before sending them to a central service for analysis. Although the service cannot access individual features of these orders, it is able to infer fraudulent patterns using machine learning techniques. We examine the capabilities of this approach and measure its impact on the overall detection performance on a dataset of more than 1.5 million orders from a large European online fashion retailer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.github.com/darp/abbo-tools.

References

  1. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  Google Scholar 

  2. Bradley, A.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30(7), 1145–1159 (1997)

    Article  Google Scholar 

  3. Bursztein, E., et al.: Handcrafted fraud and extortion: manual account hijacking in the wild. In: Proceedings of Conference on Internet Measurement Conference (IMC) (2014)

    Google Scholar 

  4. Bursztein, E., Malyshev, A., Pietraszek, T., Thomas, K.: Picasso: lightweight device class fingerprinting for web clients. In: Proceedings of ACM Workshop on Security and Privacy in Smartphones and Mobile Devices (SPSM) (2016)

    Google Scholar 

  5. Caldeira, E., Brandao, G., Pereira, A.C.M.: Fraud analysis and prevention in e-commerce transactions. In: Proceedings of Latin American Web Congress (LA-WEB) (2014)

    Google Scholar 

  6. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J.: Distributed data mining in credit card fraud detection. IEEE Intell. Syst. 14(6), 67–74 (1999)

    Article  Google Scholar 

  7. Chen, T., Guestrin, C.: XGBoost: a scalable tree boosting system. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2016)

    Google Scholar 

  8. Damashek, M.: Gauging similarity with \(n\)-grams: language-independent categorization of text. Science 267(5199), 843–848 (1995)

    Article  Google Scholar 

  9. Duda, R., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Hoboken (2001)

    MATH  Google Scholar 

  10. Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. (JMLR) 9, 1871–1874 (2008)

    MATH  Google Scholar 

  11. Fanti, G., Pihur, V., Erlingsson, Ú.: Building a RAPPOR with the unknown: privacy-preserving learning of associations and data dictionaries. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2016)

    Google Scholar 

  12. Florencio, D., Herley, C.: Phishing and money mules. In: Proceedings of IEEE International Workshop on Information Forensics and Security (WIFS) (2010)

    Google Scholar 

  13. Fredrikson, M., Lantz, E., Jha, S., Lin, S., Page, D., Ristenpart, T.: Privacy in pharmacogenetics: an end-to-end case study of personalized warfarin dosing. In: Proceedings of USENIX Security Symposium (2014)

    Google Scholar 

  14. Hao, S., et al.: Drops for stuff: an analysis of reshipping mule scams. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2015)

    Google Scholar 

  15. Kroll, M., Steinmetzer, S.: Automated cryptanalysis of bloom filter encryptions of health records. In: Proceedings of the International Conference on Health Informatics (HEALTHINF) (2015)

    Google Scholar 

  16. Kroll, M., Steinmetzer, S., Niedermeyer, F., Schnell, R.: Cryptanalysis of basic bloom filters used for privacy preserving record linkage. J. Priv. Confidentiality 6(2), 59–79 (2014)

    Google Scholar 

  17. Kuzu, M., Kantarcioglu, M., Durham, E., Malin, B.: A constraint satisfaction cryptanalysis of bloom filters in private record linkage. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2011)

    Google Scholar 

  18. Levchenko, K., et al.: Click trajectories: end-to-end analysis of the spam value chain. In: Proceedings of IEEE Symposium on Security and Privacy (2011)

    Google Scholar 

  19. LexisNexis: True cost of fraud study (2016)

    Google Scholar 

  20. Maranzato, R., Pereira, A., do Lago, A.P., Neubert, M.: Fraud detection in reputation systems in e-markets using logistic regression. In: Proceedings of ACM Symposium on Applied Computing (SAC) (2010)

    Google Scholar 

  21. Mor, N., Riva, O., Nath, S., Kubiatowicz, J.: Bloom cookies: web search personalization without user tracking. In: Proceedings of Network and Distributed System Security Symposium (NDSS) (2015)

    Google Scholar 

  22. Motoyama, M., McCoy, D., Levchenko, K., Savage, S., Voelker, G.M.: Dirty jobs: the role of freelance labor in web service abuse. In: Proceedings of USENIX Security Symposium (2011)

    Google Scholar 

  23. Ngai, E.W.T., Hu, Y., Wong, Y.H., Chen, Y., Sun, X.: The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature. Decis. Support Syst. 50(3), 559–569 (2011)

    Article  Google Scholar 

  24. Nikiforakis, N., Kapravelos, A., Joosen, W., Kruegel, C., Piessens, F., Vigna, G.: Cookieless monster: exploring the ecosystem of web-based device fingerprinting. In: Proceedings of IEEE Symposium on Security and Privacy (2013)

    Google Scholar 

  25. Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: NetProbe: a fast and scalable system for fraud detection in online auction networks. In: Proceedings of the International World Wide Web Conference (WWW) (2007)

    Google Scholar 

  26. Perl, H., Yassene, M., Brenner, M., Smith, M.: Fast confidential search for bio-medical data using bloom filters and homomorphic cryptography. In: International Conference on eScience (2012)

    Google Scholar 

  27. Preuveneers, D., Goosens, B., Joosen, W.: Enhanced fraud detection as a service supporting merchant-specific runtime customization. In: Proceedings of ACM Symposium on Applied Computing (SAC) (2017)

    Google Scholar 

  28. Schneier, B.: Applied Cryptography. Wiley, Hoboken (1996)

    Google Scholar 

  29. Schnell, R., Bachteler, T., Reiher, J.: Privacy-preserving record linkage using bloom filters. BMC Med. Inform. Decis. Mak. 9, 41 (2009)

    Article  Google Scholar 

  30. Shay, R., Ion, I., Reeder, R.W., Consolvo, S.: “My religious aunt asked why I was trying to sell her viagra”: experiences with account hijacking. In: Proceedings of ACM Conference on Human Factors in Computing Systems (CHI) (2014)

    Google Scholar 

  31. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2015)

    Google Scholar 

  32. Shokri, R., Stronati, M., Song, C., Shmatikov, V.: Membership inference attacks against machine learning models. In: Proceedings of IEEE Symposium on Security and Privacy (2017)

    Google Scholar 

  33. Sokal, R., Sneath, P.: Principles of Numerical Taxonomy. W.H. Freeman and Company, New York (1963)

    MATH  Google Scholar 

  34. Statista: Net sales revenue of Amazon from 2004 to 2017 (2018). https://www.statista.com/statistics/266282/annual-net-revenue-of-amazoncom/. Accessed April 2018

  35. Thomas, K., Iatskiv, D., Bursztein, E., Pietraszek, T., Grier, C., McCoy, D.: Dialing back abuse on phone verified accounts. In: Proceedings of ACM Conference on Computer and Communications Security (CCS) (2014)

    Google Scholar 

  36. Tramèr, F., Zhang, F., Juels, A., Reiter, M.K., Ristenpart, T.: Stealing machine learning models via prediction APIS. In: Proceedings of USENIX Security Symposium (2017)

    Google Scholar 

  37. Vatsalan, D., Christen, P., Verykios, V.S.: A taxonomy of privacy-preserving record linkage techniques. Inf. Syst. 38(6), 946–969 (2013)

    Article  Google Scholar 

  38. Worldpay: Fragmentation of fraud (2014)

    Google Scholar 

  39. Wu, D.J., Feng, T., Naehrig, M., Lauter, K.E.: Privately evaluating decision trees and random forests. In: Proceedings of Privacy Enhancing Technologies Symposium (PETS) (2016)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Alwin Maier and Paul Schmidt for their assistance during the research project. Moreover, the authors gratefully acknowledge funding from the German Federal Ministry of Education and Research (BMBF) under the project ABBO (FKZ: 13N13634).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Arp .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Arp, D., Quiring, E., Krueger, T., Dragiev, S., Rieck, K. (2018). Privacy-Enhanced Fraud Detection with Bloom Filters. In: Beyah, R., Chang, B., Li, Y., Zhu, S. (eds) Security and Privacy in Communication Networks. SecureComm 2018. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 254. Springer, Cham. https://doi.org/10.1007/978-3-030-01701-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01701-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01700-2

  • Online ISBN: 978-3-030-01701-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics