Skip to main content

DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track (ECML PKDD 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12978))

Abstract

Detecting abusive and fraudulent claims is one of the key challenges in online food delivery. This is further aggravated by the fact that it is not practical to do reverse-logistics on food unlike in e-commerce. This makes the already-hard problem of harvesting labels for fraud even harder because we cannot confirm if the claim was legitimate by inspecting the item(s). Using manual effort to analyze transactions to generate labels is often expensive and time-consuming. On the other hand, typically, there is a wealth of ‘noisy’ information about what constitutes fraud, in the form of customer service interactions, weak and hard rules derived from data analytics, business intuition and domain understanding.

In this paper, we present a novel end-to-end framework for detecting fraudulent transactions based on large-scale label generation using weak supervision. We directly use Stanford AI Lab’s (SAIL) Snorkel and tree based methods to do manual and automated discovery of labeling functions, to generate weak labels. We follow this up with an auto-encoder reconstruction-error based method to reduce label noise. The final step is a discriminator model which is an ensemble of an MLP and an LSTM. In addition to cross-sectional and longitudinal features around customer history, transactions, we also harvest customer embeddings from a Graph Convolution Network (GCN) on a customer-customer relationship graph, to capture collusive behavior. The final score is thresholded and used in decision making.

This solution is currently deployed for real-time serving and has yielded a 16% points’ improvement in recall at a given precision level. These results are against a baseline MLP model based on manually labeled data and are highly significant at our scale. Our approach can easily scale to additional fraud scenarios or to use-cases where ‘strong’ labels are hard to get but weak labels are prevalent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., Ré, C.: Snorkel: rapid training data creation with weak supervision. In: VLDB Endow 11, 3, 269–282 (2017). https://doi.org/10.14778/3157794.3157797

  2. Varma, P., Ré, C.: Snuba: automating weak supervision to label training data. In: VLDB Endow 12, 3, 223–236 (2018). https://doi.org/10.14778/3291264.3291268

  3. Zhang, W., Wang, D., Tan, X.: Robust class-specific autoencoder for data cleaning and classification in the presence of label noise. Neural Process. Lett. 50(2), 1845–1860 (2018). https://doi.org/10.1007/s11063-018-9963-9

    Article  Google Scholar 

  4. Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., Jiang, C.: Random forest for credit card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), Zhuhai, pp. 1–6 (2018). https://doi.org/10.1109/ICNSC.2018.8361343

  5. Sahin, P.Y., Duman, E.: Detecting credit card fraud by decision trees and support vector machines. In: IMECS 2011 - International Multi Conference of Engineers and Computer Scientists, 1, 442–447 (2011)

    Google Scholar 

  6. Gomez, J.A., Arevalo, J., Paredes, R., Nin, J.: End-to-end neural network architecture for fraud scoring in card payments. Pattern Recogn. Lett. 105, 175–181 (2018)

    Article  Google Scholar 

  7. Wang, S., Liu, C., Gao, X., Qu, H., Xu, W.: Session-based fraud detection in online e-commerce transactions using recurrent neural networks. In: Altun, Y., Das, K., Mielikäinen, T., Malerba, D., Stefanowski, J., Read, J., Žitnik, M., Ceci, M., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10536, pp. 241–252. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71273-4_20

    Chapter  Google Scholar 

  8. Jiang, J., et al.: Anomaly detection with graph convolutional networks for insider threat and fraud detection. In: MILCOM 2019–2019 IEEE Military Communications Conference (MILCOM), Norfolk, VA, USA, pp. 109–114 (2019). https://doi.org/10.1109/MILCOM47813.2019.9020760

  9. Cao, S., Yang, X., Chen, C., Zhou, J., Li, X., Qi, Y.: TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial (2019)

    Google Scholar 

  10. Chen, C., et al.: InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance (2020)

    Google Scholar 

  11. Branco, B., Abreu, P., Gomes, A., Almeida, M., Ascensão, J., Bizarro, P.: Interleaved sequence RNNs for fraud detection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2020)

    Google Scholar 

  12. Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes (2014)

    Google Scholar 

  13. Im, D., Ahn, S., Memisevic, R., Bengio, Y.: Denoising criterion for variational auto-encoding framework. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 2059–2065 (2017). AAAI Press

    Google Scholar 

  14. Guo, J., Liu, G., Zuo, Y., Wu, J.: Learning sequential behavior representations for fraud detection. In: 2018 IEEE International Conference on Data Mining (ICDM), Singapore, pp. 127–136 (2018). https://doi.org/10.1109/ICDM.2018.00028

  15. Zheng, Y.J., Zhou, X.H., Sheng, W.G., Xue, Y., Chen, S.Y.: Generative adversarial network based telecom fraud detection at the receiving bank. Neural Netw. 102, 78–86 (2018)

    Article  Google Scholar 

  16. Deng, R., Rua, N., Zhang, G., Zhang, X.: FraudJudger: Fraud Detection on Digital Payment Platforms with Fewer Labels, arXiv:1909.02398 (2019)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rutvik Vijjali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mathew, J., Negi, M., Vijjali, R., Sathyanarayana, J. (2021). DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86514-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86513-9

  • Online ISBN: 978-3-030-86514-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics