DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery

Mathew, Jose; Negi, Meghana; Vijjali, Rutvik; Sathyanarayana, Jairaj

doi:10.1007/978-3-030-86514-6_6

Jose Mathew¹²,
Meghana Negi¹²,
Rutvik Vijjali¹² &
…
Jairaj Sathyanarayana¹²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12978))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

1705 Accesses
2 Citations

Abstract

Detecting abusive and fraudulent claims is one of the key challenges in online food delivery. This is further aggravated by the fact that it is not practical to do reverse-logistics on food unlike in e-commerce. This makes the already-hard problem of harvesting labels for fraud even harder because we cannot confirm if the claim was legitimate by inspecting the item(s). Using manual effort to analyze transactions to generate labels is often expensive and time-consuming. On the other hand, typically, there is a wealth of ‘noisy’ information about what constitutes fraud, in the form of customer service interactions, weak and hard rules derived from data analytics, business intuition and domain understanding.

In this paper, we present a novel end-to-end framework for detecting fraudulent transactions based on large-scale label generation using weak supervision. We directly use Stanford AI Lab’s (SAIL) Snorkel and tree based methods to do manual and automated discovery of labeling functions, to generate weak labels. We follow this up with an auto-encoder reconstruction-error based method to reduce label noise. The final step is a discriminator model which is an ensemble of an MLP and an LSTM. In addition to cross-sectional and longitudinal features around customer history, transactions, we also harvest customer embeddings from a Graph Convolution Network (GCN) on a customer-customer relationship graph, to capture collusive behavior. The final score is thresholded and used in decision making.

This solution is currently deployed for real-time serving and has yielded a 16% points’ improvement in recall at a given precision level. These results are against a baseline MLP model based on manually labeled data and are highly significant at our scale. Our approach can easily scale to additional fraud scenarios or to use-cases where ‘strong’ labels are hard to get but weak labels are prevalent.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ratner, A., Bach, S.H., Ehrenberg, H., Fries, J., Wu, S., Ré, C.: Snorkel: rapid training data creation with weak supervision. In: VLDB Endow 11, 3, 269–282 (2017). https://doi.org/10.14778/3157794.3157797
Varma, P., Ré, C.: Snuba: automating weak supervision to label training data. In: VLDB Endow 12, 3, 223–236 (2018). https://doi.org/10.14778/3291264.3291268
Zhang, W., Wang, D., Tan, X.: Robust class-specific autoencoder for data cleaning and classification in the presence of label noise. Neural Process. Lett. 50(2), 1845–1860 (2018). https://doi.org/10.1007/s11063-018-9963-9
Article Google Scholar
Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., Jiang, C.: Random forest for credit card fraud detection. In: 2018 IEEE 15th International Conference on Networking, Sensing and Control (ICNSC), Zhuhai, pp. 1–6 (2018). https://doi.org/10.1109/ICNSC.2018.8361343
Sahin, P.Y., Duman, E.: Detecting credit card fraud by decision trees and support vector machines. In: IMECS 2011 - International Multi Conference of Engineers and Computer Scientists, 1, 442–447 (2011)
Google Scholar
Gomez, J.A., Arevalo, J., Paredes, R., Nin, J.: End-to-end neural network architecture for fraud scoring in card payments. Pattern Recogn. Lett. 105, 175–181 (2018)
Article Google Scholar
Wang, S., Liu, C., Gao, X., Qu, H., Xu, W.: Session-based fraud detection in online e-commerce transactions using recurrent neural networks. In: Altun, Y., Das, K., Mielikäinen, T., Malerba, D., Stefanowski, J., Read, J., Žitnik, M., Ceci, M., Džeroski, S. (eds.) ECML PKDD 2017. LNCS (LNAI), vol. 10536, pp. 241–252. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-71273-4_20
Chapter Google Scholar
Jiang, J., et al.: Anomaly detection with graph convolutional networks for insider threat and fraud detection. In: MILCOM 2019–2019 IEEE Military Communications Conference (MILCOM), Norfolk, VA, USA, pp. 109–114 (2019). https://doi.org/10.1109/MILCOM47813.2019.9020760
Cao, S., Yang, X., Chen, C., Zhou, J., Li, X., Qi, Y.: TitAnt: Online Real-time Transaction Fraud Detection in Ant Financial (2019)
Google Scholar
Chen, C., et al.: InfDetect: a Large Scale Graph-based Fraud Detection System for E-Commerce Insurance (2020)
Google Scholar
Branco, B., Abreu, P., Gomes, A., Almeida, M., Ascensão, J., Bizarro, P.: Interleaved sequence RNNs for fraud detection. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2020)
Google Scholar
Kingma, D.P., Welling, M.: Auto-Encoding Variational Bayes (2014)
Google Scholar
Im, D., Ahn, S., Memisevic, R., Bengio, Y.: Denoising criterion for variational auto-encoding framework. In: Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, pp. 2059–2065 (2017). AAAI Press
Google Scholar
Guo, J., Liu, G., Zuo, Y., Wu, J.: Learning sequential behavior representations for fraud detection. In: 2018 IEEE International Conference on Data Mining (ICDM), Singapore, pp. 127–136 (2018). https://doi.org/10.1109/ICDM.2018.00028
Zheng, Y.J., Zhou, X.H., Sheng, W.G., Xue, Y., Chen, S.Y.: Generative adversarial network based telecom fraud detection at the receiving bank. Neural Netw. 102, 78–86 (2018)
Article Google Scholar
Deng, R., Rua, N., Zhang, G., Zhang, X.: FraudJudger: Fraud Detection on Digital Payment Platforms with Fewer Labels, arXiv:1909.02398 (2019)

Download references

Author information

Authors and Affiliations

Swiggy, Bangalore, India
Jose Mathew, Meghana Negi, Rutvik Vijjali & Jairaj Sathyanarayana

Authors

Jose Mathew
View author publications
You can also search for this author in PubMed Google Scholar
Meghana Negi
View author publications
You can also search for this author in PubMed Google Scholar
Rutvik Vijjali
View author publications
You can also search for this author in PubMed Google Scholar
Jairaj Sathyanarayana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rutvik Vijjali .

Editor information

Editors and Affiliations

Facebook AI, Seattle, WA, USA
Yuxiao Dong
Torre Telefonica, Barcelona, Spain
Nicolas Kourtellis
Bielefeld University, CITEC, Bielefeld, Germany
Barbara Hammer
Basque Center for Applied Mathematics, Bilbao, Spain
Jose A. Lozano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mathew, J., Negi, M., Vijjali, R., Sathyanarayana, J. (2021). DeFraudNet: An End-to-End Weak Supervision Framework to Detect Fraud in Online Food Delivery. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-86514-6_6
Published: 10 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86513-9
Online ISBN: 978-3-030-86514-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)