Big Data for Fraud Detection

  • Vahid MojtahedEmail author
Part of the Computational Social Sciences book series (CSS)


Fraud is domain-specific, and there is no one-solution-fits-all method among fraud detection techniques. To make this chapter more specific and concrete, we provide examples concerning a common type of fraud which is food fraud. Food fraud has irreversible effects since it imposes risks to human life. The aim of this chapter is thus to present a conceptual and methodological solution for real-time fraud detection that can be implemented in the food sector by global food producers, regulatory bodies, or retailers but is generalizable to other domains.


Big data Fraud detection Anomaly detection Clustering Multivariate statistics 


  1. Barnett, V., & Lewis, T. (1994). Outliers in statistical data. 3rd edition, John Wiley & Sons, Chichester, UK, (pp. 584), ISBN 0-471-93094-6.Google Scholar
  2. Blakeborough, L., & Giro Correira, S. (2017). The scale and nature of fraud: A review of evidence. ISBN 978-1-78655-682-0 (evidence review undertaken by Home Office Analysis and Insight to bring together what is known about the scale and nature of fraud affecting individuals and businesses in the UK)Google Scholar
  3. Button, M., Lewis, C., & Tapley, J. (2009). Fraud typologies and the victims of fraud: literature review. London: National Fraud Authority, 40 p.Google Scholar
  4. Button, M., Lewis, C., & Tapley, J. (2014). Not a victimless crime: The impact of fraud on individual victims and their families. Security Journal, 27(1), 36–54.CrossRefGoogle Scholar
  5. Cabinet Office. (2014). Common areas of spend, Fraud, error and debt, Standard Definition v2.1. Retrieved from
  6. Cerioli, A., & Farcomeni, A. (2011). Error rates for multivariate outlier detection. Computational Statistics & Data Analysis, 55(1), 544–553.CrossRefGoogle Scholar
  7. Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM Computing Surveys (CSUR), 41(3), 15.CrossRefGoogle Scholar
  8. Cressey, D. R. (1950). The criminal violation of financial trust. American Sociological Review, 15(6), 738–743.CrossRefGoogle Scholar
  9. Filzmoser, P., & Hron, K. (2008). Outlier detection for compositional data using robust methods. Mathematical Geosciences, 40(3), 233–248.CrossRefGoogle Scholar
  10. Filzmoser, P., Garrett, R. G., & Reimann, C. (2005). Multivariate outlier detection in exploration geochemistry. Computers & Geosciences, 31(5), 579–587.CrossRefGoogle Scholar
  11. Filzmoser, P., Maronna, R., & Werner, M. (2008). Outlier identification in high dimensions. Computational Statistics & Data Analysis, 52(3), 1694–1711.CrossRefGoogle Scholar
  12. Garrett, R. G. (1989). The chi-square plot: A tool for multivariate outlier recognition. Journal of Geochemical Exploration, 32(1–3), 319–341.CrossRefGoogle Scholar
  13. Gee, J. (2018). The financial cost of fraud. Retrieved from
  14. Gogoi, P., Borah, B., & Bhattacharyya, D. K. (2010). Anomaly detection analysis of intrusion data using supervised & unsupervised approach. Journal of Convergence Information Technology, 5(1), 95–110.CrossRefGoogle Scholar
  15. Guardian, T. (2013). Horsemeat scandal blamed on European meat regulation changes. The Guardian. Retrieved from
  16. Hudson, A., Thomas, M., & Brereton, P. (2016). Food incidents: Lessons from the past and anticipating the future. New Food, 19, 35–39.Google Scholar
  17. Johnson, R. A., & Wichern, D. W. (2002). Applied multivariate statistical analysis (Vol. 5). Upper Saddle River, NJ: Prentice Hall.Google Scholar
  18. Kassem, R., & Higson, A. (2012). The new fraud triangle model. Journal of Emerging Trends in Economics and Management Sciences, 3(3), 191.Google Scholar
  19. Lane, T., & Brodley, C. E. (1997). Sequence matching and learning in anomaly detection for computer security. In AAAI Workshop: AI Approaches to Fraud Detection and Risk Management, pp. 43–49.Google Scholar
  20. Matsumura, E. M., & Tucker, R. R. (1992). Fraud detection: A theoretical foundation. Accounting Review, 753–782.Google Scholar
  21. Patcha, A., & Park, J.-M. (2007). An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer Networks, 51(12), 3448–3470.CrossRefGoogle Scholar
  22. R Core Team. (2014). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Google Scholar
  23. Riani, M., Atkinson, A. C., & Cerioli, A. (2009). Finding an unknown number of multivariate outliers. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 71(2), 447–466.CrossRefGoogle Scholar
  24. Rosseeuw, P. J., & Van Zomeren, B. C. (1990). Unmasking multivariate outliers and leverate points. Journal of the American Statistical Association, 85, 633–639.CrossRefGoogle Scholar
  25. Rousseeuw, P. J. (1985). Multivariate estimation with high breakdown point. Mathematical Statistics and Applications, 8, 283–297.CrossRefGoogle Scholar
  26. Spink, J., & Moyer, D. C. (2011). Defining the public health threat of food fraud. Journal of Food Science, 76(9), R157–R163.CrossRefGoogle Scholar
  27. Tennyson, S. (2008). Moral, social, and economic dimensions of insurance claims fraud. Social Research, 1181–1204.Google Scholar
  28. Wang, C., Viswanathan, K., Choudur, L., Talwar, V., Satterfield, W., & Schwan, K. (2011). Statistical techniques for online anomaly detection in data centers. In 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops (pp. 385–392). IEEE.Google Scholar
  29. Wilks, T. J., & Zimbelman, M. F. (2004). Using game theory and strategic reasoning concepts to prevent and detect fraud. Accounting Horizons, 18(3), 173–184.CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Avanade Ltd.LondonUK
  2. 2.Fera Science Ltd., National Agri-food Innovation CampusSand Hutton, YorkUK

Personalised recommendations