Detecting Value-Added Tax Evasion by Business Entities of Kazakhstan
This paper presents a statistics-based method for detecting value-added tax evasion by Kazakhstani legal entities. Starting from features selection we perform an initial exploratory data analysis using Kohonen self-organizing maps; this allows us to make basic assumptions on the nature of tax compliant companies. Then we select a statistical model and propose an algorithm to estimate its parameters in unsupervised manner. Statistical approach appears to benefit the task of detecting tax evasion: our model outperforms the scoring model used by the State Revenue Committee of the Republic of Kazakhstan demonstrating significantly closer association between scores and audit results.
KeywordsSelf-organizing maps Cluster analysis Anomaly detection Tax evasion detection
We would like to thank Inês Russinho Mouga for the thorough review of .
- 1.Anderberg, M.R.: Cluster Analysis for Applications. Monographs and Textbooks on Probability and Mathematical Statistics (1973)Google Scholar
- 3.Dolnicar, S.: The use of neural networks in marketing: market segmentation with self organising feature maps. Proc WSOM 97, 4–6 (1997)Google Scholar
- 5.Gupta, M., Nagadevara, V.: Audit selection strategy for improving tax compliance: application of data mining techniques. In: Foundations of Risk-Based Audits. Proceedings of the eleventh International Conference on e-Governance, Hyderabad, India, December, pp. 28–30 (2007)Google Scholar
- 6.Hsu, K.W., Pathak, N., Srivastava, J., Tschida, G., Bjorklund, E.: Data mining based tax audit selection: a case study of a pilot project at the minnesota department of revenue. In: Real World Data Mining Applications, pp. 221–245. Springer (2015)Google Scholar
- 7.Iivarinen, J., Kohonen, T., Kangas, J., Kaski, S.: Visualizing the clusters on the self-organizing map (1994)Google Scholar
- 9.Lückeheide, S., Velásquez, J.D., Cerda, L.: Segmentación de los contribuyentes que declaran iva aplicando herramientas de clustering. Revista de Ingeniería de Sistemas 21, 87–110 (2007)Google Scholar
- 12.Pampalk, E., Rauber, A., Merkl, D.: Using Smoothed Data Histograms for Cluster Visualization in Self-organizing Maps. Springer (2002)Google Scholar
- 13.Squire, D.M., et al.: Visualization of Cluster Changes by Comparing Self-organizing Maps. Springer (2005)Google Scholar
- 15.Viveros, M.S., Nearhos, J.P., Rothman, M.J.: Applying data mining techniques to a health insurance information system. VLDB 286–294 (1996)Google Scholar
- 17.Williams, G.J., Christen, P., et al.: Exploratory multilevel hot spot analysis: Australian taxation office case study. In: Proceedings of the Sixth Australasian conference on Data mining and analytics, vol. 70, pp. 77–84. Australian Computer Society, Inc. (2007)Google Scholar