Detecting Value-Added Tax Evasion by Business Entities of Kazakhstan

  • Zhenisbek AssylbekovEmail author
  • Igor Melnykov
  • Rustam Bekishev
  • Assel Baltabayeva
  • Dariya Bissengaliyeva
  • Eldar Mamlin
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 56)


This paper presents a statistics-based method for detecting value-added tax evasion by Kazakhstani legal entities. Starting from features selection we perform an initial exploratory data analysis using Kohonen self-organizing maps; this allows us to make basic assumptions on the nature of tax compliant companies. Then we select a statistical model and propose an algorithm to estimate its parameters in unsupervised manner. Statistical approach appears to benefit the task of detecting tax evasion: our model outperforms the scoring model used by the State Revenue Committee of the Republic of Kazakhstan demonstrating significantly closer association between scores and audit results.


Self-organizing maps Cluster analysis Anomaly detection Tax evasion detection 



We would like to thank Inês Russinho Mouga for the thorough review of [9].


  1. 1.
    Anderberg, M.R.: Cluster Analysis for Applications. Monographs and Textbooks on Probability and Mathematical Statistics (1973)Google Scholar
  2. 2.
    Chandola, V., Banerjee, A., Kumar, V.: Anomaly detection: a survey. ACM Comput. Surv. (CSUR) 41(3), 15 (2009)CrossRefGoogle Scholar
  3. 3.
    Dolnicar, S.: The use of neural networks in marketing: market segmentation with self organising feature maps. Proc WSOM 97, 4–6 (1997)Google Scholar
  4. 4.
    González, P.C., Velásquez, J.D.: Characterization and detection of taxpayers with false invoices using data mining techniques. Expert Syst. Appl. 40(5), 1427–1436 (2013)CrossRefGoogle Scholar
  5. 5.
    Gupta, M., Nagadevara, V.: Audit selection strategy for improving tax compliance: application of data mining techniques. In: Foundations of Risk-Based Audits. Proceedings of the eleventh International Conference on e-Governance, Hyderabad, India, December, pp. 28–30 (2007)Google Scholar
  6. 6.
    Hsu, K.W., Pathak, N., Srivastava, J., Tschida, G., Bjorklund, E.: Data mining based tax audit selection: a case study of a pilot project at the minnesota department of revenue. In: Real World Data Mining Applications, pp. 221–245. Springer (2015)Google Scholar
  7. 7.
    Iivarinen, J., Kohonen, T., Kangas, J., Kaski, S.: Visualizing the clusters on the self-organizing map (1994)Google Scholar
  8. 8.
    Kohonen, T.: The self-organizing map. Neurocomputing 21(1), 1–6 (1998)CrossRefzbMATHGoogle Scholar
  9. 9.
    Lückeheide, S., Velásquez, J.D., Cerda, L.: Segmentación de los contribuyentes que declaran iva aplicando herramientas de clustering. Revista de Ingeniería de Sistemas 21, 87–110 (2007)Google Scholar
  10. 10.
    Markey, M.K., Lo, J.Y., Tourassi, G.D., Floyd, C.E.: Self-organizing map for cluster analysis of a breast cancer database. Artif. Intell. Med. 27(2), 113–127 (2003)CrossRefGoogle Scholar
  11. 11.
    Melnykov, V., Chen, W.C., Maitra, R.: MixSim: an R package for simulating data to study performance of clustering algorithms. J. Stat. Softw. 51, 1–25 (2012)CrossRefGoogle Scholar
  12. 12.
    Pampalk, E., Rauber, A., Merkl, D.: Using Smoothed Data Histograms for Cluster Visualization in Self-organizing Maps. Springer (2002)Google Scholar
  13. 13.
    Squire, D.M., et al.: Visualization of Cluster Changes by Comparing Self-organizing Maps. Springer (2005)Google Scholar
  14. 14.
    Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Trans. Neural Netw. 11(3), 586–600 (2000)CrossRefGoogle Scholar
  15. 15.
    Viveros, M.S., Nearhos, J.P., Rothman, M.J.: Applying data mining techniques to a health insurance information system. VLDB 286–294 (1996)Google Scholar
  16. 16.
    Wehrens, R., Buydens, L.M., et al.: Self-and super-organizing maps in R: the kohonen package. J. Stat. Softw. 21(5), 1–19 (2007)CrossRefGoogle Scholar
  17. 17.
    Williams, G.J., Christen, P., et al.: Exploratory multilevel hot spot analysis: Australian taxation office case study. In: Proceedings of the Sixth Australasian conference on Data mining and analytics, vol. 70, pp. 77–84. Australian Computer Society, Inc. (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  • Zhenisbek Assylbekov
    • 1
    Email author
  • Igor Melnykov
    • 1
  • Rustam Bekishev
    • 1
  • Assel Baltabayeva
    • 2
  • Dariya Bissengaliyeva
    • 2
  • Eldar Mamlin
    • 2
  1. 1.School of Science and TechnologyNazarbayev UniversityAstanaKazakhstan
  2. 2.State Revenue Committee, Ministry of Finance of KazakhstanAstanaKazakhstan

Personalised recommendations