Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Local and Target Exploration of Conglomerate-Hosted Gold Deposits Using Machine Learning Algorithms: A Case Study of the Witwatersrand Gold Ores, South Africa

  • 160 Accesses

  • 1 Citations


Determining the gold grade and facies type in areas with little geological information and sparse exploration samples is fraught with uncertainties and often results in high operational costs. Point-wise gold grade data are commonly used to guide exploration and resource estimation with the application of spatial interpolation techniques such as kriging. Within this environment of data scarcity, the application of kriging leads to significant grade estimation errors, as high nugget thresholds reduce the effectiveness of kriging, a good example being the gold deposits in the Witwatersrand Basin of South Africa. To reduce the impact of subjective grade interpolation and geological interpretation, as well as to exploit currently unused geological descriptions, we present a novel machine learning-based algorithm called GS-Pred. It combines both sedimentological and gold assay data for point-wise gold grade prediction and automated facies identification in a conglomerate-hosted gold deposit. For this application, GS-Pred requires an input database of sedimentological descriptions, spatial information and gold grades and makes predictions of gold grades at any point within the spatial coverage of the input database, provided that it has appropriate sedimentological descriptions. In essence, GS-Pred examines the spatial and non-spatial variability of metal grades and provides information of the estimated resource below the nugget threshold. This proposed algorithm has been validated on subsets of data on gold grade and sedimentological characteristics of conglomerates in the Witwatersrand Basin. Validation results suggest that GS-Pred is more accurate than current machine learning techniques and ordinary kriging. The clustering result shows that there are four or at most five facies which can be distinguished from the clustering results within the dataset, which maximises the contrast in the inter-cluster prediction behavior. These clusters have a good spatial correspondence with the known geology, and the method, combined with gold grade predictions, was able to identify probable mineralization patterns, thus assisting in target exploration. This novel machine learning algorithm is entirely data driven. We have shown its successful application in a complex geological setting as the Witwatersrand Basin.

This is a preview of subscription content, log in to check access.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18


  1. Alavi, A. H., Gandomi, A. H., & Lary, D. J. (2016). Progress of machine learning in geosciences: Preface. Geoscience Frontiers,7, 1–2.

  2. Aristizabal, R. J. (2012). Estimating the parameters of the three-parameter lognormal distribution. MSc thesis, Florida International University, Florida, United States of America.

  3. Barnicoat, A. C., Henderson, I. H. C., Knipe, R. J., Yardley, B. W. D., Napier, R. W., Fox, N. P. C., et al. (1997). Hydrothermal gold mineralisation in the Witwatersrand Supergroup. Nature,386, 820–824.

  4. Bérubé, C. L., Olivo, G. R., Chouteau, M., Perrouty, S., Shamsipour, P., Enkin, R. J., et al. (2018). Predicting rock type and detecting hydrothermal alteration using machine learning and petrophysical properties of the Canadian Malartic ore and host rocks, Pontiac Subprovince, Québec, Canada. Ore Geology Reviews,96, 130–145.

  5. Bouhlel, M. A., Bartoli, N., Otsmane, A., & Morlier, J. (2016a). Improving kriging surrogates of high-dimensional design models by Partial Least Squares dimension reduction. Structural and Multidisciplinary Optimisation,53, 935–952.

  6. Bouhlel, M. A., Bartoli, N., Otsmane, A., & Morlier, J. (2016b). An improved approach for estimating the hyperparameters of the kriging model for high-dimensional problems through the partial least squares method. Mathematical Problems in Engineering, 2016, 6723410. https://doi.org/10.1155/2016/6723410.

  7. Breiman, L. (1996). Bagging predictors. Machine Learning,24, 123–140.

  8. Caté, A., Perozzi, L., Gloaguen, E., & Blouin, M. (2017). Machine learning as a tool for geologists. The Leading Edge,36, 215–219.

  9. Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–794), San Francisco, California, USA—August 13–17, 2016.

  10. Cover, T., & Hart, P. (1967). Nearest neighbour pattern classification. IEEE Transactions on Information Theory,13, 21–27.

  11. Cracknell, M., & Reading, A. (2013). The upside of uncertainty: Identification of lithology contact zones from airborne geophysics and satellite data using random forests and support vector machines. Geophysics,78, WB113–WB126.

  12. Cracknell, M. J., & Reading, A. M. (2014). Geological mapping using remote sensing data: A comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information. Computers & Geosciences,63, 22–33.

  13. D’Agostino, R., & Pearson, E. (1973). Tests for departure from normality. Empirical results for the distributions of b2 and √b1. Biometrika,60, 613–622.

  14. David, M. (1988). Handbook of applied advanced geostatistical ore reserve estimation. Amsterdam: Elsevier Science Publishers.

  15. Duan, N. (1983). Smearing estimate: A nonparametric retransformation method. Journal of the American Statistical Association,78(383), 605–610.

  16. Efron, B. (1983). Estimating the error rate of a prediction rule—Improvement on cross-validation. Journal of the American Statistical Association,78, 316–331.

  17. Fix, E., & Hodges, J. L. (1951). Discriminatory analysis. Nonparametric discrimination; Consistency properties. Randolph Field, TX: U.S. Air Force, School of Aviation Medicine.

  18. Forgy, E. W. (1965). Cluster analysis of multivariate data: Efficiency versus interpretability of classifications. Biometrics,21, 768–769.

  19. Frimmel, H. E. (2014). A giant Mesoarchaean crustal gold-enrichment episode: Possible causes and consequences for exploration. Society of Economic Geologists Special Publications,18, 209–234.

  20. Frimmel, H. E. (2018). Episodic concentration of gold to ore grade through Earth’s history. Earth Science Reviews,180, 148–158.

  21. Frimmel, H. E., Groves, D. I., Kirk, J., Ruiz, J., Chesley, J., & Minter, W. E. L. (2005). The formation and preservation of the Witwatersrand goldfields, the largest gold province in the world. In J. W. Hedenquist, J. F. H. Thompson, R. J. Goldfarb, & J. P. Richards (Eds.), Economic geology one hundredth anniversary volume (pp. 769–797). Littleton: Society of Economic Geologists.

  22. Frimmel, H. E., Le Roex, A. P., Knight, J., & Minter, W. E. L. (1993). A case study of the postdepositional alteration of the Witwatersrand Basal reef gold placer. Economic Geology,88, 249–265.

  23. Frimmel, H. E., & Minter, W. E. L. (2002). Recent developments concerning the geological history and genesis of the Witwatersrand gold deposits, South Africa. Society of Economic Geologists Special Publications,9, 17–45.

  24. Gahegan, M. (2000). On the application of inductive machine learning tools to geographical analysis. Geographical Analysis,32, 113–139.

  25. Garcia-Gutierrez, J., Martínez-Álvarez, F., Troncoso, A., & Riquelme, J. C. (2014). A comparative study of machine learning regression methods on LiDAR Data: A case study. In Á. Herrero, et al. (Eds.), International joint conference SOCO’13-CISIS’13-ICEUTE’13. Advances in intelligent systems and computing (p. 239). Cham: Springer.

  26. Goovaerts, P. (1997). Geostatistics for natural resources evaluation. New York: Oxford University Press. ISBN 0-19-511538-4.

  27. Guyon, I. (2009). A practical guide to model selection. In Marie J. (Ed.) Proceedings of the machine learning summer school. Canberra Australia January 26–February 6 Springer Text in Statistics Springer (p. 37).

  28. Handley, J. R. F. (2004). Historic overview of the witwatersrand goldfields (p. 224). Howick: Handley.

  29. Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning: Data mining, inference and prediction (2nd ed.). New York: Springer.

  30. Heinrich, C. A. (2015). Witwatersrand gold deposits formed by volcanic rain anoxic rivers and Archaean life. Nature Geoscience,8, 206–209.

  31. Ho, T. K. (1995). Random decision forest. In Proceedings of the 3rd international conference on document analysis and recognition, Montreal (pp. 278–282), 14–16 August 1995.

  32. Horscroft, F. D. M., Mossman, D. J., Reimer, T. O., & Hennigh, Q. (2011). Witwatersrand metallogenesis—The case for (modified) syngenesis. SEPM Special Publication,101, 75–95.

  33. Hsu, C. W., & Lin, C. J. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks and Learning Systems,13, 415–425.

  34. Kanevski, M., Pozdnoukhov, A., & Timonin, V. (2009). Machine learning for spatial environmental data: theory: Applications and software. Boca Raton: CRC Press.

  35. Karatzoglou, A., Meyer, D., & Hornik, K. (2006). Support vector machines. The Journal of Statistical Software,15, 28.

  36. Kositcin, N., & Krapež, B. (2004). SHRIMP U-Pb detrital zircon geochronology of the Late Archaean Witwatersrand Basin of South Africa: Relation between zircon provenance age spectra and basin evolution. Precambrian Research,129, 141–168.

  37. Kotsiantis, S. B. (2007). Supervised machine learning: A review of classification techniques. Informatica,31, 249–268.

  38. Kotsiantis, S. B. (2014). Bagging and boosting variants for handling classifications problems: A survey. The Knowledge Engineering Review,29, 78–100.

  39. Kovacevic, M., Bajat, B., Trivic, B., & Pavlovic, R. (2009). Geological units’ classification of multispectral images by using support vector machines. In Proceedings of the international conference on intelligent networking and collaborative systems IEEE (pp. 267–272).

  40. Krige, D. G. (1951). A statistical approach to some basic mine valuation problems on the Witwatersrand. Journal of the Southern African Institute of Mining and Metallurgy,52, 201–203.

  41. Kuncheva, L. (2004). Combining pattern classifiers. Methods and algorithms. Hoboken: Wiley.

  42. Lloyd, S. P. (1957). Least square quantisation in PCM. Bell Telephone Laboratories Paper. Published in journal much later: Lloyd S. P. (1982). “Least squares quantisation in PCM” (PDF). IEEE Transactions on Information Theory,28, 129–137.

  43. Malzahn, D., & Opper, M. (2005). A statistical physics approach for the analysis of machine learning algorithms on real data. Journal of Statistical Mechanics: Theory and Experiment,2005, P11001.

  44. Marsland, S. (2009). Machine learning: An algorithmic perspective. Boca Raton: CRC.

  45. Matheron, G. (1963). Principles of geostatistics. Economic Geology,58, 1246–1266.

  46. McDonald, J. H. (2014). Handbook of biological statistics (3rd ed., pp. 140–144). Baltimore, MD: Sparky House Publishing.

  47. McLennan, J. A., & Deutsch, C. V. (2004) Conditional non-bias of geostatistical simulation for estimation of recoverable reserves. Canadian Institute of Mineralogy and Metallurgy (CIM) Bulletin May 2004, 1–8.

  48. Melo, A., Sun, J., & Li, Y. (2017). Geophysical inversions applied to 3D geology characterisation of an iron oxide copper gold (IOCG) deposit in Brazil. Geophysics,82, 1–53.

  49. Minasny, B., & McBratney, A. B. (2007). Spatial prediction of soil properties using EBLUP with the Matérn covariance function. Geoderma,140, 324–336.

  50. Minter, W. E. L. (1976). Detrital gold, uranium and pyrite concentrations related to sedimentology in the Precambrian Vaal Reef Placer Witwatersrand South Africa. Economic Geology,71, 157–176.

  51. Minter, W. E. L. (1991). Palaeocurrent dispersal patterns of Witwatersrand gold placers. South African Journal of Geology,94, 70–85.

  52. Minter, W. E. L. (1999). Irrefutable detrital origin of Witwatersrand gold and evidence of eolian signatures. Economic Geology, 94, 665–670.

  53. Mossman, D. J., Minter, W. E. L., Dutkiewicz, A., Hallbauer, D. K., George, S. C., Hennigh, Q., et al. (2008). The indigenous origin of Witwatersrand “carbon”. Precambrian Research,164, 173–186.

  54. Müller, K.-R., Mika, S., Rätsch, G., Tsuda, K., & Schölkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Neural Networks,12, 181–201.

  55. Nami, N., & Verrezen, L. (1986). The use of sedimentology to improve valuation on gold mines. Chamber of Mines of South Africa. Research Report No. 32/86 Project No. GD1P.

  56. Phillips, G. N., & Powell, R. (2013). Origin of Witwatersrand gold—A metamorphic devolatilisation-hydrothermal replacement model. Transactions of IMM, Section B, Applied Earth Sciences,120, 112–129.

  57. Rahmanm, M., & Wu, H. (2013). A note on normality tests based on moments. Far East Journal of Mathematical Sciences,2, 273–282.

  58. Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press.

  59. Robb, L. J., & Meyer, F. M. (1991). A contribution to recent debate concerning epigenetic versus syngenetic mineralization processes in the Witwatersrand Basin. Economic Geology, 86, 396–401.

  60. Rossi, M. E., & Deutsch, C. V. (2014). Mineral resource estimation. Dordrecht: Springer.

  61. Santosa, F., & William, W. S. (1986). Linear inversion of band-limited reflection seismograms. Journal on Scientific and Statistical Computing,7, 1307–1330.

  62. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society: Series B (Methodological),58, 267–288.

  63. Tikhonov, A. N. (1943). On the stability of inverse problems. Doklady Akademii Nauk SSSR,39, 195–198.

  64. Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.

  65. Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Fransisco: Morgan Kaufman.

  66. Zou, H., & Hastie, T. (2005). Regularisation and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B,67, 301–320.

  67. Zuo, R. (2017). Machine learning of mineralization-related geochemical anomalies: A review of potential methods. Natural Resources Research,26, 457–464.

Download references


Glen Nwaila thanks CIMERA (Centre of Excellence for Integrated Mineral and Energy Resource Analysis) for funding this research and Sibanye Stillwater for providing the data.

Author information

GN contributed to research and methods development, analysing data and preparing the paper, SEZ contributed to research and methods development, interpreting data and preparing the paper, HF contributed to compilation of the geologic background and helped in checking drafts of the paper, and MM, CD, RD, MB and LT helped in checking drafts of the paper.

Correspondence to Glen T. Nwaila.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (XLSX 201 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Nwaila, G.T., Zhang, S.E., Frimmel, H.E. et al. Local and Target Exploration of Conglomerate-Hosted Gold Deposits Using Machine Learning Algorithms: A Case Study of the Witwatersrand Gold Ores, South Africa. Nat Resour Res 29, 135–159 (2020). https://doi.org/10.1007/s11053-019-09498-1

Download citation


  • Gold
  • Sedimentology
  • Machine learning
  • GS-Pred
  • Witwatersrand
  • k-Means clustering