Guidelines for Enhancing the Signature of Multi-element Mineralization Using Principal Component Analysis: Part 1—Monte Carlo Simulation

  • Jie YangEmail author
  • Eric Grunsky
  • Qiuming Cheng
Original Paper


Principal component analysis (PCA) is a widely used method in geochemical data processing. The method can be useful to integrate variables associated with mineralization into a single component. In this paper, a Monte Carlo simulation is designed and applied to explore the performance of PCA under conditions controlled by four factors: the number of geo-objects (lithologic units), differences between geo-objects, the relationship between the variables and the number of variables. The results imply that: (1) more significant differences between geo-objects will result in less stable PC results; (2) more geo-objects make the result more robust; (3) variables with similar relationships help to stabilize the result; (4) more input variables do not always lead to a better result. These conclusions provide useful guidelines for using PCA to yield a targeted component like mineralization.


Principal component analysis Monte Carlo simulation Data mining Geochemical exploration 



We are thankful for the suggestions and the modifications from an anonymous reviewer and from Prof. Graeme Bonham-Carter. This research benefited from financial support from National Key R&D Program of China (2016YFC0600501), National Natural Science Foundation of China (Nos. 41430320 and 41602337) and a Chinese Geological Survey project (Minerals and Geological Prospecting on Shallow Covered Areas of Jinning, Inner Mongolia, No. DD20160045).


  1. Allen, A. P., Li, B. L., & Charnov, E. L. (2001). Population fluctuations, power laws and mixtures of lognormal distributions. Ecology Letters, 4, 1–3.CrossRefGoogle Scholar
  2. Ball, T. K., Brown, M. J., Nicholson, R. A., Peachey, D., & Smith, T. K. (1984). Comparison of different geochemical prospecting techniques over the Long Rake, Fluorite–Barite–Sulfide Orebody, Derbyshire. Journal of the Geological Society, 141, 390–390.Google Scholar
  3. Boyle, R., & Jonasson, I. (1984). The geochemistry of antimony and its use as an indicator element in geochemical prospecting. Journal of Geochemical Exploration, 20, 223–302.CrossRefGoogle Scholar
  4. Buccianti, A. (2015). Frequency distributions of geochemical data, scaling laws, and properties of compositions. Pure and Applied Geophysics, 172, 1851–1863.CrossRefGoogle Scholar
  5. Carranza, E. J. M. (2008). Geochemical anomaly and mineral prospectivity mapping in GIS (1st ed.). Amsterdam: Elsevier.Google Scholar
  6. Carranza, E. J. M. (2017). Natural resources research publications on geochemical anomaly and mineral potential mapping, and introduction to the special issue of papers in these fields. Natural Resources Research, 26, 379–410. Scholar
  7. Cheng, Q. M. (2007). Mapping singularities with stream sediment geochemical data for prediction of undiscovered mineral deposits in Gejiu, Yunnan Province, China. Ore Geology Reviews, 32, 314–324.CrossRefGoogle Scholar
  8. Cheng, Q. M., & Agterberg, F. P. (2009). Singularity analysis of ore-mineral and toxic trace elements in stream sediments. Computers & Geosciences, 35, 234–244.CrossRefGoogle Scholar
  9. Cheng, Q., Bonham-Carter, G. F., Hall, G. E. M., & Bajc, A. (1997). Statistical study of trace elements in the soluble organic and amorphous Fe–Mn phases of surficial sediments, Sudbury Basin. 1. Multivariate and spatial analysis. Journal of Geochemical Exploration, 59, 27–46.CrossRefGoogle Scholar
  10. Cheng, Q., Bonham-Carter, G., Wang, W., Zhang, S., Li, W., & Qinglin, X. (2011). A spatially weighted principal component analysis for multi-element geochemical data for mapping locations of felsic intrusions in the Gejiu mineral district of Yunnan, China. Computers & Geosciences, 37, 662–669.CrossRefGoogle Scholar
  11. Cheng, Q., Xu, Y., & Grunsky, E. (2000). Integrated spatial and spectrum method for geochemical anomaly separation. Natural Resources Research, 9, 43–52.CrossRefGoogle Scholar
  12. Cheng, Q., Zhang, S., Zuo, R., Chen, Z., Xie, S., Xia, Q., et al. (2009). Progress of multifractal filtering techniques and their applications in geochemical information extraction. Earth Science Frontiers, 16, 185–198.CrossRefGoogle Scholar
  13. Comero, S., Servida, D., De Capitani, L., & Gawlik, B. M. (2012). Geochemical characterization of an abandoned mine site: A combined positive matrix factorization and GIS approach compared with principal component analysis. Journal of Geochemical Exploration, 118, 30–37.CrossRefGoogle Scholar
  14. Croux, C., Filzmoser, P., & Oliveira, M. R. (2007). Algorithms for projection—Pursuit robust principal component analysis. Chemometrics and Intelligent Laboratory Systems, 87, 218–225.CrossRefGoogle Scholar
  15. Daya, A. A. (2015). Comparative study of C–A, C–P, and N–S fractal methods for separating geochemical anomalies from background: A case study of Kamoshgaran region, northwest of Iran. Journal of Geochemical Exploration, 150, 52–63.CrossRefGoogle Scholar
  16. Eilu, P., & Mikucki, E. J. (1998). Alteration and primary geochemical dispersion associated with the Bulletin lode-gold deposit, Wiluna, Western Australia. Journal of Geochemical Exploration, 63, 73–103.CrossRefGoogle Scholar
  17. Filzmoser, P. (1999). Robust principal component and factor analysis in the geostatistical treatment of environmental data. Environmetrics, 10, 363–375.CrossRefGoogle Scholar
  18. Geranian, H., Tabatabaei, S. H., Asadi, H. H., & Carranza, E. J. M. (2016). Application of discriminant analysis and support vector machine in mapping gold potential areas for further drilling in the Sari-Gunay gold deposit, NW Iran. Natural Resources Research, 25, 145–159.CrossRefGoogle Scholar
  19. Grunsky, E. C. (2010). The interpretation of geochemical survey data. Geochemistry-Exploration Environment Analysis, 10, 27–74.CrossRefGoogle Scholar
  20. Grunsky, E. C., de Caritat, P., & Mueller, U. A. (2017). Using surface regolith geochemistry to map the major crustal blocks of the Australian continent. Gondwana Research, 46, 227–239.CrossRefGoogle Scholar
  21. Grunsky, E. C., & Kjarsgaard, B. A. (2016). Recognizing and validating structural processes in geochemical data: Examples from a diamondiferous kimberlite and a regional lake sediment geochemical survey. In J. A. Martín-Fernández & S. Thió-Henestrosa (Eds.), Compositional data analysis: CoDaWork, L’Escala, Spain, June 2015 (pp. 85–115). Cham: Springer International Publishing.CrossRefGoogle Scholar
  22. Grunsky, E. C., Mueller, U. A., & Corrigan, D. (2014). A study of the lake sediment geochemistry of the Melville Peninsula using multivariate methods: Applications for predictive geological mapping. Journal of Geochemical Exploration, 141, 15–41.CrossRefGoogle Scholar
  23. Harris, D., & Pan, G. (1999). Mineral favorability mapping: A comparison of artificial neural networks, logistic regression, and discriminant analysis. Nonrenewable Resources, 8, 93–109.Google Scholar
  24. He, H. B., & Garcia, E. A. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21, 1263–1284.CrossRefGoogle Scholar
  25. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349, 255–260.CrossRefGoogle Scholar
  26. Lewandowski, D., Kurowicka, D., & Joe, H. (2009). Generating random correlation matrices based on vines and extended onion method. Journal of Multivariate Analysis, 100, 1989–2001.CrossRefGoogle Scholar
  27. Liu, B., Guo, K., & Zhang, L. (2016). Kernel principal component analysis in the application of geochemical comprehensive feature extraction (pp. 15–19). Cham: Springer International Publishing.Google Scholar
  28. Mitzenmacher, M. (2004). A brief history of generative models for power law and lognormal distributions. Internet Mathematics, 1, 226–251.CrossRefGoogle Scholar
  29. Pirajno, F. (2012). Hydrothermal mineral deposits: Principles and fundamental concepts for the exploration geologist. Berlin: Springer.Google Scholar
  30. Pohl, W. L. (2011). Economic geology: Principles and practice. New York: Wiley.CrossRefGoogle Scholar
  31. Reimann, C., & Filzmoser, P. (2000). Normal and lognormal data distribution in geochemistry: Death of a myth. Consequences for the statistical treatment of geochemical and environmental data. Environmental Geology, 39, 1001–1014.CrossRefGoogle Scholar
  32. Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., & Chica-Rivas, M. (2015). Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews, 71, 804–818.CrossRefGoogle Scholar
  33. Roweis, S. (1998). EM algorithms for PCA and SPCA. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems (Vol. 10, pp. 626–632). Cambridge: MIT Press.Google Scholar
  34. Sawilowsky, S. S. (2003). You think you’ve got trivials? Journal of Modern Applied Statistical Methods, 2, 21.Google Scholar
  35. Sharma, A., & Pahwal, K. K. (2007). Fast principal component analysis using fixed-point algorithm. Pattern Recognition Letters, 28, 1151–1155.CrossRefGoogle Scholar
  36. Shiikawa, M. (1983). The role of mercury, arsenic and boron as pathfinder elements in geochemical exploration for geothermal energy. Journal of Geochemical Exploration, 19, 337–338.CrossRefGoogle Scholar
  37. Stanimirova, I., Walczak, B., Massart, D. L., & Simeonov, V. (2004). A comparison between two robust PCA algorithms. Chemometrics and Intelligent Laboratory Systems, 71, 83–95.CrossRefGoogle Scholar
  38. Thomopoulos, N. T. (2012). Essentials of Monte Carlo simulation: Statistical methods for building simulation models. New York: Springer.Google Scholar
  39. Vaseghi, S. V. (2008). Advanced digital signal processing and noise reduction. New York: Wiley.CrossRefGoogle Scholar
  40. Walt, S. V. D., Colbert, S. C., & Varoquaux, G. (2011). The NumPy array: A structure for efficient numerical computation. Computing in Science & Engineering, 13, 22–30.CrossRefGoogle Scholar
  41. Wishart, J. (1928). The generalised product moment distribution in samples from a normal multivariate population. Biometrika, 20A, 32–52. CrossRefGoogle Scholar
  42. Yang, J., Agterberg, F. P., & Cheng, Q. (2015). A novel filtering technique for enhancing mineralization associated geochemical and geophysical anomalies. Computers & Geosciences, 79, 94–104.CrossRefGoogle Scholar
  43. Yang, J., & Cheng, Q. (2015). A comparative study of independent component analysis with principal component analysis in geological objects identification, Part I: Simulations. Journal of Geochemical Exploration, 149, 127–135.CrossRefGoogle Scholar
  44. Zhang, C. S., Manheim, F. T., Hinde, J., & Grossman, J. N. (2005). Statistical characterization of a large geochemical database and effect of sample size. Applied Geochemistry, 20, 1857–1874.CrossRefGoogle Scholar
  45. Zuo, R. (2017). Machine learning of mineralization-related geochemical anomalies: A review of potential methods. Natural Resources Research, 26, 457–464. Scholar

Copyright information

© International Association for Mathematical Geosciences 2018

Authors and Affiliations

  1. 1.Institute of GeosciencesChina University of Geosciences (Beijing)Haidian District, BeijingChina
  2. 2.State Key Lab of Geological Processes and Mineral ResourcesChina University of Geosciences (Beijing)Haidian District, BeijingChina
  3. 3.Department of Earth and Environmental SciencesUniversity of WaterlooWaterlooCanada

Personalised recommendations