Mathematical Geosciences

, 40:639 | Cite as

Automated Kerogen Classification in Microscope Images of Dispersed Kerogen Preparation

  • L. I. Kuncheva
  • J. J. Charles
  • N. Miles
  • A. Collins
  • B. Wells
  • I. S. Lim


We develop the classification part of a system that analyses transmitted light microscope images of dispersed kerogen preparation. The system automatically extracts kerogen pieces from the image and labels each piece as either inertinite or vitrinite. The image pre-processing analysis consists of background removal, identification of kerogen material, object segmentation, object extraction (individual images of pieces of kerogen) and feature calculation for each object. An expert palynologist was asked to label the objects into categories inertinite and vitrinite, which provided the ground truth for the classification experiment. Ten state-of-the-art classifiers and classifier ensembles were compared: Naïve Bayes, decision tree, nearest neighbour, the logistic classifier, multilayered perceptron (MLP), support vector machines (SVM), AdaBoost, Bagging, LogitBoost and Random Forest. The logistic classifier was singled out as the most accurate classifier, with an accuracy greater than 90. Using a 10 times 10-fold cross-validation provided within the Weka software, we found that the logistic classifier was significantly better than five classifiers (p<0.05) and indistinguishable from the other four classifiers. The initial set of 32 features was subsequently reduced to 6 features without compromising the classification accuracy. A further evaluation of the system alerted us to the possible sensitivity of the classification to the ground truth that might vary from one human expert to another. The analysis also revealed that the logistic classifier made most of the correct classifications with a high certainty.


Kerogen recognition Machine learning Image processing Transmitted light microscopy 


  1. Athersuch J, Banner FT, Higgins AC, Howarth RJ, Swaby PA (1994) The application of expert systems to the identification and use of microfossils in the petroleum industry. Math Geol 26(4):483–489 CrossRefGoogle Scholar
  2. Bishop CM (1995) Neural networks for pattern recognition. Clarendon Press, Oxford, 504 p Google Scholar
  3. Bishop CM (2006) Pattern recognition and machine learning. Springer, New York, 738 p Google Scholar
  4. Bollmann J, Quinn P, Vela M, Brabec B, Brechner S, Cortés M, Hilbrecht H, Schmidt DN, Schiebel R, Thierstein HR (2004) Automated particle analysis: Calcareous microfossils. In: Francus P (ed) Image analysis, sediments and paleoenvironments. Kluwer Academic, Dordrecht, pp 229–252 Google Scholar
  5. Bonton P, Boucher A, Thonnat M, Tomczak R, Hidalgo P, Belmonte J, Galan C (2001) Colour image in 2d and 3d microscopy for the automation of pollen rate measurement. Image Anal Stereol 20:527–532 Google Scholar
  6. Boucher A, Hidalgo P, Thonnat M, Belmonte J, Galan C, Bonton P, Tomczak R (2002) Development of a semi-automatic system for pollen recognition. Aerobiologia 18(3–4):195–201 CrossRefGoogle Scholar
  7. Breiman L (1996) Bagging predictors. Mach Learn 26(2):123–140 Google Scholar
  8. Breiman L (2001) Random forests. Mach Learn 45:5–32 CrossRefGoogle Scholar
  9. Breiman L, Friedman J, Olshen R, Stone C (1984) Classification and regression trees. Wadsworth International, Belmont, 335 p Google Scholar
  10. Charles JJ, Kuncheva L, Wells B, Lim I (2008a) Object segmentation within microscope images of palynofacies. Comput Geosci 34:688–698. CrossRefGoogle Scholar
  11. Charles JJ, Kuncheva LI, Wells B, Lim I (2008b) Background segmentation in microscope images. In: Proc 3rd international conference on computer vision theory and applications VISAPP08, Madeira, Portugal, pp 283–294 Google Scholar
  12. Cristianini N, Taylor S-J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge, 189 p Google Scholar
  13. Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, New York, 680 p Google Scholar
  14. Flesche H, Nielsen AA, Larsen R (2000) Supervised mineral classification with semiautomatic training and validation set generation in scanning electron microscope energy dispersive spectroscopy images of thin sections. Math Geol 32(3):337–366 CrossRefGoogle Scholar
  15. France I, Duller A, Duller G, Lamb H (2000) A new approach to automated pollen analysis. Quat Sci Rev 19(6):537–546 CrossRefGoogle Scholar
  16. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139 CrossRefGoogle Scholar
  17. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann Stat 28(2):337–374 CrossRefGoogle Scholar
  18. Hand DJ, Yu K (2001) Idiot’s Bayes—not so stupid after all? Int Stat Rev 69:385–398 CrossRefGoogle Scholar
  19. Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer, New York, 536 p Google Scholar
  20. Hills S (1988) Outline extraction of microfossils in reflected light images. Camput Geosci 14(4):481–488 CrossRefGoogle Scholar
  21. Jonker R, Groben R, Tarran G, Medlin L, Wilkins M, Garcia L, Zabala L, Boddy L (2000) Automated identification and characterisation of microbial populations using flow cytometry: the aims project. Sci Mar 64:225–234 CrossRefGoogle Scholar
  22. Kuncheva LI (2004) Combining pattern classifiers. Methods and algorithms. Wiley, New York, 376 p Google Scholar
  23. Liu S, Thonnat M, Berthod M (1994) Automatic classification of planktonic foraminifera by a knowledge-based system. In: Proceedings of the 10th conference on artificial intelligence for applications. IEEE Computer Society Press, San Antonio, pp 358–364 Google Scholar
  24. Swaby PA (1992) VIDES: An expert system for visually identifying microfossils. IEEE Expert: Intell Syst Their Appl 7(2):36–42 Google Scholar
  25. Tyson RV (1990) Automated transmitted light kerogen typing by image analysis. Meded Rijks Geol Dienst 45:139–149 Google Scholar
  26. Vincent L, Soille P (1991) Watersheds in digital spaces: an efficient algorithm based on immersion simulations. IEEE Trans Pattern Anal Mach Intell 13(6):583–598 CrossRefGoogle Scholar
  27. Wang L (1995) Automatic identification of rocks in thin sections using texture analysis. Math Geol 27(7):847–865 CrossRefGoogle Scholar
  28. Weller AF, Corcoran J, Harris AJ, Ware JA (2005) The semi-automated classification of sedimentary organic matter in palynological preparations. Comput Geosci 31(10):1213–1223 CrossRefGoogle Scholar
  29. Weller AF, Harris AJ, Ware JA, Jarvis PS (2006) Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification. Comput Geosci 32(9):1357–1367 CrossRefGoogle Scholar
  30. Weller AF, Harris AJ, Ware JA (2007) Two supervised neural networks for classification of sedimentary organic matter images from palynological preparations. Math Geol 39(7):657–671 CrossRefGoogle Scholar
  31. Wilkins MF, Boddy L, Morris CW, Jonker RR (1999) Identification of phytoplankton from flow cytometry data by using radial basis function neural networks. Appl Environ Microbiol 65(10):4404—4410 Google Scholar
  32. Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Mateo, 525 p Google Scholar

Copyright information

© International Association for Mathematical Geology 2008

Authors and Affiliations

  • L. I. Kuncheva
    • 1
  • J. J. Charles
    • 1
  • N. Miles
    • 2
  • A. Collins
    • 3
  • B. Wells
    • 3
  • I. S. Lim
    • 1
  1. 1.School of Computer ScienceBangor UniversityBangorUK
  2. 2.PetroStrat LimitedLlandudnoUK
  3. 3.Conwy Valley Systems LtdDeganwyUK

Personalised recommendations