Skip to main content

Machine Learning Methods for Sweet Spot Detection: A Case Study

  • Chapter
  • First Online:
Geostatistics Valencia 2016

Part of the book series: Quantitative Geology and Geostatistics ((QGAG,volume 19))

Abstract

In the geosciences, sweet spots are defined as areas of a reservoir that represent best production potential. From the outset, it is not always obvious which reservoir characteristics that best determine the location, and influence the likelihood, of a sweet spot. Here, we will view detection of sweet spots as a supervised learning problem and use tools and methodology from machine learning to build data-driven sweet spot classifiers. We will discuss some popular machine learning methods for classification including logistic regression, k-nearest neighbors, support vector machine, and random forest. We will highlight strengths and shortcomings of each method. In particular, we will draw attention to a complex setting and focus on a smaller real data study with limited evidence for sweet spots, where most of these methods struggle. We will illustrate a simple solution where we aim at increasing the performance of these by optimizing for precision. In conclusion, we observe that all methods considered need some sort of preprocessing or additional tuning to attain practical utility. While the application of support vector machine and random forest shows a fair degree of promise, we still stress the need for caution in naive use of machine learning methodology in the geosciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bibliography

  • Al-Anazi A, Gates I (2010) A support vector machine algorithm to classify lithofacies and model permeability in heterogeneous reservoirs. Eng Geol 114(3–4):267–277

    Article  Google Scholar 

  • Beyer K, Goldstein J, Ramakrishnan R, Shaft U (1999) When is “nearest neighbor” meaningful? In: Database theory — ICDT’99, vol 1540. Springer, Berlin, pp 217–235

    Chapter  Google Scholar 

  • Bishop CM (2006) Pattern recognition and machine learning (Information science and statistics). Springer, New York

    Google Scholar 

  • Breiman L (2001) Random forest. Mach Learn 45(1):5–32

    Article  Google Scholar 

  • Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297

    Google Scholar 

  • Friedman J (1994) Flexible metric nearest neighbor classification. Stanford University

    Google Scholar 

  • Goldberger J, Roweis S, Hinton G, Salakhutdinov R (2005) Neighborhood components analysis. Adv Neural Inf Process Syst 17:513–520

    Google Scholar 

  • Hastie TJ, Tibshirani R, Friedman JH (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New York

    Book  Google Scholar 

  • He H, Garcia E (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284

    Article  Google Scholar 

  • King G, Xeng L (2001) Logistic regression in rare events data. Polit Anal 2:137–163

    Article  Google Scholar 

  • Li J (2005) Multiattributes pattern recognition for reservoir prediction. CSEG Natl Conv 2005:205–208

    Google Scholar 

  • Li L, Rakitsch B, Borgwardt K (2011) ccSVM: correcting support vector machines for confounding factors in biological data classification. Bioinformatics 27(13):i342–i348

    Article  Google Scholar 

  • Liaw A, Wiener M (2002) Classification and regression by randomForest. R News 2(3):18–22

    Google Scholar 

  • Menard S (2002) Applied logistic regression analysis. Sage, Thousand Oaks

    Book  Google Scholar 

  • Mood C (2010) Logistic regression: why we cannot do what we think we can do, and what we can do about it. Eur Sociol Rev 26(1):67–82

    Article  Google Scholar 

  • Platt JC (1999) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74

    Google Scholar 

  • Vonnet J, Hermansen G (2015) Using predictive analytics to unlock unconventional plays. First Break 33(2):87–92

    Google Scholar 

  • Wohlberg B, Tartakovsky D, Guadagnini A (2006) Subsurface characterization with support vector machines. IEEE Trans Geosci Remote Sens 44(1):47–57

    Article  Google Scholar 

Download references

Acknowledgment

We thank Arne Skorstad and Markus Lund Vevle, both at Emerson Process Management Roxar AS, for the data set and for answering questions related to it.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vera Louise Hauge .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this chapter

Cite this chapter

Hauge, V.L., Hermansen, G.H. (2017). Machine Learning Methods for Sweet Spot Detection: A Case Study. In: Gómez-Hernández, J., Rodrigo-Ilarri, J., Rodrigo-Clavero, M., Cassiraga, E., Vargas-Guzmán, J. (eds) Geostatistics Valencia 2016. Quantitative Geology and Geostatistics, vol 19. Springer, Cham. https://doi.org/10.1007/978-3-319-46819-8_38

Download citation

Publish with us

Policies and ethics