Keywords

1 Introduction

Through the windows of the cottage, we watched the sun slip behind the trees on the ridge across the lake, turning the light dusting of snow from pink to red to crimson. As darkness settled outside, the windows became mirrors, lit by the flame from the logs in the fireplace, until all we could see was our two reflections, each resting comfortably in an armchair, wine glass in hand. We talked into the late evening, past the rising of the crescent moon, reminiscing about people, about ideas, about where it all began. We’d known each other for more than three decades, and were comfortable when talk lapsed into silence … and equally comfortable when silence gave way to a new thought, a different recollection, and the conversation flared up into a dispute about memory, about theory or about practice. Even when the wine bottle stood empty, and the embers in the fireplace seemed to be exhausted, the logs would sometimes adjust, one breaking and settling against another as sparks shot into the air. New fire from old.

It was December 2013, and I had succeeded in having my old advisor from Stanford, André Journel, visit me in Ontario to discuss a joint contribution to the volume on multiple-point statistics being compiled by Grégoire Mariethoz and Philippe Renard. Busy lives kept us from completing that task, but the conversation from that weekend by Lake Muskoka did become enough of an almost-paper that I was grateful for the opportunity of this 50th anniversary volume to complete what we began. Neither André nor I have much to contribute to modern MPS research; we are both “gray hairs” and now stand well back from the fire of leading-edge research. But our hair was once not so gray, and we were there at the beginning when we laid the kindling for what has become a remarkably rich idea. So our offering from that Lake Muskoka discussion is reflections on how the MPS came together. It is a tale familiar to science, with chance encounters, casual remarks that turn out to have great depth, cocktail napkins turned into whiteboards, heads shaking in disagreement: “that can’t be right”. As we yield the stage to the next generations of researchers, our hope is that others continue to recognize the value of cross-pollination, of interacting with others in the field, especially those who have ideas that contradict one’s own beliefs. When one sturdy idea burns and breaks, settling against another, sparks fly and we have our best chance to ignite new understandings of both theory and practice.

2 1970s

2.1 A Hammer Without a Nail

Although the theory of geostatistical simulation was firmly established by the early 1970s (Journel 1974), it had still not been widely accepted in practice by the end of the decade. The now-venerable turning bands algorithm was the only game in town when one wanted to create a conditional simulation. There were a handful of practical case study example of conditional simulation in the mining industry, but it remained a hard sell in an industry that prefers, even now, to report one single “best” estimate of mineral resources and reserves than to wrestle with a family of equi-probable outcomes.

3 1980s

3.1 Interest in Geostatistics Spreads to the Oil Industry

Through the 1970s, the oil industry lagged behind the mining industry as an adopter of geostatistics. Many oil companies found value in some of the trend variants of kriging as additional tools in their contouring toolkit, especially when dealing with structural traps where trends are common in the elevation of the top of structure. Kriging with an external drift provided a good contouring solution in faulted reservoirs where seismic data provided strongly correlated indirect measurements of depth to the top of the reservoir (Maréchal 1984). But oil companies had many good contouring methods that worked well without any geostatistics, and it was not until the late 1980s when most of the major oil companies took notice of conditional simulation because it offered something new: the ability to do Monte Carlo analysis with 3D models of a reservoir’s rock and fluid properties that honored data and that were geologically plausible.

3.2 New Simulation Tools and the Struggle for Visual Realism

At Stanford, where I studied in the 1980s, research was supported by the Stanford Center for Reservoir Forecasting. The SCRF consortium’s interest in risk analysis fueled a growing number of new geostatistical algorithms for creating realizations that honored continuous data (typically rock and fluid properties) and categorical data (typically lithologies): sequential indicator simulation (Alabert 1987), LU decomposition (Alabert 1987; Davis 1987), sequential gaussian simulation (Isaaks 1990; Gómez-Hernández 1991).

Despite having new algorithms for the conditional simulation of continuous variables, Stanford’s toolkit still struggled to produce convincing simulations of categorical variables such as lithologies in a sand-shale sequence. Although indicator realizations could be made to honor indicator variogram models, the results usually were not convincing as artwork; they simply looked wrong. In Fig. 32.1, much of the (limited) success of the SIS simulation is due to the use of a trend model and to locally varying directions of maximum continuity, and not so much to the indicator kriging or to sequential simulation.

Fig. 32.1
figure 1

Examples of indicator simulation and object-based simulation of fluvial channels. The image at the top shows a training image (a satellite image of the Brahmaputra River) from which indicator variograms were calculated and used to create the SIS realization in the middle frame, conditioned to the data shown as circles. The same training image provides information on the distribution of parameters that describe object geometry; these were used as input to an object-based simulator, FLUVSIM (Deutsch and Tran 2002), to create the unconditional realization at the bottom. Although the object-based simulation succeeds in creating channels that are visually more coherent, it is difficult to condition to known lithologies at specific locations

Boolean simulations that stochastically arranged prescribed geometries into a computer model usually won more approval for realism, but because these object-based algorithms were not pixel-based, they had difficulty with conditioning to well data, especially if there were lots of closely-spaced wells. In Fig. 32.1, the SIS realization is conditioned to 240 data points; but the object-based simulation, which produces a more satisfying result, is unconditional.

Through my time as a graduate student at Stanford, the Holy Grail of conditional simulation was a best-of-both-worlds algorithm that had the visual realism of object-based methods but that conditioned easily to hard data, no matter how dense. There were discussions at that time about the possibility that we might never achieve what we thought we wanted because of the fundamental difference between the statistical characteristics of an image and the meaning that knowledgeable experts extract from the image. In the example in Fig. 32.1, human vision allows us to see the entire set of meandering braided channels. Statistical summaries, especially variograms, do not “see” anything in its entirety; they see the image two points at a time. The analogy that André Journel often used was that it was like a blind person, trying to understand an object in front of him when he was allowed only to probe with the two forefingers. Limited to poking here and poking there, the blind person would struggle to tell the difference between an elephant and a rhinoceros.

The envy of the visual success of object-based realizations, and the desire to maintain the ease of conditioning with pixel-based methods, catalyzed a lot of discussion in the late 1980s about multi-point geostatistics. What would three-point or four-point or n-point variograms look like? How might they be calculated experimentally? How could they be modeled? How should they be used in an improved version of kriging?

3.3 Outcrops and Scanned Images as Analogs

In the mining industry, where geostatistics was first embraced, drill hole spacing is typically on the order of tens of meters, close enough that the choice of a variogram model could be based on experimental variograms. In petroleum reservoirs, wells are typically spaced several hundreds of meters apart, sometimes thousands of meters. This practical reality of petroleum applications gave rise to an immediate practical problem when the oil industry took an interest in conditional simulation in the 1980s: where to get the closely-spaced information required to make experimental variograms?

The common advice in the 1980s was that outcrop studies could provide the data required to support statistical and geostatistical parameter choices, such as the length, anisotropy and orientation distributions required for object-based methods, or the variograms required for geostatistical methods. Outcrop studies did not begin in the 1980s; but this was the decade when they flourished. Many of the major oil companies, either individually or in consortiums, funded detailed quantitative studies of outcrops that could serve as good geological analogs for producing fields. And outcrop studies from earlier decades were dusted off and re-purposed as sources for data that could support parameterization of computer models.

Figure 32.2 shows an example of data from a 1960s outcrop study that was re-discovered by several researchers in the 1980s. It was created by digitizing shale streaks from a photograph of a cliff face of an outcrop of the Assakao Formation in the Tassili region of the central Sahara (Dupuy and du Prey 1968). Fifteen years after the data was first presented, Helge Haldorsen used the Assakao outcrop study as the basis for choosing the shale length distribution for object-based simulations of sand-shale sequences for his Ph.D. research (Haldorsen 1983; Haldorsen and Chang 1985). During the time when I studied at Stanford, I shared an office with Alec Desbarats to whom Helge had given the Assakao data for Alec’s research on stochastic modeling of flow in sand-shale sequences (Desbarats 1987).

Fig. 32.2
figure 2

The Assakao Sandstone data set (from Desbarats 1987). The formation is generally sandstone (white) with occasional shale streaks (black)

If a good outcrop analog was not available, one could (with fingers crossed and a prayer for absolution of sin) invoke a fractal argument and choose as an analog something with an entirely different scale. At a much larger scale than most reservoirs, satellite imagery, which started to become widely available in the 1980s, could serve as the source of information on spatial statistics. At the regional scale, or even at the scale of very large reservoirs, images like the top frame in Fig. 32.1 could help in sorting out statistical parameters for numerical simulation. And at a much smaller scale, there were scanned images of slabs of sedimentary rock at the scale of hand specimens, such as the example shown in Fig. 32.3.

Fig. 32.3
figure 3

Digital image of a slab of cross-bedded sandstone from Utah

Digitized images, whether of outcrops or of similar phenomena at different scales provide a basis for calculating not only experimental variograms but also multi-point statistics. When calculated from a rasterized image, the length distribution of shale streaks can be seen as a multi-point statistic. In the Assakao outcrop example shown in Fig. 32.2, where the individuals pixels are 20 × 20 cm, the probability of encountering a shale streak that is 20 m long can be calculated by scanning the image across each row, counting up the number of times we get a white pixel followed by 100 black pixels, then followed by a white pixel … then dividing this by the total number of shales of any length. Alec Desbarats did exactly this in his Ph.D. thesis when he wanted to test the fidelity of the synthetic sand-shale sequences he had created using indicator simulation (Fig. 32.4). He knew he had the correct proportion of shales and that he had matched the indicator variogram; but he was curious about how well he had done on the multi-point statistic that Helge Haldorsen controlled directly in his simulations. Figure 32.5 shows the histograms of the shale length distributions from an indicator simulation of the Assakao outcrop, and from the original image; the indicator simulation shows more very short shales than does the original image, with a lower mean length and higher variance.

Fig. 32.4
figure 4

Indicator simulation of the Assakao outcrop image in Fig. 32.2 (from Desbarats 1987)

Fig. 32.5
figure 5

Histograms of shale lengths from Fig. 32.2 (left) and Fig. 32.4 (right) (from Desbarats 1987)

Other similar studies at the same time by François Alabert showed the same result: indicator simulation produces realizations that show more short features and too few long features. The over-representation of short features is also obvious from a visual comparison of indicator simulations to the reality they try to mimic, e.g. the top two frames in Fig. 32.1, or the realization in Fig. 32.4 with the outcrop image in Fig. 32.2. The common explanation given at the time was that when an algorithm controls only the first and second-order moments (histogram, or indicator proportion, and the variogram) then the uncontrolled higher-order moments drift in the direction of disorder or maximum entropy.

3.4 Leaving the Ivory Tower and Getting on with Adult Life

My years as a student at Stanford ended in 1988. Sold my bicycle, the one that hadn’t been stolen. Gave up the wonderful room I had in a camping trailer behind a house in Palo Alto. Headed off into the world of consulting, with Neil Schofield and Roland Froidevaux as my partners in FSS International Consultants. The notion was simple: Neil and I were familiar with student poverty and didn’t mind another year of living with little money. After a year, if we failed as consultants then we could get real jobs.

We managed not to fail, and each of the FSS partners found ourselves busy with clients who wanted advice and assistance with geostatistical studies. My workload was split between mining studies, where simulation was rarely discussed, and petroleum studies, where kriging was rarely discussed.

Even though my mining studies had little to do with stochastic modeling, there was one mining project that, in hindsight, probably planted some useful seeds for what later became the MPS prototype algorithm. It was a project in which some of the useful geological and numerical data were available only from paper records written by hand decades ago: drill logs with assay values transcribed manually. In the late 1980s, software for optical character recognition (OCR) struggled with handwriting; it still does today, but it was worse back then. Even though commercial OCR software could make no sense of the handwritten logs, my sense was that it should be possible to extract much of it automatically, instead of going through a time-consuming and error-prone process of manual data entry. The drill logs were neat and legible, and all of the key numerical values were written in boxes on a form. With only 11 possible characters in use, the ten digits and the decimal point, it seemed possible to me that the handwriting could be recognized by an algorithm that trained itself from actual images. I wrote a program that would search the scanned image (an eight-level grayscale raster), looking for islands of non-white in the appropriate boxes on the form. It would then show what it had found to the user, who would identify the symbol by typing in one of the 11 choices. After a few dozen examples of each of the 11 possibilities, the software was able to estimate the probability that a new small patch corresponded to each of the possibilities. It did this simply by direct pixel-to-pixel matching of grayscale levels, without any clever rescaling or rotation. If it could not establish a sufficiently high probability for one particular choice, it would drop pixels from the comparison and try again. The user would correct it when it made mistakes, and the software would store its acquired collection of confirmed examples in a growing database. As with most of my Mo-code, it took a bit of tinkering to get it to work well; but it ended up being used, and saved weeks of data entry from hundreds of old drill logs. We ended up calling the program “Am-I-Right” because that’s how the program worked: by making guesses based on pixel-to-pixel pattern matching, and then checking with the user to see if that guess was correct.

3.5 Chris Farmer’s Unexpected Claim

1988 was also the year when I first met Chris Farmer, at the SPE Forum on reservoir characterization in Grindelwald, Switzerland. He was working on methods for numerically simulating reservoir rocks, recognized the benefits of a pixel-based approach, and had developed new ideas about what information to extract from outcrop studies and scanned images of analogs (Farmer, 1989). During my early years as a consultant, I managed to visit Chris at the UK Atomic Energy Agency’s research centre at Winfrith. During this visit, he made a claim that seemed implausible … no, it actually seemed flat out wrong; but I was raised well by my parents, and knew that it was rude for a guest to precipitate an argument.

We had been talking about extracting indicator variogram and cross-variogram information from scanned images and Chris remarked that you have to be careful when you do this because if you try to make a realization exactly match all of the indicator variograms and cross-variograms from a scanned image, then you’ll just get back the scanned image; and the purpose of creating realizations is not to exactly match one “true” image, but instead to sample a space of uncertainty that shares something in common with the original image. I checked if I understood him correctly: did he really mean that you can exactly … exactly … match an image just by reproducing its indicator variograms and cross-variograms? I knew (or thought I knew) that this wasn’t true. Even with multiple indicators, all of the variograms and cross-variograms are still two-point statistics; you’re still a blind person, feebly prodding either an elephant or a rhino.

Chris clarified that he did mean exactly, with one minor caveat: that you actually get two possible images which are 180° rotations of each other; you might end up with an upside-down elephant, but you’d easily be able to figure out that it wasn’t a rhino. And he also explained that he meant that you match to the complete experimental indicator variograms for every possible separation distance and direction on the rasterized image. Even with these caveats, I still found his claim implausible; but kept thinking about why he would be so sure about this.

The other reason it was not worth getting into the details of why Chris was confused was that I agreed with the basic point he was trying to make: the purpose of what we have now taken to calling a “training image” is not to match it, but instead to use it as a guide for selected spatial statistical characteristics. You want to match the statistics, while conditioning to data, not replicate one training image.

4 1990s

4.1 Why Chris Farmer Was Right

In 1991, the SPE Forum on reservoir characterization was held in Crested Butte, Colorado, and I had a chance to continue the discussion with Chris Farmer about indicator variograms and training images. When I explained, as diplomatically as I could, that I didn’t think his claim was correct, he grabbed a nearby napkin, sketched a small grid, and colored in some pixels as black, white and gray. He agreed that I was right if we lived in a world of variogram models for random functions that are infinite in all directions. “But in the real world, things have edges,” he explained patiently, “and this means there’s only one pair of pixels in the original image that completely span the diagonal”. He went on to show how you can actually deduce the grayscale levels for the two corner pixels (up to the 180° rotation) and then work inwards from the corners. The Appendix to this paper shows a small worked-out example of the trick that Chris explained.

As soon as he explained it, and I realized that I was the one who was wrong, Chris dismissed it as an algorithmic oddity, a cute and clever trick that has no practical value for simulating reservoir rocks, especially because the goal is never to exactly replicate the original image.

Even though I understood the principle behind the procedure of attacking the corners first and then working inwards, the algorithm still wasn’t clear in my head, and I spent some time that year trying to write code for doing what Chris had described. I never did manage to work out all the special cases, and it ended up on the back burner as one more unfinished project.

4.2 Back to the Ivory Tower: A Brief Escape from Adult Life

In late 1991, my consulting business was thriving and growing; I had a small staff in Vancouver, and plenty of project work. But I was spending more time as an administrator and manager, neither of which I am good at, and less time doing the technical work that I enjoy.

My old advisor convinced me that I could let the staff run the show while I spent a year at Stanford, back in the ebb and flow of new ideas with his new crop of graduate students. Twenty five years later, I find it remarkable what was accomplished during that year: P-field simulation, co-located cokriging, and a proof-of-concept algorithm for multiple-point statistics. All of these new geostatistical methods that we investigated in 1992 began with a piece of Mo-code that did something useful, and not with theory; that came later. André comes at research from the side of theory that leads to equations that can be coded and tested. I tend to come at it the opposite way, with a piece of code that achieves a desired result and that then leads to the question “I wonder why that works?”.

In the early part of 1992, with the luxury of time to do research again, I dusted off some of my back-burner projects, and came back to my attempt to code Chris Farmer’s trick for replicating an image from its indicator variograms and cross-variograms. The details of the algorithm were still a mess, but I realized that I could get very close to a satisfactory result using simulated annealing, a possibility that came to the forefront because Clayton Deutsch was finishing his Ph.D. thesis on simulated annealing that year. I wrote a program that would start with a grid that had exactly the correct proportions of the gray levels, randomly scattered, and that would use simulated annealing to iteratively adjust the image by swapping pixels in order to push the experimental indicator variograms and cross-variograms of the evolving grid in the direction of a target values established by the complete indicator variograms of the original image. No variogram models were used; everything was done using look-up tables of variogram values. I used a photo of André, exhausted after a climb on Mount Whitney, as the training image, converted it to an eight-level grayscale image with seven indicators. 200 columns and rows, seven direct indicator variograms, 21 indicator cross-variograms, all calculated for every one of approximately 80,000 lags on the image. It took four days of run-time and hundreds of millions of swaps before the difference between the indicator variograms of the simulation and the image could not be reduced. It was hopelessly inefficient, but it confirmed for me that Chris Farmer had been right.

For me, the recognition that you can exactly match an image from a very complete and specific statistical summary of specific patterns was an eye-opener. Although it now probably seems fairly obvious, in the early 1990s, the wealth of information contained in an image’s statistical summaries was not immediately apparent. Then, the normal workflow was to assemble statistical parameters by fitting models to experimental statistics. The experimental variogram, for example, was an important stepping-stone to a variogram model; but it was only a means to an end. We did not think of the massive look-up table of summary statistics for thousands of grouped pairs of data as something that could serve directly as an input parameter. But why not? Why in an age of computer power did we continue to create simplified mathematical models of statistical characteristics? Was it really necessary to boil the parameterization down to a few numbers, a nugget effect and a range, rather than leave the statistical summary in its original form as a massive look-up table? For me, this was the “aha” moment catalyzed by my belief, years earlier, that Chris Farmer’s claim about indicator variograms was not correct. The reason I was wrong was that massive look-up tables of indicator variograms are a rich source of very detailed information. The mistake we were making was that we moved past this wealth of information and replaced it with a simple model.

The idea for the first prototype of an MPS simulation algorithm came from the accidental meeting of thoughts about the role of training images in reservoir simulation and the experience of having coded the Am-I-Right procedure for optical character recognition for a mining project. The principal difference between Am-I-Right and the MPS prototype is that, after scanning the image to build a probability distribution, the Am-I-Right procedure always took the most likely value while MPS used the distribution as a basis for random sampling.

The first tests of the MPS prototype were done on a digital image of a cross-bedded sandstone, like the one shown in Fig. 32.3. This was chosen because it presents curved structures that are difficult to capture with most geostatistical simulations, which tend to show straight features in the direction of maximum continuity unless an explicit attempt is made to use locally varying directions of anisotropy. Figure 32.6 shows the first published results of an MPS simulation (Guardiano and Srivastava 1992). That Tróia ’92 paper used a two-level black-and-white training image because the first tests on an eight-level grayscale image were very slow; it would be several years before Sebastien Strebelle’s Ph.D. research (Strebelle 2000) produced the first efficient and practical implementation of the original clumsy prototype.

Fig. 32.6
figure 6

The first published example of results of an MPS simulation (from Guardiano and Srivastava 1992). The frame on the left shows the training image, a black-white image obtained from a digital photograph of a slab of cross-bedded sandstone. The middle frame shows a realization from sequential indicator simulation. The right frame shows a realization from the MPS prototype algorithm

Even though the first results were not brilliant, certainly not by today’s standards, they did show that it was possible to impart to a simulation higher-order connectivities and patterns that are not explicitly summarized in variograms. In the right frame of Fig. 32.6, it is the black pixels that make the thin curved arcs, while the contiguous regions of white pixels tend to be larger and blockier. The middle frame of Fig. 32.6 shows that these features are hard to capture in an indicator simulation, which tends to symmetrize the black and white geometries when the proportion is near 50%.

5 Concluding Thoughts

Where do ideas come from? Is it possible to create fertile conditions for innovation? Of the many who have studied these questions, my favorite is Steve Johnson, who wrote Where Good Ideas Come From: The Natural History of Innovation; he has presented his thoughts in a 2010 TED Talk and also in a short YouTube video (https://www.youtube.com/watch?v=NugRZGDbPFU). Much of what Johnson identifies as key elements of innovation are in evidence in the origins of the MPS simulation algorithm: the slow incubation of hunches, the borrowing and combining of ideas from other people with related hunches, the catalytic effect of recognizing error, and of finding the missing piece.

The one piece of Johnson’s message that resonates most strongly with my experience is the importance of staying connected to others; he often concludes his presentations with the observation that innovation comes by chance, but chance favors the connected mind. By “connected mind” he means a mind that is connected to what others are doing, how they are thinking about similar problems. It is the hunches and cast-off ideas of those people that you’ll end up borrowing and adapting to improve a hunch of your own that has still not reached fruition.

Of the many different ideas that ended up being woven together into the MPS prototype, there may be a dropped thread, something that might be research worth pursuing. It is the fact that complete indicator variograms and cross-variograms provide extremely rich and detailed information about an image, so rich and detailed that they can, in fact, be used to replicate the original image. While replication of a training image should never be a goal, it’s intriguing to think about what we might be able to do if we matched a small sub-set of the complete look-up table of all indicator variograms. We know that we get a “perfect” realization if we use 100% of the look-up table. Would the realization look “fairly good” or “completely ugly” if we decimated the complete look-up table and used only 10% of it, or only 1%? My own tests with the annealing version of this procedure, and the example in Appendix A, indicate that the indicator cross-variograms are sometimes not necessary, i.e. that you can achieve nearly perfect reproduction of the original image without them. Dropping all the indicator cross-covariances would considerably reduce the size of the look-up table, or any subset of it. Something worth trying?

My final reflection is on the beneficial tug-of-war between theory and practice. Throughout my career as a consultant, and tourist in academia, I have enjoyed discovering that the path to a solution sometimes starts when you enter the maze from the theory side, and sometimes starts from an entrance on the practical side. When theory leads you to the point of a set of equations, that need not be the end because there may be something useful to be learned in attempting to implement those equations in practice, in writing a piece of computer code that produces an answer in a reasonable amount of time. And, coming from the other end, having developed an algorithm that produces an intriguing result that seems “good” or “right”, it’s useful to try to work out why it works. Even if the answer came heuristically, the theory that explains why it’s an approximately correct answer might reveal a generalization that makes it possible to improve the answer.