Keywords

1 Introduction

The concept of mindfulness is largely dependent on one’s theoretical perspective but, in general, there is agreement that it involves open receptive attention, present moment awareness, and de-automization in thought processes. As a contemplative training intervention, mindfulness has been especially lauded by many practitioners as making improvements to performance ranging from increased productivity to enhanced decision making [11]. While some of these results are backed by empirical evidence, the scientific community lags in comprehensively validating these claims [19]. Despite these techniques ancient origins in religious practices (e.g., Zen Buddhism) and use in Clinical Psychology settings, this is an emergent field of scientific inquiry in a nascent state [4, 10, 20]. This has resulted in calls from the science community to establish a comprehensive research agenda across disciplines of Psychology to address the need to underpin practical prescriptions with empirically derived principles and guidelines [5, 6, 12, 19].

Some criticisms of the existing body of empirical research are that there is no single operational definition for mindfulness, an over reliance on subjective recall measures leaving common method bias as a concern, an ill-defined nomological network, a failure to control for confounds, and an inability to replicate results found [1, 4, 19]. Additionally, Clinical Psychology scholars are pointing out potential negative outcomes of mindfulness-based interventions with contraindications for certain populations emerging in the empirical literature [19]. Certainly, this points to the need to more fully understand the boundary conditions of contemplative strategies. Acknowledgement of these challenges provides opportunities to employ rigorous methods driven by theory to arrive at an informed evidence-based practice. Further, these types of studies will assist practitioners with answering what the return-on-investment is for interventions such as mindfulness practice.

Historically, Modeling and Simulation (M&S) test environments have supported the development of principles and guidelines based on multitrait-multimethod approaches in other contexts and may provide a similar supporting role for mindfulness concepts [3]. When combined with a statistical methodology such as Design of Experiments (DoE), M&S offers an effective and efficient strategy for determining and evaluating key system and human performance parameters. M&S and DoE have been successfully applied to a wide variety of industries including medical, agricultural, e-commerce, and defense [13]. One of the fundamental concepts of DoE – replication – is well-suited for M&S applications. Replication via M&S increases test data and confidence in test results, allowing for comparisons across samples and techniques that would be difficult, impractical, or too expensive otherwise [17].

Given the lack of empirical evidence for the effectiveness of contemplative training interventions, could a similar analytical approach (M&S plus DOE) be used to validate current or find alternative results? The ability to identify and scope significant factors and control the test environment remains paramount. Certainly, factors such as the background of the individual, the quantity and type of contemplative (e.g., mindful breathing, focused thought, meditation) or other training interventions experienced, task expertise level, and nature of the task require further examination and control, which M&S offers [4]. What other factors have the potential to affect the results and to what degree do those factors interact? A recent meta-analysis of trait mindfulness, the average/baseline level of a person’s mindfulness absent a mindfulness practice or intervention, suggests the existence of mediating variables between this construct and work effort and perceived job stress respectively. An M&S environment would surely offer opportunities to identify such relationships and interaction effects in a controlled setting [16]. What test environment features (M&S or otherwise) are necessary to realistically stimulate the system under test and conduct data collection? We posit that an M&S testbed, along with a statistically-designed test approach, could provide empirical results to these questions for a given task or activity and system under test. Prior to a discussion of M&S, we focus on introducing mindfulness definitions and conceptualizations from multiple academic disciplines. Next, we discuss methodological shortfalls in mindfulness research and propose measures for assessment across human functions and performance categories. We then demonstrate basic M&S and DoE principles to a sample task (driving) to show their ability to better understand the relationships and significance of the mindfulness categories and associated measures. Finally, we draw preliminary conclusions from the presented research and propose foundations for future mindfulness research and testing.

2 Mindfulness Definition and Concept Confusion

Globally, what is meant by the term mindfulness is largely dependent upon theoretical perspective, leading to an incohesive literature base, which is further compounded by interest in the topic across disciplines. While cross-disciplinary interest provides some exciting prospects, it also presents challenges when crosstalk is stilted. The theoretical perspectives underpinning work in mindfulness can be crudely dichotomized into those that nest neatly within Eastern Philosophy and those that have been adapted to fit within Western Philosophy. Regardless of philosophical stance, there is agreement that mindfulness involves open receptive attention, present moment awareness, and de-automization in thought processes. Beyond this, there are significant departures that serve as sources of debate. Table 1 below provides some popular definitions for mindfulness across academic disciplines. It is evident from this list that it is largely considered as a state as opposed to trait construct. However, during our review of the literature for preparation of this manuscript we noted on several occasions that surveys psychometrically validated to assess trait mindfulness were used to assess state mindfulness. Unfortunately, this is not an uncommon occurrence in the study of mindfulness [9].

Table 1. Mindfulness operational definitions in the literature

One of the greatest challenges in the empirical literature is the lack of a consistent conceptualization for mindfulness. Good et al. [9] conducted a review of the mindfulness literature to understand the effects in the workplace. Results of this review revealed that the term “mindfulness” has been used to refer to trait mindfulness, state mindfulness, mindfulness practice, and mindfulness interventions. While all of these uses are valid, the use of the umbrella term “mindfulness” is not recommended for facilitating a coherent scientific and technical base to advance understanding. Rather, specificity of which conceptualizations are under consideration in any given study is imperative. In alignment with this recommendation, we offer Table 2 below to facilitate selection of terminology. Regardless of concept(s) undergoing test, research methodology remains a concern across disciplines.

Table 2. Mindfulness conceptualizations.

3 Addressing Methodological Shortfalls in Testing Mindfulness Concepts

Recently, Goldberg et al. conducted a systematic review of the methodological quality of the Clinical Psychology mindfulness literature base, which revealed modest improvements over the last 17 years [8]. However, they did identify needed methodological improvements: (1) active control conditions, (2) larger sample sizes, (3) longitudinal studies, (4) treatment fidelity assessment, and (5) reporting of instructors/instruction certification/validation. Indeed, these shortfalls can be leveled on the Work Psychology literature as well adding an overreliance on cross-sectional methods leaving common method bias a concern and causation in the existing nomological network unanswered [9]. Further, a failure to replicate results has been noted. All told, this presents opportunities to easily remedy the many methodological deficiencies noted (e.g., conducting a Power Analysis can assist with identifying the right sample size to adequately test a concept in any given study).

Recently, there have been efforts across both the Clinical and Work Psychology disciplines to provide frameworks to organize existing research and define points of departure for future research [9, 19]. We integrated these frameworks in Table 3 below where there was convergence and added a category where one should naturally exist (i.e., attitudes). Additionally, we culled existing measures that have been used to test mindfulness concepts in the literature demonstrating that researchers are spanning beyond the surveys used in cross-sectional studies. This list is in no way exhaustive but rather is intended to serve as a point of departure to inspire future directions. Working with these measures, studies to test antecedents, correlates, and proximal/distal outcomes can easily be conceived. In this vein, such an organizing framework lends itself to development of testable theories of mindfulness, where few exist. Further, rigorous methodologies, and understanding of variables that may have substantial pay off one could engage in Experimental Design to rapidly define a research agenda.

Table 3. Mindfulness categories, potential measures, & potential utilization of M&S testbed

M&S offers a strong resource for rigorous empirical test of mindfulness concepts. First, M&S offers the suspension of reality through the creation of contexts that research participants experience. For example, in a clinical setting one could easily conceive of modeling a series of anxiety inducing environments in which the efficacy of various mindfulness practice could be tested. Similarly, in a work setting, environments that simulate task settings could be developed to test the effectiveness of mindfulness on worker’s performance. These types of environments could be used to support laboratory, quasi-experimental, longitudinal, and computational experimental methods while also providing a measure of control that has been missing in many past efforts. Further, it is plausible that these measures can be combined in a myriad of ways dependent upon the research questions of interest.

4 Leveraging M&S and Experimental Design to Test a Sample Mindfulness Research Agenda

M&S environments are typically very good at computationally-based problems and can often be executed many times and very quickly. In doing so, M&S can produce large sets of results, generally for much less time, effort, and resources than would otherwise be required. It is these basic characteristics that often lead people to use M&S to address their research questions. But how do you know which research questions can best be addressed by M&S? How should you interpret your results? And how should you use your results to refine your model and to improve system performance? It is these and related questions that led to the development of a statistical methodology for planning, conducting, and analyzing experiments, including those that use computer-based M&S, known as Design of Experiments (DoE).

Believed to have begun in the 1920s in the agricultural industry, DoE uses statistical methods to efficiently identify key factors and obtain the most information with as few trials as possible. Maximizing these efficiencies becomes very important when dealing with limited, expensive, or high-risk resources. The process to identify what factors or combinations of factors impacts the desired response variable(s) is called screening. In its simplest form, screening can be implemented by a factorial design that includes all combinations of factors at all levels – two factors, each with two levels would produce 22 = 4 trials. If one factor has a different effect (response variable outcome) at different levels on another factor, this is called an interaction [13]. The existence of an interaction, along with an understanding of the desired response variable(s), can be used to make more efficient experimental designs (fewer trials). One can imagine that complex experiments with many factors and non-continuous levels would produce an unmanageable number of trials. To deal with this situation, experimenters can intelligently reduce the number of factors and levels based on their higher order interactions through a fractional factorial design [17].

Additional principles for experimental designs can be used to ensure the objectivity and efficiency of trials. Randomization implies running trials in random order to reduce bias to the degree possible. This principle is especially useful when human participants are involved. Replication is the repeat of one or more trials in order to estimate the experimental error (typically minor differences in response variables due to unimportant factors e.g., accuracy or consistency of a scale) and blocking attempts to suppress the impact of high-variance factors on the experimental error [13, 17].

In the late 1990’s the National Highway Traffic Safety Administration along with the National Center on Sleep Disorders Research conducted a comprehensive study on driver drowsiness and fatigue. While not directly related to mindfulness, the study provided a framework for understanding various effects on driving that could be extended and applied to mindfulness. Further, the study provides a context (driving) that is already well-represented within the civilian, commercial, and military M&S community. The following discussion is offered as an example of how M&S and DoE could be applied to a mindfulness context with drivers.

The purpose of our sample research project is to determine the mindfulness-related effects on driving. Our dependent variables (measured response outcomes) will be both physiological (heart rate, respiration), and attention (stability, control, efficiency). Our independent variables (factors) will be driver age (16–25 [L], 26–55 [M], 56+ [H]), periodicity (how often the driver completes this route – daily [D], weekly [W], monthly [M]), traffic level (low [L], medium [M], high [H]), and distraction level (occurrence of unanticipated events - low [L], medium [M], high [H]). Therefore, we have up to 34 = 81 trials if we were to conduct a full factorial design of this experiment as further detailed in Table 4.

Table 4. Full factorial run matrix

Conducting a live experiment with 81 trials including human drivers, different traffic levels, and different distraction levels would be difficult to control and potentially very time consuming. For these reasons, we have decided to use a virtual simulator (human operator using simulated equipment) to conduct our experiment. We estimate that each trial will take approximately 45 min (~60 h total) to complete. Unfortunately, we only have access to the driving simulator for a maximum of 30 h (~40 trials) so we will have to find ways to reduce the number of required trials by half while still maintaining confidence in our results.

The first step in trying to reduce the number of trials is to identify which factors do and do not interact (have an impact on the response outcomes). Unless there is an existing data set for the factors of interest, one will need to find a method to determine if and to what degree there are interactions. One of the most straightforward ways of determining interactions is by sampling and executing a subset of the trials. Using the full factorial run matrix above, we have decided to sample Driver Age [L, M, H], Periodicity [D, M], Traffic Level [L, H], and Distraction Level [L, H]. This is a reasonable approach because we are sampling all levels of Driver Age and the boundaries of all other factors. These samples will use 24 of the 40 available trials as detailed in the Table 5 but should give us strong indications of interactions.

Table 5. Sampling run matrix

Through execution of the selected trials we have learned that Driver Age and Traffic Level has minimal interactions (the effect is not significant on the desired response outcomes). Based on this information and the number of available trials remaining (16), we have refined our run matrix in Table 6 as follows:

Table 6. Refined run matrix

The refined run matrix largely keeps driver age and traffic constant while iterating all levels of periodicity and distraction level. The final four trials (36–40) are “extra” and are replications of trials from our sampling matrix to ensure continuity of results.

The notional results of our sample research project indicate that periodicity has the largest physiological effect while distraction level has the largest attention effect and these factors do interact. Specifically, we now know that higher periodicity combined with lower distraction levels decreases our physiological (heart rate, respiration) and attention (stability, control, efficiency) factors, while lower periodicity and higher distraction levels increases our physiological and attention factors. Furthermore, due to the appropriate application of DoE methods, we are able to state that our results have statistical significance and can be used to refine our current model or used as input to future studies. Finally, the use of M&S allowed us to conduct many more trials, with much more control of the experimental environment than would be possible in a live experiment.

5 Conclusion

While there is general agreement that improving “mindfulness” has positive effects, the research summarized in this paper confirms that there is no consensus regarding the various mindfulness theories, definitions, or measures. This suggests that the discipline and context to which one is applying mindfulness concepts must be strongly considered. This also suggests that discipline-specific approaches to mindfulness concepts may need to be further researched and developed.

Mindfulness categories and measures are proposed across human functions and performance that are essential for testing and assessing any mindfulness theory or definition. The sample research project uses those measures, along with a statistically-significant and replicable methodology (DoE plus M&S), to determine and evaluate mindfulness factors for a given context (driving). This example and approach is significant because it can be improved and applied to other contexts and can assist with the research and evolution of additional theories and definitions of mindfulness.

Collectively, this research assists the community in achieving a broader understanding and body of knowledge of the state of mindfulness concepts. Additional research is recommended to further understand and refine discipline-specific mindfulness concepts and test those concepts using proven experimental methods. If conducted, these activities are expected to result in improved mindfulness theories, definitions, and measures that can be used for the benefit of individual and collective human performance across a spectrum of disciplines.