1 Introduction

Each and every individual human life is instructive about how human lives are lived and why so, which makes the study of any one possibly useful for the progress of scientific human Psychology (Sandelowski 1996). Such study involves describing and trying to causally explain specific events in single human lives: experiential, behavioral, situational, and perhaps someday distinct enough neurophysiological events. Such descriptions, whether in narrative or numerical form, are conjunctions of gradations, one from each of several dimensions of the event described. Causal explanations of such events are in terms of some conjunction of prior events that constitutes a sufficient condition for the subsequent occurrence of the event explained (see, e.g., Krause 2010). For scientific human Psychology (SHP) what is explained are experiencings (the content of moments of consciousness: see, e.g., Krause 2016a) and behaviors (culturally distinguished patterns of apparently voluntary body movements, actions, or apparently involuntary ordinarily-perceptible bodily changes, i.e., expressions: see Krause 2005). Both must eventually be causally explicable in terms of genetics, situations, behaviors, epigenetics, neurophysiology, and (for persons who do not consider experiencings to be epiphenomenal, about which notion see, e.g., Walter 2009) also experiencings. These explanations may be derived deductively from general principles and then empirically tested or derived inductively (i.e., as grounded theory: e.g., Strauss and Corbin 1998) from the study of how an individual human life is lived and progressively conditionalized as further lives are studied. Both approaches involve effect dimensions and possible causal dimensions, so whether causally explanatory notions guide the gathering of evidence or evidence evokes such notions does not matter because both approaches can be useful (see, e.g., Bruscaglioni 2016; Fann 2012; Flach and Hadjiantonis 2013; Magnani 2001; Morse 2009; Reichertz 2007). The point of SHP theory and practice research is to try to understand how the full variety of human lives are lived, why so, how each could be better lived, and what would promote this.

Because the nature, course, and circumstances of each human life are somewhat unique (e.g., what situations and experiencings causally influenced what experiencings and behaviors when, and what behaviors and situations causally influenced what subsequent situations when), averaging over the results of studies of several lives (or over a series of occasions in a single life) wastes much of the information obtained about this uniqueness of individual persons’ lives (or of occasions in these lives). Therefore, increasing the number, n, and so variety, v, of lives (or occasions) studied before averaging increases the amount of information wasted about the individual members of this n or v, respectively.

Averaging over persons is what most fundamentally distinguishes current “quantitative” SHP research, which does do so, from proper “qualitative” SHP theory and practice research, which does not do so. The proper purpose for such averaging over persons is to inform social policies for aggregations of persons rather than to be adequately informative about any individual person, whereas SHP theory and practice are properly about individual persons (Krause 2016b, 2016c). Many journals have already published papers on the differences between qualitative and quantitative research (see, e.g., Eby et al. 2014), but so far as I have been able to discern none of these (other than, e.g., Strauss and Corbin 1998, 10–11; Taylor et al. 2015, 9; but these too succinctly) have explicitly capitalized on the distinction between averaging on and correlating of variables rather than locating each individual person in a single descriptive multi-phase hyperspace (e.g., Krause 2010; Krause and Howard 2002) and so respecting human individuality in a common descriptive framework. For more on the history and prospects of the nomothetic and the idiographic, see, e.g., Lamiell 2014; Robinson 2011; Salvatore and Valsiner 2010).

Each aggregation of human lives (or of occasions in a life) has a set of statistical properties. The statistical properties that have been of most interest to quantitative SHP research are the size of the aggregation, n, and the “moments” (see, e.g., Tietjen 1986, 9–15) of its distribution over dimensions of interest (i.e., variables: see, e.g., Krause 2016b), especially the first two “moments”: the aggregation’s mean and variance on single dimensions or its regression curve and bi- or multi-variate correlation on two or more dimensions. The statistics of aggregations (assembled within some feasibly brief interval of time) are well discussed, for example, in Yule and Kendall (1950, 102–168) for the univariate case, in Yule and Kendall (1950, 199–339) and further in Kendall and Stuart (1966a, 278–418) for the bi- and multi-variate cases, and in Kendall and Stuart (1966b, 285–313) for the canonical (i.e., conjointly independent- and dependent-variable multivariate) case. (For time-series aggregations of occasions, i.e., curve fitting, over some substantively interestingly extended period of time see, for example, Kendall and Stuart 1966b, 342–503; Nesselroade 2010). These mathematical matters are sufficiently sophisticated or arcane for many of us to inspire only our respect or avoidance, while for many other of us learning the mathematics is interesting or challenging enough to divert our attention from their irrelevance for SHP theory and practice which too easily goes unnoticed. However, what is crucially important to appreciate is that these statistics concern aggregations and therefore indeed are all irrelevant for SHP theory and practice (but not social policy) purposes (e.g., Krause 2016b; Krause and Lutz 2009; Krause et al. 2011).

All n of any sample of persons are generally not at the sample mean or on its regression line (or surface) for the measured dimensions, so in general most members of an SHP sample of persons are not described by the sample’s mean or regression line (or surface), respectively, on any dimension(s). Likewise, a sample’s variance on any dimension does not usually describe the squared deviation from the sample mean of most of the sample’s n, nor does the correlation between two variables usually indicate most of the sample’s n distances from their regression line for bivariate regression or from their surface for multiple or canonical correlation-regression. the familiar statistics of aggregations are irrelevant for SHP theory and practice (but not social policy). Thus, the distinction between aggregations of persons and individual persons is the most fundamental logical distinction between what are currently being called “quantitative” and “qualitative” Psychology (but also see, e.g., Waterman 2013).

2 Sample correlations, regressions, and structural equations

Nevertheless, concern for understanding how and explaining why individual human lives are lived as they are or are changed by psychological craft (i.e., psychotherapy, education, parenting, supervision, rehabilitation…: Krause, M. S., manuscript: Psychological crafts and ways of practicing them.) practices remains upstaged for many Psychologists by the present normativity of Linear Model statistics of case (or occasion) sample aggregations (see, e.g., Stigler 2016, on the essential nature of Mathematical Statistics and, e.g., Porter 1996, for some caveats on its applications) and their statistical significance testing (see, e.g., Chow 1996; Goodman 1999, 2016). This includes testing of 2 variables on the same dimension by the t-ratio (see, e.g., Krause 2011, 2016b) and of Structural Equation Models (SEM: e.g., Kline 2011) concerned with what variables (each being an aggregation of n measurements distributed on each of d dimensions for some sample of n cases) have co-varied how with what others, which is the way “quantitative” SHP research is now orthodoxly pursued (as to why consider, e.g., Nickerson 1998) regardless of its irrelevance for most of the individuals in the aggregations studied (why it is irrelevant for individuals is further discussed in Krause 2016b, 2016c; Krause et al. 2011).Footnote 1

Because a variable is a distribution of cases on a dimension, how variables on different dimensions apparently are inter-correlated and regress on each other depends upon how some given sample of n cases is distributed on each of these dimensions (see, e.g., Krause 2016b). How any two of d given dimensions actually covary in some incompletely accessible population of cases can only be estimated on the basis of how the available sample variables covary in some available set of c different case samples or in all these pooled. However, there are no logical grounds for relying on the mean of any such c correlation or regression sets or the pooling’s correlation or regression as an accurate estimate of how the dimensions themselves covary in the population from which the c samples were drawn. How any two dimensions covary in a population of cases at some specific time can only be known by measuring that whole population of cases at that time. No feasible sample of cases from that population can safely be relied upon by SHP to produce this result, because in general SHP can only opportunity sample the human population. Even were it randomly sampled (which is no easy nor usual matter: see, e.g., Feller 1968, 30 and 243–63; Krause 2016b) only the statistically expected, rather than the actual, correlation logically must be the same for each sample. As the pooled size of actually random samples increases it is this statistically expected correlation that stochastically tends to be more nearly approached (see, e.g., Krause 2016c), but without ever revealing how nearly. Random sampling is an important mathematical notion but of little practical use to SHP.

3 Case distributions in multidimensional phase spaces

A clear distinction needs to be made between a distribution of cases in a multidimensional phase space (i.e., one including a time dimension) and the relations among variables mathematically derived from this distribution (see Krause 2010, 2016b, 2016c; Krause and Howard 2002). Each location in such a space implies a set of d adjectives or adverbs narratively descriptive in dimension gradation terms of every case located there. The same sample data can be represented in either way, case by case or sample variable by sample variable, but with very different implications.

A d-dimensional hyperspace’s case distribution entails specific correlations and regressions among the space’s d variables, but only if these dimensions are real-number gradated because this is necessary for performing the arithmetic operations for calculating correlations and regressions. Such correlations and regressions themselves, however, entail nothing about any individual case’s actual location in this hyperspace and so nothing about these cases’ actual distribution in it. This is so because quite different distributions can entail the very same correlations and regressions if these are sufficiently narrowly dimension-spanning distributions so that can be different only in their precise location or orientation in the hyperspace. Picture a glued together bundle of crossed sticks moved and rotated to various different positions within a room. In other words, it is the internal structure of a multivariate distribution, not its actual location in its hyperspace, that directly determines its correlations and regressions.

To accumulate N cases in c independently acquired samples logically requires the sampling to be with replacement or the population sampled to be large enough, so that no sample drawn alters the shape of the population’s distribution. Both conditions are mathematically interesting and empirically whimsical possibilities, but only they would permit the statistics of these c samples (i.e., their c hetero-variable sets of correlations and regressions and iso-variable mean differences/treatment contrasts) to vary strictly randomly across samples that were randomly drawn from the population. Only then would the values for the statistics of each of c samples have the corresponding statistics (i.e., the same correlation, regression, or mean-difference statistics) of the population’s N as their statistically expected values (see, e.g., Krause, 2016c).

In actual practice, however, any c samples are drawn in some order without replacement, and so none but the first sample could be randomly drawn (were even that actually feasible) from the whole N.Footnote 2 This means that neither averaging over nor pooling of c samples from the same population of cases, unless they happen to exhaust this population (an unlikely possibility for SHP research) can guarantee an accurate estimate of the N’s distribution of cases in a multidimensional phase space or of the relations among variables for this whole population distribution. Is some sample estimate close enough to this for SHP theory or practice purposes? How could that possibly be determined? So we are left with having to inductively make what sense we can from whatever cases have already been studied, and no feasible sample (i.e., anything short of the whole population itself) of cases is definitive enough to assuredly prove a causal explanation, whereas only a single validly measured (something which SHP has yet to definitively settle for any of its descriptive dimension: Krause 2012) disconfirming case is enough to disprove a causal explanation. The elegant mathematics of case sample relative frequencies and probabilities does not address the issues of individual cases that SHP theory and practice properly must deal with. This deserves further spelling out, which shall be done in what follows in terms of case sample size, variety, representativeness of the population, differences, and population estimation.

4 The role of sample size

Sample size, n, has a key role in prevailing SHP quantitative research primarily because of statistical significance testing. The larger is n the smaller can a t-ratio or a correlation be to reach statistical significance (see, e.g., Kanji 1993, 8 and 33–34, but also note there and in Goodman 1999, the assumptions required). The principle role of n in proper SHP theory and practice relevant (i.e., qualitative) empirical research, however, is to reveal all the important effect dimensions and all the dimensions causally influential on or predictive of these and to span and saturate the ranges of these dimensions in order to maximize the variety, v, of persons’ lives (or of occasions in an individual person’s life) studied (see, e.g., Cleary et al. 2014; Francis et al. 2010; Krause 2010; Mason 2010). These dimensions are the essential framework of all causally explanatory theory because how they themselves, rather than the variables presently used in SHP to represent dimensions (i.e., case sample distributions on dimensions), are inter-related is what properly matters for SHP theory and practice (Krause 2016b).

Case sample variety, v, has had no explicit role in usual SHP quantitative research design, but because it may result in longer or fatter tailed dependent-variable distributions (even without specifically sampling for interesting independent-variable dimension outliers, which maximizing v would favor) it has acquired a quite contrary role in SHP statistical data analysis. The more such univariate or bivariate (i.e., scatter-gram) outlier results do occur, the larger must n be to produce statistically significant t-ratios and correlations, respectively. So censoring outliers has become recommended (see, e.g., Carling 2000; Chatterjee and Hadi 1986; Iglewicz and Hoaglin 1993) to facilitate achieving statistical significance. This is a safer tactic than increasing n because increasing n may also increase sample variance and so impede attaining statistical significance. The arguments against statistical significance testing have certainly not been soundly rebutted (see, e.g., the details and references in Goodman 1999, and Nickerson 2000, for a start on these arguments as well as on those for statistical significance testing; also see Krause 2011, 2013a, 2016b).

Sample size, n, is and sample variety, v, can be enhanced by further sampling. Especially for meagerly resourced research projects sequential sampling may be more feasible than single-stage sampling. However, only sequential sampling allows sampling to be done most purposefully (as distinct from opportunistically or, where this is actually possible: Krause 2016b, randomly) in order to most cost-effectively maximize v.

This purpose of sequential sampling is radically different than that of sequential sampling designed for cost-effectively accumulating a just large enough n (something that single-stage sampling cannot guarantee to achieve), which is a much worked at topic of Mathematical Statistics (see, e.g., DeGroot 2005, 267–384; Turner et al. 2003). The concern for cost control requires a stopping rule to deter n from getting too much larger than necessary for achieving the effectiveness of the sampling. The mathematics involved requires real-number-gradated dimensions, which are generally or at least often unavailable in SHP theory and practice research (which properly must deal mostly in ordered-category-gradated dimensions: see Krause 2012, 2013b; Michell 1999, 2008). This kind of sequential sampling theory concerns the statistics of aggregations of cases (or occasions) rather than the description and causal explanation of individual cases (or occasions). So how should SHP theory and practice researchers determine what n to settle for while trying, as they should, to maximize v?

Francis et al. (2010) provide an introduction to how this issue might be approached. They use a “no new ideas found” criterion, that may reasonably be interpreted to mean “no new varieties found”, v stable (and so “theory saturation”, see, e.g., Morse 1995, possibly achieved), in some given number of successive cases that justifies stopping collecting further cases. This approach to setting an n stopping rule depends on already having a rich enough theory for recognizing what are relevant “new ideas”, which must imply further varieties and so must mean already having some set of d I input (i.e., causal or predictive) dimensions, each (for simplicity here) gradated by k ordered-categories that define the set of d I × k = L “old ideas” that remains still unchallenged by any “new ideas”: increments to v.

Studying only some predetermined number of cases, regardless of which if any of the L theoretically distinguishable varieties has been found in these, is another stopping rule, but what n? Perhaps that it not be less than L, so the smaller are d and k the smaller can this n properly be. For SHP theory or practice research k can be useful as at least 2, but dimension definition properly must determine dimension gradation (Krause 2012). Because there surely are many different causal influences on how persons live their lives, however, d must surely be quite large. Although opportunity sampling will produce some occupation of many of the L locations, specifically focused special efforts will likely be needed to find at least 1 case for each of the rest or to reasonably argue why this is impossible for some of them. Finding the most needed cases for theory development: grounding and testing (see, e.g., Baker et al. 2012; Ritchie et al. 2003), rather than simply enough cases for some mooted statistical power (e.g., Cohen 1992; Kraemer and Blasey 2015), is the prime requirement of SHP theory and practice research. So purposive sequential sampling is most advisable for SHP theory and practice research, the purpose being to fully specify d I and ultimately to span and saturate with cases the ranges of all the these dimensions but most immediately to explore the presently apparently most theoretically and practically important members of L. This is ideally done by close and sustained study of one case at a time to permit its thorough study in order to locate it accurately in L and, for theory purposes to discover additional dimensions to add to d, supplements to any dimension’s k, and any clues to what still unoccupied members of L to try find an occupant of next, especially those of greater theoretical or practical importance (e.g., Coyne 1997; Dourdouma and Mortl 2012; Flick 2009, 114–126). So n cannot properly be predetermined for SHP theory purposes, whereas it often must be for social policy purposes and is variously determined for practitioners of the various psychological crafts.

Not already having a credible enough single L for all causal theory and practice purpose, which is a continuing SHP predicament, purposive sequential case sampling itself ought to be taken advantage of for suggesting and grounding whatever seem the d and their k that are needed and feasible (see Corbin and Strauss 1990; Breckenridge and Jones 2009; Onwuegbuzie and Leech 2007). Besides very close and sustained study of each obtained case for possible addition to d and to k for each member of d, this also requires persistence at trying to obtain at least one case for each of the k gradations of each of the d dimensions before at least tentatively concluding that some of these hyperspace locations are naturally unoccupied (see Krause 2016b). If more theories arise from this process, proceeding to purposefully sequentially sample in light of them should then be done until theory- and practice-based needs for cases have been exhausted. This is a further reason that n cannot properly be predetermined.

Francis et al. (2010) tested their estimated nothing new run lengths (i.e., their amount of further sampling) by exceeding them by some longer extension of still nothing new: for example “After 10 interviews, when three further interviews have been conducted with no new themes emerging, we will define this as the point of data saturation. The stopping criterion is tested after each successive interview (i.e., 11, 12 and 13; then 12, 13 and 14, and so on) until there are three consecutive interviews without additional material.” (p. 8). This stratagem is applicable for any given hyperspace or for developing one, although its arbitrariness is troubling, especially in light of the preceding paragraph. Because there obviously is no sound effectiveness-based n-stop rule (see, e.g., Wood and Christy 1999, 11–15; Mason 2010), only cost considerations and “all plausibly occupied L locations already occupied” can rationally be employed for defining an n-stop rule. Periodic case re-samplings are desirable to give so far unoccupied L locations further opportunity to be occupied or additions to L to be encountered (see Corbin and Strauss 1990). Each SHP theory-and-practice research project will have its particular cost considerations that dictate what n of cases can be adequately studied, but international coordination of these projects as to what core set of d, what k for each, what measure is used to represent each, and what minimum level of funding is necessary will be necessary to deter SHP research resources from continuing to be as squandered on psychological research projects that truly are incomparable with one another, as has too often been the case (see Krause 2016b, 2016c; and for a richly detailed psychotherapy outcome research example of such incomparability and so n waste, carefully read Orlinsky et al. 2004).

5 The role of sample variety

Linear Model research is concerned with a sample’s variety, v, only as an influence on the variance of the sample’s variables’ distributions, because variance, as well as sample size, influences statistical significance tests of regression (as r2 or R2) and mean difference (as s2). This is distinct from the hyperspace spanning and saturating multidimensional variety that actually matters for SHP theory and practice research. It matters because nothing is too novel or inconsistent with prevailing theory or practice to be of interest to such research (see, e.g., Krause 2010; Welles 2014), so every individual case, rather than only relationships among variables, may have something important to contribute to SHP.

The openness and curiosity of proper SHP theory and practice research differs sharply from the current Linear Model interest in pruning data of “outliers” and so avoiding “excessive” dependent-variable variance (again see Carling 2000; Chatterjee and Hadi 1986; Iglewicz and Hoaglin 1993). It favors close study, rather than indiscriminate pruning, of every case that occupies an extreme location in the measured dimensions’ hyperspace (see, e.g., Christianson et al. 2009, for an example of such study). Pruning validly measured and otherwise unexceptionable outliers narrows the range and so more centrally concentrates an outcome variable distribution, reducing its variance, and so making less difference in comparison group means required for being statistically significant. What outliers to prune to enhance a correlation is more nuanced, so scattergrams need to be studied to see what cases need pruning to most enhance a correlation (e.g., Koh et al. 2007; Osborne and Overbay 2004).

SHP theory and practice research sampling should aim to maximize variety, v, and so should be purposive rather than opportunity or random sampling (see, e.g., Coyne 1997; Dourdouma and Mortl 2012; Flick 2009, 114–126; Morse 1995, on the former two and, e.g., Krause 2016b, on their differences from random sampling). One requisite for maximizing v is to measure each case on as many as possible dimensions that conceivably might be outcome important or causally relevant to these and to study it for further dimensions to measure on that might usefully increase d. A second requisite is to carefully study how this d I hyperspace is being progressively spanned and saturated by measurements as case sampling progresses and then to actively seek cases in its most sparsely occupied and plausible theory suggested regions. A third is to return to study further the earlier studied cases in order to obtain information from them on their subsequent evolution and on dimensions added to the research set since then and to reassess the validity of the previously obtained data, which requires having developed a relationship with the persons involved that facilitates such revisiting and further study (see, e.g., Wax and Shapiro 1956). A fourth is to encourage from every person studied speculation about further dimensions and anything else that might be relevant to one’s research topic. All four of which differ from usual SHP Linear Model research practice by trying to make research subjects research collaborators because this is necessary for obtaining complete and valid data (Krause, M. S., manuscript b: Psychological measuring and measurements: Their possible effects on persons measured and unmeasured).

6 The roles of sample representativeness of population variety and variety proportion

A population’s full variety (V) representativeness in the obtained sample of cases (v = V) is ultimately crucial for SHP theory and practice research. Each variety’s population-proportions’ (P) representativeness (vP = VP) in the sample data is important for SHP policy research. Both V and VP are generally unknown for SHP because only opportunity sample data have generally been available to SHP.

Population variety (V) is estimated in terms of some chosen set, d, of gradated dimensions, each (for simplicity here) of k ordinal gradations. This defines a hyperspace of d × k = L discrete locations, each occupied in a population of interest by one or more cases or by none. Only if all the L occupied by at least one case in the population are occupied by at least one case in the sample data is population variety (V) fully represented. Some of these L locations may have no sample occupant, which calls for some testable explanation as to why as part of SHP theory, just as much as does a location’s having one or more sample occupants. For example, why some injuries sometimes are initially not reported as painful by the injured is no less important to understand than why some seemingly comparable injuries are initially reported as painful by the injured (see, e.g., Wall 1979) or why some pains have apparently never been accompanied by any seemingly relevant injury.

For SHP theory and practice purposes the number of occupants beyond one that an L location has is not crucial (although it may be for policy purposes), so the most cost-effective sampling would be to find at least one case for each location: L in all. This calls for purposive sampling informed by which of the L locations remain unoccupied, by which of these are of most practical or theoretical interest, and by whatever notions there are about how to find cases for each of these. The most immediately obvious aspect of such informing is the currently proposed d, which ideally would include every cause and effect dimension considered relevant in any currently entertained theory by any faction of the SHP theory and practice community concerned with any of these d dimensions. Dimensional under-specification is far more serious than dimensional over-specification (Krause 2010, 2013a, 2016c; Krause et al. 2007), even though the latter is more costly in the short run. The methodological implication of this too is that each case should be exhaustively studied to precisely locate it somewhere among the given L and for clues to increasing d or k and so L. Both concerns are hallmarks of “qualitative” and so v = V tending research.

Prolonged autobiographical study (see, e.g., Van Manen 2015; Wertz et al. 2011) by highly trans-theoretically trained and humanity-representative SHP researchers, who then strenuously dialogue about what they discover, is the obviously best initial way to pursue SHP theory and practice research. This is so because all data about how persons live their lives and why so necessarily are descriptions of personal experiencings, and a rich variety of highly sensitized informants offers the best chance of constructing the most comprehensive d, k, and so L to use for purposively choosing and then thoroughly studying further persons (Krause 2016a). This is the optimal nature of what can properly be called “qualitative psychological research”, and it contrasts starkly with current Linear Model “quantitative” research.

7 The role of samples’ differences

What role sample differences play in influencing a Linear Model statistic or an L occupancy pattern depends on how each of a set of c samples is drawn, which for SHP has generally been opportunistically. Each addition to c creates the opportunity for there to be greater v and so fuller occupancy of L (just as does each addition to any sample’s n) in a pooling of these samples (Σc n = N), as well as for a greater variety in intra-variable mean differences and variances and for inter-variable regressions and correlations across the set of c samples. SHP sampling for theory or practice purposes should be done to maximize v in each of and across the set of c samples and so should be purposive.

Single opportunity sample intra-variable mean differences and inter-variable regressions ought not be taken seriously as estimates of population intra-variable mean differences and inter-variable regressions, although they traditionally are. Whereas their occupied (but not their still unoccupied) L locations should be taken seriously for SHP theory and practice purposes because these at least suggest specific sufficient condition causal relationships, as mean differences with residual dependent-variable variance and regressions based upon imperfect correlations do not. As c and so the N of all these c pooled is increased, the still unoccupied L locations should be taken increasingly seriously as targets for purposive sampling but also as likely constraints on theory or practice that require explanation: why might this specific member of L be un-instantiable (e.g., why -if we haven’t- have we still not encountered a case of completely cured schizophrenia, of spontaneous adult self-actualization from a severely traumatized infancy, of mindfulness/enlightenment lost…)?

Only given uniformly valid measurement on all d dimensions (without which proceeding is merely wasteful if not preparatory for achieving this) in some c samples of cases is pooling these into one sample of N cases for a comprehensive analysis a proper option (see Krause, 2016c b). Their pooled occupancy pattern of L would then be what is most informative. As c increases, this pattern (and so any statistics derived from it) will likely evolve somewhat, more obviously the smaller and differently opportunistic or purposive (given the usual infeasibility of SHP actually random sampling) are the c samplings. Therefore, single samples’ occupancy patterns on L and the statistics based on these ought not be taken seriously whatever their level of statistical significance. Instead, only the occupancy pattern of L based on all the accumulated c sample data should be taken seriously, but even then only as the so far best estimate of the case-population occupation pattern of L.

The greater is c and the larger is n for each of the uniformly validly measured c samples, and the more independently of each other is the drawing of these samples, the more comprehensively will SHP be informed about the full variety (V) of humanity and the population proportion of each (VP). It can never be certain, however, that all the ways human lives are lived or the causes of this have yet been covered, so as SHP progresses more purposeful sampling and L expansion will likely be required to continue to close the gap between what is known (the size and occupancy of L) and what remains to be learned about how human lives are lived and why so.

Unless univariate mean differences and multivariate regressions reliably account for all the dependent-variable variance they cannot confidently be claimed to account for any of it. This may seem counterintuitive but it is so because, given valid measurement, any Linear Model residual dependent-variable variance clearly indicates that other causal influences on the dependent-variables than those of the study’s specified independent-variables are influencing these dependent-variables and could also be confounding the study’s independent-variables (Krause 2013a, 2016b, 2016c). How could the mathematically convenient assumptions that residual dependent-variable variance is entirely random measurement error or that any still unaccounted for other influences on the dependent-variables are orthogonal to all of a study’s independent-variables possibly be justified? Any residual dependent-variable variance whatsoever (i.e., R2 > 0 or s2 > 0) means that d I is too small or includes some causally irrelevant dimensions, that this particular Linear Model or SEM is dimensionally under- or otherwise mis-specified.

Sample differences dramatize this Linear Model problem but can quite un-problematically more fully occupy L. They are innocuous for random sampling because there only N matters. They express the differences in the opportunities in opportunity sampling. They are the point in proper purposive sampling.

8 Building theory rather than statistically estimating population distributions

Sample data generalizability to their population depends upon the sample’s representativeness of the population from which the data were drawn. This representativeness properly is in terms of including all the varieties of persons (V) and of indicating these varieties’ proportions in the population (VP). This is either exact representativeness, which can be recognizable only if there somehow were prior knowledge about the population’s distribution, or it is stochastic statistical expectation representativeness made possible by truly random sampling from the population. Both are presently unavailable to SHP (see, e.g., Krause 2016b, 2016c), so SHP must sample to inductively infer the human population’s distribution in SHP’s currently normative d space by widely purposively accumulating individual case studies, because individual lives are what SHP theory and practice and so “qualitative” research properly are about.

Projecting from sample data -rather estimating actual- population parameters is what feasible SHP “quantitative” social policy research is about. The latter is not statistical estimation of actual but unknown population distribution’s parameters, because neither assuredly representative nor random sampling of the human population is yet feasible. It is instead the building of a known sample distribution that presently must stand in for the real but still not ascertainable human population distribution for SHP’s attempting to take account of the full psychological variety of humanity (V) and of the human population proportion of each variety (VP).

For SHP’s both “qualitative” theory and practice and “quantitative” social policy research purposes this means: (a) Progressively finding, or (where it is ethical to do so) creating, cases in every one of the L locations in whatever is the present SHP-normative trans-theoretical hyperspace. (b) Progressively increasing d, adjusting their k, and so increasing L as much as is descriptively and causally explanatorily useful and possible. How densely any location of the L is sample occupied (beyond not at all), and so the achieved sample frequency-distribution of the human population in this hyperspace (i.e., VP), is irrelevant for SHP theory and practice purposes but relevant for SHP social policy purposes.

Insofar as any causal-dimension space location maps one-to-many to effect-dimension space locations, the causal-dimension space is dimensionally underspecified (Krause 2010, 2016b). Insofar as any effect-dimension space location remains unoccupied, the possibility exists that some locations in the full causal-dimension space relevant to it have not yet been sampled or cannot be until some further dimensions are added to the present causal-dimension space. In this way SHP theory and practice research must gradually build by invention and discovery a hyperspace of some L locations and in it a known sample distribution of some N cases of human life, rather than pretend to statistically estimate on the basis of random sampling what this distribution actually is, which would be ideal to know for SHP’s policy research purposes (See footnote 2).