Data Collection, Control, and Sample Size

  • J. Christopher Westland
Part of the Studies in Systems, Decision and Control book series (SSDC, volume 22)


Models are only as good as the data that they analyze—“garbage in, garbage out.” Unfortunately, data-model duality and data-model fit are often overlooked in theory-driven research. This chapter explores the many considerations that are necessary for proper collection of data and credible conduct of research. The chapter covers important concepts in causality, data adequacy, resampling, screening, exploratory data analysis, the nature of latent constructs, and the role of data in research.


  1. Bentler, Peter M., and A.B. Mooijaart. 1989. “Choice of Structural Model via Parsimony: A Rationale Based on Precision.” Psychological Bulletin 106 (2): 315.CrossRefGoogle Scholar
  2. ———. 1989b. “A New Incremental Fit Index for General Structural Equation Models.” Sociological Methods & Research 17 (3): 303–316.CrossRefGoogle Scholar
  3. Boomsma, A. 1982. “The Robustness of LISREL Against Small Sample Sizes in Factor Analysis Models.” Systems Under Indirect Observation: Causality, Structure, Prediction 1: 149–173.Google Scholar
  4. Browne, Michael W., and Robert Cudeck. 1989. “Single Sample Cross-Validation Indices for Covariance Structures.” Multivariate Behavioral Research 24 (4): 445–455.CrossRefGoogle Scholar
  5. Cattell, Raymond B. 1966. “The Scree Test for the Number of Factors.” Multivariate Behavioral Research 1 (2): 245–276.CrossRefGoogle Scholar
  6. Cliff, Norman. 1988. The Eigenvalues-Greater-Than-One Rule and the Reliability of Components. Psychological Bulletin 103 (2): 276.CrossRefGoogle Scholar
  7. Copas, John B., and H.G. Li. 1997. “Inference for Non-random Samples.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 59 (1): 55–95.MathSciNetCrossRefGoogle Scholar
  8. ———. 1974. Econometrics. Berlin: Springer.Google Scholar
  9. Ding, L., W.F. Velicer, and L.L. Harlow. 1995. “Effects of Estimation Methods, Number of Indicators Per Factor, and Improper Solutions on Structural Equation Modeling Fit Indices.” Structural Equation Modeling: A Multidisciplinary Journal 2 (2): 119–143.CrossRefGoogle Scholar
  10. Fisher, Ronald A. 1921. “On the Probable Error of a Coefficient of Correlation Deduced from a Small Sample.” Metron 1: 3–32.Google Scholar
  11. Fisher, Ronald A. 1935. The Logic of Inductive Inference. Journal of the Royal Statistical Society 98 (1): 39—82.MathSciNetCrossRefGoogle Scholar
  12. Goodhue, Dale L. 1995. “Understanding User Evaluations of Information Systems.” Management Science 41 (12): 1827–1844.CrossRefGoogle Scholar
  13. Goodhue, Dale L., and Ronald L. Thompson. 1995. “Task-Technology Fit and Individual Performance.” MIS Quarterly 19 (2): 213–236.CrossRefGoogle Scholar
  14. Goodhue, Dale, William Lewis, and Ron Thompson. 2006. “PLS, Small Sample Size, and Statistical Power in Mis Research.” In Proceedings of the 39th Annual Hawaii International Conference on System Sciences, 2006. Hicss’06. Vol. 8, 202b. Piscataway: IEEE.Google Scholar
  15. ———. 2012a. “Comparing PLS to Regression and Lisrel: A Response to Marcoulides, Chin, and Saunders.” Mis Quarterly 36 (3): 703–716.Google Scholar
  16. ———. 2012b. “Does PLS Have Advantages for Small Sample Size or Non-normal Data?” Mis Quarterly 36 (3): 981–1001.Google Scholar
  17. He, Qinying, and H.N. Nagaraja. 2011. “Correlation Estimation in Downton’s Bivariate Exponential Distribution Using Incomplete Samples.” Journal of Statistical Computation and Simulation 81 (5): 531–546.MathSciNetCrossRefGoogle Scholar
  18. Jöreskog, Karl G., and Arthur S. Goldberger. 1975. “Estimation of a Model with Multiple Indicators and Multiple Causes of a Single Latent Variable.” Journal of the American Statistical Association 70 (351a): 631–639.MathSciNetCrossRefGoogle Scholar
  19. Kaiser, Henry F. 1960. “The Application of Electronic Computers to Factor Analysis.” Educational and Psychological Measurement 20 (1): 141–151.CrossRefGoogle Scholar
  20. Kendall, Maurice, and J.D.R. Gibbons. 1990. Correlation Methods. Oxford: Oxford University Press.zbMATHGoogle Scholar
  21. Leamer, Edward E. 1978. Specification Searches: Ad Hoc Inference with Nonexperimental Data. Vol. 53. New York: Wiley.zbMATHGoogle Scholar
  22. Marsh, Herbert W., and Michael Bailey. 1991. “Confirmatory Factor Analyses of Multitrait-Multimethod Data: A Comparison of Alternative Models.” Applied Psychological Measurement 15 (1): 47–70.CrossRefGoogle Scholar
  23. Nakagawa, Shinichi, and Innes C. Cuthill. 2007. “Effect Size, Confidence Interval and Statistical Significance: A Practical Guide for Biologists.” Biological Reviews 82 (4): 591–605.CrossRefGoogle Scholar
  24. Nunnally, Jum C., Ira H. Bernstein, and others. 1967. Psychometric Theory. Vol. 226. New York: McGraw-Hill.Google Scholar
  25. Podsakoff, Philip M., and Dennis W. Organ. 1986. “Self-reports in Organizational Research: Problems and Prospects.” Journal of Management 12 (4): 531–544.CrossRefGoogle Scholar
  26. Raspe, Rudolf Erich. 2009. The Surprising Adventures of Baron Munchausen. Auckland: The Floating Press.Google Scholar
  27. Samuel, Mari Dominique Drouet Kotz, Dominique Drouet Mari, and Samuel Kotz. 2001. Correlation and Dependence. Singapore: World Scientific.Google Scholar
  28. Schechtman, Edna, and Shlomo Yitzhaki. 1999. “On the Proper Bounds of the Gini Correlation.” Economics Letters 63 (2): 133–138.MathSciNetCrossRefGoogle Scholar
  29. Shevlyakov, Georgy L., and Nikita O. Vilchevski. 2002. “Minimax Variance Estimation of a Correlation Coefficient for ε-Contaminated Bivariate Normal Distributions.” Statistics & Probability Letters 57 (1): 91–100.MathSciNetCrossRefGoogle Scholar
  30. Stuart, Alan, Maurice George Kendall, and John Keith Ord. 1991. Classical Inference and Relationship. Oxford: Oxford University Press.Google Scholar
  31. Tabachnick, Barbara G., Linda S. Fidell, and Jodie B. Ullman. 2007. Using Multivariate Statistics. Needham Heights: Allyn & Bacon/Pearson Education.Google Scholar
  32. Tanaka, Jeffrey S. 1987. “‘How Big Is Big Enough?’: Sample Size and Goodness of Fit in Structural Equation Models with Latent Variables.” Child Development 58: 134–146.CrossRefGoogle Scholar
  33. Westland, J. Christopher. 2010. “Lower Bounds on Sample Size in Structural Equation Modeling.” Electronic Commerce Research and Applications 9 (6): 476–487.CrossRefGoogle Scholar
  34. Wolfram, Stephen. 2002a. A New Kind of Science. Vol. 5. Champaign: Wolfram Media.Google Scholar
  35. Xu, Weichao, Y.S. Hung, Mahesan Niranjan, and Minfen Shen. 2010. “Asymptotic Mean and Variance of Gini Correlation for Bivariate Normal Samples.” IEEE Transactions on Signal Processing 58 (2): 522–534.MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Information & Decision SystemsUniversity of Illinois at ChicagoChicagoUSA

Personalised recommendations