Research in Higher Education

, Volume 50, Issue 8, pp 775–794 | Cite as

Psychometric Properties of Three New National Survey of Student Engagement Based Engagement Scales: An Item Response Theory Analysis

  • Adam C. Carle
  • David Jaffee
  • Neil W. Vaughan
  • Douglas Eder


We sought to develop and psychometrically describe three new student engagement scales to measure college students’ engagement with their faculty (student-faculty engagement: SFE), community-based activities (CBA), and transformational learning opportunities (TLO) using items selected from the National Survey of Student Engagement (NSSE), a widely used, standardized student engagement survey. We used confirmatory factor analysis for ordered-categorical measures, item response theory (IRT), and data from 941 US college students’ NSSE responses. Our findings indicated acceptable construct validity. The scales measured related but separable areas of engagement. IRT demonstrated that scores on the student-faculty engagement scale offered the most precise measurement in the middle range of student-faculty engagement. The CBA scale most reliably measured above average engagement, while TLO scores provided relatively precise descriptions of engagement across this spectrum. Findings support these scales’ utility in institutional efforts to describe “local” student engagement, as well as efforts to use these scales in cross-institutional comparisons.


Student engagement Item response theory Test reliability Measurement Transformational learning Community learning Student faculty interaction 



We would like to thank the students who participated in our University's National Survey of Student Engagement. We would also like to thank the reviewers whose comments improved our original manuscript. Finally, Adam would also like to thank Tara J. Carle and Margaret Carle whose unending support and thoughtful comments make his work possible.


  1. Astin, A. W. (1993). What matters in college: Four critical years revisited. San Francisco: Jossey-Bass.Google Scholar
  2. Astin, A. W., & Sax, L. J. (1998). How undergraduates are affected by service participation. The Journal of College Student Development, 39(3), 251–263.Google Scholar
  3. Astin, A. W., Vogelgesang, L. J., Ikeda, E. K., & Yee, J. A. (2000). How service learning affects students. Los Angeles: Higher Education Research Institute University of California.Google Scholar
  4. Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57, 289–300.Google Scholar
  5. Bjorner, J. B., Smith, K. J., Stone, C., & Sun, X. (2007). IRTFIT: A macro for item fit and local dependence tests under IRT models. USA: QualityMetric Incorporated School of Education, University of Pittsburgh.Google Scholar
  6. Carini, R. M., Kuh, G. D., & Klein, S. P. (2006). Student engagement and student learning: Testing the linkages. Research in Higher Education, 47, 1–32.CrossRefGoogle Scholar
  7. Chickering, A. W., & Gamson, Z. F. (1987). Seven principles for good practice in undergraduate education. AAHE Bulletin, 39, 3–7.Google Scholar
  8. Crane, P. K., Gibbons, L. E., Jolley, L., & van Belle, G. (2006). Differential item functioning analysis with ordinal logistic regression techniques: DIFdetect and difwithpar. Medical Care, 44, S115–S123.CrossRefGoogle Scholar
  9. Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302.CrossRefGoogle Scholar
  10. Embretson, S., & Reise, S. P. (2000). Item response theory for psychologists. NJ: Lawrence.Google Scholar
  11. Gonyea, R. M., Kinzie, J., Kuh, G. D., & Nelson Laird, T. F. (2008). High impact activities: What they are, why they work, and who benefits. Program presented at the American Association for Colleges and Universities annual meeting. Washington, DC.Google Scholar
  12. Gordon, J., Ludlum, J., & Hoey, J. (2008). Validating NSSE against student outcomes: Are they related? Research in Higher Education, 49, 19–39.CrossRefGoogle Scholar
  13. Green, S. B., & Babyak, M. A. (1997). Control of type I errors with multiple tests of constraints in structural equation modeling. Multivariate Behavioral Research, 32(1), 39–51.CrossRefGoogle Scholar
  14. Hambleton, R. K., & Swaminathan, H. (1985). Item response theory. Boston, MA: Kluwer.Google Scholar
  15. Hopkins, K. D. (1997). Educational and psychological measurement and evaluation. New York: Allyn & Bacon.Google Scholar
  16. Hu, L., & Bentler, P. M. (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspecification. Psychological Methods, 3(4), 424–453.CrossRefGoogle Scholar
  17. Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6(1), 1–55.CrossRefGoogle Scholar
  18. Jones, R. (2006). Identification of measurement differences between English and Spanish language versions of the mini-mental state examination: Detecting differential item functioning using MIMIC modeling. Medical Care, 44, S124–S133.CrossRefGoogle Scholar
  19. Kinzie, J., Evenbeck, S. (2008). Assessing Student Engagement in High-Impact Practices. Program presented at the NASPA IARC Conference Scottsdale, AZ.Google Scholar
  20. Klein, S. P., Kuh, G. D., Chun, M., Hamilton, L., & Shavelson, R. (2005). An approach to measuring cognitive outcomes across higher education institutions. Research in Higher Education, 46(3), 251–276. CrossRefGoogle Scholar
  21. Kuh, G. D. (2001). The national survey of student engagement: Conceptual framework and overview of psychometric properties. Bloomington, IN: Indiana University, Center for Postsecondary Research.Google Scholar
  22. Kuh, G. D. (2003). What we’re learning about student engagement from NSSE. Change, 35, 24–32.Google Scholar
  23. Kuh, G. D. (2005). 7 steps for taking student learning seriously. Trusteeship, 13, 20–24.Google Scholar
  24. Kuh, G. D., Hayek, J. C., Carini, R. M., Ouimet, J. A., Gonyea, R. M., & Kennedy, J. (2001). NSSE technical and norms report. Bloomington, IN: Indiana University Center for Postsecondary Research and Planning.Google Scholar
  25. Kuh, G. D., Kinzie, J., Schuh, J. H., Whitt, E. J., & Associates. (2005). Student success in college: Creating conditions that matter. San Francisco: Jossey-Bass.Google Scholar
  26. McInnis, E. D. (2006). Nonresponse bias in student assessment surveys: A comparison of respondents and non-respondents of the national survey of student engagement at an independent comprehensive Catholic University. Unpublished doctoral dissertation, Marywood University, USA.Google Scholar
  27. Muthén, L. K., & Muthén, B. O. (1998–2007). Mplus user’s guide. (4ed.) Los Angeles, CA: Muthén & Muthén.Google Scholar
  28. National Survey of Student Engagement (NSSE). (2006). NSSE 2006 codebook. Retrieved November 22, 2009, from
  29. National Survey of Student Engagement (NSSE). (2008). Origins. Retrieved November 22, 2009, from
  30. Orlando, M., & Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24(1), 50–64.CrossRefGoogle Scholar
  31. Orlando, M., & Thissen, D. (2003). Further investigation of the performance of S-X²: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27(4), 289–298.CrossRefGoogle Scholar
  32. Pace, C. R. (1984). Measuring the quality of college student experiences. Los Angeles: Los Angeles Center for the Study of Evaluation, University of California Los Angeles.Google Scholar
  33. Pascarella, E. T., & Terenzini, P. T. (1991). How college affects students. San Francisco: Jossey-Bass.Google Scholar
  34. Pedhazur, E. J., & Schmelkin, L. (1991). Measurement, design, and analysis: An integrated approach. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.Google Scholar
  35. Pike, G. R. (2006a). The convergent and discriminant validity of NSSE Scalelet scores. Journal of College Student Development, 47, 551–564.CrossRefGoogle Scholar
  36. Pike, G. R. (2006b). The dependability of NSSE Scalelets for college- and department-level assessment. Research in Higher Education, 47, 177–195.CrossRefGoogle Scholar
  37. Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34(4, pt. 2), 100.Google Scholar
  38. Stone, C. A. (2000). Monte carlo based null distribution for an alternative goodness-of-fit test statistic in IRT models. Journal of Educational Measurement, 37(1), 58–75.CrossRefGoogle Scholar
  39. Stone, C. A. (2003). Empirical power and type I error rates for an IRT fit statistic that considers the precision of ability estimates. Educational and Psychological Measurement, 63(4), 566–583.CrossRefGoogle Scholar
  40. Stone, C. A., & Zhang, B. (2003). Assessing goodness of fit of item response theory models: A comparison of traditional and alternative procedures. Journal of Educational Measurement, 40(4), 331–352.CrossRefGoogle Scholar
  41. Thissen, D., Chen, W., & Bock, D. (2002). MULTILOG 7. Chicago: Scientific software international.Google Scholar
  42. Thissen, D., Steinberg, L., & Wainer, H. (1993). Differential item functioning. In P. W. Holland (Ed.), Detection of differential item functioning using the parameters of item response models (pp. 67–113). Hillsdale, NJ: Lawrence Erlbaum Associates.Google Scholar
  43. Thorndike, R. M. (2004). Measurement and evaluation in psychology and education (7th ed.). Columbus, OH: Merrill Publishing Co/Prentice-Hall.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Adam C. Carle
    • 1
  • David Jaffee
    • 2
  • Neil W. Vaughan
    • 1
  • Douglas Eder
    • 3
  1. 1.Department of PsychologyUniversity of North FloridaJacksonvilleUSA
  2. 2.Academic AffairsUniversity of North FloridaJacksonvilleUSA
  3. 3.Institutional EffectivenessUniversity of North FloridaJacksonvilleUSA

Personalised recommendations