Skip to main content

What Influences the Agreement Among Student Ratings of Science Instruction?

  • Chapter

Abstract

In multilevel research on classroom instruction, individual student ratings are often aggregated to the class level in order to obtain a representative indicator of the classroom construct under study. Whether students within a class provide ratings consistent enough to justify aggregation, however, has not been the object of much research. Drawing on data from N = 9524 students from 391 classes who participated in the national extension to the PISA 2006 study in Germany, the interrater reliability and interrater agreement of student ratings of science instruction were examined. Results showed that students within a class tended to accurately and reliably rate various aspects of their science lessons. However, agreement among ratings was influenced by class size, learning time, school track, and science performance. In multiple regression analyses, science performance turned out to be of particular importance in accounting for differences in the homogeneity of ratings. The findings suggest that agreement among students’ perceptions of instruction should be a central consideration for researchers using aggregated measures to examine classroom teaching.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   24.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Aleamoni, L. M. (1999): Student rating myths versus research facts from 1924 to 1998. In: Journal of Personnel Evaluation in Education, Vol. 13, pp. 153–166.

    Article  Google Scholar 

  • Baltes, B. B./ Parker, C. P. (2000): Reducing the effects of performance expectations on behavioral ratings. In: Organizational Behavior and Human Decision Processes, Vol. 82, pp. 237–267.

    Article  Google Scholar 

  • Bliese, P. D. (2000): Within-group agreement, non-independence, and reliability: Implications for data aggregation and analysis. In: Klein, K. J./ Kozlowski, S. W. (Eds.): Multilevel theory, research, and methods in organizations. — San Francisco, pp. 349–381.

    Google Scholar 

  • Brown, R. D./ Hauenstein, N. M. A. (2005): Interrater agreement reconsidered: An alternative to the rWG indices. In: Organizational Research Methods, Vol. 8, pp. 165–184.

    Article  Google Scholar 

  • Chan, D. (1998): Functional relations among constructs in the same context domain at different levels of analysis: A typology of composition models. In: Journal of Applied Psychology, Vol. 83, pp. 234–246.

    Article  Google Scholar 

  • Chi, M. T. H./ Feltovich, P./ Glaser, R. (1981): Categorization and representation of physics problems by experts and novices. In: Cognitive Science, Vol. 5, pp. 121–152.

    Article  Google Scholar 

  • Clausen, M. (2002): Unterrichtsqualität: Eine Frage der Perspektive? [Instructional quality: A matter of perspective?]. — Münster.

    Google Scholar 

  • Cohen, J. (Ed.) (1988): Statistical power analysis for the behavioral sciences. — 2nd ed. — Hillsdale.

    Google Scholar 

  • Den Brook P./ Brekelmans M./ Wubbels T. 2006 Multilevel Issues in Research using Students’ Perceptions of Learning Environments the Case of the Questionnaire on Teacher Interaction. In Learning Environment Research 9 pp. 199–213.

    Google Scholar 

  • Greenwald, A. G. (1997): Validity concerns and usefulness of student ratings of instruction. In: American Psychologist, Vol. 52, pp. 1182–1186.

    Article  Google Scholar 

  • James, L. R./ Demaree, R. G./ Wolf, G. (1984): Estimating within-group interrater reliability with and without response bias. In: Journal of Applied Psychology, Vol. 69, pp. 85–98.

    Article  Google Scholar 

  • James, L. R./ Demaree, R. G./ Wolf, G. (1993): rWG: An assessment of within-group interrater agreement. In: Journal of Applied Psychology, Vol. 78, pp. 306–309.

    Article  Google Scholar 

  • Klein et al. 2001 = Klein, K. J./ Buhl conn, A./ Brent Smith, D./ Speer Sorra, J.) (2001: Is everyone in agreement? An exploration of within-group agreement in employee perceptions of the work environment. In: Journal of Applied Psychology, Vol. 86, pp. 3–16.

    Article  Google Scholar 

  • Koth, C. W./ Bradshaw, C. P./ Leaf, P. J. (2008): A multilevel study of predictors of student perceptions of school climate: The effect of classroom-level factors. In: Journal of Educational Psychology, Vol. 100, pp. 96–104.

    Article  Google Scholar 

  • Kruglanksi, A. W. (1989): The psychology of being ‘right’: The problem of accuracy in social perception and cognition. In: Psychological Bulletin, Vol. 106, pp. 395–409.

    Article  Google Scholar 

  • Kunter, M./ Baumert, J. (2006): Who is the expert? Construct and criteria validity of student and teacher ratings of instruction. In: Learning Environment Research, Vol. 9, pp. 231–251.

    Article  Google Scholar 

  • Lanahan et al. 2005 = Lanahan, L./ Mcgrath, D. J./ Mclaughlin, M./ Burian-Fitzgerald, M./ Sal-Ganik, L. (2005): Fundamental problems in the measurement of instructional processes: Estimating reasonable effect sizes and conceptualizing what is important to measure. — Washington.

    Google Scholar 

  • Lebreton, J. M./ Senter, J. L. (in press): Answers to 20 questions about interrater reliability and interrater agreement. In: Organizational Research Methods.

    Google Scholar 

  • Levy et al. 2003 = Levy, J./ den Brok, P./ Wubbels, T./ Brekelmans, M. (2003): Students’ perceptions of interpersonal aspects of the learning environment. In: Learning Environments Research, Vol. 6, pp. 5–36.

    Article  Google Scholar 

  • Lüdtke et al. 2005 = Lüdtke, O./ Köller, O./ Marsh, H. W./ Trautwein, U. (2005): Teacher frame of reference and the big-fish-little-pond effect. In: Contemporary Educational Psychology, Vol. 30, pp. 263–285.

    Article  Google Scholar 

  • Lüdtke et al. 2006 = Lüdtke, O./ Trautwein, U./ Kunter, M./ Baumert, J. (2006): Reliability and agreement of student ratings of the classroom environment: A reanalysis of TIMSS data. In: Learning Environment Research, Vol. 9, pp. 215–230.

    Article  Google Scholar 

  • Marsh, H. W. (1984): Students’ evaluations of university teaching: Dimensionality, reliability, validity, potential biases, and utility. In: Journal of Educational Psychology, Vol. 75, pp. 150–166.

    Article  Google Scholar 

  • Marsh, H. W./ Hau, K-T. (2007): Applications of latent-variable models in educational psychology: The need for methodological-substantive synergies. In: Contemporary Educational Psychology, Vol. 32, pp. 151–170.

    Article  Google Scholar 

  • Marsh, H. W./ Martin, A. J./ Jeng, J. H. S. (2008): A multilevel perspective in gender in classroom motivation and climate: Potential benefits of male teachers for boys? In: Journal of Educational Psychology, Vol. 100, pp. 78–95.

    Article  Google Scholar 

  • Marsh, H. W./ Rowe, K./ Martin, A. (2002): PhD students’ evaluations of research supervision: Issues, complexities and challenges in a nationwide Australian experiment in benchmarking universities. In: Journal of Higher Education, Vol. 73, pp. 313–348.

    Article  Google Scholar 

  • Mcgraw, K. O./ Wong, S. P. (1996): Forming inferences about some intraclass correlation coefficients. In: Psychological Methods, Vol. 1, pp. 30–46.

    Article  Google Scholar 

  • Miller, A. D./ Murdock, T. B. (2007): Modeling latent true scores to determine the utility of aggregate student perceptions as classroom indicators in HLM: The case of classroom goal structures. In: Contemporary Educational Psychology, Vol. 32, pp. 83–104.

    Article  Google Scholar 

  • OECD. (Ed.) (2007): PISA 2006 — Science competencies for tomorrow’s world. — Paris.

    Google Scholar 

  • Prenzel et al. 2007 = Prenzel, M./ Carstensen, C./ Frey, A./ Drechsel, B./ Rönnebeck, S. (2007): PISA 2006 — Eine Einführung in die Studie [PISA 2006 — An introduction to the study]. In: Prenzel, M./ Artelt, C./ Baumert, J./ Blum, W./ Hammann, M./ Klieme, E./ Pekrun, R. (Hrsg.): PISA 2006. Die Ergebnisse der dritten internationalen Vergleichsstudie. — Münster, S. 31–59.

    Google Scholar 

  • Prenzel, M./ Kramer, K./ Drechsel, B. (2002): Self-determined and interested learning in vocational education. In: Beck, K. (Ed.): Teaching-learning processes in vocational education. — Frankfurt a.M., pp. 43–68.

    Google Scholar 

  • Sedlmeier, P. (2006): The role of scales in student ratings. In: Learning and Instruction, Vol. 16, pp. 401–415.

    Article  Google Scholar 

  • Seidel, T. (2006): The role of student characteristics in studying micro teaching-learning environments. In: Learning Environment Research, Vol. 9, pp. 253–271.

    Article  Google Scholar 

  • Seidel et al. 2007 = Seidel, T./ Prenzel, M./ Wittwer, J./ Schwindt, K. (2007): Unterricht in den Na-turwissenschaften [Science teaching]. In: Prenzel, M./ Artelt, C./ Baumert, J./ Blum, W./ Hammann, M./ Klieme, E./ Pekrun, R. (Hrsg.): PISA 2006. Die Ergebnisse der dritten internationalen Vergleichs-studie. — Münster. S. 147–179.

    Google Scholar 

  • Shrout, P. E./ Fleiss, J. L. (1979): Intraclass correlations: Uses in assessing rater reliability. In: Psychological Bulletin, Vol. 86, pp. 420–428.

    Article  Google Scholar 

  • Stigler, J. W./ Gallimore, R./ Hiebert, J. (2000): Using video surveys to compare classrooms and teaching across cultures: Examples and lessons from the TIMSS video studies. In: Educational Psychologist, Vol. 35, pp. 87–100.

    Article  Google Scholar 

  • Urdan, T./ Midgley, C./ Anderman, E. M. (1998): The role of classroom goal structure in students’ use of self-handicapping strategies. In: American Educational Research Journal, Vol. 35, pp. 101–122.

    Google Scholar 

  • Wittwer, J./ Senkbeil, M. (2008): Is students’ computer use at home related to their mathematical performance at school? In: Computers & Education, Vol. 50, pp. 1558–1571.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Manfred Prenzel Jürgen Baumert

Rights and permissions

Reprints and permissions

Copyright information

© 2009 VS Verlag für Sozialwissenschaften | GWV Fachverlage GmbH, Wiesbaden

About this chapter

Cite this chapter

Wittwer, J. (2009). What Influences the Agreement Among Student Ratings of Science Instruction?. In: Prenzel, M., Baumert, J. (eds) Vertiefende Analysen zu PISA 2006. VS Verlag für Sozialwissenschaften. https://doi.org/10.1007/978-3-531-91815-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-531-91815-0_11

  • Publisher Name: VS Verlag für Sozialwissenschaften

  • Print ISBN: 978-3-531-15929-4

  • Online ISBN: 978-3-531-91815-0

  • eBook Packages: Humanities, Social Science (German Language)

Publish with us

Policies and ethics