Abstract
Canadian students experience many different assessments throughout their schooling (O’Connor 2011). There are many benefits to using a variety of assessment types, item formats, and science-based performance tasks in the classroom to measure the many dimensions of science education. Although using a variety of assessments is beneficial, it is unclear exactly what types, format, and tasks are used in Canadian science classrooms. Additionally, since assessments are often administered to help improve student learning, this study identified assessments that may improve student learning as measured using achievement scores on a standardized test. Secondary analyses of the students’ and teachers’ responses to the questionnaire items asked in the Pan-Canadian Assessment Program were performed. The results of the hierarchical linear modeling analyses indicated that both students and teachers identified teacher-developed classroom tests or quizzes as the most common types of assessments used. Although this ranking was similar across the country, statistically significant differences in terms of the assessments that are used in science classrooms among the provinces were also identified. The investigation of which assessment best predicted student achievement scores indicated that minds-on science performance-based tasks significantly explained 4.21% of the variance in student scores. However, mixed results were observed between the student and teacher responses towards tasks that required students to choose their own investigation and design their own experience or investigation. Additionally, teachers that indicated that they conducted more demonstrations of an experiment or investigation resulted in students with lower scores.
Similar content being viewed by others
Notes
The discrepancy between the MANOVA results, which indicated differences in assessment practice among the provinces, and the three-level HLM interclass correlation, which indicated minimal provincial differences, may have been due to the large sample size which may have made small effects statistically significant (Tabachnick and Fidell 2013). This result is further supported by the low partial-eta squared values from each of the MANOVAs.
References
Abrahams, I., & Millar, R. (2008). Does practical work really work? A study of the effectiveness of practical work as a teaching and learning method in school science. International Journal of Science Education, 30(14), 1945–1969. https://doi.org/10.1080/09500690701749305.
American Educational Research Association [AERA], American Psychological Association [APA], & National Council on Measurement in Education [NCME]. (2014). Standards for educational and psychological testing. Washington, DC: Author.
Barab, S. A., Gresalfi, M. S., & Ingram-Goble, A. (2010). Transformational play: using games to position person, content, and context. Educational Researcher, 39(7), 525–536 Retrieved from http://ase.tufts.edu/DevTech/courses/readings/Barab_Transformational_Play_2010.pdf.
Bennett, R. E., & Gitomer, D. H. (2009). Transforming K-12 assessment: integrating accountability testing, formative assessment and professional support. In C. Wyatt-Smith & J. J. Cumming (Eds.), Educational assessment in the 21st century (pp. 43–61). New York, NY: Springer.
Bennett, R.E., Persky, H., Weiss, A.R., and Jenkins, F. (2007). Problem solving in technology-rich environments: a report from the NAEP technology-based assessment project (NCES 2007–466). U.S. Department of Education. Washington, DC: National Center for Education Statistics. Retrieved from the Institute of Education Sciences website: http://nces.ed.gov/nationsreportcard/pubs/studies/2007466.asp.
Black, P., & Wiliam, D. (1998). Inside the black box: raising standards through classroom assessment. London: School of Education, King’s College.
Chalmers, A. F. (1999). What is this thing called science? Indianapolis, IN: Hackett Publishing Company.
Chu, M-W. (2017, March). Using computer simulated science laboratories: a test of pre-laboratory activities with the learning error and formative feedback model. Unpublished doctoral dissertation, University of Alberta, Edmonton.
Council of Ministers of Education, Canada [CMEC] (2013a). Pan-Canadian assessment program PCAP—2013 student questionnaire. Council of Ministers of Education, Canada. Toronto: Author. Retrieved from https://www.cmec.ca/docs/pcap/pcap2013/Student%20Questionnaire.pdf.
Council of Ministers of Education, Canada [CMEC] (2013b). Pan-Canadian assessment program PCAP—2013 teacher questionnaire. Council of Ministers of Education, Canada. Toronto: Author. Retrieved from https://www.cmec.ca/docs/pcap/pcap2013/Teacher%20Questionnaire.pdf.
Council of Ministers of Education, Canada [CMEC] (2014). Pan-Canadian assessment program 2013: report on the pan-Canadian assessment of science, reading, and mathematics. Council of Ministers of Education, Canada. Toronto: Author. Retrieved from http://cmec.ca/Publications/Lists/Publications/Attachments/337/PCAP-2013-Public-Report-EN.pdf.
Duncan, & Noonan. (2007). Factors affecting teachers’ grading and assessment practices. Alberta Journal of Educational Research, 53(1), 1–21 Retrieved from http://ajer.journalhosting.ucalgary.ca/index.php/ajer/article/view/602/585.
Frontline (2014). The testing industry’s big four: profiles of the four companies that dominate the business of making and scoring standardized achievement tests. Retrieved from http://www.pbs.org/wgbh/pages/frontline/shows/schools/testing/companies.html.
Fung, K., & Chu, M.-W. (2015). Fairness of standardized assessments: discrepancy between provincial and territorial results. Journal of Contemporary Issues in Education, 10(1), 2–24. https://doi.org/10.20355/C5KG6P.
Gobert, J., Sao Pedro, M., Raziuddin, J., & Baker, R. (2013). From log files to assessment metrics for science inquiry using educational data mining. Journal of the Learning Sciences, 22(4), 521–563 Retrieved from http://slinq.org/projectfiles/pubs/GobertEtAlJLS2013.pdf.
Hodson, D. (1996). Laboratory work as scientific method: three decades of confusion and distortion. Jounral of Curriculum Studies, 28(2), 115–135. https://doi.org/10.1080/0022027980280201.
Hodson, D. (2003). Time for action: science education for an alternative future. International Journal of Science Education, 25(6), 645–670. https://doi.org/10.1080/09500690305021.
Hofstein, A., & Lunetta, V. N. (2003). The laboratory in science education: foundations for the twenty-first century. Science Education, 88(1), 28–54. https://doi.org/10.1002/sce.10106.
Leighton, J. P., Chu, M.-W., & Seitz, P. (2013). Cognitive diagnostic assessment and the learning errors and formative feedback (LEAFF) model. In R. Lissitz (Ed.), Informing the practice of teaching using formative and interim assessment: A systems approach (pp. 183–207). Charlotte: Information Age Publishing.
Kane, M. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Westport, CT: American Council on Education/Praeger.
Klinger, D. A., & Saab, H. (2012). Educational leadership in the context of low-stakes accountability: the Canadian perspective. In L. Volante (Ed.), School leadership in the context of standard-based reform: International perspective (pp. 69–94). New York, NY: Springer Science + Business Media.
Klinger, D., DeLuca, C., & Miller, T. (2008). The evolving culture of large-scale assessments in Canadian education. Canadian Journal of Educational Administration and Policy, 76, 1–34.
Krathwohl, D. R. (2002). A revision of bloom’s taxonomy: an overview. Theory Into Practice, 41(4), 212–218 Retrieved from https://www.depauw.edu/files/resources/krathwohl.pdf.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.
Ma, J., & Nickerson, J. V. (2006). Hands-on, simulated, and remote laboratories: a comparative literature review. ACM Computing Surveys, 38(3), 7. https://doi.org/10.1145/1132960.1132961.
McMillan J. H. (2001). Fundamental assessment principles for teachers and school administrators. Practical Assessment, Research & Evaluation, 7(8). Retrieved from http://pareonline.net/getvn.asp?v=7&n=8.
National Research Council. (2006). America’s lab report: investigations in high school science. Committee on High School Science Laboratories: Role and Vision, In S. R. Singer, M. L. Hilton, and H. A. Schweingruber, (Eds.). Board on Science Education, Center for Education. Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press. Retrieved from http://www.nap.edu/catalog/11311.html.
National Research Council. (2014). Developing assessments for the next generation science standards. In J. W. Pellegrino, M. R. Wilson, J. A. Koenig, & A. S. Beatty (Eds.), Division of behavioral and social sciences and education, Committee on Developing Assessments of Science Proficiency in K-12. Board on Testing and Assessment and Board on Science Education. Washington, DC: The National Academies Press Retrieved from http://www.nap.edu/catalog.php?record_id=18409.
Next Generation Science Standards Lead States. (2013). Next generation science standards: for states, by states. Washington, DC: The National Academies Press Retrieved from http://www.nap.edu/catalog.php?record_id=18290.
O’Connor, K. (2011). 15 fixes for broken grades (Canadian edition). Toronto, ON: Pearson Canada.
Organization for Economic Co-operation and Development [OECD] (2017). PISA 2015 assessment and analytical framework: science, reading, mathematic, financial literacy and collaborative problem solving, OECD Publishing, Paris. Retrieved from https://doi.org/10.1787/9789264281820-en.
PhET. (2017). PhET interactive simulations: research. Retrieved from https://phet.colorado.edu/en/research.
Popham, W. J. (2011). Classroom assessment: what teachers need to know (6th ed.). Boston: Pearson.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: applications and data analysis methods (Second Edition). Newbury Park, CA: Sage.
Shepard, L. A. (2000). The role of assessment in a learning culture. Educational Researcher, 29(7), 4–14 Retrieved from http://nepc.colorado.edu/files/TheRoleofAssessmentinaLearningCulture.pdf.
Shute, V. J., & Ventura, M. (2013). Measuring and supporting learning in games: stealth assessment. Cambridge, MA: Massachusetts Institute of Technology Press Retrieved from http://myweb.fsu.edu/vshute/pdf/white.pdf.
Shute, V., Leighton, J. P., Jang, E. E., & Chu, M.-W. (2016). Advances in the science of assessment. Educational Assessment, 21(1), 34–59. https://doi.org/10.1080/10627197.2015.1127752.
Snijders, T. A. B., & Bosker, R. J. (2012) Multilevel analysis: an introduction to basic and advanced multilevel modeling (Second Edition). London: Sage Publishers.
Supovitz, J. (2009). Can high-stakes testing leverage educational improvement? Prospects from the last decade of testing and accountability reform. Journal of Educational Change, 10(1), 211–227. https://doi.org/10.1007/s10833-009-9105-2.
Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics. Boston: Pearson/Allyn & Bacon.
Volante, L., & Jaafar, S. B. (2008). Educational assessment in Canada. Assessment in Education: Principles, Policy, & Practice, 15(2), 201–210. Retrieved from. https://doi.org/10.1080/09695940802164226.
Wainer, H. (1990). Introduction and history. In H. Wainer, N. J. Dorans, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: a primer (pp. 1–21). Hillsdale, NJ: Erlbaum.
Zenisky, A. L., & Sireci, S. G. (2002). Technological innovations in large-scale assessment. Applied Measurement in Education, 15(4), 337–362. https://doi.org/10.1207/S15324818AME1504_02.
Acknowledgements
Preparation of this paper was supported by Council of Ministers of Education Canada (CMEC). CMEC encourages researchers to express freely their professional judgment. This paper, therefore, does not necessarily represent the positions or the policies of CMEC and no official endorsement should be inferred.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A
PCAP-2013 Student and Teacher Survey Questionnaire Assessment Items
Student Questionnaire (CMEC, 2013a)
Assessment Types
-
1.
How often do you do the following in your science class? (Four-point Likert scale: 1=never, 2=rarely, 3=sometimes, and 4=often)
-
a)
Write tests or quizzes.
Science Performance-Based Tasks
-
2.
How often do you do the following in your science class? (Four-point Likert scale: 1=never, 2=rarely, 3=sometimes, and 4=often)
-
a)
Watch the teacher do experiments as demonstrations.
-
b)
Do experiments following the instructions of the teacher or textbook.
-
c)
Choose your own investigations.
-
d)
Design an investigation to test your own ideas.
-
e)
Explain your ideas or solutions to other students.
-
f)
Spend time doing science activities or investigations.
Teacher Questionnaire (CMEC, 2013b)
Assessment Types
-
3.
In the science class selected for PCAP-2013, how often are students assessed in the following ways? (Four-point Likert scale: 1=never, 2=rarely, 3=sometimes, and 4=often)
-
a)
Common school-wide tests or assessments
-
b)
Teacher-developed classroom tests
-
c)
Student portfolios and/or journals
-
d)
Individual student assignments/projects
-
e)
Group assignments/projects
-
f)
Homework
-
g)
Performance assessment (e.g., design a research project, an investigation or a machine)
Item Formats
-
4.
In your teacher-developed science tests/examinations, how often do you use the following kinds of items or questions? (Four-point Likert scale: 1=never, 2=rarely, 3=sometimes, and 4=often)
-
a)
Selected-response items (e.g., true/false, multiple choice)
-
b)
Short-response items (e.g., one or two words, facts, short sentences)
-
c)
Extended-response items requiring an explanation or justification
-
d)
Performance assessment (e.g., design a research project, an investigation or a machine)
Science Performance-Based Tasks
-
5.
To what extent do you ask the students to do the following during science instruction in the science class selected for PCAP-2013? (Four-point Likert scale: 1=not at all, 2=a little, 3=more than a little, and 4=a lot)
-
a)
Observe natural phenomena and describe what they see
-
b)
Watch you demonstrate an experiment or investigation
-
c)
Formulate their own questions for investigations
-
d)
Design ways to seek answers to their own questions
-
e)
Design or plan experiments or investigations
-
f)
Conduct experiments or investigations
Rights and permissions
About this article
Cite this article
Chu, MW., Fung, K. Relationships Between the Way Students Are Assessed in Science Classrooms and Science Achievement Across Canada. Res Sci Educ 50, 791–812 (2020). https://doi.org/10.1007/s11165-018-9711-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11165-018-9711-1