Three Things Game Designers Need to Know About Assessment

Mislevy, Robert J.; Behrens, John T.; Dicerbo, Kristen E.; Frezzo, Dennis C.; West, Patti

doi:10.1007/978-1-4614-3546-4_5

Robert J. Mislevy⁴,
John T. Behrens⁵,
Kristen E. Dicerbo⁵,
Dennis C. Frezzo⁶ &
…
Patti West⁷

4144 Accesses
14 Citations
1 Altmetric

Abstract

Designing game-based assessments requires coordinating the work of people from communities with little overlap, such as subject matter experts, game designers, software engineers, assessment specialists, and psychometricians. This chapter discusses three things that game designers should know about assessment to help their work come together toward the common goal: (1) Assessment design is compatible with game design, because they build on the same principles of learning. (2) Assessment is not really about numbers; it is about the structure of reasoning. (3) The key constraints of assessment design and game design need to be addressed, even if in rudimentary form, from the very beginning of the design process. The assessment design framework called “evidenced centered design” is introduced to complement game design principles, so that designers can address assessment criteria such as reliability and validity jointly with game criteria such as engagement and interactivity. The ideas are illustrated with examples from the Packet Tracer simulation environment and Aspire game that are used in the Cisco Networking Academies for learning and assessing computer network engineering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Pearl (1988) quoted the statistician Glenn Shafer as having said “Probability isn’t really about numbers; it’s about the structure of reasoning.”
2.
http://www.facebook.com/apps/application.php?id=123204664363385. Downloaded April 15, 2011.
3.
Behrens, Frezzo, Mislevy, Kroopnick, and Wise (2007) analyzed an earlier prototype of Aspire, called Network City, in terms of the structural, functional, and semiotic symmetries between simulation-based games and assessments.
4.
This figure is based on Almond, Steinberg, and Mislevy (2002) four-process architecture for assessment delivery systems. They describe how the processes are structured around the information in student, evidence, and task models discussed later in this chapter. Frezzo, Behrens, and Mislevy (2009) show how it plays out in Cisco’s Packet Tracer Skills Assessments.
5.
These depictions and the narrative discussion of them set the stage for the more technical specifications that experts will need to address, such as measurement models, scoring algorithms, and generative task models. The interested reader is referred to Mislevy et al. (2003) and Mislevy and Riconscente (2006).
6.
Susan Embretson’s (1985) Test design: Developments in psychology and psychometrics was a watershed publication on this problem. Leighton and Gierl (2007) provide more recent examples.
7.
http://www.norsys.com. Downloaded May 1, 2011.

References

Alexander, C., Ishikawa, S., & Silverstein, M. (1977). A pattern language: Towns, buildings, construction. New York: Oxford University Press.
Google Scholar
Almond, R. G., Steinberg, L. S., & Mislevy, R. J. (2002). Enhancing the design and delivery of assessment systems: A four-process architecture. Journal of Technology, Learning, and Assessment, 1(5). Retrieved May 1, 2011, from http://www.bc.edu/research/intasc/jtla/journal/v1n5.shtml.
Bagley, E., & Shaffer, D. W. (2009). When people get in the way: Promoting civic thinking through epistemic gameplay. International Journal of Gaming and Computer-mediated Simulations, 1, 36–52.
Article Google Scholar
Barab, S. A., Dodge, T., & Gee, J. P. (in press). The worked example: Invitational scholarship in service of an emerging field. Educational Researcher.
Google Scholar
Behrens, J. T., Frezzo, D. C., Mislevy, R. J., Kroopnick, M., & Wise, D. (2007). Structural, functional, and semiotic symmetries in simulation-based games and assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 59–80). New York: Erlbaum.
Google Scholar
Behrens, J. T., Mislevy, R. J., Bauer, M., Williamson, D. M., & Levy, R. (2004). Introduction to evidence centered design and lessons learned from its application in a global e-learning program. International Journal of Testing, 4, 295–301.
Article Google Scholar
Behrens, J. T., Mislevy, R. J., DiCerbo, K. E., & Levy, R. (2012). An evidence centered design for learning and assessment in the digital world. In M. C. Mayrath, J. Clarke-Midura, & D. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 13–53). Charlotte, NC: Information Age.
Google Scholar
Bejar, I. I., & Braun, H. (1999). Architectural simulations: From research to implementation. Final report to the National Council of Architectural Registration Boards (ETS RM-99-2). Princeton, NJ: Educational Testing Service.
Google Scholar
Bennett, R. E., & Bejar, I. I. (1998). Validity and automated scoring: It’s not only the scoring. Educational Measurement: Issues and Practice, 17(4), 9–17.
Article Google Scholar
Cheng, B. H., Ructtinger, L., Fujii, R., & Mislevy, R. (2010). Assessing systems thinking and complexity in science (Large-Scale Assessment Technical Report 7). Menlo Park, CA: SRI International.
Google Scholar
Chi, M. T. H., Glaser, R., & Farr, M. J. (Eds.). (1988). The nature of expertise. Hillsdale, NJ: Erlbaum.
Google Scholar
Chung, G. K. W. K., Baker, E. L., Delacruz, G. C., Bewley, W. L., Elmore, J., & Seely, B. (2008). A computational approach to authoring problem-solving assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 289–307). Mahwah, NJ: Erlbaum.
Google Scholar
Clarke-Midura, J., & Dede, C. (2010). Assessment, technology, and change. Journal of Research on Technology in Education, 42, 309–328.
Google Scholar
Claxton, G. (2002). Education for the learning age: A sociocultural approach to learning to learn. In G. Wells & G. Claxton (Eds.), Learning for life in the 21st century (pp. 19–33). Oxford, UK: Blackwell.
Google Scholar
Conejo, R., Guzmán, E., Millán, E., Trella, M., Pérez-De-La-Cruz, J. L., & Ríos, A. (2004). A web-based tool for adaptive testing. International Journal of Artificial Intelligence in Education, 14, 29–61.
Google Scholar
Csíkszentmihályi, M. (1975). Beyond boredom and anxiety. San Francisco, CA: Jossey-Bass.
Google Scholar
Embretson, S. E. (Ed.). (1985). Test design: Developments in psychology and psychometrics. Orlando: Academic.
Google Scholar
Embretson, S. E. (1998). A cognitive design system approach to generating valid tests: Application to abstract reasoning. Psychological Methods, 3, 380–396.
Article Google Scholar
Ericsson, A. K., Charness, N., Feltovich, P., & Hoffman, R. R. (2006). Cambridge handbook on expertise and expert performance. Cambridge, UK: Cambridge University Press.
Book Google Scholar
Fletcher, J. D., & Morrison, J. E. (2007). Representing cognition in games and simulations. In E. Baker, J. Dickieson, W. Wulfeck, & H. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 107–137). New York: Lawrence Erlbaum.
Google Scholar
Frezzo, D. C. (2009). Using activity theory to understand the role of a simulation-based interactive learning environment in a computer networking course. Doctoral dissertation, ProQuest. Retrieved August 29, 2011, from http://gradworks.umi.com/33/74/3374268.html.
Frezzo, D. C., Behrens, J. T., & Mislevy, R. J. (2009). Design patterns for learning and assessment: Facilitating the introduction of a complex simulation-based learning environment into a community of instructors. The Journal of Science Education and Technology. Retrieved April 10, 2012, from Springer Open Access http://www.springerlink.com/content/566p6g4307405346/.
Fullerton, T., Swain, C., & Hoffman, S. S. (2008). Game design workshop: Designing, prototyping, and playtesting games (2nd ed.). Burlington, MA: Morgan Kaufmann.
Google Scholar
Gamma, E., Helm, R., Johnson, R., & Vlissides, J. (1994). Design patterns. Reading, MA: Addison-Wesley.
Google Scholar
Gee, J. P. (2003). What video games have to teach us about learning and literacy. New York: Palgrave/Macmillan.
Google Scholar
Greeno, J. G. (1998). The situativity of knowing, learning, and research. American Psychologist, 53, 5–26.
Article Google Scholar
Katz, I. R. (1994). Coping with the complexity of design: Avoiding conflicts and prioritizing constraints. In A. Ram, N. Nersessian, & M. Recker (Eds.), Proceedings of the sixteenth annual meeting of the Cognitive Science Society (pp. 485–489). Mahwah, NJ: Erlbaum.
Google Scholar
Koster, R. (2005). A theory of fun for game design. Scottsdale, AZ: Paraglyph.
Google Scholar
Lave, J., & Wenger, E. (1991). Situated learning: Legitimate peripheral participation. Cambridge: Cambridge University Press.
Book Google Scholar
Leighton, J., & Gierl, M. (Eds.). (2007). Cognitive diagnostic assessment for education: Theory and applications. New York, NY: Cambridge University Press.
Google Scholar
Levy, R., Behrens, J. T., & Mislevy, R. J. (2006). Variations in adaptive testing and their online leverage points. In D. D. Williams, S. L. Howell, & M. Hricko (Eds.), Online assessment, measurement, and evaluation (pp. 180–202). Hershey, PA: Information Science Publishing.
Google Scholar
Loftus, E. F., & Loftus, G. R. (1983). Mind at play: The psychology of video games. New York: Basic Books.
Google Scholar
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Mahwah, NJ: Erlbaum.
Google Scholar
Luecht, R. M. (2006). Assessment engineering: An emerging discipline. Paper presented in the Centre for Research in Applied Measurement and Evaluation, University of Alberta, Edmonton.
Google Scholar
Malone, T. W. (1981). What makes computer games fun? Byte, 6, 258–277.
Google Scholar
Margolis, M. J., & Clauser, B. E. (2006). A regression-based procedure for automated scoring of a complex medical performance assessment. In D. M. Williamson, R. J. Mislevy, & I. I. Bejar (Eds.), Automated scoring for complex tasks in computer-based testing (pp. 123–167). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Mayer, R. E. (1981). Frequency norms and structural analysis of algebra story problems into families, categories, and templates. International Science, 10, 135–175.
Google Scholar
Messick, S. (1994). The interplay of evidence and consequences in the validation of performance assessments. Educational Researcher, 23(2), 13–23.
Google Scholar
Mislevy, R. J. (2004). Can there be reliability without “reliability”? Journal of Educational and Behavioral Statistics, 29, 241–244.
Article Google Scholar
Mislevy, R. J., & Riconscente, M. M. (2006). Evidence-centered assessment design: Layers, concepts, and terminology. In S. Downing & T. Haladyna (Eds.), Handbook of test development (pp. 61–90). Mahwah, NJ: Erlbaum.
Google Scholar
Mislevy, R. J., Riconscente, M. M., & Rutstein, D. W. (2009). Design patterns for assessing model-based reasoning (Large-Scale Assessment Technical Report 6). Menlo Park, CA: SRI International.
Google Scholar
Mislevy, R. J., Steinberg, L. S., & Almond, R. A. (2003). On the structure of educational assessments. Measurement: Interdisciplinary Research and Perspectives, 1, 3–67.
Article Google Scholar
Mislevy, R. J., Steinberg, L. S., Breyer, F. J., Johnson, L., & Almond, R. A. (2002). Making sense of data from complex assessments. Applied Measurement in Education, 15, 363–378.
Article Google Scholar
Moss, P. (1994). Can there be validity without reliability? Educational Researcher, 23(2), 5–12.
Google Scholar
Nelson, B. C., Erlandson, B., & Denham, A. (2011). Global channels of evidence for learning and assessment in complex game environments. British Journal of Educational Technology, 42, 88–100.
Article Google Scholar
Pausch, R., Gold, R., Skelly, T., & Thiel, D. (1994). What HCI designers can learn from video game designers. In Conference on human factors in computer systems (pp. 177–178). Boston, MA: ACM.
Google Scholar
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Kaufmann.
Google Scholar
Quellmalz, E., & Pellegrino, J. W. (2009). Technology and testing. Science, 323, 75–79.
Article Google Scholar
Rollings, A., & Morris, D. (2000). Game architecture and design. Scottsdale, AZ: Coriolis.
Google Scholar
Roschelle, J. (1996). Designing for cognitive communication: Epistemic fidelity or mediating collaborative inquiry? In D. L. Day & D. K. Kovacs (Eds.), Computers communication and mental models (pp. 13–25). Bristol, PA: Taylor and Francis.
Google Scholar
Rupp, A., Gushta, M., Mislevy, R. J., & Shaffer, D. W. (2010). Evidence-centered design of epistemic games: Measurement principles for complex learning environments. Journal of Technology, Learning, and Assessment, 8(4). Retrieved April 10, 2012, from http://ejournals.bc.edu/ojs/index.php/jtla/article/download/1623/1467.
Rupp, A., Templin, J., & Henson, R. (2010). Diagnostic measurement: Theory, methods, and applications. New York, NY: Guilford.
Google Scholar
Salen, K., & Zimmerman, E. (2004). Rules of play: Game design fundamentals. Cambridge: MIT.
Google Scholar
Salthouse, T. A. (1991). Expertise as the circumvention of human processing limitations. In K. A. Ericcson & J. Smith (Eds.), Toward a general theory of expertise (pp. 286–300). Cambridge, UK: Cambridge University Press.
Google Scholar
Scalise, K., & Gifford, B. (2006). Computer-based assessment in E-learning: A framework for constructing “Intermediate Constraint” questions and tasks for technology platforms. Journal of Technology, Learning, and Assessment, 4(6). Retrieved July 17, 2009, from http://ejournals.bc.edu/ojs/index.php/jtla/article/view/1653/1495.
Schmit, M. J., & Ryan, A. (1992). Test-taking dispositions: A missing link? Journal of Applied Psychology, 77, 629–637.
Article Google Scholar
Schwartz, D. L., & Arena, D. (2009). Choice-based assessments for the digital age. Palo Alto: Stanford University.
Google Scholar
Shaffer, D. W. (2006). How computer games help children learn. New York: Palgrave/Macmillan.
Book Google Scholar
Shaffer, D. W., & Gee, J. P. (2012). The Right Kind of GATE: Computer games and the future of assessment. In M. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 211–228). Charlotte: Information Age Publishing.
Google Scholar
Shaffer, D. W., Hatfield, D., Svarovsky, G. N., Nash, P., Nulty, A., Bagley, E., et al. (2009). Epistemic network analysis: A prototype for 21st century assessment of learning. The International Journal of Learning and Media, 1, 33–53.
Article Google Scholar
Shao, J., & Tu, D. (1995). The jackknife and bootstrap. New York: Springer.
Book Google Scholar
Shute, V. J. (2011). Stealth assessment in computer-based games to support learning. In S. Tobias & J. D. Fletcher (Eds.), Computer games and instruction (pp. 503–524). Charlotte, NC: Information Age Publishers.
Google Scholar
Shute, V. J., & Torres, R. (2012). Where streams converge: Using evidence-centered design to assess Quest to Learn. In M. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 91–204). Charlotte, NC: Information Age Publishing.
Google Scholar
Sundre, D. L., & Wise, S. L. (2003). ‘Motivation filtering’: An exploration of the impact of low examinee motivation on the psychometric quality of tests. Paper presented at the annual meeting of the National Council on Measurement in Education, Chicago.
Google Scholar
Vendlinski, T. P., Baker, E. L., & Niemi, D. (2008). Templates and objects in authoring problem solving assessments. In E. L. Baker, J. Dickieson, W. Wulfeck, & H. F. O’Neil (Eds.), Assessment of problem solving using simulations (pp. 309–333). New York: Erlbaum.
Google Scholar
Vygotsky, L. S. (1978). Mind and society: The development of higher psychological processes. Cambridge, MA: Harvard University Press.
Google Scholar
Wainer, H., Dorans, N. J., Flaugher, R., Green, B. F., Mislevy, R. J., Steinberg, L., et al. (2000). Computerized adaptive testing: A primer (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
Google Scholar
Wertsch, J. (1998). Mind as action. New York: Oxford University Press.
Google Scholar
Williamson, D. M., Bauer, M., Steinberg, L. S., Mislevy, R. J., Behrens, J. T., & DeMark, S. (2004). Design rationale for a complex performance assessment. International Journal of Measurement, 4, 303–332.
Google Scholar

Download references

Acknowledgments

The work reported here was supported in part by a research contract from Cisco Systems, Inc., to the University of Maryland, College Park, and the Center for Advanced Technology in Schools (CATS), PR/Award Number R305C080015, as administered by the Institute of Education Sciences, U.S. Department of Education. The findings and opinions expressed in this report are those of the authors and do not necessarily reflect the positions or policies of Cisco, the CATS, the National Center for Education Research (NCER), the Institute of Education Sciences (IES), or the U.S. Department of Education.

Author information

Authors and Affiliations

Educational Testing Service, Rosedale Road, 12-T, Princeton, NJ, 08541, USA
Robert J. Mislevy
Center for Digital Experience and Analytics, Pearson, 400 Center Ridge Drive, Austin, TX, 78753, USA
John T. Behrens & Kristen E. Dicerbo
Instructional Research and Technology, Cisco Systems, 300 Berry Street #552, San Francisco, CA, 94158, USA
Dennis C. Frezzo
Cisco Networking Academy, 4085 SE 23rd Terrace, Ocala, FL, 34480, USA
Patti West

Authors

Robert J. Mislevy
View author publications
You can also search for this author in PubMed Google Scholar
John T. Behrens
View author publications
You can also search for this author in PubMed Google Scholar
Kristen E. Dicerbo
View author publications
You can also search for this author in PubMed Google Scholar
Dennis C. Frezzo
View author publications
You can also search for this author in PubMed Google Scholar
Patti West
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert J. Mislevy .

Editor information

Editors and Affiliations

, Fakultät für Sozialwissenschaften, University of Mannheim, Gebaudeteil B A5,6, Mannheim, 68131, Baden-Württemberg, Germany
Dirk Ifenthaler
, College of Education, University of Oklahoma, Van Vleet Oval 820, Norman, 73019, Oklahoma, USA
Deniz Eseryel
Van Fleet Oval 820, Norman, 73019, Oklahoma, USA
Xun Ge

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mislevy, R.J., Behrens, J.T., Dicerbo, K.E., Frezzo, D.C., West, P. (2012). Three Things Game Designers Need to Know About Assessment. In: Ifenthaler, D., Eseryel, D., Ge, X. (eds) Assessment in Game-Based Learning. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3546-4_5

Download citation

DOI: https://doi.org/10.1007/978-1-4614-3546-4_5
Published: 25 May 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3545-7
Online ISBN: 978-1-4614-3546-4
eBook Packages: Humanities, Social Sciences and LawEducation (R0)

Publish with us

Policies and ethics