Discrepancy of Difficulty Level Based On Item Analysis and Test Developers’ Judgment: Department of Biology at Universitas Terbuka, Indonesia

Diki, Diki; Yuliastuti, Eko

doi:10.1007/978-3-319-66227-5_17

Diki Diki⁷ &
Eko Yuliastuti⁷

Part of the book series: Educational Communications and Technology: Issues and Innovations ((ECTII))

741 Accesses
1 Citations

Abstract

The chapter discusses a discrepancy of test items by difficulty level between the test developers and students’ perceptions. Previous studies showed the difficulty level was critical in multiple choice question tests (Naqvi et al., Procedia—Social and Behavioral Sciences 2:3909–3913, 2010; Sim and Rasiah, Annals Academy of Medicine Singapore 35:67–71, 2006). A high number of invalid test items also reduced the effectiveness of a test (Ratnaningsih & Isfarudi, 2010). The aim of the study was to compare the difficulty level of the test items according to the test developers and the difficulty level based on item analysis. The hypothesis is that if there is a gap between two kinds of difficulty levels, the test is less effective. The study used data from three examination results of BIOL4110 (a General Biology test at Universitas Terbuka, Indonesia) of three consecutive semesters between 2014 and 2015. Participant numbers for of each examination were 469, 536, and 520 students. Analysis of a relationship between difficulty levels used Chi square test. In addition, there was an analysis of relevance of the test to the textbook using KR20 and an analysis of the discriminant index. The analysis showed that in each semester, there were always different difficulty levels between test developer judgment and item analysis results. In addition, the relevance level of the test was greater than 0.5, which was good, while the discriminant index was not good, since some test items had rpbis of <0.3. However the passing rate of each test (62–73%) was satisfactory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Abdulghani, H. M., Ahmad, F., Irshad, M., Khalil, M. S., Al-Shaikh, G. K., Syed, S., Aldrees, A., Alrawais, N., & Haque, S. (2015). Faculty development programs improve the quality of multiple choice questions items’ writing. Scientific Reports, 5, 1–7.
Article Google Scholar
Baker, F. B. (2001). The basics of item response theory (2nd ed.). College Park: ERIC Clearinghouse on Assessment and Evaluation.
Google Scholar
Diki, D. (2015). Creativity of biology students in online learning: Case study of Universitas Terbuka, Indonesia. Doctoral dissertation, The Claremont Graduate University.
Google Scholar
Erturk, N. O. (2015). Testing your tests: Reliability issues of academic English exams. International Journal of Psychology and Educational Studies, 2(2), 47–52.
Article Google Scholar
Hambleton, R. K., & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response. Theory and Their Applications to Test Development. Instructional Topics in Educational Measurement. Fall. 38–47.
Google Scholar
Hewindati, Y. T., & Zuhairi, A. (2009). Conducting biological science practicum at a distance at Universitas Terbuka, Indonesia. Asian Association of Open Universities Journal, 4(1), 47–58.
Article Google Scholar
Holmberg, B. (2005). Theory and practice of distance education (2nd ed.). New York: Routledge.
Google Scholar
Hotiu, A. (2006). The relationship between item difficulty and discrimination indices in multiple-choice tests in a physical science course. Doctoral dissertation, Florida Atlantic University, Boca Raton, Florida.
Google Scholar
Kehoe, J. (1995). Basic item analysis for multiple-choice tests. Practical Assessment, Research, and Evaluation, 4(10), 1–13.
Google Scholar
Kubinger, K. D., & Gottschall, C. H. (2007). Item difficulty of multiple choice tests dependant on different item response formats – An experiment in fundamental research on psychological assessment. Psychological Science, 49(4), 361.
Google Scholar
Mitra, N. K., Nagaraja, H. S., Ponnudurai, G., & Judson, J. P. (2009). The levels of difficulty and discrimination indices in type A multiple choice questions of pre-clinical semester 1 multidisciplinary summative tests. International e-Journal of Science, Medicine and, Education, 3(1), 2–7.
Google Scholar
Moore, M. G., & Kearsley, G. (2012). Distance education: A system view of online learning (3rd ed.). Belmont, CA: Wadsworth.
Google Scholar
Mukerjee, P., & Lahiri, S. K. (2015). Analysis of multiple choice questions (MCQ): Item and test statistics from an assessment in a medical college of Kolkata, West Bengal. IOSR Journal of Dental and Medical Sciences, 14(VI), 47–52.
Google Scholar
Naqvi, S. I. H., Hashmi, M. A., & Hussain, A. (2010). Validation of objective-type test in biology at secondary school level. Procedia - Social and Behavioral Sciences, 2(2), 3909–3913.
Article Google Scholar
Ratnaningsih, J.D., & Isfarudi, I. (2013). Analisis butir tes obyektif ujian akhir semester mahasiswa Universitas Terbuka berdasarkan teori tes modern. Jurnal Pendidikan Terbuka dan Jarak Jauh. September 2013. 14(2), 98–109.
Google Scholar
Sabri, S. (2013). Item analysis of student comprehensive test for research in teaching beginner string ensemble using model based teaching among music students in public universities. International Journal of Education and Research, 1(12), 3–14.
Google Scholar
Sim, S., & Rasiah, R. I. (2006). Relationship between item difficulty and discrimination indices in true/false-type multiple choice questions of a para-clinical multidisciplinary paper. Annals Academy of Medicine Singapore, 35(2), 67–71.
Google Scholar
Sirri, A., & Fredanno, M. (2011). The use of item analysis for the improvement of objective examinations. Procedia - Social and Behavioral Sciences, 29, 188–197.
Article Google Scholar
Swanson, D. B., Holtzman, K. Z., Allbee, K., & Clauser, B. E. (2006). Psychometric characteristics and response times for content-parallel extended-matching and one-best-answer items in relation to number of options. Academic Medicine, 81(10), S52–S55.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Fakultas MIPA, Universitas Terbuka, Jl. Cabe Raya, Pamulang, Tangerang Selatan, 15418, Indonesia
Diki Diki & Eko Yuliastuti

Authors

Diki Diki
View author publications
You can also search for this author in PubMed Google Scholar
Eko Yuliastuti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Diki Diki .

Editor information

Editors and Affiliations

University of Wyoming, Laramie, Wyoming, USA
Kay A. Persichitte
The Indonesian Professional Association for Educational Technology, Jakarta, Indonesia
Atwi Suparman
Department of Learning Technologies, University of North Texas, Denton, Texas, USA
Michael Spector

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Diki, D., Yuliastuti, E. (2018). Discrepancy of Difficulty Level Based On Item Analysis and Test Developers’ Judgment: Department of Biology at Universitas Terbuka, Indonesia. In: Persichitte, K., Suparman, A., Spector, M. (eds) Educational Technology to Improve Quality and Access on a Global Scale. Educational Communications and Technology: Issues and Innovations. Springer, Cham. https://doi.org/10.1007/978-3-319-66227-5_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-66227-5_17
Published: 17 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66226-8
Online ISBN: 978-3-319-66227-5
eBook Packages: EducationEducation (R0)

Publish with us

Policies and ethics