Social Indicators Research

, Volume 141, Issue 1, pp 31–60 | Cite as

Measurement Assessment in Cross-Country Comparative Analysis: Rasch Modelling on a Measure of Institutional Quality

  • Paola AnnoniEmail author
  • Nicholas Charron


The European Quality of Government Index (EQI) is the only measure of institutional quality available at the regional level in the European Union. The index, published in 2010 and again in 2013, is based on an ad-hoc survey that measures three different broad aspects of governance within countries: corruption, impartiality and quality. The EQI is assessed in this paper for the first time by means of Rasch modelling, a popular Item Response Theory method. It is demonstrated that Rasch modelling allows for a wide scope of validity and consistency tests of surveys of this kind. The analysis helped strengthening the survey, and consequently the index, by highlighting areas for improvement that can be applied to future rounds of the EQI survey. For instance, it allowed for testing the questions equivalence across different countries and respondents’ socio-demographic background, the validity and fit of each question’s measurement scale and the internal consistency of the EQI domains of corruption, impartiality and quality. Several of the shortcomings that were highlighted by the Rasch analysis will be addressed in the upcoming round of data collection for the third edition of the EQI. The analysis is then expected to have a positive impact on improving the first measure of quality of government in the European Union regions.


European Quality of Government Index European Union regions Item response theory Rasch modelling Comparative analysis 


  1. Alwin, D. F. (1992). Information transmission in the survey interview: Number of response categories and the reliability of attitude measurement. Sociological Methodology, 22, 83–118.Google Scholar
  2. Anderson, A. B., Basilevsky, A., & Hum, D. P. J. (1983). Measurement: Theory and Techniques. In P. H. Rossi, J. D. Wright, & A. B. Anderson (Eds.), Handbook of survey research (pp. 231–287). San Diego, CA: Academic Press.Google Scholar
  3. Andrich, D. (1988). Rasch models for measurement. U.S.A: SAGE Publications Inc.Google Scholar
  4. Annoni, P., & Bruggemann, R. (2009). Exploring partial order of European countries. Social Indicators Research, 92(3), 471–487.Google Scholar
  5. Annoni, P., & Weziak-Bialowolska, D. (2012). The web index: Gender bias findings from the rating scale model. Rasch Measurement Transactions, 26(3), 1389–1390.Google Scholar
  6. Annoni, P., & Weziak-Bialowolska, D. (2016). A measure to target anti-poverty policies in the European Union regions. Applied Research in Quality of Life, 11(1), 181–207.Google Scholar
  7. Annoni, P., Weziak-Bialowolska, D., & Farhan, H. (2013). Measuring the impact of the Web: Rasch modeling for survey evaluation. Journal of Applied Statistics, 40(8), 1831–1851.Google Scholar
  8. Ariely, G., & Davidov, E. (2011). Can we rate public support for democracy in a comparable way? Cross-national equivalence of democratic attitudes in the World Value Survey. Social Indicators Research, 104(2), 271–286.Google Scholar
  9. Babiar, T. C. (2011). Exploring differential item functioning (DIF) with the Rasch model: A comparison of gender differences on eighth grade science items in the United States and Spain. Journal of Applied Measurement, 12(2), 144–164.Google Scholar
  10. Bertrand, M., & Mullainathan, S. (2001). Do people mean what they say? Implications for subjective survey data. Economics and Social Behavior, 91(2), 67–72.Google Scholar
  11. Beugelsdijk, S., & Klasing, M. J. (2015). Diversity and trust: The role of shared values. Journal of Comparative Economics, 44(3), 522–540.Google Scholar
  12. Bond, G. T., & Fox, C. M. (2001). Applying the Rasch Model: Fundamental measurement in the Human Sciences. New Jersey: LEA.Google Scholar
  13. Bradley, K. D., Peabody, M. R., Akers, K. S., & Knutson, N. M. (2015). Rating scales in survey research: Using the Rasch model to illustrate the neutral middle category measurement flaw. Survey Practice, 8(1), 1–12.Google Scholar
  14. Brady, H. E. (1985). The perils of survey research: Inter-personally incomparable responses. Political Methodology, 11, 269–290.Google Scholar
  15. Bruggemann, R., & Patil, G. P. (2011). Ranking and prioritization of multi-indicator systems introduction to partial order applications. New York: Springer.Google Scholar
  16. Byrne, B. M., Oakland, T., Leong, F. T., van de Vijver, F. J., Hambleton, R. K., Cheung, F. M., & Bartram, D. (2009). A critical analysis of cross-cultural research and testing practices: Implications for improved education and training in psychology. Training and Education in Professional Psychology, 3(2), 94.Google Scholar
  17. Carlsen, L., & Bruggemann, R. (2017). Fragile state index: Trends and developments. A partial order data analysis. Social Indicators Research, 133, 1–14.Google Scholar
  18. Chalmers, R. P. (2012). A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1–29.Google Scholar
  19. Charron, N., Dijkstra, L., & Lapuente, V. (2014). Regional governance matters: Quality of government within European Union member states. Regional Studies, 48(1), 68–90.Google Scholar
  20. Charron, N., Dijkstra, L., & Lapuente, V. (2015). Mapping the regional divide in Europe: A measure for assessing quality of government in 206 European regions. Social Indicators Research, 122(2), 315–346.Google Scholar
  21. Charron, N., & Lapuente, V. (2013). Why do some regions in Europe have higher quality of government? The Journal of Politics, 75, 567–582.Google Scholar
  22. Charron, N., Lapuente, V., & Rothstein, B. (2013). Quality of government and corruption from a European perspective: A comparative study on the quality of government in EU regions. Cheltenham: Edward Elgar Publishing.Google Scholar
  23. Clinton, J., Jackman, S., & Rivers, D. (2004). The statistical analysis of roll call data. American Political Science Review, 98(2), 355–370.Google Scholar
  24. Cummins, R. A. (1998). The second approximation to an international standard for life satisfaction. Social Indicators Research, 43(3), 307–334.Google Scholar
  25. Cummins, R. A., & Gullone, E. (2000). Why we should not use 5-point Likert scales: The case for subjective quality of life measurement. In Proceedings, second international conference on quality of life in cities (pp. 74–93).Google Scholar
  26. Dahlström, C., Lapuente, V., & Teorell, J. (2012). The merit of meritocratization: Politics, bureaucracy, and the institutional deterrents of corruption. Political Research Quarterly, 65(3), 656–668.Google Scholar
  27. Davidov, E. (2009). Measurement equivalence of nationalism and constructive patriotism in the ISSP: 34 countries in a comparative perspective. Political Analysis, 17(1), 64–82.Google Scholar
  28. Davidov, E., Meuleman, B., Cieciuch, J., Schmidt, P., & Billiet, J. (2014). Measurement equivalence in cross-national research. Sociology, 40, 55–75.Google Scholar
  29. Dawes, J. G. (2008). Do data characteristics change according to the number of scale points used? An experiment using 5 point, 7 point and 10 point scales. International Journal of Market Research, 51(1), 61–77.Google Scholar
  30. De Regt, S., Smits, T., & Mortelmans, D. (2011). Trends in authoritarianism: Evidence from 31 European countries. The International Journal of Social Sciences and Humanity Studies, 3(1), 395–404.Google Scholar
  31. European Union Commission. (2014). 6th Report on Economic, Social and Territorial Cohesion
  32. Fattore, M. (2016). Partially ordered sets and the measurement of multidimensional ordinal deprivation. Social Indicators Research, 128, 835–858.Google Scholar
  33. Foster, J. E., McGillivray, M., & Seth, S. (2013). Composite indices: Rank robustness, statistical association, and redundancy. Economic Review, 32, 35–56.Google Scholar
  34. Freitag, M., & Traunmüller, R. (2009). Spheres of trust: An empirical analysis of the foundations of particularised and generalised trust. European Journal of Political Research, 48(6), 782–803.Google Scholar
  35. Henson, S., Blandon, J., & Cranfield, J. (2010) Difficulty of healthy eating: A Rasch model approach. Social Science & Medicine, 70(10), 1574–1580.Google Scholar
  36. Inglehart, R., & Baker, W. E. (2000). Modernization, cultural change, and the persistence of traditional values. American Sociological Review, 65(1), 19–51.Google Scholar
  37. Irwin, K. C., & Irwin, R. J. (2005). Assessing development in numeracy of students from different socio-economic areas: A Rasch analysis of Three Fundamental Tasks. Educational Studies in Mathematics, 58(3), 283–298.Google Scholar
  38. Justesen, M. K., & Bjørnskov, C. (2014). Exploiting the poor: Bureaucratic corruption and poverty in Africa. World Development, 58, 106–115.Google Scholar
  39. Kaufmann, D., Kraay, A., & Mastruzzi, M. (2009). Governance matters VIII: Aggregate and individual governance indicators, 1996–2008. World Bank Policy Research Working Paper, no. 4978.Google Scholar
  40. King, G., Murray, C. J., Salomon, J. A., & Tandon, A. (2004). Enhancing the validity and cross-cultural comparability of measurement in survey research. American Political Science Review, 98(01), 191–207.Google Scholar
  41. Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. Handbook of survey research, 2, 263–314.Google Scholar
  42. Linacre, M. J. (2015). A User’s Guide to: WINSTEPS and Ministeps: Rasch-Model Computer Programs. ISBN 0-941938-03-4.Google Scholar
  43. Lindqvist, E., & Östling, R. (2010). Political polarization and the size of government. American Political Science Review, 104(03), 543–565.Google Scholar
  44. Lozano, L. M., García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology, 4(2), 73–79.Google Scholar
  45. Martin, A. D., & Quinn, K. M. (2002). Dynamic ideal point estimation via Markov Chain Monte Carlo for the US Supreme Court, 1953–1999. Political Analysis, 10(2), 134–153.Google Scholar
  46. Mauro, V., Biggeri, M., & Maggino, F. (2016) Measuring and monitoring poverty and well-being: A new approach for the synthesis of multidimensionality. Social Indicators Research, 1–15.
  47. Moors, G. (2008). Exploring the effect of a middle response category on response style in attitude measurement. Quality & Quantity, 42(6), 779–794.Google Scholar
  48. Morrison, D. (2005). Multivariate statistical methods. London: Thomson.Google Scholar
  49. Osman, S. A., Naamb, S. I., Jaafarb, O., Badaruzzamanb, W. H. W., Abdullah, R. A., & Rahmatet, O. K. (2012). Application of Rasch model in measuring students’ performance in civil engineering design II course. Procedia—Social and Behavioral Sciences, 56, 59–66.Google Scholar
  50. Piquero, A. R., Macintosh, R., & Hockman, M. (2002). The validity of a self-reported delinquency scale. Comparison across gender, age, race and place of residence. Sociological Methods and Research, 30(4), 492–529.Google Scholar
  51. Putnam, R. (2001). Social capital: Measurement and consequences. Canadian Journal of Policy Research, 2(1), 41–51.Google Scholar
  52. Rasch, G. (1960). Probabilistic models for some intelligence and attainment test. Danish Institute for Educational Research, Copenhagen, Denmark. Reprint, 1980. Chicago: University of Chicago Press.Google Scholar
  53. Rodriguez-Pose, A., & Garcilazo, E. (2013). Quality of Government and the Returns of Investment: Examining the impact of Cohesion expenditure in European regions. OECD Regional development working papers 2013/12: OECD Publishing.Google Scholar
  54. Salmon, C. T., & Nichols, J. S. (1983). The next-birthday method of respondent selection. Public Opinion Quarterly, 47(2), 270–276.Google Scholar
  55. Sen, A. (2002). Health: Perception versus observation. British Medical Journal, 324, 860–861.Google Scholar
  56. Shaw, F. (1991). Descriptive IRT vs. prescriptive Rasch. Rasch Measurement Transactions, 5, 1.Google Scholar
  57. Smith, T. W. (2003). Developing comparable questions in cross-national surveys. In J. A. Harkness, F. J. R. Van de Vijver, & P. H. Mohler (Eds.), Cross-cultural survey methods (pp. 69–91). New York: Wiley.Google Scholar
  58. Smith, E. V., Jr., & Smith, R. M. (2004). Introduction to Rasch measurement: Theory, models and applications. Maple Grove, Minnesota: JAM Press.Google Scholar
  59. Sturgis, P., Roberts, C., & Smith, P. (2014). Middle alternatives revisited how the neither/nor response acts as a way of saying “I Don’t Know”? Sociological Methods and Research, 43(1), 15–38.Google Scholar
  60. Sundaram, M., Smith, M. J., Revicki, D. A., Elswick, B., & Miller, L. (2009). Rasch analysis informed the development of a classification system for diabetes-specific preference-based measure of health. Journal of Clinical Epidemiology, 62, 845–856.Google Scholar
  61. Treier, S., & Jackman, S. (2008). Democracy as a Latent Variable. American Journal of Political Science, 52, 201–217.Google Scholar
  62. Weijters, B., Cabooter, E., & Schillewaert, N. (2010). The effect of rating scale format on response styles: The number of response categories and response category labels. International Journal of Research in Marketing, 27(3), 236–247.Google Scholar
  63. Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80–83.Google Scholar
  64. Wright, B. D., & Masters, G. N. (1982). Rating Scale Analysis. Chicago: Mesa Press.Google Scholar

Copyright information

© Springer Science+Business Media B.V., part of Springer Nature 2017

Authors and Affiliations

  1. 1.European Commission, Policy Development and Economic Analysis UnitDirectorate General for Regional and Urban PolicyBrusselsBelgium
  2. 2.Quality of Government InstituteUniversity of GothenburgGothenburgSweden

Personalised recommendations