Peer review versus bibliometrics: Which method better predicts the scholarly impact of publications?

  • Giovanni AbramoEmail author
  • Ciriaco Andrea D’Angelo
  • Emanuela Reale


In this work, we try to answer the question of which method, peer review versus bibliometrics, better predicts the future overall scholarly impact of scientific publications. We measure the agreement between peer review evaluations of Web of Science indexed publications submitted to the first Italian research assessment exercise and long-term citations of the same publications. We do the same for an early citation-based indicator. We find that the latter shows stronger predictive power, i.e. it more reliably predicts late citations in all the disciplinary areas examined, and for any citation time window starting 1 year after publication.


Research evaluation Scientometrics Publication quality Scientific advancement 


  1. Abramo, G. (2018). Revisiting the scientometric conceptualization of impact and its measurement. Journal of Informetrics, 12(3), 590–597.Google Scholar
  2. Abramo, G., Cicero, T., & D’Angelo, C. A. (2011a). Assessing the varying level of impact measurement accuracy as a function of the citation window length. Journal of Informetrics, 5(4), 659–667.Google Scholar
  3. Abramo, G., Cicero, T., & D’Angelo, C. A. (2012). The dispersion of research performance within and between universities as a potential indicator of the competitive intensity in higher education systems. Journal of Informetrics, 6(2), 155–168.Google Scholar
  4. Abramo, G., Cicero, T., & D’Angelo, C. A. (2013a). National peer-review research assessment exercises for the hard sciences can be a complete waste of money: the Italian case. Scientometrics, 95(1), 311–324.Google Scholar
  5. Abramo, G., & D’Angelo, C. A. (2016). Refrain from adopting the combination of citation and journal metrics to grade publications, as used in the Italian national research assessment exercise (VQR 2011-2014). Scientometrics, 109(3), 2053–2065.Google Scholar
  6. Abramo, G., D’Angelo, C. A., & Di Costa, F. (2011b). National research assessment exercises: a comparison of peer review and bibliometrics rankings. Scientometrics, 89(3), 929–941.Google Scholar
  7. Abramo, G., D’Angelo, C. A., & Felici, G. (2019). Predicting long-term publication impact through a combination of early citations and journal impact factor. Journal of Informetrics, 13(1), 32–49.Google Scholar
  8. Abramo, G., D’Angelo, C. A., & Rosati, F. (2015). The determinants of academic career advancement: evidence from Italy. Science and Public Policy, 42(6), 761–774.Google Scholar
  9. Abramo, G., D’Angelo, C. A., & Viel, F. (2013b). Selecting competent referees to assess research projects proposals: a study of referees’ registers. Research Evaluation, 22(1), 41–51.Google Scholar
  10. Aksnes, D.W., Langfeldt, L., & Wouters, P. (2019). Citations, citation indicators, and research quality: An overview of basic concepts and theories. SAGE Open, January–March, 1–17.Google Scholar
  11. Aksnes, D. W., & Taxt, R. E. (2004). Peer reviews and bibliometric indicators: A comparative study at Norvegian University. Research Evaluation, 13(1), 33–41.Google Scholar
  12. Alfò, M., Benedetto, S., Malgarini, M., & Scipione, S. (2017). On the use of bibliometric information for assessing articles quality: an analysis based on the third Italian research evaluation exercise. In 2017 STI conference, Paris.Google Scholar
  13. Allen, L., Jones, C., Dolby, K., Lynn, D., & Walport, M. (2009). Looking for landmarks: The role of expert review and bibliometric analysis in evaluating scientific publication outputs. PLoS ONE, 4(6), e5910.Google Scholar
  14. Ancaiani, A., Anfossi, A. F., Barbara, A., Benedetto, S., Blasi, B., Carletti, V., et al. (2015). Evaluating scientific research in Italy: The 2004–10 research evaluation exercise. Research Evaluation, 24(3), 242–255.Google Scholar
  15. ANVUR. (2013). Valutazione della qualità della ricerca 2004–2010. Rapporto finale. Last Accessed 12 June 2019.
  16. Baccini, A., Barabesi, L., & De Nicolao, G. (2018). The Holy Grail and the bad sampling: a test for the homogeneity of missing proportions for evaluating the agreement between peer review and bibliometrics in the Italian research assessment exercises. arXiv:1810.12430v1.
  17. Baccini, A., & De Nicolao, G. (2016). Do they agree? Bibliometric evaluation versus informed peer review in the Italian research assessment exercise. Scientometrics, 108(3), 1651–1671.Google Scholar
  18. Bertocchi, G., Gambardella, A., Jappelli, T., Nappi, C. A., & Peracchi, F. (2015). Bibliometric evaluation versus informed peer review: Evidence from Italy. Research Policy, 44(2), 451–466.Google Scholar
  19. Bornmann, L. (2011). Scientific peer review. Annual Review of Information Science and Technology, 45, 199–245.Google Scholar
  20. Bornmann, L., & Daniel, H.-D. (2005). Does the h-index for ranking of scientists really work? Scientometrics, 65(3), 391–392.Google Scholar
  21. Bornmann, L., & Daniel, H.-D. (2008). What do citation counts measure? A review of studies on citing behavior. Journal of Documentation, 64(1), 45–80.Google Scholar
  22. Bornmann, L., & Leydesdorff, L. (2013). The validation of (advanced) bibliometric indicators through peer assessments: A comparative study using data from InCites and F1000. Journal of Informetrics, 7(2), 286–291.Google Scholar
  23. Cabezas-Clavijo, Á., Robinson-García, N., Escabias, M., & Jiménez-Contreras, E. (2013). Reviewers’ ratings and bibliometric indicators: Hand in hand when assessing over research proposals? PLoS ONE, 8(6), e68258.Google Scholar
  24. Cetina, K. K. (1981). The manufacture of knowledge: An essay on the constructivist and contextual nature of science. New York: Pergamon Press.Google Scholar
  25. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46.Google Scholar
  26. Cohen, J. (1968). Weighted kappa: Nominal scale agreement with provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), 213–220.Google Scholar
  27. Cole, J. R., & Cole, S. (1973). Social stratification in science. Chicago: University of Chicago Press.Google Scholar
  28. Cuccurullo, F. (2006). La valutazione triennale della ricerca–VTR del CIVR. Analysis, 3(4), 5–7.Google Scholar
  29. Fleiss, J. L., Levin, B., & Myunghee, C. P. (2003). Statistical methods for rates and proportions. Hoboken, NJ: Wiley.zbMATHGoogle Scholar
  30. Franceschet, M., & Costantini, A. (2011). The first Italian research assessment exercise: A bibliometric perspective. Journal of Informetrics, 5(2), 275–291.Google Scholar
  31. Garfield, E. (1979). Citation indexing-its theory and application in science, technology, and humanities. New York, NY: Wiley.Google Scholar
  32. Garfield, E. (1980). Premature discovery or delayed recognition: Why? Current Contents, 21, 5–10.Google Scholar
  33. Glänzel, W. (2008). Seven myths in bibliometrics. About facts and fiction in quantitative science studies. In H. Kretschmer & F. Havemann (Eds.), Proceedings of WIS fourth international conference on webometrics, informetrics and scientometrics & ninth COLLNET meeting. Berlin: Institute for Library and Information Science.Google Scholar
  34. Harnad, S. (2008). Validating research performance metrics against peer rankings. Ethics in Science and Environmental Politics, 8(1), 103–107.MathSciNetGoogle Scholar
  35. Herrmannova, D., Patton, R., Knoth, P., & Stahl, C. (2018). Do citations and readership identify seminal publications? Scientometrics, 115(1), 239–262.Google Scholar
  36. Horrobin, D. F. (1990). The philosophical basis of peer review and the suppression of innovation. Journal of the American Medical Association, 263(10), 1438–1441.Google Scholar
  37. Ke, Q., Ferrara, E., Radicchi, F., & Flammini, A. (2015). Defining and identifying sleeping beauties in science. Proceedings of the National Academy of Sciences, 112(24), 7426–7431.Google Scholar
  38. Kreiman, G., & Maunsell, J. H. R. (2011). Nine criteria for a measure of scientific output. Frontiers in Computational Neuroscience, 5(48), 11.Google Scholar
  39. Kulczycki, E., Korzeń, M., & Korytkowski, P. (2017). Toward an excellence-based research funding system: Evidence from Poland. Journal of Informetrics, 11(1), 282–298.Google Scholar
  40. Latour, B. (1987). Science in action: How to follow scientists and engineers through society. Cambridge, MA: Harvard University Press.Google Scholar
  41. Leydesdorff, L., Bornmann, L., Comins, J. A., & Milojević, S. (2016). Citations: Indicators of quality? The impact fallacy. Frontiers in Research Metrics and Analytics, 1(1), 1–15.Google Scholar
  42. Lin, L. I.-K. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45(1), 255–268.zbMATHGoogle Scholar
  43. Lin, L. I.-K. (2000). Erratum: A note on the concordance correlation coefficient (biometrics (1989) (214)). Biometrics, 56(1), 324–325.MathSciNetGoogle Scholar
  44. Mahdi, S., D’Este, P., & Neely, A. (2008). Citation counts: are they good predictors of RAE scores? Technical Report February. Advanced Institute of Management Research.
  45. Martin, B. R., & Irvine, J. (1983). Assessing basic research: Some partial indicators of scientific progress in radio astronomy. Research Policy, 12(2), 61–90.Google Scholar
  46. McBride, G. B. (2005). A proposal for strength-of-agreement criteria for lins concordance correlation coefficient. NIWA Client Report, HAM2005-062.Google Scholar
  47. Meho, L. I., & Sonnenwald, D. H. (2000). Citation ranking versus peer evaluation of senior faculty research performance: a case study of Kurdish Scholarship. Journal of the American Society for Information Science, 51(2), 123–138.Google Scholar
  48. Merton, R. K. (1973). Priorities in scientific discovery. In R. K. Merton (Ed.), The sociology of science: Theoretical and empirical investigations (pp. 286–324). Chicago: University of Chicago Press.Google Scholar
  49. Mingers, J., & Leydesdorff, L. (2015). A review of theory and practice in scientometrics. European Journal of Operational Research, 246(1), 1–19.zbMATHGoogle Scholar
  50. Moxam, H., & Anderson, J. (1992a). Peer review. A view from the inside. Science and Technology Policy, 5(1), 7–15.Google Scholar
  51. Moxam, H., & Anderson, J. (1992b). Peer review. A view from the inside. Science and Technology Policy, 5(1), 7–15.Google Scholar
  52. Mryglod, O., Kenna, R., Holovatch, Y., & Berche, B. (2015). Predicting results of the research excellence framework using departmental h-index: revisited. Scientometrics, 104(3), 1013–1017.Google Scholar
  53. Oppenheim, C. (1997). The correlation between citation counts and the 1992 research assessment exercise ratings for British research in genetics, anatomy and archaeology. Journal of Documentation, 53(5), 477–487.MathSciNetGoogle Scholar
  54. Oppenheim, C., & Norris, M. (2003). Citation counts and the research assessment exercise V: Archaeology and the 2001 RAE. Journal of Documentation, 56(6), 709–730.Google Scholar
  55. Pendlebury, D. A. (2009). The use and misuse of journal metrics and other citation indicators. Scientometrics, 57(1), 1–11.Google Scholar
  56. Pichappan, P., & Sarasvady, S. (2002). The other side of the coin: The intricacies of author self-citations. Scientometrics, 54(2), 285–290.Google Scholar
  57. Pride, D., & Knoth, P. (2018). Peer review and citation data in predicting university rankings, a large-scale analysis. In International conference on theory and practice of digital libraries, TPDL 2018: Digital libraries for open knowledge, 195–207. Last Accessed 12 June 2019.
  58. Reale, E., Barbara, A., & Costantini, A. (2007). Peer review for the evaluation of academic research: Lessons from the Italian experience. Research Evaluation, 16(3), 216–228.Google Scholar
  59. Reale, E., & Zinilli, A. (2017). Evaluation for the allocation of university research project funding: Can rules improve the peer review? Research Evaluation, 26(3), 190–198.Google Scholar
  60. Rinia, E. J., van Leeuwen, T., van Vuren, H. G., & van Raan, A. F. J. (1998). Comparative analysis of a set of bibliometric indicators and central peer-review criteria: Evaluation of condensed matter physics in the Netherlands. Research Policy, 27(1), 95–107.Google Scholar
  61. Sheskin, D. J. (2003). Handbook of parametric and nonparametric statistical procedures. London: Chapman & Hall.zbMATHGoogle Scholar
  62. Sugimoto, C. R., & Larivière, V. (2018). Measuring research. Oxford: Oxford University Press.Google Scholar
  63. Taylor, J. (2011a). The assessment of research quality in UK universities: Peer review or metrics? British Journal of Management, 22(2), 202–217.Google Scholar
  64. Taylor, J. (2011b). The assessment of research quality in UK universities: Peer review or metrics? British Journal of Management, 22(2), 202–217.Google Scholar
  65. Thomas, P. R., & Watkins, D. S. (1998). Institutional research rankings via bibliometric analysis and direct peer-review: A comparative case study with policy implications. Scientometrics, 41(3), 335–355.Google Scholar
  66. Traag, V. A., & Waltman, L. (2019). Systematic analysis of agreement between metrics and peer review in the UK REF. London: Palgrave Communications.Google Scholar
  67. van Raan, A. F. J. (2004). Sleeping beauties in science. Scientometrics, 59(3), 461–466.Google Scholar
  68. van Raan, A. F. J. (2006). Comparison of the Hirsch-index with standard bibliometric indicators and with peer judgment for 147 chemistry research groups. Scientometrics, 67(3), 491–502.Google Scholar
  69. Vieira, E. S., Cabral, J. A. S., & Gomes, J. A. N. F. (2014a). Definition of a model based on bibliometric indicators for assessing applicants to academic positions. Journal of the Association for Information Science and Technology, 65(3), 560–577.Google Scholar
  70. Vieira, E. S., Cabral, J. A. S., & Gomes, J. A. N. F. (2014b). How good is a model based on bibliometric indicators in predicting the final decisions made by peers? Journal of Informetrics, 8(2), 390–405.Google Scholar
  71. Vieira, E. S., & Gomes, J. A. N. F. (2018). The peer-review process: The most valued dimensions according to the researcher’s scientific career. Research Evaluation, 27(3), 246–261.Google Scholar
  72. Wilsdon, J., Allen, L., Belfiore, E., Campbell, P., Curry, S., Hill, S., et al. (2015). The Metric Tide: Report of the independent review of the role of metrics in research assessment and management. Bristol: HEFCE.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2019

Authors and Affiliations

  • Giovanni Abramo
    • 1
    Email author
  • Ciriaco Andrea D’Angelo
    • 2
  • Emanuela Reale
    • 3
  1. 1.Laboratory for Studies in Research Evaluation, Institute for System Analysis and Computer Science (IASI-CNR)National Research CouncilRomeItaly
  2. 2.Department of Engineering and ManagementUniversity of Rome “Tor Vergata”RomeItaly
  3. 3.Research Institute on Sustainable Economic Growth (IRCRES-CNR)National Research CouncilRomeItaly

Personalised recommendations