, Volume 121, Issue 3, pp 1815–1824 | Cite as

The data source of this study is Web of Science Core Collection? Not enough

  • Weishu LiuEmail author


Clarivate Analytics’ Web of Science Core Collection, a comprehensive database consisting of ten sub-datasets, is increasingly applied in academic research across over two hundred Web of Science categories. 271 English language SCIE and SSCI papers published in 2017–2018 from the category of Information Science and Library Science have mentioned “Web of Science” in the topic field. A manual check of the full texts of these papers reveals that 243 of them have used “Web of Science Core Collection” as the data source but over half of them haven’t specified the sub-datasets of Web of Science Core Collection used in the study. Since many institutions may only subscribe to a customized subset of the whole core collection, the non-transparency of the data source will hinder the reproducibility of some corresponding studies. This study suggests that researchers should specify the sub-datasets and corresponding coverage timespans when using Web of Science Core Collection as the data source.


Web of Science Core Collection Reproducibility Responsible research Bibliographic database Bibliometric analysis 



This research is partially supported by the National Natural Science Foundation of China (Grant No. 71801189) and Zhejiang Provincial Natural Science Foundation of China (Grant No. LQ18G030010). The author would like to thank the referee for his/her insightful suggestions and comments which have significantly improved the manuscript.

Compliance with ethical standards

Conflict of interest

The author declares that there is no conflict of interest.


  1. Bar-Ilan, J., & Halevi, G. (2018). Temporal characteristics of retracted articles. Scientometrics,116(3), 1771–1783. Scholar
  2. Berg, J. (2018). Progress on reproducibility. Science,359(6371), 9. Scholar
  3. Camerer, C. F., Dreber, A., Holzmeister, F., Ho, T. H., Huber, J., Johannesson, M., et al. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour,2(9), 637–644. Scholar
  4. Clarivate Analytics. (2019). Web of Science databases. Retrieved July 19, 2019, from
  5. Dallas, T., Gehman, A. L., & Farrell, M. J. (2018). Variable bibliographic database access could limit reproducibility. BioScience,68(8), 552–553. Scholar
  6. Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2016). Empirical analysis and classification of database errors in Scopus and Web of Science. Journal of Informetrics,10(4), 933–953. Scholar
  7. Hu, G., Yang, Y., & Tang, L. (2019). Retraction and research integrity education in China. Science and Engineering Ethics,25(1), 325–326. Scholar
  8. Jacso, P. (2018). The scientometric portrait of Eugene Garfield through the free ResearcherID service from the Web of Science Core Collection of 67 million master records and 1.3 billion references. Scientometrics,114(2), 545–555. Scholar
  9. Lei, L., & Zhang, Y. (2018). Lack of improvement in scientific integrity: An analysis of WoS retractions by Chinese researchers (1997–2016). Science and Engineering Ethics,24(5), 1409–1420. Scholar
  10. Leydesdorff, L., Carley, S., & Rafols, I. (2013). Global maps of science based on the new Web-of-Science categories. Scientometrics,94(2), 589–593. Scholar
  11. Li, K., Rollins, J., & Yan, E. (2018). Web of Science use in published research and review papers 1997–2017: A selective, dynamic, cross-domain, content-based analysis. Scientometrics,115(1), 1–20. Scholar
  12. Liu, W. (2017). The changing role of non-English papers in scholarly communication: Evidence from Web of Science’s three journal citation indexes. Learned Publishing,30(2), 115–123. Scholar
  13. Liu, W., Hu, G., & Tang, L. (2018). Missing author address information in Web of Science—An explorative study. Journal of Informetrics,12(3), 985–997. Scholar
  14. Rafols, I., Porter, A. L., & Leydesdorff, L. (2010). Science overlay maps: A new tool for research policy and library management. Journal of the American Society for Information Science and Technology,61(9), 1871–1887. Scholar
  15. Rousseau, R., Egghe, L., & Guns, R. (2018). Becoming metric-wise: A bibliometric guide for researchers. Cambridge, MA: Chandos Publishing. Scholar
  16. Tang, L. (2013). Does “birds of a feather flock together” matter—Evidence from a longitudinal study on US–China scientific collaboration. Journal of Informetrics,7(2), 330–344. Scholar
  17. Tang, L., Hu, G., & Liu, W. (2017). Funding acknowledgment analysis: Queries and caveats. Journal of the Association for Information Science and Technology,68(3), 790–794. Scholar
  18. Van Eck, N., & Waltman, L. (2010). Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics,84(2), 523–538. Scholar
  19. Walsh, J. P., Lee, Y. N., & Tang, L. (2019). Pathogenic organization in science: Division of labor and retractions. Research Policy,48(2), 444–461. Scholar
  20. Zhu, J., Hu, G., & Liu, W. (2019). DOI errors and possible solutions for Web of Science. Scientometrics,118(2), 709–718. Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2019

Authors and Affiliations

  1. 1.School of Information Management and EngineeringZhejiang University of Finance and EconomicsHangzhouChina

Personalised recommendations