Mining Google Scholar Citations: An Exploratory Study

  • Ze Huang
  • Bo Yuan
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7389)


The official launch of Google Scholar Citations in 2011 opens a new horizon for analyzing the citations of individual researchers with unprecedented convenience and accuracy. This paper presents one of the first exploratory studies based on the data provided by Google Scholar Citations. More specifically, we conduct a series of investigations on: i) the overall citation patterns across different disciplines; ii) the correlation among various index metrics; iii) the personal citation patterns of researchers; iv) the transformation of research topics over time. Our results suggest that Google Scholar Citations is a powerful data source for citation analysis and provides a solid basis for performing more sophisticated data mining research in the future.


Google Scholar Citations Citation Analysis Tag Cloud Clustering 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    White, H.D., McCain, K.W.: Visualizing a Discipline: An Author Co-Citation Analysis of Information Science, 1972-1995. Journal of the American Society for Information Science and Technology 49(4), 327–355 (1998)Google Scholar
  2. 2.
    Chen, C.: CiteSpace II: Detecting and Visualizing Emerging Trends and Transient Patterns in Scientific Literature. Journal of the American Society for Information Science and Technology 57(3), 359–377 (2006)CrossRefGoogle Scholar
  3. 3.
    Gisvold, S.E.: Citation Analysis and Journal Impact Factors – Is the Tail Wagging the Dog? Acta Anaesthesiol. Scand. 43(10), 971–973 (1999)CrossRefGoogle Scholar
  4. 4.
    MacRoberts, M.H., MacRoberts, B.R.: Problems of Citation Analysis: A Critical Review. Journal of the American Society for Information Science and Technology 40(5), 342–349 (1989)CrossRefGoogle Scholar
  5. 5.
    Bakkalbasi, N., Bauer, K., Glover, J., Wang, L.: Three Options for Citation Tracking: Google Scholar, Scopus and Web of Science. Biomedical Digital Libraries 3(7) (2006)Google Scholar
  6. 6.
    Harzing, A., Wal, R.: Google Scholar as a New Source for Citation Analysis. Ethics in Science and Environmental Politics 8(1), 61–73 (2008)CrossRefGoogle Scholar
  7. 7.
    Meho, L.I., Yang, K.: Impact of Data Sources on Citation Counts and Rankings of LIS Faculty: Web of Science versus Scopus and Google Scholar. Journal of The American Society for Information Science and Technology 58(13), 2105–2125 (2007)CrossRefGoogle Scholar
  8. 8.
    Torvik, V., Smalheiser, N.: Author Name Disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data 3(3), Article 11 (2009)Google Scholar
  9. 9.
    Hirsch, J.: An Index to Quantify an Individual’s Scientific Research Output. Proceedings of the National Academy of Sciences 102(46), 16569–16572 (2005)CrossRefGoogle Scholar
  10. 10.
    Egghe, L.: Theory and Practise of the g-index. Scientometrics 69(1), 137–152 (2006)MathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Ze Huang
    • 1
  • Bo Yuan
    • 1
  1. 1.Intelligent Computing Lab, Division of Informatics, Graduate School at ShenzhenTsinghua UniversityShenzhenP.R. China

Personalised recommendations