Skip to main content

Creation and Analysis of Large-Scale Bibliometric Networks

  • Chapter
Springer Handbook of Science and Technology Indicators

Part of the book series: Springer Handbooks ((SHB))

Abstract

In the more than a decade since the last Handbook of Quantitative Science and Technology Research [8.1] was published, a sea change has occurred in the creation and analysis of bibliometric networks that describe the Science & Technology (S&T) landscape. Previously, networks were typically restricted in size to hundreds or thousands of objects (papers, journals, authors, etc.) due to lack of data access and computing capacity. However, recent years have seen the increased availability of full databases, increased computing capacity, and development of partitioning and community detection algorithms that can work effectively at large scale. As a result, much larger networks–comprised of millions or tens of millions of objects–are being created and analyzed. These large-scale networks have enabled analyses that were simply not possible in the past, analyses that require the context of complete networks to give accurate results.

In this chapter, we focus on large-scale, global bibliometric networks , and on the types of analysis that are enabled by these networks. We start by providing a historical perspective that sets the stage for recent advances that have culminated in the ability to create and analyze large-scale bibliographic networks . We then discuss data sources and the methods that are commonly used to create large-scale networks. We review many of these networks, along with the types of unique analyses that they enable, and ways that their results can be effectively communicated. After reviewing the state of the art, we discuss our most recent large-scale topic-level model of science in detail as an example of a global bibliometric model and show how it can be used for various applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • H.F. Moed, W. Glänzel, U. Schmoch (Eds.): Handbook of Quantitative Science and Technology Research (Springer, Dordrecht 2004)

    Google Scholar 

  • R. Klavans, K.W. Boyack: Toward a consensus map of science, J. Am. Soc. Inf. Sci. Technol. 60(3), 455–476 (2009)

    Article  Google Scholar 

  • I. Rafols, A.L. Porter, L. Leydesdorff: Science overlay maps: A new tool for research policy and library management, J. Am. Soc. Inf. Sci. Technol. 61(9), 1871–1887 (2010)

    Article  Google Scholar 

  • T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang: Comparison of topic extraction approaches and their results, Scientometrics 111, 1169–1221 (2017)

    Article  Google Scholar 

  • E. Garfield: Citation indexes for science: A new dimension in documentation through association of ideas, Science 122, 108–111 (1955)

    Article  Google Scholar 

  • E. Garfield, I.H. Sher, R.J. Torpie: The Use of Citation Data in Writing the History of Science (Institute for Scientific Information, Philadelphia 1964)

    Book  Google Scholar 

  • H. Small: Co-citation in the scientific literature: A new measure of the relationship between two documents, J. Am. Soc. Inf. Sci. 24, 265–269 (1973)

    Article  Google Scholar 

  • H. Small, B.C. Griffith: The structure of scientific literatures, I: Identifying and graphing specialties, Soc. Stud. Sci. 4, 17–40 (1974)

    Google Scholar 

  • Clarivate Analytics: Research Fronts 2016, https://clarivate.com/wp-content/uploads/2017/10/Research_Fronts_2016_Report_EN.pdf (2016)

  • H. Small, K.W. Boyack, R. Klavans: Identifying emerging topics in science and technology, Res. Policy 43, 1450–1467 (2014)

    Article  Google Scholar 

  • D. Rotolo, D. Hicks, B. Martin: What is an emerging technology?, Res. Policy 44(10), 1827–1843 (2015)

    Article  Google Scholar 

  • T.S. Kuhn: The Structure of Scientific Revolutions, 2nd edn. (Univ. Chicago Press, Chicago 1970)

    Google Scholar 

  • N.C. Mullins: Theories and Theory Groups in Contemporary American Sociology (Harper Row, New York 1973)

    Google Scholar 

  • D. Crane: Invisible Colleges. Diffusion of Knowledge in Scientific Communities (Univ. Chicago Press, Chicago 1972)

    Google Scholar 

  • K.W. Boyack: Investigating the effect of global data on topic detection, Scientometrics 111, 999–1015 (2017)

    Article  Google Scholar 

  • R. Klavans, K.W. Boyack: Using global mapping to create more accurate document-level maps of research fields, J. Am. Soc. Inf. Sci. Technol. 62(1), 1–18 (2011)

    Article  Google Scholar 

  • B. Latour: Science in Action (Harvard Univ. Press, Cambridge 1987)

    Google Scholar 

  • D. Herrmannova, P. Knoth: An analysis of the Microsoft Academic Graph, D-Lib Magazine (2016), https://doi.org/10.1045/september2016-herrmannova

    Article  Google Scholar 

  • S. Ribas, A. Ueda, R.L.T. Santos, B. Ribeiro-Neto, N. Ziviani: UFMG/LATIN at WSDM Cup 2016: Simplified relative citation ratio for static paper ranking. In: 9th ACM International Conference on Web Search and Data Mining (ACM, San Francisco 2016)

    Google Scholar 

  • C. Freyman, J. Byrnes, T. Muezzinoglu: Knowlege flows: Linking big data sets. In: OECD Blue Sky III (In, Ghent 2016), http://de.slideshare.net/innovationoecd/110-freyman-knowledge-flows-linking-big-dataset

    Google Scholar 

  • A. Breitzman, P. Thomas: The emerging clusters model: A tool for identifying emerging technologies across multiple patent systems, Res. Policy 44, 195–205 (2015)

    Article  Google Scholar 

  • K. Börner, C. Chen, K.W. Boyack: Visualizing knowledge domains, Annu. Rev. Inf. Sci. Technol. 37, 179–255 (2003)

    Article  Google Scholar 

  • H. Small: Update on science mapping: Creating large document spaces, Scientometrics 38(2), 275–293 (1997)

    Article  Google Scholar 

  • K.W. Boyack, R. Klavans: Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?, J. Am. Soc. Inf. Sci. Technol. 61(12), 2389–2404 (2010)

    Article  Google Scholar 

  • L. Waltman, N.J. van Eck: A new methodology for constructing a publication-level classification system of science, J. Am. Soc. Inf. Sci. Technol. 63(12), 2378–2392 (2012)

    Article  Google Scholar 

  • N.J. van Eck, L. Waltman: How to normalize cooccurrence data? An analysis of some well-known similarity measures, J. Am. Soc. Inf. Sci. Technol. 60(8), 1635–1651 (2009)

    Article  Google Scholar 

  • K.W. Boyack, D. Newman, R.J. Duhon, R. Klavans, M. Patek, J.R. Biberstine, B. Schijvenaars, A. Skupin, N. Ma, K. Börner: Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches, PLoS One 6(3), e18029 (2011)

    Article  Google Scholar 

  • S. Fortunato: Community detection in graphs, Phys. Rep. 486(3–5), 75–174 (2010)

    Article  Google Scholar 

  • L. Šubelj, N.J. van Eck, L. Waltman: Clustering scientific publications based on citation relations: A systematic comparison of different methods, PLoS One 11(4), e154404 (2016)

    Article  Google Scholar 

  • M. Rosvall, C.T. Bergstrom: Maps of random walks on complex networks reveal community structure, Proc. Natl. Acad. Sci. USA 105(4), 1118–1123 (2008)

    Article  Google Scholar 

  • S. Bae, D. Halperin, J.D. West, M. Rosvall, B. Howe: Scalable and efficient flow-based community detection for large-scale graph analysis, ACM Trans. Knowl. Discov. Data 11(3), 32 (2017)

    Article  Google Scholar 

  • V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre: Fast unfolding of communities in large networks, J. Stat. Mech. Theory Exp. 10, P10008 (2008)

    Article  Google Scholar 

  • L. Waltman, N.J. van Eck: A smart local moving algorithm for large-scale modularity-based community detection, Eur. Phys. J. B 86, 471 (2013)

    Article  Google Scholar 

  • C. Chen, F. Ibekwe-SanJuan, J. Hou: The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis, J. Am. Soc. Inf. Sci. Technol. 61(7), 1386–1409 (2010)

    Article  Google Scholar 

  • C. Chen: CiteSpace: A Practical Guide for Mapping Scientific Literature (Nova Science, New York 2016)

    Google Scholar 

  • N.J. van Eck, L. Waltman: Software survey: VOSviewer, a computer program for bibliometric mapping, Scientometrics 84(2), 523–538 (2010)

    Article  Google Scholar 

  • M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: Science mapping software tools: Review, analysis, and cooperative study among tools, J. Am. Soc. Inf. Sci. Technol. 62(7), 1382–1402 (2011)

    Article  Google Scholar 

  • M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: SciMAT: A new science mapping analysis software tool, J. Am. Soc. Inf. Sci. Technol. 63(8), 1609–1630 (2012)

    Article  Google Scholar 

  • T. Kamada, S. Kawai: An algorithm for drawing general undirected graphs, Inf. Process. Lett. 31, 7–15 (1988)

    Article  Google Scholar 

  • T.M.J. Fruchterman, E.M. Reingold: Graph drawing by force-directed placement, Softw. Pract. Exp. 21(11), 1129–1164 (1991)

    Article  Google Scholar 

  • S. Martin, W.M. Brown, R. Klavans, K.W. Boyack: OpenOrd: An open-source toolbox for large graph layout, Proc. SPIE Int. Soc. Opt. Eng. 7868, 786806 (2011)

    Google Scholar 

  • A. Mrvar, V. Batagelj: Analysis and visualization of large networks with program package Pajek, Complex Adapt. Syst. Model. 4, 6 (2016)

    Article  Google Scholar 

  • M. Bastian, S. Heymann, M. Jacomy: Gephi: An open source software for exploring and manipulating networks. In: 3rd Int. AAAI Conf. Weblogs Soc. Media (2009)

    Google Scholar 

  • K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 1, Inf. Process. Manag. 36(6), 779–808 (2000)

    Article  Google Scholar 

  • K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 2, Inf. Process. Manag. 36(6), 809–840 (2000)

    Article  Google Scholar 

  • K.W. Boyack, R. Klavans: Creation of a highly detailed, dynamic, global model and map of science, J. Assoc. Inf. Sci. Technol. 65(4), 670–685 (2014)

    Article  Google Scholar 

  • K. Börner, R. Klavans, M. Patek, A.M. Zoss, J.R. Biberstine, R.P. Light, V. Lariviere, K.W. Boyack: Design and update of a classification system: The UCSD map of science, PLoS ONE 7(7), e39464 (2012)

    Article  Google Scholar 

  • K.W. Boyack: Using detailed maps of science to identify potential collaborations, Scientometrics 79(1), 27–44 (2009)

    Article  Google Scholar 

  • R. Klavans, K.W. Boyack: Toward an objective, reliable and accurate method for measuring research leadership, Scientometrics 82(3), 539–553 (2000)

    Article  Google Scholar 

  • J. Ruiz-Castillo, L. Waltman: Field-normalized citation impact indicators using algorithmically constructed classification systems of science, J. Informetr. 9, 102–117 (2015)

    Article  Google Scholar 

  • K.W. Boyack, R. Klavans: Including non-source items in a large-scale map of science: What difference does it make?, J. Informetr. 8, 569–580 (2014)

    Article  Google Scholar 

  • R. Klavans, K.W. Boyack: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?, J. Assoc. Inf. Sci. Technol. 68(4), 984–998 (2017)

    Article  Google Scholar 

  • J.G. Foster, A. Rzhetsky, J.A. Evans: Tradition and innovation in scientists' research strategies, Am. Soc. Rev. 80(5), 875–908 (2015)

    Article  Google Scholar 

  • F. Shi, J.G. Foster, J.A. Evans: Weaving the fabric of science: Dynamic network models of science's unfolding structure, Soc. Netw. 43, 73–85 (2015)

    Article  Google Scholar 

  • I. Wesley-Smith, C.T. Bergstrom, J.D. West: Static ranking of scholarly papers using article-level Eigenfactor (ALEF). In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)

    Google Scholar 

  • J.D. West, I. Wesley-Smith, C.T. Bergstrom: A recommendation system based on hierarchical clustering of an article-level citation network, IEEE Trans. Big Data 2(2), 113–123 (2016)

    Article  Google Scholar 

  • K.W. Boyack, M. Patek, L.H. Ungar, P. Yoon, R. Klavans: Classification of individual articles from all of science by research level, J. Informetr. 18(1), 1–12 (2014)

    Article  Google Scholar 

  • K.W. Boyack, R. Klavans, H. Small, L. Ungar: Characterizing the emergence of two nanotechnology topics using a contemporaneous global micro-model of science, J. Eng. Technol. Manag. 32, 147–159 (2014)

    Article  Google Scholar 

  • L. Waltman, K.W. Boyack, G. Colavizza, N.J. van Eck: A principled approach for comparing relatedness measures for clustering publications. In: 16th Int. Conf. Int. Soc. Scientometr. Informetr., Wuhan (2017) pp. 691–702

    Google Scholar 

  • M.-H. Feng, K.-H. Chan, H.-Y. Chen, M.-F. Tsai, M.-Y. Yeh, S.-D. Lin: An efficient solution to reinforce paper ranking using author/venue/citation information – The winner's solution for WSDM Cup 2016. In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)

    Google Scholar 

  • K.W. Boyack, R. Klavans: Measuring science-technology interaction using rare inventor-author names, J. Informetr. 2, 173–182 (2008)

    Article  Google Scholar 

  • S. Reardon: Text-mining offers clues to success, Nature 509, 410 (2014)

    Article  Google Scholar 

  • K.R. McKeown, H. Daume III, S. Chaturvedi, J. Paparrizos, K. Thadani, P. Barrio, O. Biran, S. Bothe, M. Collins, K.R. Fleischmann, L. Gravano, R. Jha, B. King, K. McInerney, T. Moon, A. Neelakantan, D. O'Seaghdha, D. Radev, C. Templeton, S. Teufel: Predicting the impact of scientific concepts using full text features, J. Assoc. Inf. Sci. Technol. 67(11), 2684–2696 (2016)

    Article  Google Scholar 

  • V.I. Torvik, N.R. Smalheiser: Author name disambiguation in MEDLINE, ACM Trans. Knowl. Discov. Data 3(3), 11–40 (2009)

    Article  Google Scholar 

  • G.-C. Li, R. Lai, A. D'Amour, D.M. Doolin, Y. Sun, V.I. Torvik, A.Z. Yu, L. Fleming: Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010), Res. Policy 43, 941–955 (2014)

    Article  Google Scholar 

  • W. Liu, R.I. Dogan, S. Kim, D.C. Comeau, W. Kim, L. Yeganova, Z. Lu, W.J. Wilbur: Author name disambiguation for PubMed, J. Assoc. Inf. Sci. Technol. 65(4), 765–781 (2014)

    Article  Google Scholar 

  • C. Schulz, A. Mazloumian, A.M. Petersen, O. Penner, D. Helbing: Exploiting citation networks for large-scale author name disambiguation, EPJ Data Sci. 1, 11 (2014)

    Article  Google Scholar 

  • E. Caron, N.J. van Eck: Large scale author name disambiguation using rule-based scoring and clustering. In: Proc. Sci. Technol. Indic. Conf., Leiden (2014) pp. 79–86

    Google Scholar 

  • B.D. Fegley, V.I. Torvik: Has large-scale named-entity network analysis been resting on a flawed assumption?, PLoS ONE 8, e70299 (2009)

    Article  Google Scholar 

  • B. Uzzi, S. Mukherjee, M. Stringer, B. Jones: Atypical combinations and scientific impact, Science 342, 468–472 (2013)

    Article  Google Scholar 

  • K.W. Boyack, R. Klavans: Atypical combinations are confounded by disciplinary effects. In: 19th Int. Conf. Sci. Technol. Indicat (2014)

    Google Scholar 

  • V. Larivière, S. Haustein, K. Börner: Long-distance interdisciplinarity leads to higher scientific impact, PLoS ONE 10, e122565 (2015)

    Google Scholar 

  • L. Kay, N. Newman, J. Youtie, A. Porter, I. Rafols: Patent overlay mapping: Visualizing technological distance, J. Assoc. Inf. Sci. Technol. 65(12), 2432–2443 (2014)

    Article  Google Scholar 

  • E.M. Talley, D. Newman, D. Mimno, B.W. Herr, H.M. Wallach, G.A.P.C. Burns, A.G.M. Leenders, A. McCallum: Database of NIH grants using machine-learned categories and graphical clustering, Nat. Methods 8(6), 443–444 (2011)

    Article  Google Scholar 

  • M. Bertin, I. Atanassova, Y. Gingras, V. Lariviere: The invariant distribution of references in scientific articles, J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016)

    Article  Google Scholar 

  • K.W. Boyack, N.J. van Eck, G. Colavizza, L. Waltman: Characterizing in-text citations in scientific articles: A large-scale analysis, J. Informetr. 12(1), 59–73 (2018)

    Article  Google Scholar 

  • S. Emmons, S. Kobourov, M. Gallant, K. Börner: Analysis of network clustering algorithms and cluster quality metrics at scale, PLoS ONE 11(7), e0159161 (2016)

    Article  Google Scholar 

  • R. Klavans, K.W. Boyack: Research portfolio analysis and topic prominence, J. Informetr. 11(4), 1158–1174 (2017)

    Article  Google Scholar 

  • D.J. de Solla Price: Little Science, Big Science (Columbia Univ. Press, New York 1963)

    Book  Google Scholar 

  • K.W. Boyack: Thesaurus-based methods for mapping contents of publication sets, Scientometrics 111, 1141–1155 (2017)

    Article  Google Scholar 

  • R. Klavans, K.W. Boyack: The research focus of nations: Economic vs. altruistic motivations, PLoS ONE 12, e169383 (2017)

    Article  Google Scholar 

  • J.J. Franklin, R. Johnston: Co-citation bibliometric modeling as a tool for S&T policy and R&D management: Issues, applications, and developments. In: Handbook of Quantitative Studies of Science and Technology, ed. by A.F.J. van Raan (Elsevier, Amsterdam 1988) pp. 325–389

    Chapter  Google Scholar 

  • R. Klavans, K.W. Boyack: Mapping altruism, J. Informetr. 8, 431–447 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kevin W. Boyack .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this chapter

Cite this chapter

Boyack, K.W., Klavans, R. (2019). Creation and Analysis of Large-Scale Bibliometric Networks. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-02511-3_8

Download citation

Publish with us

Policies and ethics