Abstract
In the more than a decade since the last Handbook of Quantitative Science and Technology Research [8.1] was published, a sea change has occurred in the creation and analysis of bibliometric networks that describe the Science & Technology (S&T) landscape. Previously, networks were typically restricted in size to hundreds or thousands of objects (papers, journals, authors, etc.) due to lack of data access and computing capacity. However, recent years have seen the increased availability of full databases, increased computing capacity, and development of partitioning and community detection algorithms that can work effectively at large scale. As a result, much larger networks–comprised of millions or tens of millions of objects–are being created and analyzed. These large-scale networks have enabled analyses that were simply not possible in the past, analyses that require the context of complete networks to give accurate results.
In this chapter, we focus on large-scale, global bibliometric networks , and on the types of analysis that are enabled by these networks. We start by providing a historical perspective that sets the stage for recent advances that have culminated in the ability to create and analyze large-scale bibliographic networks . We then discuss data sources and the methods that are commonly used to create large-scale networks. We review many of these networks, along with the types of unique analyses that they enable, and ways that their results can be effectively communicated. After reviewing the state of the art, we discuss our most recent large-scale topic-level model of science in detail as an example of a global bibliometric model and show how it can be used for various applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
H.F. Moed, W. Glänzel, U. Schmoch (Eds.): Handbook of Quantitative Science and Technology Research (Springer, Dordrecht 2004)
R. Klavans, K.W. Boyack: Toward a consensus map of science, J. Am. Soc. Inf. Sci. Technol. 60(3), 455–476 (2009)
I. Rafols, A.L. Porter, L. Leydesdorff: Science overlay maps: A new tool for research policy and library management, J. Am. Soc. Inf. Sci. Technol. 61(9), 1871–1887 (2010)
T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang: Comparison of topic extraction approaches and their results, Scientometrics 111, 1169–1221 (2017)
E. Garfield: Citation indexes for science: A new dimension in documentation through association of ideas, Science 122, 108–111 (1955)
E. Garfield, I.H. Sher, R.J. Torpie: The Use of Citation Data in Writing the History of Science (Institute for Scientific Information, Philadelphia 1964)
H. Small: Co-citation in the scientific literature: A new measure of the relationship between two documents, J. Am. Soc. Inf. Sci. 24, 265–269 (1973)
H. Small, B.C. Griffith: The structure of scientific literatures, I: Identifying and graphing specialties, Soc. Stud. Sci. 4, 17–40 (1974)
Clarivate Analytics: Research Fronts 2016, https://clarivate.com/wp-content/uploads/2017/10/Research_Fronts_2016_Report_EN.pdf (2016)
H. Small, K.W. Boyack, R. Klavans: Identifying emerging topics in science and technology, Res. Policy 43, 1450–1467 (2014)
D. Rotolo, D. Hicks, B. Martin: What is an emerging technology?, Res. Policy 44(10), 1827–1843 (2015)
T.S. Kuhn: The Structure of Scientific Revolutions, 2nd edn. (Univ. Chicago Press, Chicago 1970)
N.C. Mullins: Theories and Theory Groups in Contemporary American Sociology (Harper Row, New York 1973)
D. Crane: Invisible Colleges. Diffusion of Knowledge in Scientific Communities (Univ. Chicago Press, Chicago 1972)
K.W. Boyack: Investigating the effect of global data on topic detection, Scientometrics 111, 999–1015 (2017)
R. Klavans, K.W. Boyack: Using global mapping to create more accurate document-level maps of research fields, J. Am. Soc. Inf. Sci. Technol. 62(1), 1–18 (2011)
B. Latour: Science in Action (Harvard Univ. Press, Cambridge 1987)
D. Herrmannova, P. Knoth: An analysis of the Microsoft Academic Graph, D-Lib Magazine (2016), https://doi.org/10.1045/september2016-herrmannova
S. Ribas, A. Ueda, R.L.T. Santos, B. Ribeiro-Neto, N. Ziviani: UFMG/LATIN at WSDM Cup 2016: Simplified relative citation ratio for static paper ranking. In: 9th ACM International Conference on Web Search and Data Mining (ACM, San Francisco 2016)
C. Freyman, J. Byrnes, T. Muezzinoglu: Knowlege flows: Linking big data sets. In: OECD Blue Sky III (In, Ghent 2016), http://de.slideshare.net/innovationoecd/110-freyman-knowledge-flows-linking-big-dataset
A. Breitzman, P. Thomas: The emerging clusters model: A tool for identifying emerging technologies across multiple patent systems, Res. Policy 44, 195–205 (2015)
K. Börner, C. Chen, K.W. Boyack: Visualizing knowledge domains, Annu. Rev. Inf. Sci. Technol. 37, 179–255 (2003)
H. Small: Update on science mapping: Creating large document spaces, Scientometrics 38(2), 275–293 (1997)
K.W. Boyack, R. Klavans: Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?, J. Am. Soc. Inf. Sci. Technol. 61(12), 2389–2404 (2010)
L. Waltman, N.J. van Eck: A new methodology for constructing a publication-level classification system of science, J. Am. Soc. Inf. Sci. Technol. 63(12), 2378–2392 (2012)
N.J. van Eck, L. Waltman: How to normalize cooccurrence data? An analysis of some well-known similarity measures, J. Am. Soc. Inf. Sci. Technol. 60(8), 1635–1651 (2009)
K.W. Boyack, D. Newman, R.J. Duhon, R. Klavans, M. Patek, J.R. Biberstine, B. Schijvenaars, A. Skupin, N. Ma, K. Börner: Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches, PLoS One 6(3), e18029 (2011)
S. Fortunato: Community detection in graphs, Phys. Rep. 486(3–5), 75–174 (2010)
L. Šubelj, N.J. van Eck, L. Waltman: Clustering scientific publications based on citation relations: A systematic comparison of different methods, PLoS One 11(4), e154404 (2016)
M. Rosvall, C.T. Bergstrom: Maps of random walks on complex networks reveal community structure, Proc. Natl. Acad. Sci. USA 105(4), 1118–1123 (2008)
S. Bae, D. Halperin, J.D. West, M. Rosvall, B. Howe: Scalable and efficient flow-based community detection for large-scale graph analysis, ACM Trans. Knowl. Discov. Data 11(3), 32 (2017)
V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre: Fast unfolding of communities in large networks, J. Stat. Mech. Theory Exp. 10, P10008 (2008)
L. Waltman, N.J. van Eck: A smart local moving algorithm for large-scale modularity-based community detection, Eur. Phys. J. B 86, 471 (2013)
C. Chen, F. Ibekwe-SanJuan, J. Hou: The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis, J. Am. Soc. Inf. Sci. Technol. 61(7), 1386–1409 (2010)
C. Chen: CiteSpace: A Practical Guide for Mapping Scientific Literature (Nova Science, New York 2016)
N.J. van Eck, L. Waltman: Software survey: VOSviewer, a computer program for bibliometric mapping, Scientometrics 84(2), 523–538 (2010)
M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: Science mapping software tools: Review, analysis, and cooperative study among tools, J. Am. Soc. Inf. Sci. Technol. 62(7), 1382–1402 (2011)
M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: SciMAT: A new science mapping analysis software tool, J. Am. Soc. Inf. Sci. Technol. 63(8), 1609–1630 (2012)
T. Kamada, S. Kawai: An algorithm for drawing general undirected graphs, Inf. Process. Lett. 31, 7–15 (1988)
T.M.J. Fruchterman, E.M. Reingold: Graph drawing by force-directed placement, Softw. Pract. Exp. 21(11), 1129–1164 (1991)
S. Martin, W.M. Brown, R. Klavans, K.W. Boyack: OpenOrd: An open-source toolbox for large graph layout, Proc. SPIE Int. Soc. Opt. Eng. 7868, 786806 (2011)
A. Mrvar, V. Batagelj: Analysis and visualization of large networks with program package Pajek, Complex Adapt. Syst. Model. 4, 6 (2016)
M. Bastian, S. Heymann, M. Jacomy: Gephi: An open source software for exploring and manipulating networks. In: 3rd Int. AAAI Conf. Weblogs Soc. Media (2009)
K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 1, Inf. Process. Manag. 36(6), 779–808 (2000)
K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 2, Inf. Process. Manag. 36(6), 809–840 (2000)
K.W. Boyack, R. Klavans: Creation of a highly detailed, dynamic, global model and map of science, J. Assoc. Inf. Sci. Technol. 65(4), 670–685 (2014)
K. Börner, R. Klavans, M. Patek, A.M. Zoss, J.R. Biberstine, R.P. Light, V. Lariviere, K.W. Boyack: Design and update of a classification system: The UCSD map of science, PLoS ONE 7(7), e39464 (2012)
K.W. Boyack: Using detailed maps of science to identify potential collaborations, Scientometrics 79(1), 27–44 (2009)
R. Klavans, K.W. Boyack: Toward an objective, reliable and accurate method for measuring research leadership, Scientometrics 82(3), 539–553 (2000)
J. Ruiz-Castillo, L. Waltman: Field-normalized citation impact indicators using algorithmically constructed classification systems of science, J. Informetr. 9, 102–117 (2015)
K.W. Boyack, R. Klavans: Including non-source items in a large-scale map of science: What difference does it make?, J. Informetr. 8, 569–580 (2014)
R. Klavans, K.W. Boyack: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?, J. Assoc. Inf. Sci. Technol. 68(4), 984–998 (2017)
J.G. Foster, A. Rzhetsky, J.A. Evans: Tradition and innovation in scientists' research strategies, Am. Soc. Rev. 80(5), 875–908 (2015)
F. Shi, J.G. Foster, J.A. Evans: Weaving the fabric of science: Dynamic network models of science's unfolding structure, Soc. Netw. 43, 73–85 (2015)
I. Wesley-Smith, C.T. Bergstrom, J.D. West: Static ranking of scholarly papers using article-level Eigenfactor (ALEF). In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)
J.D. West, I. Wesley-Smith, C.T. Bergstrom: A recommendation system based on hierarchical clustering of an article-level citation network, IEEE Trans. Big Data 2(2), 113–123 (2016)
K.W. Boyack, M. Patek, L.H. Ungar, P. Yoon, R. Klavans: Classification of individual articles from all of science by research level, J. Informetr. 18(1), 1–12 (2014)
K.W. Boyack, R. Klavans, H. Small, L. Ungar: Characterizing the emergence of two nanotechnology topics using a contemporaneous global micro-model of science, J. Eng. Technol. Manag. 32, 147–159 (2014)
L. Waltman, K.W. Boyack, G. Colavizza, N.J. van Eck: A principled approach for comparing relatedness measures for clustering publications. In: 16th Int. Conf. Int. Soc. Scientometr. Informetr., Wuhan (2017) pp. 691–702
M.-H. Feng, K.-H. Chan, H.-Y. Chen, M.-F. Tsai, M.-Y. Yeh, S.-D. Lin: An efficient solution to reinforce paper ranking using author/venue/citation information – The winner's solution for WSDM Cup 2016. In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)
K.W. Boyack, R. Klavans: Measuring science-technology interaction using rare inventor-author names, J. Informetr. 2, 173–182 (2008)
S. Reardon: Text-mining offers clues to success, Nature 509, 410 (2014)
K.R. McKeown, H. Daume III, S. Chaturvedi, J. Paparrizos, K. Thadani, P. Barrio, O. Biran, S. Bothe, M. Collins, K.R. Fleischmann, L. Gravano, R. Jha, B. King, K. McInerney, T. Moon, A. Neelakantan, D. O'Seaghdha, D. Radev, C. Templeton, S. Teufel: Predicting the impact of scientific concepts using full text features, J. Assoc. Inf. Sci. Technol. 67(11), 2684–2696 (2016)
V.I. Torvik, N.R. Smalheiser: Author name disambiguation in MEDLINE, ACM Trans. Knowl. Discov. Data 3(3), 11–40 (2009)
G.-C. Li, R. Lai, A. D'Amour, D.M. Doolin, Y. Sun, V.I. Torvik, A.Z. Yu, L. Fleming: Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010), Res. Policy 43, 941–955 (2014)
W. Liu, R.I. Dogan, S. Kim, D.C. Comeau, W. Kim, L. Yeganova, Z. Lu, W.J. Wilbur: Author name disambiguation for PubMed, J. Assoc. Inf. Sci. Technol. 65(4), 765–781 (2014)
C. Schulz, A. Mazloumian, A.M. Petersen, O. Penner, D. Helbing: Exploiting citation networks for large-scale author name disambiguation, EPJ Data Sci. 1, 11 (2014)
E. Caron, N.J. van Eck: Large scale author name disambiguation using rule-based scoring and clustering. In: Proc. Sci. Technol. Indic. Conf., Leiden (2014) pp. 79–86
B.D. Fegley, V.I. Torvik: Has large-scale named-entity network analysis been resting on a flawed assumption?, PLoS ONE 8, e70299 (2009)
B. Uzzi, S. Mukherjee, M. Stringer, B. Jones: Atypical combinations and scientific impact, Science 342, 468–472 (2013)
K.W. Boyack, R. Klavans: Atypical combinations are confounded by disciplinary effects. In: 19th Int. Conf. Sci. Technol. Indicat (2014)
V. Larivière, S. Haustein, K. Börner: Long-distance interdisciplinarity leads to higher scientific impact, PLoS ONE 10, e122565 (2015)
L. Kay, N. Newman, J. Youtie, A. Porter, I. Rafols: Patent overlay mapping: Visualizing technological distance, J. Assoc. Inf. Sci. Technol. 65(12), 2432–2443 (2014)
E.M. Talley, D. Newman, D. Mimno, B.W. Herr, H.M. Wallach, G.A.P.C. Burns, A.G.M. Leenders, A. McCallum: Database of NIH grants using machine-learned categories and graphical clustering, Nat. Methods 8(6), 443–444 (2011)
M. Bertin, I. Atanassova, Y. Gingras, V. Lariviere: The invariant distribution of references in scientific articles, J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016)
K.W. Boyack, N.J. van Eck, G. Colavizza, L. Waltman: Characterizing in-text citations in scientific articles: A large-scale analysis, J. Informetr. 12(1), 59–73 (2018)
S. Emmons, S. Kobourov, M. Gallant, K. Börner: Analysis of network clustering algorithms and cluster quality metrics at scale, PLoS ONE 11(7), e0159161 (2016)
R. Klavans, K.W. Boyack: Research portfolio analysis and topic prominence, J. Informetr. 11(4), 1158–1174 (2017)
D.J. de Solla Price: Little Science, Big Science (Columbia Univ. Press, New York 1963)
K.W. Boyack: Thesaurus-based methods for mapping contents of publication sets, Scientometrics 111, 1141–1155 (2017)
R. Klavans, K.W. Boyack: The research focus of nations: Economic vs. altruistic motivations, PLoS ONE 12, e169383 (2017)
J.J. Franklin, R. Johnston: Co-citation bibliometric modeling as a tool for S&T policy and R&D management: Issues, applications, and developments. In: Handbook of Quantitative Studies of Science and Technology, ed. by A.F.J. van Raan (Elsevier, Amsterdam 1988) pp. 325–389
R. Klavans, K.W. Boyack: Mapping altruism, J. Informetr. 8, 431–447 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Boyack, K.W., Klavans, R. (2019). Creation and Analysis of Large-Scale Bibliometric Networks. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-02511-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-02511-3_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02510-6
Online ISBN: 978-3-030-02511-3
eBook Packages: Economics and FinanceEconomics and Finance (R0)