Creation and Analysis of Large-Scale Bibliometric Networks

Boyack, Kevin W.; Klavans, Richard

doi:10.1007/978-3-030-02511-3_8

Kevin W. Boyack⁵ &
Richard Klavans⁶

Part of the book series: Springer Handbooks ((SHB))

3600 Accesses
10 Citations

Abstract

In the more than a decade since the last Handbook of Quantitative Science and Technology Research [8.1] was published, a sea change has occurred in the creation and analysis of bibliometric networks that describe the Science & Technology (S&T) landscape. Previously, networks were typically restricted in size to hundreds or thousands of objects (papers, journals, authors, etc.) due to lack of data access and computing capacity. However, recent years have seen the increased availability of full databases, increased computing capacity, and development of partitioning and community detection algorithms that can work effectively at large scale. As a result, much larger networks–comprised of millions or tens of millions of objects–are being created and analyzed. These large-scale networks have enabled analyses that were simply not possible in the past, analyses that require the context of complete networks to give accurate results.

In this chapter, we focus on large-scale, global bibliometric networks , and on the types of analysis that are enabled by these networks. We start by providing a historical perspective that sets the stage for recent advances that have culminated in the ability to create and analyze large-scale bibliographic networks . We then discuss data sources and the methods that are commonly used to create large-scale networks. We review many of these networks, along with the types of unique analyses that they enable, and ways that their results can be effectively communicated. After reviewing the state of the art, we discuss our most recent large-scale topic-level model of science in detail as an example of a global bibliometric model and show how it can be used for various applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 299.00; Price excludes VAT (USA)

Hardcover Book: USD 379.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

H.F. Moed, W. Glänzel, U. Schmoch (Eds.): Handbook of Quantitative Science and Technology Research (Springer, Dordrecht 2004)
Google Scholar
R. Klavans, K.W. Boyack: Toward a consensus map of science, J. Am. Soc. Inf. Sci. Technol. 60(3), 455–476 (2009)
Article Google Scholar
I. Rafols, A.L. Porter, L. Leydesdorff: Science overlay maps: A new tool for research policy and library management, J. Am. Soc. Inf. Sci. Technol. 61(9), 1871–1887 (2010)
Article Google Scholar
T. Velden, K.W. Boyack, J. Gläser, R. Koopman, A. Scharnhorst, S. Wang: Comparison of topic extraction approaches and their results, Scientometrics 111, 1169–1221 (2017)
Article Google Scholar
E. Garfield: Citation indexes for science: A new dimension in documentation through association of ideas, Science 122, 108–111 (1955)
Article Google Scholar
E. Garfield, I.H. Sher, R.J. Torpie: The Use of Citation Data in Writing the History of Science (Institute for Scientific Information, Philadelphia 1964)
Book Google Scholar
H. Small: Co-citation in the scientific literature: A new measure of the relationship between two documents, J. Am. Soc. Inf. Sci. 24, 265–269 (1973)
Article Google Scholar
H. Small, B.C. Griffith: The structure of scientific literatures, I: Identifying and graphing specialties, Soc. Stud. Sci. 4, 17–40 (1974)
Google Scholar
Clarivate Analytics: Research Fronts 2016, https://clarivate.com/wp-content/uploads/2017/10/Research_Fronts_2016_Report_EN.pdf (2016)
H. Small, K.W. Boyack, R. Klavans: Identifying emerging topics in science and technology, Res. Policy 43, 1450–1467 (2014)
Article Google Scholar
D. Rotolo, D. Hicks, B. Martin: What is an emerging technology?, Res. Policy 44(10), 1827–1843 (2015)
Article Google Scholar
T.S. Kuhn: The Structure of Scientific Revolutions, 2nd edn. (Univ. Chicago Press, Chicago 1970)
Google Scholar
N.C. Mullins: Theories and Theory Groups in Contemporary American Sociology (Harper Row, New York 1973)
Google Scholar
D. Crane: Invisible Colleges. Diffusion of Knowledge in Scientific Communities (Univ. Chicago Press, Chicago 1972)
Google Scholar
K.W. Boyack: Investigating the effect of global data on topic detection, Scientometrics 111, 999–1015 (2017)
Article Google Scholar
R. Klavans, K.W. Boyack: Using global mapping to create more accurate document-level maps of research fields, J. Am. Soc. Inf. Sci. Technol. 62(1), 1–18 (2011)
Article Google Scholar
B. Latour: Science in Action (Harvard Univ. Press, Cambridge 1987)
Google Scholar
D. Herrmannova, P. Knoth: An analysis of the Microsoft Academic Graph, D-Lib Magazine (2016), https://doi.org/10.1045/september2016-herrmannova
Article Google Scholar
S. Ribas, A. Ueda, R.L.T. Santos, B. Ribeiro-Neto, N. Ziviani: UFMG/LATIN at WSDM Cup 2016: Simplified relative citation ratio for static paper ranking. In: 9th ACM International Conference on Web Search and Data Mining (ACM, San Francisco 2016)
Google Scholar
C. Freyman, J. Byrnes, T. Muezzinoglu: Knowlege flows: Linking big data sets. In: OECD Blue Sky III (In, Ghent 2016), http://de.slideshare.net/innovationoecd/110-freyman-knowledge-flows-linking-big-dataset
Google Scholar
A. Breitzman, P. Thomas: The emerging clusters model: A tool for identifying emerging technologies across multiple patent systems, Res. Policy 44, 195–205 (2015)
Article Google Scholar
K. Börner, C. Chen, K.W. Boyack: Visualizing knowledge domains, Annu. Rev. Inf. Sci. Technol. 37, 179–255 (2003)
Article Google Scholar
H. Small: Update on science mapping: Creating large document spaces, Scientometrics 38(2), 275–293 (1997)
Article Google Scholar
K.W. Boyack, R. Klavans: Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?, J. Am. Soc. Inf. Sci. Technol. 61(12), 2389–2404 (2010)
Article Google Scholar
L. Waltman, N.J. van Eck: A new methodology for constructing a publication-level classification system of science, J. Am. Soc. Inf. Sci. Technol. 63(12), 2378–2392 (2012)
Article Google Scholar
N.J. van Eck, L. Waltman: How to normalize cooccurrence data? An analysis of some well-known similarity measures, J. Am. Soc. Inf. Sci. Technol. 60(8), 1635–1651 (2009)
Article Google Scholar
K.W. Boyack, D. Newman, R.J. Duhon, R. Klavans, M. Patek, J.R. Biberstine, B. Schijvenaars, A. Skupin, N. Ma, K. Börner: Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches, PLoS One 6(3), e18029 (2011)
Article Google Scholar
S. Fortunato: Community detection in graphs, Phys. Rep. 486(3–5), 75–174 (2010)
Article Google Scholar
L. Šubelj, N.J. van Eck, L. Waltman: Clustering scientific publications based on citation relations: A systematic comparison of different methods, PLoS One 11(4), e154404 (2016)
Article Google Scholar
M. Rosvall, C.T. Bergstrom: Maps of random walks on complex networks reveal community structure, Proc. Natl. Acad. Sci. USA 105(4), 1118–1123 (2008)
Article Google Scholar
S. Bae, D. Halperin, J.D. West, M. Rosvall, B. Howe: Scalable and efficient flow-based community detection for large-scale graph analysis, ACM Trans. Knowl. Discov. Data 11(3), 32 (2017)
Article Google Scholar
V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre: Fast unfolding of communities in large networks, J. Stat. Mech. Theory Exp. 10, P10008 (2008)
Article Google Scholar
L. Waltman, N.J. van Eck: A smart local moving algorithm for large-scale modularity-based community detection, Eur. Phys. J. B 86, 471 (2013)
Article Google Scholar
C. Chen, F. Ibekwe-SanJuan, J. Hou: The structure and dynamics of cocitation clusters: A multiple-perspective cocitation analysis, J. Am. Soc. Inf. Sci. Technol. 61(7), 1386–1409 (2010)
Article Google Scholar
C. Chen: CiteSpace: A Practical Guide for Mapping Scientific Literature (Nova Science, New York 2016)
Google Scholar
N.J. van Eck, L. Waltman: Software survey: VOSviewer, a computer program for bibliometric mapping, Scientometrics 84(2), 523–538 (2010)
Article Google Scholar
M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: Science mapping software tools: Review, analysis, and cooperative study among tools, J. Am. Soc. Inf. Sci. Technol. 62(7), 1382–1402 (2011)
Article Google Scholar
M.J. Cobo, A.G. Lopez-Herrera, E. Herrera-Viedma, F. Herrera: SciMAT: A new science mapping analysis software tool, J. Am. Soc. Inf. Sci. Technol. 63(8), 1609–1630 (2012)
Article Google Scholar
T. Kamada, S. Kawai: An algorithm for drawing general undirected graphs, Inf. Process. Lett. 31, 7–15 (1988)
Article Google Scholar
T.M.J. Fruchterman, E.M. Reingold: Graph drawing by force-directed placement, Softw. Pract. Exp. 21(11), 1129–1164 (1991)
Article Google Scholar
S. Martin, W.M. Brown, R. Klavans, K.W. Boyack: OpenOrd: An open-source toolbox for large graph layout, Proc. SPIE Int. Soc. Opt. Eng. 7868, 786806 (2011)
Google Scholar
A. Mrvar, V. Batagelj: Analysis and visualization of large networks with program package Pajek, Complex Adapt. Syst. Model. 4, 6 (2016)
Article Google Scholar
M. Bastian, S. Heymann, M. Jacomy: Gephi: An open source software for exploring and manipulating networks. In: 3rd Int. AAAI Conf. Weblogs Soc. Media (2009)
Google Scholar
K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 1, Inf. Process. Manag. 36(6), 779–808 (2000)
Article Google Scholar
K. Sparck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 2, Inf. Process. Manag. 36(6), 809–840 (2000)
Article Google Scholar
K.W. Boyack, R. Klavans: Creation of a highly detailed, dynamic, global model and map of science, J. Assoc. Inf. Sci. Technol. 65(4), 670–685 (2014)
Article Google Scholar
K. Börner, R. Klavans, M. Patek, A.M. Zoss, J.R. Biberstine, R.P. Light, V. Lariviere, K.W. Boyack: Design and update of a classification system: The UCSD map of science, PLoS ONE 7(7), e39464 (2012)
Article Google Scholar
K.W. Boyack: Using detailed maps of science to identify potential collaborations, Scientometrics 79(1), 27–44 (2009)
Article Google Scholar
R. Klavans, K.W. Boyack: Toward an objective, reliable and accurate method for measuring research leadership, Scientometrics 82(3), 539–553 (2000)
Article Google Scholar
J. Ruiz-Castillo, L. Waltman: Field-normalized citation impact indicators using algorithmically constructed classification systems of science, J. Informetr. 9, 102–117 (2015)
Article Google Scholar
K.W. Boyack, R. Klavans: Including non-source items in a large-scale map of science: What difference does it make?, J. Informetr. 8, 569–580 (2014)
Article Google Scholar
R. Klavans, K.W. Boyack: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?, J. Assoc. Inf. Sci. Technol. 68(4), 984–998 (2017)
Article Google Scholar
J.G. Foster, A. Rzhetsky, J.A. Evans: Tradition and innovation in scientists' research strategies, Am. Soc. Rev. 80(5), 875–908 (2015)
Article Google Scholar
F. Shi, J.G. Foster, J.A. Evans: Weaving the fabric of science: Dynamic network models of science's unfolding structure, Soc. Netw. 43, 73–85 (2015)
Article Google Scholar
I. Wesley-Smith, C.T. Bergstrom, J.D. West: Static ranking of scholarly papers using article-level Eigenfactor (ALEF). In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)
Google Scholar
J.D. West, I. Wesley-Smith, C.T. Bergstrom: A recommendation system based on hierarchical clustering of an article-level citation network, IEEE Trans. Big Data 2(2), 113–123 (2016)
Article Google Scholar
K.W. Boyack, M. Patek, L.H. Ungar, P. Yoon, R. Klavans: Classification of individual articles from all of science by research level, J. Informetr. 18(1), 1–12 (2014)
Article Google Scholar
K.W. Boyack, R. Klavans, H. Small, L. Ungar: Characterizing the emergence of two nanotechnology topics using a contemporaneous global micro-model of science, J. Eng. Technol. Manag. 32, 147–159 (2014)
Article Google Scholar
L. Waltman, K.W. Boyack, G. Colavizza, N.J. van Eck: A principled approach for comparing relatedness measures for clustering publications. In: 16th Int. Conf. Int. Soc. Scientometr. Informetr., Wuhan (2017) pp. 691–702
Google Scholar
M.-H. Feng, K.-H. Chan, H.-Y. Chen, M.-F. Tsai, M.-Y. Yeh, S.-D. Lin: An efficient solution to reinforce paper ranking using author/venue/citation information – The winner's solution for WSDM Cup 2016. In: 9th ACM Int. Conf. Web Search Data Mining (ACM, San Francisco 2016)
Google Scholar
K.W. Boyack, R. Klavans: Measuring science-technology interaction using rare inventor-author names, J. Informetr. 2, 173–182 (2008)
Article Google Scholar
S. Reardon: Text-mining offers clues to success, Nature 509, 410 (2014)
Article Google Scholar
K.R. McKeown, H. Daume III, S. Chaturvedi, J. Paparrizos, K. Thadani, P. Barrio, O. Biran, S. Bothe, M. Collins, K.R. Fleischmann, L. Gravano, R. Jha, B. King, K. McInerney, T. Moon, A. Neelakantan, D. O'Seaghdha, D. Radev, C. Templeton, S. Teufel: Predicting the impact of scientific concepts using full text features, J. Assoc. Inf. Sci. Technol. 67(11), 2684–2696 (2016)
Article Google Scholar
V.I. Torvik, N.R. Smalheiser: Author name disambiguation in MEDLINE, ACM Trans. Knowl. Discov. Data 3(3), 11–40 (2009)
Article Google Scholar
G.-C. Li, R. Lai, A. D'Amour, D.M. Doolin, Y. Sun, V.I. Torvik, A.Z. Yu, L. Fleming: Disambiguation and co-authorship networks of the U.S. patent inventor database (1975–2010), Res. Policy 43, 941–955 (2014)
Article Google Scholar
W. Liu, R.I. Dogan, S. Kim, D.C. Comeau, W. Kim, L. Yeganova, Z. Lu, W.J. Wilbur: Author name disambiguation for PubMed, J. Assoc. Inf. Sci. Technol. 65(4), 765–781 (2014)
Article Google Scholar
C. Schulz, A. Mazloumian, A.M. Petersen, O. Penner, D. Helbing: Exploiting citation networks for large-scale author name disambiguation, EPJ Data Sci. 1, 11 (2014)
Article Google Scholar
E. Caron, N.J. van Eck: Large scale author name disambiguation using rule-based scoring and clustering. In: Proc. Sci. Technol. Indic. Conf., Leiden (2014) pp. 79–86
Google Scholar
B.D. Fegley, V.I. Torvik: Has large-scale named-entity network analysis been resting on a flawed assumption?, PLoS ONE 8, e70299 (2009)
Article Google Scholar
B. Uzzi, S. Mukherjee, M. Stringer, B. Jones: Atypical combinations and scientific impact, Science 342, 468–472 (2013)
Article Google Scholar
K.W. Boyack, R. Klavans: Atypical combinations are confounded by disciplinary effects. In: 19th Int. Conf. Sci. Technol. Indicat (2014)
Google Scholar
V. Larivière, S. Haustein, K. Börner: Long-distance interdisciplinarity leads to higher scientific impact, PLoS ONE 10, e122565 (2015)
Google Scholar
L. Kay, N. Newman, J. Youtie, A. Porter, I. Rafols: Patent overlay mapping: Visualizing technological distance, J. Assoc. Inf. Sci. Technol. 65(12), 2432–2443 (2014)
Article Google Scholar
E.M. Talley, D. Newman, D. Mimno, B.W. Herr, H.M. Wallach, G.A.P.C. Burns, A.G.M. Leenders, A. McCallum: Database of NIH grants using machine-learned categories and graphical clustering, Nat. Methods 8(6), 443–444 (2011)
Article Google Scholar
M. Bertin, I. Atanassova, Y. Gingras, V. Lariviere: The invariant distribution of references in scientific articles, J. Assoc. Inf. Sci. Technol. 67(1), 164–177 (2016)
Article Google Scholar
K.W. Boyack, N.J. van Eck, G. Colavizza, L. Waltman: Characterizing in-text citations in scientific articles: A large-scale analysis, J. Informetr. 12(1), 59–73 (2018)
Article Google Scholar
S. Emmons, S. Kobourov, M. Gallant, K. Börner: Analysis of network clustering algorithms and cluster quality metrics at scale, PLoS ONE 11(7), e0159161 (2016)
Article Google Scholar
R. Klavans, K.W. Boyack: Research portfolio analysis and topic prominence, J. Informetr. 11(4), 1158–1174 (2017)
Article Google Scholar
D.J. de Solla Price: Little Science, Big Science (Columbia Univ. Press, New York 1963)
Book Google Scholar
K.W. Boyack: Thesaurus-based methods for mapping contents of publication sets, Scientometrics 111, 1141–1155 (2017)
Article Google Scholar
R. Klavans, K.W. Boyack: The research focus of nations: Economic vs. altruistic motivations, PLoS ONE 12, e169383 (2017)
Article Google Scholar
J.J. Franklin, R. Johnston: Co-citation bibliometric modeling as a tool for S&T policy and R&D management: Issues, applications, and developments. In: Handbook of Quantitative Studies of Science and Technology, ed. by A.F.J. van Raan (Elsevier, Amsterdam 1988) pp. 325–389
Chapter Google Scholar
R. Klavans, K.W. Boyack: Mapping altruism, J. Informetr. 8, 431–447 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

SciTech Strategies, Inc., Albuquerque, NM, USA
Kevin W. Boyack
SciTech Strategies, Inc., Wayne, PA, USA
Richard Klavans

Authors

Kevin W. Boyack
View author publications
You can also search for this author in PubMed Google Scholar
Richard Klavans
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kevin W. Boyack .

Editor information

Editors and Affiliations

ECOOM and Faculty of Economics and Business, KU Leuven, Leuven, Belgium
Wolfgang Glänzel
Amsterdam, The Netherlands
Henk F. Moed
Competence Center Policy – Industry – Innovation, Fraunhofer Institute for Systems and Innovation Research ISI, Karlsruhe, Germany
Ulrich Schmoch
Faculty of Science and Engineering, University of Wolverhampton, Wolverhampton, UK
Mike Thelwall

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Boyack, K.W., Klavans, R. (2019). Creation and Analysis of Large-Scale Bibliometric Networks. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-02511-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-02511-3_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02510-6
Online ISBN: 978-3-030-02511-3
eBook Packages: Economics and FinanceEconomics and Finance (R0)

Publish with us

Policies and ethics