Abstract
This chapter focusses on the drivers for the advancement of mapping of science and the detection of topics as often applied in scientometrics. The chapter identifies three different drivers for this advancement: technological innovation resulting in increased computational power, the improved community detection approaches available today, and advancements in scientometrics itself with respect to the actual linking of documents through citations or lexical approaches. We will show that the main drivers are the first two, with the last one somewhat lagging behind. Next, severe methodological issues have been identified in network science related to the application of these techniques for community detection. The resolution limit and the degeneracy problem are described. The last section shows how different approaches are taken to enable scientometricians to create global maps of science and how they come to comparable results at higher levels of granularity but that the validity of more fine-grained clusters and topics suffers strongly in the discussed problems, which raises serious questions with respect to the applicability of these global techniques with a strong local focus.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
M.E.J. Newman, M. Girvan: Finding and evaluating community structure in networks, Phys. Rev. E 69, 026113 (2004)
L. Waltman, N.J. van Eck: A smart local moving algorithm for large-scale modularity-based community detection, Eur. Phys. J. B 86(11), 471 (2013)
S. Fortunato, M. Barthélemy: Resolution limit in community detection, PNES 104, 36 (2007)
B.H. Good, Y.-A. de Montojoye, A. Clauset: Performance of modularity maximization in practical contexts, Phys. Rev. E 81, 046106 (2010)
G.E. Moore: Cramming more components onto Integrated Circuits, Electronics 38(8), 33–35 (1965)
R. Nambiar, M. Puess: Transaction performance vs. Moore's Law: A trend analysis. In: TPCTC 2010: Performance Evaluation, Measurement and Characterization of Complex Systems (Springer, Berlin, Heidelberg 2011) pp. 110–120
M. Rosvall, C.T. Bergstrom: Maps of information flow reveal community structure in complex networks, PNAS 105, 1118 (2008)
V.D. Blondel, J.-L. Guillaume, R. Lambiotte, E. Lefebvre: Fast unfolding of communities in large networks, J. Stat. Mech. Theory Exp. 2008(10), P10008 (2008)
E.C.M. Noyons: Science maps within a science policy context. In: Handbook of Quantitative Science and Technology Research, ed. by H.F. Moed, W. Glänzel, U. Schmoch (Springer, Dordrecht 2004) pp. 187–213
E. Garfield: Permuterm subject index – The primordial dictionary of science, Curr. Contents 12(22), 4 (1969)
H. Small, B.C. Griffith: The structure of scientific literatures, I: Identifying and graphing specialties, Soc. Stud. Sci. 4, 17–40 (1974)
M.M. Kessler: Bibliographic coupling between scientific papers, Am. Doc. 14, 10–25 (1963)
R. Klavans, K.W. Boyack: Which type of citation analysis generates the most accurate taxonomy of scientific and technical knowledge?, JASIST 68(4), 984–998 (2017)
D.M. Blei, A.Y. Ng, M.I. Jordan: Latent Dirichlet allocation, J. Mach. Learn. Res. 3, 993–1022 (2003)
S. Wasserman, K. Faust: Social Network Analysis: Methods and Applications (Cambridge Univ. Press, New York 1994)
H.D. White, K.W. McCain: Visualizing a discipline: An author co-citation analysis of information science, 1972–1995, J. Am. Soc. Inf. Sci. 49, 327–355 (1998)
L. Waltman, N.J. van Eck: A new methodology for constructing a publication-level classification system of science, JASIST 63(12), 2378–2392 (2012)
N.J. van Eck, L. Waltman: Citation-based clustering of publications using CitNetExplorer and VOSviewer, Scientometrics 111(2), 1053–1070 (2017)
K.W. Boyack, R. Klavans: Including non-source items in a large-scale map of science: What difference does it make?, J. Informetr. 8, 569–580 (2014)
K.W. Boyack, R. Klavans: Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?, JASIST 61(12), 2389–2404 (2010)
R. Klavans, K.W. Boyack: Toward an objective, reliable and accurate method for measuring research leadership, Scientometrics 82(3), 539–553 (2010)
E. Garfield, M.V. Malin, H. Small: Citation Data as Science Indicators. In: Toward a Metric of Science: The Advent of Science Indicators, ed. by Y. Elkana, J. Lederberg, R.K. Merton, A. Thackray, H. Zuckerman (John Wiley & Sons, New York 1978) pp. 179–207, reprinted in Essays of an Information Scientist, Vol. 6, p. 580, 1983
R. Klavans, K.W. Boyack: Using global mapping to create more accurate document-level maps of research fields, JASIST 62(1), 1–18 (2011)
W. Glänzel, H.J. Czerwon: A new methodological approach to bibliographic coupling and its application to the national, regional and institutional level, Scientometrics 37(2), 195–221 (1996)
G.M. Sheldrick: A short history of SHELX, Acta Chrystallogr. Sect. A 64(1), 112–122 (2008)
B. Thijs: Drakkar: A graph based all-nearest neighbour search algorithm for bibliographic coupling. In: Proc. 5th Worksh. Bibliometr.-Enhanc. Inf. Retriev. (BIR), Vol. 1823 (2017) pp. 101–111
M. Callon, J.P. Courtial, F. Laville: Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemistry, Scientometrics 22(1), 155–205 (1991)
R.J.W. Tijssen, A.F.J. Van Raan: Mapping changes in science and technology: Bibliometric co-occurrence analysis of the R&D literature, Eval. Rev. 18(1), 98–115 (1994)
P. Glenisson, W. Glänzel, O. Person: Combining full-text analysis and bibliometric indicators. A pilot study, Scientometrics 63(1), 163–180 (2005)
P. Glenisson, W. Glänzel, F. Janssens, B. De Moor: Combining full text and bibliometric information in mapping scientific disciplines, Inf. Process. Manag. 41, 1548–1572 (2005)
M. Zitt, E. Bassecoulard: Development of a method for detection and trend analysis of research fronts built by lexical or co-citation analysis, Scientometrics 30, 333–351 (1994)
R. Todorov: Displaying content of scientific journals: A co-heading analysis, Scientometrics 23(2), 319–334 (1992)
K.W. Boyack, D. Newman, R.J. Duhon, R. Klavans, M. Patek, J.R. Biberstine: Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches, PLoS One 6(3), e18029 (2011)
M.F. Porter: An algorithm for suffix stripping, Program 14(3), 130–137 (1980)
T. Dunning: Accurate methods for the statistics of surprise and coincidence, Comput. Linguist. 19, 61–74 (1993)
E. Leopold, M. May, G. Paaß: Data mining and text mining for S&T research. In: Handbook of Quantitative Science and Technology Research, ed. by H.F. Moede, W. Glänzel, U. Schmoch (Springer, Dordrecht 2004) pp. 187–213
G. Neumann, J. Piskorski: A shallow text processing core engine, Comput. Intell. 18(3), 451–476 (2002)
B. Thijs, W. Glänzel, M. Meyer: Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of “Information system research”. In: Proc. Worksh. Mining Sci. Papers: Comput. Linguist. Bibliometr. International Society of Scientometrics and Informetrics Conference (ISSI), Istanbul, Vol. 1384 (2015)
G.J. Udo, R.C. Kick: The determinants of the critical success factors of information systems downsizing, Eur. J. Inf. Syst. 6(4), 218 (1997)
W. Glänzel, B. Thijs: Using hybrid methods and ‘core documents' for the representation of clusters and topics: the astronomy dataset, Scientometrics 111(2), 1071–1087 (2017)
G. Salton, C. Buckley: Term-weighting approaches in automatic text retrieval, Inf. Process. Manag. 24, 513–523 (1988)
B. Thijs, W. Glänzel, M. Meyer: Improved lexical similarities for hybrid clustering through the use of noun phrases extraction. In: FEB Research Report MSI_1703 (KU Leuven – Faculty of Economics and Business, Leuven 2017)
K. Spärck Jones: A statistical interpretation of term specificity and its application in retrieval, J. Doc. 28, 11–21 (1972)
T. Hofman: Unsupervised learning by probabilistic latent semantic analysis, Mach. Learn. 42, 177–196 (2001)
R. Koopman, S. Wang, A. Scharnhorst: Contextualization of topics: Browsing through the universe of bibliographic information, Scientometrics 111(2), 1071–1087 (2017)
K. Spärck Jones, S. Walker, S.E. Robertson: A probabilistic model of information retrieval: Development and comparative experiments. Part 1, Inf. Process. Manag. 36, 779–808 (2000)
C.D. Manning, P. Raghavan, H. Schütze: Introduction to Information Retrieval (Cambridge Univ. Press, Cambridge 2008)
D. Ravichandran, P.E. Pantel: Hovy: Randomized algorithms and NLP: Using locality sensitive hash function for high speed noun clustering. In: Proc. 43rd Annu. Meet. Assoc. Comput. Linguist (2005) pp. 622–629
J. Bichteler, E.A. Eaton: The combined use of bibliographic coupling and co-citation for document retrieval, JASIST 31(4), 278–282 (1980)
R.R. Braam, H.F. Moed, A.F.J. van Raan: Mapping of science by combined co-citation and word analysis, part 1: Structural aspects, JASIST 42(4), 233–251 (1991)
R.R. Braam, H.F. Moed, A.F.J. van Raan: Mapping of science by combined co-citation and word analysis part II: Dynamical aspects, JASIST 42(4), 252–266 (1991)
F. Janssens, P. Glenisson, W. Glänzel, B. De Moor: Co-clustering approaches to integrate lexical and bibliographical information. In: Proc. of the 10th Int. Conf. Int. Soc. Scientometr. Informetr. (ISSI) (Karolinska Univ. Press, Stockholm 2005) pp. 284–289
R. Albert, A.-L. Barabási: Statistical mechanics of complex networks, Rev. Mod. Phys. 74(1), 47–97 (2002)
F. Janssens, W. Glänzel, B. De Moor: A hybrid mapping of information science, Scientometr. 75(3), 607–631 (2008)
W. Glänzel, B. Thijs: Using ‘core documents' for detecting and labelling new emerging topics, Scientometrics 91(2), 399–416 (2012)
W. Glänzel, B. Thijs: Using ‘core documents' for the representation of clusters and topics, Scientometrics 88(1), 297–309 (2011)
M.E.J. Newman: Modularity and community structure in networks, PNAS 103(23), 8577–8582 (2006)
R.D. Bock, S.Z. Husain: An adaptation of Holzinger's B-coefficients for the analysis of sociometric data, Sociometry 13, 146–153 (1950)
R. Rotta, A. Noack: Multilevel local search algorithms for modularity clustering, J. Exp. Algorithmics (2011), https://doi.org/10.1145/1963190.1970376
C.E. Shannon, W. Weaver: The Mathematical Theory of Communication (Univ. of Illinois Press, Champaign 1949)
S. Brin, L. Page: The anatomy of a large-scale hypertextual Web search engine, Comput. Netw. ISDN Syst. 30, 107–117 (1998)
L. Bohlin, D. Edler, A. Lancichinetti, M. Rosvall: Community detection and visualization of networks with the map equation framework. In: Measuring Scholarly Impact: Methods and Practice, ed. by Y. Ding, R. Rousseau, D. Wolfram (Springer, Cham 2014)
M. Rosvall, C.T. Bergstrom: Mapping change in large networks, PLoS ONE 5(1), e8694 (2010)
M.T. Schaub, R. Lambiotte, M. Barahona: Encoding dynamics for multiscale community detection: Markov time sweeping for the map equation, Phys. Rev. E 86, 026112 (2012)
M. Kheirkhahzadeh, A. Lancichinetti, M. Rosvall: Efficient community detection of network flows for varying Markov times and bipartite networks, Phys. Rev. E 93, 032309 (2016)
A.V. Esquivel, M. Rosvall: Compression of flow can reveal overlapping-module organization in networks, Phys. Rev. X 1, 021025 (2011)
M. De Domenico, A. Lancichinetti, A. Arenas, M. Rosvall: Identifying modular flows on multilayer networks reveals highly overlapping organization in interconnected systems, Phys. Rev. X 5, 011027 (2015)
V.A. Traag, P. Van Dooren, Y. Nesterov: Narrow scope for resolution-limit-free community detection, Phys. Rev. E 84(1), 016114 (2011)
T. Kawamoto, M. Rosvall: Estimating the resolution limit of the map equation in community detection, Phys. Rev. E 91, 012809 (2015)
A. Lancichinetti, S. Fortunato: Limits of modularity maximization in community detection, Phys. Rev. E 84, 066122 (2011)
P.J. Rouseeuw: Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math. 20(1), 53–65 (1987)
N.X. Vinh, J. Epps, J. Bailey: Information theoretic measures for clustering comparison: Is a correction for chance necessary? (PDF). In: ICML '09: Proc. 26th Annu. Int. Conf. Mach. Learn. ACM (2009) pp. 1073–1080
M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma, M. McCauley, M.J. Franklin, S. Shenker, I. Stoica: Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In: 9th USENIX Symp. Netw. Syst. Design Implement., San Jose (2012)
L.G. Valiant: A bridging model for parallel computing, Commun. ACM 33(8), 103–111 (1990)
B. Thijs, L. Zhang, W. Glänzel: Bibliographic coupling and hierarchical clustering for the validation and improvement of subject-classification schemes, Scientometrics 105(3), 1453–1467 (2015)
K.W. McCain, K. Turner: Citation context analysis and aging patterns of journal articles in molecular genetics, Scientometrics 17(1), 127–163 (1989)
B. Thijs, E. Schiebel, W. Glänzel: Do second-order similarities provide added-value in a hybrid approach?, Scientometrics 96(3), 667–677 (2013)
J.H. Ward Jr.: Hierarchical grouping to optimize an objective function, J. Am. Stat. Assoc. 58, 236–244 (1963)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Thijs, B. (2019). Science Mapping and the Identification of Topics: Theoretical and Methodological Considerations. In: Glänzel, W., Moed, H.F., Schmoch, U., Thelwall, M. (eds) Springer Handbook of Science and Technology Indicators. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-02511-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-02511-3_9
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-02510-6
Online ISBN: 978-3-030-02511-3
eBook Packages: Economics and FinanceEconomics and Finance (R0)