, Volume 102, Issue 3, pp 2223–2245 | Cite as

Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation

  • Michel Zitt


In this position paper, we comment on various approaches to the delineation of scientific fields or domains, a typical prerequisite for a wide class of bibliometric studies. There is growing evidence that this meso-level, between micro targets of typical IR and large disciplines handled by macro-level bibliometric studies, takes full advantage of hybrid approaches. Firstly, delineation tasks gain to combine the a priori thinking of traditional IR, which typically involves clearly targeted expectations, and the a posteriori thinking of bibliometric mapping, where the decisions are built on external structuring of the domain in a wider context. The combination of the two ways of thought is far from new, with IR increasingly building on bibliometric networks for query expansion, and bibliometrics building on IR for evaluating and refining its outcomes. Secondly, delineation benefits from the multi-network perspective, which gives different representations of the scientific topics, usually all the more converging than the objects are dense and well separated. Focusing on two basic networks—words and citations—various sequences or combinations of operations are discussed. Bibliometrics and IR, especially when properly combined in multi-network approaches, provide an efficient toolbox for studies of domains delimitation. It should be recalled however that the context of such studies is often loaded with policy stakes that ask for cautious supervision and consultation processes.


Bibliometrics Information retrieval Science mapping Field delineation Hybrid textual-citation techniques Query expansion 



The author thanks Alain Lelu, Université de Franche-Comté and Loria, Nancy, Elise Bassecoulard, formerly Inra-Lereco, and anonymous referees, for helpful remarks; Patricia Laurens and Antoine Schoen, ESIEE, Marne la Vallée, for permission to use the genomics map, from our previous co-work.


  1. Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. Proceedings of the 1993 ACM SIGMOD, 207.Google Scholar
  2. Ahlgren, P., & Colliander, C. (2009). Document-document similarity approaches and science mapping: Experimental comparison of five approaches. Journal of Informetrics, 3(1), 49–63.CrossRefGoogle Scholar
  3. Archambault E., Beauchesne O. H., & Caruso J. (2011) Towards a multilingual, comprehensive and open scientific journal ontology, in Proceedings 13th ISSI Conference, Durban, South Africa.Google Scholar
  4. Barabasi, A. L., & Albert, R. (1999). Emergence of scaling in random networks. Science, 286(5439), 509–512.CrossRefMathSciNetGoogle Scholar
  5. Bassecoulard, E., & Zitt, M. (1999). Indicators in a research institute: A multi-level classification of scientific journals. Scientometrics, 44(3), 23–345.CrossRefGoogle Scholar
  6. Benzecri, J. P. (1973) La place de l’a priori, Encyclopedia Universalis, 17, Organum, 11–24.Google Scholar
  7. Benzecri, J. P., et al. (1981). Pratique de l’analyse des données : Linguistique et lexicologie. Paris: Dunod.MATHGoogle Scholar
  8. Bergstrom, C. (2007). Eigenfactor: Measuring the value and prestige of scholarly journals. College & Research Libraries News, 68(5).
  9. Blair, D. C. (2003). Information retrieval and the philosophy of language. Annual Review of Information Science and Technology, 37, 3–50.CrossRefGoogle Scholar
  10. Blondel V. D., Guillaume J. L., Lambiotte R., & Lefebvre E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment, 2008(10), 10008.Google Scholar
  11. Börner, K., Chen, C. M., & Boyack, K. W. (2003). Visualizing knowledge domains. Annual Review of Information Science and Technology, 37, 179–255.CrossRefGoogle Scholar
  12. Börner, K., Glänzel, W., Scharnhorst, A., & van den Besselaar, P. (2011). Modeling science: studying the structure and dynamics of science. Scientometrics, 89, 347–348.CrossRefGoogle Scholar
  13. Bornmann, L., & Daniels, H. D. (2008). What do citation counts measure? A review of studies on citation behavior. Journal of Documentation, 64(1), 45–80.Google Scholar
  14. Boyack, K. W., Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? JASIST, 61(12), 2389–2404.Google Scholar
  15. Boyack, K., & Klavans, R. (2013). Creation of a highly detailed, dynamic, global model and map of science, forthcoming in JASIST. doi: 10.1002/asi.22990.
  16. Boyack, K., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text. JASIST, 64(9), 1759–1767.CrossRefGoogle Scholar
  17. Braam, R. R., Moed, H. F., & Van Raan, A. F. J. (1991). Mapping of science by combined co-citation and word analysis. I Structural aspects. JASIS, 42(4), 233–251.CrossRefGoogle Scholar
  18. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and Isdn Systems, 30(1–7), 107–117.CrossRefGoogle Scholar
  19. Cadot M., & Lelu, A. (2011). Combining Explicitness and Classifying Performance via MIDOVA Lossless Representation for Qualitative Datasets. International Journal on Advances in Software, 5(1–2), 1–16.Google Scholar
  20. Callahan, A., Hockema, S., & Eysenbach, G. (2010). Contextual co-citation: Augmenting co-citation analysis and its applications. JASIST, 61(6), 1130–1143.Google Scholar
  21. Callon, M., Courtial, J. P., Turner, W. A., & Bauin, S. (1983). From translations to problematic networks: An introduction to co-word analysis. Social Science Information, 22(2), 191–235.CrossRefGoogle Scholar
  22. Callon, M., Courtial, J. P., & Laville, F. (1991). Co-word analysis as a tool for describing the network of interactions between basic and technological research: The case of polymer chemistry. Scientometrics, 22(1), 155–205.CrossRefGoogle Scholar
  23. Carayol, N., & Roux, P. (2009). Knowledge flows and the geography of networks: A strategic model of small world formation. Journal of Economic Behavior & Organization, 71(2), 414–427.CrossRefGoogle Scholar
  24. Carpineto, G., & Romano, C. (2012). A survey of automatic query expansion in information retrieval. ACM-CSUR, 44(1), 1.CrossRefGoogle Scholar
  25. Chavalarias, D., & Cointet, J. P. (2013). Phylomemetic patterns in science evolution—The rise and fall of scientific fields. PLoS ONE, 8(2), e54847.CrossRefGoogle Scholar
  26. Chen, C. M. (2006). CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature. JASIS, 57(3), 359–377.CrossRefGoogle Scholar
  27. Chen, C. M., Ibekwe-Sanjuan, F., & Hou, J. (2010). The structure and dynamics of co-citation clusters: A multiple-perspective co-citation analysis. JASIST, 61(7), 1386–1409.CrossRefGoogle Scholar
  28. Cronin, B. (1984). The citation process; The role and significance of citations in scientific communication (p. 103). London: Taylor Graham.Google Scholar
  29. de Beaver, D., & Rosen, R. (1979). Studies in scientific collaboration. Part II. Scientific co-authorship, resarch productivity and visibility in the French Scientific Elite, 1799–1830. Scientometrics, 1(2), 133–149.CrossRefGoogle Scholar
  30. Deerwester, S., Dumai, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. JASIST, 41(6), 391–407.CrossRefGoogle Scholar
  31. Elkiss, A., Shen, S., Fader, A., Erkan, G., States, D., & Radev, D. (2008). Blind men and elephants: What do citation summaries tell us about a research article? JASIST, 59(1), 51–62.CrossRefGoogle Scholar
  32. Eom, Y. H., & Fortunato, S. (2011). Characterizing and modeling citation dynamics. PLoS ONE, 6(9), e24926. doi: 10.1371/journal.pone.0024926.CrossRefGoogle Scholar
  33. Garfield, E. (1967). Primordial concepts, citation indexing and historio-bibliography. Journal Library History, 2, 235–249.Google Scholar
  34. Garfield, E., & Sher, I. H. (1993). Keywords-Plus(Tm) -Algorithmic derivative indexing. JASIST, 44(5), 298–299.CrossRefGoogle Scholar
  35. Garfield, E., Pudovkin, A. I., & Istomin, V. S. (2003). Why do we need algorithmic historiography? JASIST, 54(5), 400–412.CrossRefGoogle Scholar
  36. Gilbert, G. N. (1977). Referencing as persuasion. Studies of Science, 7, 113–122.Google Scholar
  37. Gilbert, N. (1997). A simulation of the structure of academic science. Sociological Research Online, 2(2), 3. Scholar
  38. Glänzel, W., & Czerwon, H. J. (1996). A new methodological approach to bibliographic coupling and its application to the national, regional and institutional level. Scientometrics, 37(2), 195–221.CrossRefGoogle Scholar
  39. Glänzel, W., & Schubert, A. (2003). A new classification of the science fields and subfields designed for scientometric evaluation purposes. Scientometrics, 56(3), 357–367.CrossRefGoogle Scholar
  40. Gläser, J., Lange, S., Laudel, G., & Schimank, U. (2010). The Limits of Universality: How field-specific epistemic conditions affect authority relations and their consequences. In R. Whitley, J. Gläser, & L. Engwall (Eds.), Reconfiguring knowledge production: Changing authority relationships in the sciences and their consequences for intellectual innovation (pp. 291–324). Oxford: Oxford University Press.CrossRefGoogle Scholar
  41. Ingwersen, P. (1996). Cognitive perspectives of information retrieval interaction: Elements of a cognitive IR theory. Journal of Documentation, 57(6), 715–740.CrossRefGoogle Scholar
  42. Ingwersen, P., & Järvelin, K. (2005). The turn: Integration of inversion seeking and retrieval in context (p. 436). Berlin: Springer.Google Scholar
  43. Janssens, F., Glanzel, W., & De Moor, B. (2008). A hybrid mapping of information science. Scientometrics, 75(3), 607–631.CrossRefGoogle Scholar
  44. Jardine, N., & van Rijsbergen, C. J. (1971). The use of hierarchical clustering in information retrieval. Information Storage and Retrieval, 7, 217–240.CrossRefGoogle Scholar
  45. Kessler, M. M. (1963). Bibliographic coupling between scientific papers. American Documentation, 14, 10–25.CrossRefGoogle Scholar
  46. Kostoff, R. N., delRio, J. A., Humenik, J. A., Garcia, E. O., & Ramirez, A. M. (2001). Citation mining: Integrating text mining and bibliometrics for research user profiling. JASIST, 52(13), 1148–1156.CrossRefGoogle Scholar
  47. Larivière, V., Archambault, E., & Gingras, Y. (2008). Long-term variations in the aging of scientific literature: from exponential growth to steady-state science (1900–2004). JASIST, 59(2), 288–296.Google Scholar
  48. Larsen, B. (2002). Exploiting citation overlaps for information retrieval: Generating a boomerang effect from the network of scientific papers. Scientometrics, 54(2), 155–178.CrossRefGoogle Scholar
  49. Latour, B. (1987). Science in action: How to follow Scientists and Engineers through society. Cambridge: Harvard University Press.Google Scholar
  50. Laurens, P., Zitt, M., & Bassecoulard, E. (2010). Delineation of the genomics field by hybrid citation-lexical methods: Interaction with experts and validation process. Scientometrics, 82(3), 647–662.CrossRefGoogle Scholar
  51. Lelu, A. (1994). Clusters and factors: Neural algorithms for a novel representation of huge and highly multidimensional data sets. In E. Diday & Y. Lechevallier (Eds.), New approaches in classification and data analysis (pp. 241–248). Berlin: Springer.CrossRefGoogle Scholar
  52. Leydesdorff, L., & Cozzens, S. E. (1993). The delineation of specialties in terms of journals using the dynamic journal set of the science citation Index. Scientometrics, 26, 133–154.CrossRefGoogle Scholar
  53. Liu, S., & Chen, C. M. (2013). The differences between latent topics in abstracts and citation contexts of citing papers. JASIST, 64(3), 627–639.CrossRefGoogle Scholar
  54. Liu, X., Yu, S., Janssens, F., Glänzel, W., Moreau, Y., & De Moor, B. (2010). Weighted hybrid clustering by combining text mining and bibliometrics on a large-scale journal database. JASIST, 61(6), 1105–1119.Google Scholar
  55. Marshakova, I. V. (1973). Document coupling system based on references taken from science citation Index (in Russian). Nauchno-TeknicheskayaInformatsiya, Ser. 2 6.3.Google Scholar
  56. Martyn, J. (1964). Bibliographic coupling. Journal of Documentation, 20(4), 236.CrossRefGoogle Scholar
  57. Mc Cain, K. W. (1983). The author co-citation structure of macroeconomics. Scientometrics, 5(5), 277–289.CrossRefGoogle Scholar
  58. McCain, K.W. (1989). Descriptor and citation retrieval in the medical behavioral sciences literature: Retrieval over-laps and novelty distribution. JASIS, 40(2), 110–114.Google Scholar
  59. Morris, S. A., Yen, G., Wu, Z., & Asnake, B. (2003). Time line visualization of research fronts. JASIST, 54(5), 413–422.CrossRefGoogle Scholar
  60. Mullins, N. C., Hargens, L. L., Hecht, P. K., & Kick, E. L. (1977). The group structure of co-citation clusters: A comparative study. American Sociological Review, 42, 552–562.CrossRefGoogle Scholar
  61. Mutschke, P., & Quan-Haase, A. (2001). Collaboration and cognitive structures in social science research fields: Towards socio-cognitive analysis in information systems. Scientometrics, 52(3), 487–502.CrossRefGoogle Scholar
  62. Mutschke, P., Mayr, P., Schaer, P., & Sure, Y. (2011). Science models as value-added services for scholarly information systems. Scientometrics, 89, 349–364.CrossRefGoogle Scholar
  63. Narin, F., Pinski, G., & Gee, H. H. (1976). Structure of the biomedical literature. Journal of the American Society for Information Science, 27(1), 25–45.CrossRefGoogle Scholar
  64. Narin, F., & Noma, E. (1985). Is technology becoming science? Scientometrics, 7(3), 369–381.CrossRefGoogle Scholar
  65. Noyons, E. C. M. (1999). Bibliometric mapping as a science policy and research management tool. Leiden: Leiden University DSWO Press.Google Scholar
  66. Palacios-Huerta, I., & Volij, O. (2004). The measurement of intellectual influence. Econometrica, 72(3), 963–977.CrossRefGoogle Scholar
  67. Pao, M. L. (1993). Term and citation retrieval -a field-study. Information Processing and Management, 29(1), 95–112.CrossRefGoogle Scholar
  68. Papadimitriou, C., Raghavan, P., Tamaki H. & Vempala S. (1998). Latent semantic indexing: A probabilistic analysis, PODS Proceedings of the 17th ACM SIGACT-SIGMOD-SIGART symposium on principles of databases systems. 159–168.Google Scholar
  69. Pinski, G., & Narin, F. (1976). Citation influence for journal aggregates of scientific publications: Theory, with application to the literature of physics. Information Processing and Management, 12, 297–312.CrossRefGoogle Scholar
  70. Polanco, X., Grivel, L. & Royauté, J. (1995). How to do things with terms in informetrics : Terminological variation and stabilization as science watch indicators. In M. Koenig (Ed.), Proceedings of the 5th ISSI Intl Conference (River Forest IL, June 7-10, 1995) 435–444: Learned Information, Medford NJ.Google Scholar
  71. Price, D. J. de Solla. (1965). Networks of scientific papers. Science, 149(3683), 510–515.Google Scholar
  72. Price, D. J. de Solla. (1976). A general theory of bibliometric and other cumulative advantage processes. Journal of the American Society for Information Science, 27(5), 292–306.Google Scholar
  73. Rafols, I., Porter, A. L., & Leydesdorff, L. (2010). Science overlay maps: A new tool for research policy and library management. JASIS, 61(9), 1871–1887.CrossRefGoogle Scholar
  74. Ritchie A., Robertson S. & Teufel S. (2008) Comparing citation context for information retrieval, CIKM’08, Proceedings 17th ACM Conference on Information and knowledge management 213–222.Google Scholar
  75. Rocchio, J. (1971). Relevance feedback in information retrieval. In G. Salton (Ed.), The smart retrieval system: Experiments in automatic document processing (pp. 313–323). Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
  76. Ross, N. C. M., & Wolfram, D. (2000). End user searching on the Internet: An analysis of term pair topics submitted to the Excite search engine. JASIST, 51(10), 949–958.CrossRefGoogle Scholar
  77. Rosvall, M., & Bergstrom, C. (2008). Maps of information flows reveal structures in complex networks. PNAS, 105, 1118.CrossRefGoogle Scholar
  78. Roth, C., & Cointet, J. P. (2010). Social and semantic coevolution in Knowledge. Social Networks, 32(1), 16–29.CrossRefGoogle Scholar
  79. Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. JASIST, 41(4), 288–297.CrossRefGoogle Scholar
  80. Scharnhorst, A., Börner, K., & van den Besselaar, P. (Eds.). (2012). Models of science dynamics: Encounters between complexity theory and information sciences (Understanding Complex Systems). Berlin: Springer.Google Scholar
  81. Small, H. (1973). Co-citation in the scientific literature : A new measure of the relationship between two documents. JASIS, 24(4), 265–269.CrossRefGoogle Scholar
  82. Small, H. (1980). Co-citation context analysis and the structure of paradigms. Journal of Documentation, 36(3), 183–196.CrossRefMathSciNetGoogle Scholar
  83. Small, H. (2011). Interpreting maps of science using citation context sentiments: A preliminary investigation. Scientometrics, 87(2), 373–388.CrossRefGoogle Scholar
  84. Teufel S., Siddharthan A. & Tidhar D. (2006) Automatic classification of citation function, Proceedings EMNLP ‘06 Proceedings 2006 Conference on Empirical Methods in Natural Language Processing.Google Scholar
  85. van den Besselaar, P., & Heimeriks, G. (2006). Mapping research topics using word-reference co-occurrences: A method and an exploratory case study. Scientometrics, 68(3), 377–393.CrossRefGoogle Scholar
  86. Waltmann, L., & van Eck, N. (2012). A new methodology for constructing a publication-level classification system of science. JASIS, 63(12), 2378–2392.CrossRefGoogle Scholar
  87. Watts, C., & Gilbert, N. (2011). Does cumulative advantage affect collective learning in science? An agent-based simulation, Scientometrics, 89(1), 437–463.Google Scholar
  88. White, H. D., & Griffith, B. C. (1981). Author co-citation: A literature measure of intellectual structure. JASIS, 32(3), 163–172.CrossRefGoogle Scholar
  89. Zitt, M., & Bassecoulard, E. (1996). Reassessment of co-citation methods for science indicators: Effect of methods improving recall rates. Scientometrics, 37(2), 223–244.CrossRefGoogle Scholar
  90. Zitt, M., & Bassecoulard, E. (2006). Delineating complex scientific fields by an hybrid lexical-citation method: An application to nanosciences. Information Processing and Management, 42(6), 1513–1531.CrossRefGoogle Scholar
  91. Zitt, M., Ramanana-Rahary, S., & Bassecoulard, E. (2005). Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation. Scientometrics, 63(2), 373–401.CrossRefGoogle Scholar
  92. Zitt, M., Lelu, A., & Bassecoulard, E. (2011). Hybrid citation-word representations in science mapping: Portolan charts of research fields? JASIST, 62(1), 19–39. doi: 10.1002/asi.21440.CrossRefGoogle Scholar
  93. Zitt M., & Small, H. (2008). Modifying the journal impact factor by fractional citation weighting: The audience factor. JASIST, 59(11), 1856–1860.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2014

Authors and Affiliations

  1. 1.Lereco U1134, SAE2 DepartmentINRANantes Cedex 03France

Personalised recommendations