Advertisement

Methodological Issues of Webometric Studies

  • Peter Ingwersen
  • Lennart Björneborn

Abstract

The contribution defines webometrics within the framework of informetric studies, bibliometrics, and scientometrics as belonging to library and information science, and associated with cybermetrics as a generic sub-field. It outlines a consistent and detailed link typology and terminology and makes explicit the distinction between the web node levels when using the proposed terminological structures. Secondly, the contribution presents the meaning, methodology and problematic issues of the central webometric analysis types, i.e., Web engine and crawler coverage, quality and sampling issues. It discusses briefly Web Impact Factor and other link analyses. The contribution finally looks into log studies of human Web interaction.

Keywords

Search Engine Internet Protocol Address Commercial Search Engine Engine Coverage Informetric Study 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Almind, T.C., Ingwersen, P. (1997). Informetric analyses on the World Wide Web: methodological approaches to “webometrics’. Journal of Documentation, 53 (4), 404–426.CrossRefGoogle Scholar
  2. Allen, E.S., Burke, J.M., Welch, M.E., Rieseberg, L.H. (1999). How reliable is science information on the Web? Science, 402, 722.Google Scholar
  3. Bar-Ilan, J. (1997). The ‘mad cow disease’: Usenet newsgroups and bibliometric laws. Scientometrics, 39 (1), 29–55.Google Scholar
  4. Bar-Ilan, J. (1999). Search engine results over time: A case study on search engine stability. Cybermetrics, 2/3 (1999), paper 1. Visited 08.11.2003: http://www.cindoc.csic.es/cybermetrics/articles/v2i1p1.html.
  5. Bar-Ilan, J. (2000). The Web as an information resource on informetrics? A content analysis. Journal of the American Society for Information Scienc. 51, 432–443.Google Scholar
  6. Bar-Ilan, J. (2001). Data collection methods on the Web for informetric purposes: A review and analysis. Scientometrics, 50 (1), 7–32.CrossRefGoogle Scholar
  7. Bar-Ilan, J. (2002). Methods for measuring search engine performance over time. Journal of the American Society for Information Science and Technology, 53 (4), 308–319.CrossRefGoogle Scholar
  8. Bar-Ilan, J., Peritz, B.C. (2000). The life span of a specific topic on the Web. The case of ‘informetrics’: A quantitative analysis, Scientometrics. 46, 371–382.Google Scholar
  9. Björneborn, L. (2001). Small-world linkage and co-linkage. Proceedings of the 12th ACM Conference on Hypertext and Hypermedia (pp. 133–134). New York: ACM Press.Google Scholar
  10. Björneborn, L. (2004). Small-world link structures across an academic Web space: a library and information science approach. PhD Thesis. Royal School of Library and Information Science, Denmark. http://www.db.dk/dbi/samling/phd/lennartbjoerneborn-phd.pdf.Google Scholar
  11. Björneborn, L., Ingwersen, P. (2001). Perspectives of webometrics. Scientometrics, 50, 65–82.CrossRefGoogle Scholar
  12. Björneborn, L., Ingwersen, P. (2004). Towards a basic framework for webometrics. Journal of American Society for Information Science and Technology (in press).Google Scholar
  13. Brin, S., Page, L. (1998). The anatomy of a large scale hypertextual web search engine. Computer Networks and ISDN Systems, 30 (1–7), 107–117.Google Scholar
  14. Broder, A., Kumar, R., Maghoul, F., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A., Wiener, J. (2000). Graph structure in the Web. Computer Networks, 33 (1–6), 309–320.Google Scholar
  15. Brookes, B.C. (1990). Biblio-, sciento-, infor-metrics??? What are we talking about? In: L. Egghe, R. Rousseau (Eds.), Informetrics 89/90: Second International Conference on Bibliometrics, Scientometrics and Informetrics (pp. 31–43). Amsterdam: Elsevier.Google Scholar
  16. Catledge, L. D., Pitkow, J. E. (1995). Characterizing browsing strategies in the World-Wide Web. Computer Networks and ISDN Systems, 27 (6), 1065–1073.CrossRefGoogle Scholar
  17. Chen, C., Newman, J., Newman, R., Rada, R. (1998). How did university departments interweave the Web: a study of connectivity and underlying factors. Interacting with Computers, 10, 353–373.Google Scholar
  18. Clarke, S.J., Willett, P. (1997). Estimating the recall performance of Web search engines. Aslib Proceedings, 49, 184–189.Google Scholar
  19. Courtois, M.P., Berry, M.W. (1999). Results ranking in Web search engines, Online, May/June, 39–46.Google Scholar
  20. Cronin, B. (2001). Bibliometrics and beyond: some thoughts on web-based citation analysis. Journal of Information Science, 27 (1), 1–7.CrossRefGoogle Scholar
  21. Cronin, B., McKim, G. (1996). Science and scholarship on the World Wide Web: A North American perspective. Journal of Documentation, 52, 163–172.Google Scholar
  22. Cui, L. (1999). Rating health Web sites using the principles of citation analysis: A bibliometric approach. Journal of Medical Internet Research, 1 (1), e4 (ISSN: 1438-8871). Visited 08.11.2003: http://www.jmir.org/1999/1/e4/index.htm.CrossRefGoogle Scholar
  23. Egghe, L., Rousseau, R. (1990). Introduction to informetrics: quantitative methods in library, documentation and information science. Amsterdam: Elsevier.Google Scholar
  24. Glänzel, W. (2003). Personal communication. Available — visited 08.11.2003. http://www.oud.niwi.knaw.nl/nerdi/lectures/glanzel.pdf
  25. Herring, S.C. (2002). Computer-mediated communication on the Internet. Annual Review of Information Science and Technology, 36, 109–168.CrossRefGoogle Scholar
  26. Hine, C. (2000). Virtual Ethnography. London: Sage.Google Scholar
  27. Henzinger, M.R., Heydon, A., Mitzenmacher, M., Najork, M. (2000). On near-uniform URL sampling. Proceedings of the 9th International World Wide Web Conference, May 2000. Computer Networks, 33 (1–6), 295–308.Google Scholar
  28. Hou, J.Y. & Zhang, Y. (2003). Effectively finding relevant Web pages from linkage information. IEEE Transactions on Knowledge and Data Engineering, 15 (4), 940–951.Google Scholar
  29. Ingwersen, P. (1998). The calculation of Web Impact Factors. Journal of Documentation, 54, 236–243.CrossRefGoogle Scholar
  30. Ingwersen, P., Järvelin, K. The Turn: integration of information seeking and retrieval in context. Kluver (forthcoming).Google Scholar
  31. Jansen, B.J., Pooch, U. (2001). A review of Web searching studies and a framework for future research. Journal of the American Society for Information Science, 52 (3), 235–246.Google Scholar
  32. Jansen, B. J., Spink, A., Saracevic, T. (2000). Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing & Management, 36, 207–227.CrossRefGoogle Scholar
  33. Jepsen, E.T., Seiden, P., Ingwersen, P., Björneborn, L., Borlund, P. (2004). Characteristics of scientific Web publications: Preliminary data gathering and analysis. Journal of American Society for Information Science and Technology (in press).Google Scholar
  34. Kleinberg, J.M. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46 (5), 604–632.CrossRefGoogle Scholar
  35. Kleinberg, J., Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A. (1999). The Web as a graph: measurements, models and methods. Lecture Notes in Computer Sc., 1627, 1–18.Google Scholar
  36. Kling, R., McKim, G. (2000). Not just a matter of time: field differences in the shaping of electronic media in supporting scientific communication. Journal of the American Society for Information Science, 51 (14), 1306–1320.CrossRefGoogle Scholar
  37. Larson, R. (1996). Bibliometrics of the World Wide Web: an exploratory analysis of the intellectual structure of cyberspace. Proceedings of the 59th Annual Meeting of the American Society for Information Science, 33, 71–78.Google Scholar
  38. Lawrence, S., Giles, C.L. (1998). Searching the World Wide Web. Science, 280, 98–100.CrossRefGoogle Scholar
  39. Lawrence, S., Giles, C. L. (1999). Accessibility and distribution of information on the Web. Nature, 400, 107–110.CrossRefGoogle Scholar
  40. Li, X.M., Thelwall, M., Musgrove, P., Wilkinson, D. (2003). The relationship between the WIFs or inlinks of Computer Science Departments in UK and their RAE ratings or research productivities in 2001. Scientometrics, 57 (2), 239–255.CrossRefGoogle Scholar
  41. Molyneux, R.E., Williams, R.V. (1999). Measuring the Internet. Annual Review of Information Science and Technology, 34, 287–339.Google Scholar
  42. Oppenheim, C., Morris, A., McKnight, C. (2000). The evaluation of WWW search engines. Journal of Documentation, 56, 190–211.Google Scholar
  43. Otte, E., Rousseau, R. (2002). Social network analysis: a powerful strategy, also for the information sciences. Journal of Information Science, 28 (6), 441–454.CrossRefGoogle Scholar
  44. Park, H.W., Thelwall, M. (2003). Hyperlink analyses of the World Wide Web: A review. Journal of Computer-Mediated Communication, 8 (4). Visited 08.11.2003: http://www.ascusc.org/jcmc/vol8/issue4/park.html.
  45. Pinski, G., Narin, F. (1976). Citation influences for journal aggregates of scientific publications: theory, with applications to the literature of physics. Information Processing and Management, 12, 297–312.CrossRefGoogle Scholar
  46. Pirolli, P., Pitkow, J., Rao, R. (1996). Silk from a sow’s ear: extracting usable structures from the Web. CHI 96 Electronic Proceedings. Visited 08.11.2003: http://www.acm.org/sigchi/chi96/proceedings/papers/Pirolli_2/pp2.html.
  47. Rodriguez I Gairin, J.M. (1997). Volorando el impacto de la informacion en Internet: Altavista, el “Citation Index’ de la Red. Revista Espanola de Documentacion Scientifica, 20 (2), 175–181.Google Scholar
  48. Rousseau, R. (1997). Sitations: an exploratory study. Cybermetrics, 1 (1). Visited 08.11.2003: http://www.cindoc.csic.es/cybermetrics/articles/v1i1p1.html.
  49. Rousseau, R. (1999). Daily time series of common single word searches in AltaVista and Northern Light. Cybermetrics, 2/3, paper 2. Visited 08.11.2003: http://www.cindoc.csic.es/cybermetrics/articles/v2i1p2.html.
  50. Rousseau, R. (2001). Evolution in time of the number of hits in keyword searches on the Internet during one year, with special attention to the use of the word euro. In M. Davis, C. Wilson (Eds.), Proc. of the 8th Int. Conf. on Scientometrics & Informetrics. Sydney, 619–627.Google Scholar
  51. Rusmevichientong, P., Pennock, D.M., Lawrence, S., Giles, S.L. (2001). Methods for sampling pages uniformly from the Web. In Proceedings of the AAAI Fall Symposium on Using Uncertainty within Computation, 121–128.Google Scholar
  52. Silverstein, C., Henzinger, M., Marais, H., Moricz, M. (1999). Analysis of a very large Web search engine query log. SIGIR Forum, 33 (1): 6–12.Google Scholar
  53. Smith, A.G. (1999). A tale of two web spaces: comparing sites using web impact factors. Journal of Documentation, 55, 577–592.Google Scholar
  54. Spink, A. (2002). Introduction to the special issue on Web research. Journal of the American Society for Information Science & Technology, 53 (2), 65–66.Google Scholar
  55. Spink, A., Wolfram, D, Jansen, B. J., Saracevic, T. (2001). Searching the Web: the public and their queries. Journal of the American Society for Information Science, 52 (3), 226–234.Google Scholar
  56. Snyder, H., Rosenbaum, H. (1999). Can search engines be used as tools for web-link analysis? A critical view. Journal of Documentation, 55, 375–384.CrossRefGoogle Scholar
  57. Tague-Sutcliffe, J. (1992). An introduction to informetrics. Information Processing & Management, 28 (1), 1–3.CrossRefGoogle Scholar
  58. Thelwall, M. (2000). Web impact factors and search engine coverage. Journal of Documentation, 56, 185–189.Google Scholar
  59. Thelwall, M. (2001a). Extracting macroscopic information from web links. Journal of the American Society for Information Science and Technology, 52 (13), 1157–1168.CrossRefGoogle Scholar
  60. Thelwall, M. (2001b). The responsiveness of search engine indexes, Cybermetrics, 5 (1). Visited 08.11.2003: http://www.cindoc.csic.es/cybermetrics/articles/v5i1p1.html.
  61. Thelwall, M. (2001c) A Web crawler design for data mining. Journal of Information Science, 27 (5), 319–325.CrossRefGoogle Scholar
  62. Thelwall, M., Tang, R. (2003). Disciplinary and linguistic considerations for academic Web linking: an exploratory hyperlink mediated study with Mainland China and Taiwan. Scientometrics, 58 (1), 155–181.CrossRefGoogle Scholar
  63. Thelwall, M., Vaughan, L., Björneborn, L. (2005). Webometrics. Annual Review of Information Science and Technology, 39 (in press).Google Scholar
  64. Thomas, O., Willett, P. (2000). Webometric analysis of departments of librarianship and information science. Journal of Information Science, 26 (6), 421–428.CrossRefGoogle Scholar
  65. Vaughan, L., Thelwall, M. (2003). Scholarly use of the Web: What are the key inducers of links to journal web sites? Journal of the American Society for Information Science and Technology, 54 (1), 29–38.CrossRefGoogle Scholar
  66. Vaughan, L., Thelwall, M. (2004). Search engine coverage bias: evidence and possible causes. Information Processing & Management (to appear).Google Scholar
  67. Wilkinson, D., Harries, G., Thelwall, M., Price, E. (2003). Motivations for academic Web site interlinking: evidence for the Web as a novel source of information on information scholarly communication. Journal of Information Science, 29 (1), 59–66.CrossRefGoogle Scholar

Copyright information

© Kluwer Academic Publishers 2004

Authors and Affiliations

  • Peter Ingwersen
    • 1
  • Lennart Björneborn
    • 1
  1. 1.Department of Information StudiesRoyal School of Library and Information ScienceCopenhagenDenmark

Personalised recommendations