Abstract
The estimated number of static web pages in Oct 2005 was over 20.3 billion, which was determined by multiplying the average number of pages per web server based on the results of three previous studies, 200 pages, by the estimated number of web servers on the Internet, 101.4 million. However, based on the analysis of 8.5 billion web pages that we crawled by Oct. 2005, we estimate the total number of web pages to be 53.7 billion. This is because the number of dynamic web pages has increased rapidly in recent years. We also analyzed the web structure using 3 billion of the 8.5 billion web pages that we have crawled. Our results indicate that the size of the ”CORE,” the central component of the bow tie structure, has increased in recent years, especially in the Chinese and Japanese web.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Lawrence, S., Giles, C.L.: Searching the World Wide Web. Science 280(5360), 98–100 (1998)
Lawrence, S., Giles, C.L.: Accessibility of Information on the Web. Nature, 400, 107–109 (1999)
Institute for Information and Communications Policy, Statistics Investigation Report for contents on the World-Wide Web (2004), http://www.soumu.go.jp/iicp/chousakenkyu/seika/houkoku.html
Netcraft: Web Server Survey (November 2006), http://news.netcraft.com/archives/2006/11/01/november_2006_web_server_survey.html
e-Society Project, http://www.yama.info.waseda.ac.jp/~yamana/e-society/index_eng.htm
Broder, A., Kumar, R., Maghoul, F., Raghavan, P., State, R., Tomkins, A., Wiener, J.: Graph structure in the web. In: Proc. of 9th World Wide Web Conf., pp. 309–320 (2000)
Boldi, P., Codenotti, B., Santini, M., Vigna, S.: Structural Properties of the African Web. In: Poster Proc. of 11th World Wide Web Conf., (2002)
Lie, G., Yu, Y., Han, J., Xue, G.: China web graph measurements and evolution. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, Springer, Heidelberg (2005)
Bharat, K., Broder, A.: A Technique for Measuring the Relative Size and Overlap of Public Web Search Engines. Journal of Computer Networks and ISDN Systems 30(1-7), 379–388 (1998)
Henzinger, M., Heydon, A., Mitzenmacher, M., Najork, M.: Measuring Index Quality using Random Walks on the Web. In: Proc. of 8th World Wide Web Conf., pp. 213–225 (1999)
Vaughan, L., Thelwall, M.: Search Engine Coverage Bias: Evidence and Possible Causes. Journal of Information Processing and Management 40(4), 693–707 (2004)
Bar-Yossef, Z., Gurevich, M.: “Random Sampling from a Search Engine’s Index. In: Proc. of 15th World Wide Web Conf., pp. 367–376 (2006)
Basis Technology Rosette Language Identifier, http://www.basistech.com/language-identification/
Hirai, H., Raghavan, S., G-Molina, H., Paepcke, A.: Webbase: A repository of the Web. In: Proc. of 9th World Wide Conf., pp. 277–293 (2000)
The Stanford WebBase Project, http://dbpubs.stanford.edu:8091/~testbed/doc2/WebBase/
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hirate, Y., Kato, S., Yamana, H. (2008). Web Structure in 2005. In: Aiello, W., Broder, A., Janssen, J., Milios, E. (eds) Algorithms and Models for the Web-Graph. WAW 2006. Lecture Notes in Computer Science, vol 4936. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78808-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-78808-9_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78807-2
Online ISBN: 978-3-540-78808-9
eBook Packages: Computer ScienceComputer Science (R0)