Basic WWW Technologies

doi:10.1007/978-3-540-77469-3_1

Basic WWW Technologies

Chapter

667 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 99))

Internet search service google.com announced recently that its indexes span an unprecedented 8 billion Web documents. The total includes a few billion Web pages, few hundreds of millions of images and about a decade’s worth - some 1 billion - of newsgroup messages. Promoting technology that scours this vast collection in less than a second, Google says that obtaining a similar result by hand would take a human searcher 15180 years - assuming someone could be found to scrutinize one document every minute, for twenty-four hours per day. If we were to put 15180 humans to perform this task we could have done it in one year. To do it in a minute we have to use 5.6 million people and to do it in a second like google does we have to use 333 million people. That is just 1 in every 20 people on the face of the earth will be just scrutinizing the 8 billion Web pages that google can index for now. As stark as this illustration may be, it points to the burgeoning scale of Web-based information resources. In a few short years, the Web has become our most compelling technological accomplishment. It’s the world’s largest library and telephone network, the world’s largest jukebox and photo album. It is global, and at the same time perfectly local. It’s the fastest growing enterprise on Earth, and yet, no stockbroker or media pundit or taxman can tell us who’s actually in charge. Social scientist Ithiel de Sola Pool [27], in a posthumous 1990 collection “Technologies Without Boundaries,” called the Internet “part of the largest machine that man has ever constructed.” A dozen years later, de Sola Pool appears eloquent in his understatement. As we learn to use this machine for more than merely searching an impending googolplex of indexed terms, our ability to repurpose, aggregate, and cross-reference individual datum inexpensively offers new impetus to economic and societal velocity. Where today the Web presents a loosely-coupled periodical file or a waste-bin of one-way screeds, we are discovering ways to unearth semantic clues that yield new knowledge, fresh applications and, perhaps, astounding insights. For enterprises doing business on the Web, the network is also becoming the world’s largest cash register, a development that will only accelerate as we find new ways to mine value from the content institutions and individuals place there. Information architects use the term Web mining to describe the process of locating and extracting usable information from Web-based repositories and doing so on a repeatable, continuous basis. In this book, we describe this new discipline and explain “hubs” and “authorities” on the Web, google’s Web ranking to arrive at better mining the Web for information as well as users that understand better their needs.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2008). Basic WWW Technologies. In: Search Engines, Link Analysis, and User's Web Behavior. Studies in Computational Intelligence, vol 99. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77469-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-540-77469-3_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77468-6
Online ISBN: 978-3-540-77469-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Buying options