Web Information Resource Discovery: Past, Present, and Future

  • Gultekin Ozsoyoglu
  • Abdullah Al-Hamdani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2869)


In a time span of twelve years, the World Wide Web–only a computer and an internet connection away from anybody anywhere, and with abundant, diverse and sometimes incorrect, redundant, spam, and bad information–has become the major information repository for the masses and the world. The web is becoming all things to all people, totally oblivious to nation/country/continent boundaries, promising mostly free information to all, and quickly growing into a repository in all languages and all cultures. With large digital libraries and increasingly significant educational resources, the web is becoming an equalizer, a balancing force, and an opportunity for all, especially for underdeveloped/developing countries. The web is both exciting and overwhelming, changing the way the world communicates, from the way businesses are conducted to the way masses are educated, from the way research is performed to the way research results are disseminated. It is fair to say that the web will only get more diverse, larger and more chaotic in the near future.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agichtein, E., Eskin, E., Gravano, L.: Combining Strategies for Extracting Relations from Text Collections. ACM SIGMOD (2000)Google Scholar
  2. 2.
    Agichtein, E., Gravano, L.: Snowball: Extracting relations from large plain-text collections. In: The 5th ACM International Conference on Digital Libraries (June 2000)Google Scholar
  3. 3.
    Agichtein, E., Gravano, L.: Querying Text Databases for Efficient Information Extraction. In: Proce. of the 19th IEEE Intl Conference on Data Engineering (ICDE) (2003)Google Scholar
  4. 4.
    Brickley, D., Guha, R.V.: Resource Description Framework Schema (RDFS). W3C Proposed Recommendation (1999), available at
  5. 5.
    Bharat, K., Henzinger, M.R.: Improved algorithms for topic distillation in a hyperlinked environment. In: ACM SIGIR Conf. (1998)Google Scholar
  6. 6.
    Broekstra, J., Klein, M., Fensel, D., Horrocks, I.: Adding formal semantics to the Web: building on top of RDF Schema. In: Proc. of the ECDL (2000)Google Scholar
  7. 7.
    Berners-Lee, T.: Semantic Web Roadmap. W3C draft (January 2000), available at
  8. 8.
    De Bra, P.M.E., Post, R.D.J.: Searching for arbitrary information in the WWW: Making Client-based searching feasible. In: WWW Conf. (1994)Google Scholar
  9. 9.
    Brin, S., Page, L.: The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, Brisbane, Australia (1998)Google Scholar
  10. 10.
    Brin, S.: Extracting patterns and relations from the world wide web. In: Atzeni, P., Mendelzon, A.O., Mecca, G. (eds.) WebDB 1998. LNCS, vol. 1590, pp. 172–183. Springer, Heidelberg (1998), CrossRefGoogle Scholar
  11. 11.
    Chakrabarti, S., et al.: Mining the web’s link structure. IEEE Computer (August 1999)Google Scholar
  12. 12.
    Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: A new approach to topicspecific web resource Discovery. In: Proceedings of WWW 8 Conf. (1999)Google Scholar
  13. 13.
    Cho, J., Garcia-Molina, H., Page, L.: Efficient crawling through URL ordering. In: Proceedings of the Seventh International World-Wide Web Conference (1998)Google Scholar
  14. 14.
    Chakrabarti, S.: Mining the Web: Discovering knowledge from hypertext data. Morgan- Kaufmann Publishers, San Francisco (2003)Google Scholar
  15. 15.
    Diligenti, M., Coetzee, F., Lawrence, S., Giles, C.L., Gori, M.: Focused Crawling using Context Graphs. In: VLDB 2000 (2000)Google Scholar
  16. 16.
    Eberhart, A.: Survey of RDF data on the web. In: Proc. of the 6th World Multiconference on Systemics, Cybernetics and Informatics (SCI) (2002)Google Scholar
  17. 17.
  18. 18.
    Gruber, T.: A translation approach to portable ontologies. Knowledge Acquisition (1993)Google Scholar
  19. 19.
    Guarino, N.: Formal Ontology and Information Systems. In: Guarino, N. (ed.) Formal Ontology in Information Systems, Proc. of the 1st International Conference (1998)Google Scholar
  20. 20.
    Grishman, R., Huttunen, S., Yangarber, R.: Real-Time Event Extraction for Infectious Disease Outbreaks. In: Proceedings of Human Language Technology Conference (2002)Google Scholar
  21. 21.
    Grishman, R.: Information extraction: Techniques and challenges. In: Pazienza, M.T. (ed.) SCIE 1997. LNCS(LNAI), vol. 1299. Springer, Heidelberg (1997)Google Scholar
  22. 22.
    Hersovici, M., et al.: The sharksearch algorithm—an application: Tailored web site mapping. In: WWW 7 Conf. (1998)Google Scholar
  23. 23.
    Horrocks et al.: The Ontology Inference Layer OIL. Technical report, Free University of Amsterdam (2000),
  24. 24.
    Kleinberg, J.: Authoritative Sources in hyperlinked environments. In: The 9th ACM SIAM Symposium on Discrete Mathematics (1998)Google Scholar
  25. 25.
    Koivunen, M., Miller, E.: W3C Semantic Web Activity. In: The proceedings of the Semantic Web Kick-off Seminar in Finland, November 2 (2001)Google Scholar
  26. 26.
    Lempel, R., Moran, S.: SALSA: The stochastic approach for link-structure analysis. ACM TOIS (April 2001)Google Scholar
  27. 27.
    Lassila, O., Swick, R.: Resource Description Framework (RDF) Model and Syntax Specification. W3C Recommendation, February 22 (1999)Google Scholar
  28. 28.
    Manola, F., Miller, E.: RDF Primer. W3C Working Draft, January 23 (2003)Google Scholar
  29. 29.
    Menczer, F., Pant, G., Ruiz, M., Srinivasan, P.: Evaluating topic-driven Web crawlers. In: Proc. 24th Intl. ACM SIGIR Conf. (2001)Google Scholar
  30. 30.
    Najork, M., Weiner, J.: Breadth-First search crawling yields high-quality pages. In: WWW 1998 (1998)Google Scholar
  31. 31.
    Ng, A., Zheng, A., Jordan, M.: Stable algorithms for link analysis. In: ACM SIGIR (2001)Google Scholar
  32. 32.
    Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Stanford Digital Libraries Working Paper (1998)Google Scholar
  33. 33.
    Salton, G.: Automatic Text Processing. Addison-Wesley, Reading (1989)Google Scholar
  34. 34.
    International Directory of Search Engines. Search Engine Colossus (2003), available at
  35. 35.
    The Major Search Engines and Directories. Search Engine Watch Report, Danny Sullivan (2003), available at:
  36. 36.
    The Semantic Web Community Portal, at
  37. 37.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Gultekin Ozsoyoglu
    • 1
  • Abdullah Al-Hamdani
    • 1
  1. 1.Dept of Electrical Engineering and Computer ScienceCase Western Reserve UniversityCleveland

Personalised recommendations