Skip to main content

Structure-Based Queries over the World Wide Web

  • Conference paper
Conceptual Modeling – ER ’98 (ER 1998)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1507))

Included in the following conference series:

Abstract

With the increasing importance of the World Wide Web as an information repository, how to locate documents of interest becomes more and more significant. The current practice is to send keywords to search engines. However, these search engines lack the capability to take the structure of the Web into consideration. We thus present a novel query language, NetQL and its implementation, for accessing the World Wide Web. Rather than working on global text-full search, NetQL is designed for local structure-based queries. It not only exploits the topology of web pages given by hyperlinks, but also supports queries involving information inside pages. A novel approach to extract information from web pages is presented. In addition, the methods to control the complexity of query processing are also addressed in this paper.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adeberg, B.: NoDOSE - A tool for semi-automatically extracting structured and semistructured data from text documents. In: Proc. of the ACM SIGMOD International Conference on Management of Data (1998)

    Google Scholar 

  2. Ashish, N., Knoblock, C.:Wrapper generation for semi-structured Internetsources. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)

    Google Scholar 

  3. Atzeni, P., Mecca, G., Merialdo, P.:Semistructured and structured data in theWeb: going back and forth. In: 1st Workshop on Management of Semistructured Data (1997)

    Google Scholar 

  4. Costantino, M., Morgan, R.G., Collingham, R.J., Garigliano, R.: Natural language processing and information extraction: Qualitative analysis of financial news articles. In: Proc. of the Conf. on Computational Intelligence for Financial Engineering (1997)

    Google Scholar 

  5. Francis, W.N., Kucera, H.: Frequency analysis of English usage: lexicon and grammar. Houghton Mifflin (1982)

    Google Scholar 

  6. Fernandez, M., Suciu, D.: Query optimizations for semi-structured data using graph schema. In: ICDE 1998 (1998)

    Google Scholar 

  7. Goldman, R., Widom, J.: Interactive query and search in semistructured databases. Technical Report, Stanford University (1998)

    Google Scholar 

  8. Hammer, J., Molina, H.G., Cho, J., Aranha, R., Crespo, A.: Extracting semistructured information from the Web. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)

    Google Scholar 

  9. Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proc. of Int’l Conf. on Research on Computational Linguistics, Taiwan (1997)

    Google Scholar 

  10. Kozima, H., Furugori, T.: Similarity between words computed by spreading activation on an English dictionary. In: Proc. of EACL-1993(Utrecht), pp. 232–239 (1993)

    Google Scholar 

  11. Konopnicki, D., Shmueli, O.: W3QS: A query system for the world wide web. In: VLDB 1995, Zurich, pp. 54–65 (1995)

    Google Scholar 

  12. Lacrox, Z., Sahuguet, A., Chandrasekar, R., Srinivas, B.: A novel approach to querying the Web: Integrating Retrieval and Browsing. In: Embley, D.W. (ed.) ER 1997. LNCS, vol. 1331. Springer, Heidelberg (1997)

    Google Scholar 

  13. Lakshmanan, L.V.S., Sadri, F., Subramanian, I.N.: A declarative language for querying and restructuring the Web. In: Proc. of 6th. International Workshop on Research Issues in Data Engineering, RIDE 1996, New Orleans (February 1996)

    Google Scholar 

  14. Liu, M.: NetQL: an intelligent web query language. Master Thesis, University of Regina

    Google Scholar 

  15. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: an on-line lexical database. International Journal of Lexicography (1993)

    Google Scholar 

  16. Mendelzon, A., Mihaila, G., Milo, T.: Querying the World Wide Web. In: 1st Int. Conf. on Parallel and Distributed Information System (1996)

    Google Scholar 

  17. Smith, D., Lopez, M.: Information extraction for semi-structured documents. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)

    Google Scholar 

  18. Soderland, S.: Learning to extract text-based information from the world wide wed. In: Proc. of 3rd International Conf. on Knowledge Discovery and Data Mining (KDD 1997) (1997)

    Google Scholar 

  19. Smeaton, A.F., Quigley, I.: Experiments on using semantics distances betweenwords in image caption retrieval. In: SIGIR 1996 (1996)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Guan, T., Liu, M., Saxton, L.V. (1998). Structure-Based Queries over the World Wide Web. In: Ling, TW., Ram, S., Li Lee, M. (eds) Conceptual Modeling – ER ’98. ER 1998. Lecture Notes in Computer Science, vol 1507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49524-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-49524-6_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-65189-5

  • Online ISBN: 978-3-540-49524-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics