Abstract
With the increasing importance of the World Wide Web as an information repository, how to locate documents of interest becomes more and more significant. The current practice is to send keywords to search engines. However, these search engines lack the capability to take the structure of the Web into consideration. We thus present a novel query language, NetQL and its implementation, for accessing the World Wide Web. Rather than working on global text-full search, NetQL is designed for local structure-based queries. It not only exploits the topology of web pages given by hyperlinks, but also supports queries involving information inside pages. A novel approach to extract information from web pages is presented. In addition, the methods to control the complexity of query processing are also addressed in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Adeberg, B.: NoDOSE - A tool for semi-automatically extracting structured and semistructured data from text documents. In: Proc. of the ACM SIGMOD International Conference on Management of Data (1998)
Ashish, N., Knoblock, C.:Wrapper generation for semi-structured Internetsources. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)
Atzeni, P., Mecca, G., Merialdo, P.:Semistructured and structured data in theWeb: going back and forth. In: 1st Workshop on Management of Semistructured Data (1997)
Costantino, M., Morgan, R.G., Collingham, R.J., Garigliano, R.: Natural language processing and information extraction: Qualitative analysis of financial news articles. In: Proc. of the Conf. on Computational Intelligence for Financial Engineering (1997)
Francis, W.N., Kucera, H.: Frequency analysis of English usage: lexicon and grammar. Houghton Mifflin (1982)
Fernandez, M., Suciu, D.: Query optimizations for semi-structured data using graph schema. In: ICDE 1998 (1998)
Goldman, R., Widom, J.: Interactive query and search in semistructured databases. Technical Report, Stanford University (1998)
Hammer, J., Molina, H.G., Cho, J., Aranha, R., Crespo, A.: Extracting semistructured information from the Web. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proc. of Int’l Conf. on Research on Computational Linguistics, Taiwan (1997)
Kozima, H., Furugori, T.: Similarity between words computed by spreading activation on an English dictionary. In: Proc. of EACL-1993(Utrecht), pp. 232–239 (1993)
Konopnicki, D., Shmueli, O.: W3QS: A query system for the world wide web. In: VLDB 1995, Zurich, pp. 54–65 (1995)
Lacrox, Z., Sahuguet, A., Chandrasekar, R., Srinivas, B.: A novel approach to querying the Web: Integrating Retrieval and Browsing. In: Embley, D.W. (ed.) ER 1997. LNCS, vol. 1331. Springer, Heidelberg (1997)
Lakshmanan, L.V.S., Sadri, F., Subramanian, I.N.: A declarative language for querying and restructuring the Web. In: Proc. of 6th. International Workshop on Research Issues in Data Engineering, RIDE 1996, New Orleans (February 1996)
Liu, M.: NetQL: an intelligent web query language. Master Thesis, University of Regina
Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: an on-line lexical database. International Journal of Lexicography (1993)
Mendelzon, A., Mihaila, G., Milo, T.: Querying the World Wide Web. In: 1st Int. Conf. on Parallel and Distributed Information System (1996)
Smith, D., Lopez, M.: Information extraction for semi-structured documents. In: 1st Workshop on Management of Semistructured Data, Arizona (1997)
Soderland, S.: Learning to extract text-based information from the world wide wed. In: Proc. of 3rd International Conf. on Knowledge Discovery and Data Mining (KDD 1997) (1997)
Smeaton, A.F., Quigley, I.: Experiments on using semantics distances betweenwords in image caption retrieval. In: SIGIR 1996 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guan, T., Liu, M., Saxton, L.V. (1998). Structure-Based Queries over the World Wide Web. In: Ling, TW., Ram, S., Li Lee, M. (eds) Conceptual Modeling – ER ’98. ER 1998. Lecture Notes in Computer Science, vol 1507. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-49524-6_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-49524-6_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65189-5
Online ISBN: 978-3-540-49524-6
eBook Packages: Springer Book Archive