Abstract
Deciding the relevance of Web pages to a query or a topic is very important in serving Web users. For clustering and classifying Web pages the similar decisions need to be made. Most of work usually uses positively related terms in one form or another. Once a topic is given or focused, we suggest using negative terms to the topic for the relevance decision. A method to generate negative terms automatically by using DMOZ, Google and WordNet, is discussed, and formulas to decide the relevance using the negative terms are also given in this paper. Experiments convince us of the usefulness of the negative terms against the topic. This work also helps to solve the polysemy problem. Since generating negative terms to any topic is automatic, this work may help many studies for the service improvement in the Web.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Attardi, G., Gulli, A., Sebastiani, F.: Automatic Web Page Categorization by Link and Context Analysis. In: Proc. of THAI 1999, European Symposium on Telematics, Hypermedia and Artificial Intellignece (1999)
Baeza-Yates, R., Ribeiro-Neto, B.: Modern Information Retrieval. Addison-Wesley, Reading (1999)
Kontostathis, A., Pottenger, W.M.: Improving Retrieval Performance with Positive and Negative Equivalence Classes of Terms, TR in Lehigh Univ. (2002)
Hoashi, K., Matsumoto, K., Inoue, N., Hashimoto, K.: Experiments on the TREC-8 Filtering Track. In: Proc. of SIGIR 2000 (2000)
Yu, C.T., Salton, G., Siu, M.K.: Effective Automatic Indexing Using Term Addition and Deletion. JACMÂ 25 (1978)
Chakrabarti, S., van den Berg, M., Dom, B.: Focused crawling: A new approach to topic-specific Web resource discovery. In: Proc. of the 8th International WWW Conference (1999)
Chakrabarti, S., et al.: Automatic Resource Compilation by Analyzing Hyperline Structure and Associated Text. In: Proc. of the 7th International WWW Conference (1998)
Eguchi, K.: Incremental query expansion using local information of clusters. In: Proc. of the 4th World Multiconference on Systems, Cybernetics and Informatics (2000)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. In: Proc. of ACM-SIAM Symposium on Discrete Algorithms (1998)
Kim, S.: Improving the Performance of an Information Agent for a Specific Domain on the WWW. Master thesis, Hongik Graudate School (2002)
Menzcer, F., Pant, G., Ruiz, M.: Evaluation Topic-Driven Web Crawlers. In: Proc. of SIGIR 2001 (2001)
Menzcer, F., Pant, G., Srinivasan, P.: Topical Web Crawlers: Evaluating Adaptive Algorithms. ACM Transactions on Internet Technology V (February 2003)
Miller, G.: Wordnet: An online lexical database. International Journal of Lexicography 3 (1997)
Pant, G., Menzcer, F.: MySpiders: Evolve Your Own Intelligent Web Crawlers. Autonomous Agents and Multi-Agent Systems 5 (2002)
[DMOZ], http://www.dmoz.org/Kids_and_Teens/School_Time/Science/Living_Thing/Animals/Mammals/
[Google], http://www.google.com/
[Wordnet], http://www.cogsci.princeton.edu/~wn/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Byun, YT., Choi, YH., Lee, KC. (2005). Automatic Generation and Use of Negative Terms to Evaluate Topic-Related Web Pages. In: Shimojo, S., Ichii, S., Ling, TW., Song, KH. (eds) Web and Communication Technologies and Internet-Related Social Issues - HSI 2005. HSI 2005. Lecture Notes in Computer Science, vol 3597. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527725_23
Download citation
DOI: https://doi.org/10.1007/11527725_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27830-6
Online ISBN: 978-3-540-31808-8
eBook Packages: Computer ScienceComputer Science (R0)