Flexible Category Structure for Supporting WWW Retrieval

Takata, Yoshiaki; Nakagawa, Kokoro; Seki, Hiroyuki

doi:10.1007/3-540-45394-6_15

Flexible Category Structure for Supporting WWW Retrieval

Yoshiaki Takata⁷,
Kokoro Nakagawa⁷ &
Hiroyuki Seki⁷

Conference paper
First Online: 14 December 2001

626 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1921))

Abstract

A method for supporting WWW retrieval by constructing a flexible category structure adaptable to the user's search intention is proposed. The method uses categorization viewpoints as a priori knowledge, where a categorization viewpoint is a finite set of consistent category names. A set of documents retrieved by initial keywords is decomposed by categorization viewpoints and each decomposition is scored by clearness or entropy. The user selects an appropriate decomposition by considering the score. The decomposition is recursively performed until a category structure of reasonable size is obtained. Experimental results show that the sets of documents decomposed by the proposed method have higher precision than those decomposed by clustering (K-means). It is also shown that both the scores based on clearness and entropy of the decomposition have relatively high correlation with the precision.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anick, P. G. and Tipirneni, S.: The Paraphrase Search Assistant: Terminological Feedback for Iterative Information Seeking, in SIGIR '99, pp.153–159, 1999.
Google Scholar
Dreilinger, D. and Howe, A. E.: Experiences with Selecting Search Engines using Metasearch, ACM Trans. Information Systems, Vol. 15,No.3, pp.195–222, 1997.
Article Google Scholar
Fishkin, K. and Stone, M. C.: Enhanced Dynamic Queries via Movable Filters, in CHI '95, pp.415–420, 1995.
Google Scholar
Golovchinsky, G.: Queries? Links? Is there a difference?, in CHI 97, pp.407–414, 1997.
Google Scholar
Grossman, D. A. and Frieder, O.: Information Retrieval: Algorithms and Heuristics, pp.134–142, Kluwer Academic Publishers, 1998.
Google Scholar
Harada, M.: Freya version 0.92, 1998, http://odin.ingrid.org/freya/.
Kawano, H. and Hasegawa, T.: Data Mining Technology for WWW Resource Retrieval, in IPSJ SIG Notes, DBS108, pp.33–40, 1996.
Google Scholar
Kitani, T., et al.: BMIR-J2-ATest Collection for Evaluation of Japanese Information Retrieval Systems, in IPSJ SIG Notes, DBS114, pp.15–22, 1998.
Google Scholar
Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Imaichi, O. and Imamura, T.: Japanese MorphologicalAnalysis System ChaSen Manual,Technical ReportNAIST-IS-TR97007, Nara Institute of Science and Technology, 1997.
Google Scholar
Pirolli, P., Shank, P., Hearst, M. and Diehl, C.: Scatter/Gather Browsing Communicates the Topic Structure of a Very Large Text Collection, in CHI 96, pp.213–220, 1996.
Google Scholar
Pollitt, A. S.: The key role of classification and indexing in view-based searching, in Proc. 63rd IFLA General Conf., 1997.
Google Scholar
Robertson, G. G., Card, S. K. and Mackinlay, J. D.: Information Visualization using 3D Interactive Animation, Comm. ACM, Vol. 36,No. 4, pp.57–71, 1993.
Google Scholar
Salton, G., Singhal, A., Buckley, C. and Mitra, M.: Automatic Text Decomposition Using Text Segments and Text Themes, in Hypertext '96, pp.53–65, 1996.
Google Scholar
Sanderson, M. and Croft, B.: Deriving concept hierarchies from text, in SIGIR '99, pp.206–213, 1999.
Google Scholar
Tou, J. T. and Gonzalez, R. C.: Pattern Recognition Principles, pp.89–97, Addison-Wesley, 1974.
Google Scholar
Voorhees, E. M. and Harman, D. K.: Evaluation Techniques and Measures, in The Seventh Text REtrieval Conference (TREC 7), p.A-1, National Institute of Standards and Technology (NIST), 1998.
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Information Science, Nara Institute of Science and Technology, Japan
Yoshiaki Takata, Kokoro Nakagawa & Hiroyuki Seki

Authors

Yoshiaki Takata
View author publications
You can also search for this author in PubMed Google Scholar
Kokoro Nakagawa
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyuki Seki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Accountancy and Information Systems, Brigham Young University Marriott School, 585 TNRB, P.O. Box 23087, 84602-3087, Provo, Utah, USA
Stephen W. Liddle
Insitute for Business Informatics and Application Systems, University of Klagenfurt, Universitätsstr. 65-67, 9020, Klagenfurt, Austria
Heinrich C. Mayr
Computer Science Institute, Brandenburg University of Technology at Cottbus, Postfach 101344, 03013, Cottbus, Germany
Bernhard Thalheim

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Takata, Y., Nakagawa, K., Seki, H. (2000). Flexible Category Structure for Supporting WWW Retrieval. In: Liddle, S.W., Mayr, H.C., Thalheim, B. (eds) Conceptual Modeling for E-Business and the Web. ER 2000. Lecture Notes in Computer Science, vol 1921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45394-6_15

Download citation

DOI: https://doi.org/10.1007/3-540-45394-6_15
Published: 14 December 2001
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41073-7
Online ISBN: 978-3-540-45394-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics