Skip to main content

Flexible Category Structure for Supporting WWW Retrieval

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1921))

Abstract

A method for supporting WWW retrieval by constructing a flexible category structure adaptable to the user's search intention is proposed. The method uses categorization viewpoints as a priori knowledge, where a categorization viewpoint is a finite set of consistent category names. A set of documents retrieved by initial keywords is decomposed by categorization viewpoints and each decomposition is scored by clearness or entropy. The user selects an appropriate decomposition by considering the score. The decomposition is recursively performed until a category structure of reasonable size is obtained. Experimental results show that the sets of documents decomposed by the proposed method have higher precision than those decomposed by clustering (K-means). It is also shown that both the scores based on clearness and entropy of the decomposition have relatively high correlation with the precision.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anick, P. G. and Tipirneni, S.: The Paraphrase Search Assistant: Terminological Feedback for Iterative Information Seeking, in SIGIR '99, pp.153–159, 1999.

    Google Scholar 

  2. Dreilinger, D. and Howe, A. E.: Experiences with Selecting Search Engines using Metasearch, ACM Trans. Information Systems, Vol. 15,No.3, pp.195–222, 1997.

    Article  Google Scholar 

  3. Fishkin, K. and Stone, M. C.: Enhanced Dynamic Queries via Movable Filters, in CHI '95, pp.415–420, 1995.

    Google Scholar 

  4. Golovchinsky, G.: Queries? Links? Is there a difference?, in CHI 97, pp.407–414, 1997.

    Google Scholar 

  5. Grossman, D. A. and Frieder, O.: Information Retrieval: Algorithms and Heuristics, pp.134–142, Kluwer Academic Publishers, 1998.

    Google Scholar 

  6. Harada, M.: Freya version 0.92, 1998, http://odin.ingrid.org/freya/.

  7. Kawano, H. and Hasegawa, T.: Data Mining Technology for WWW Resource Retrieval, in IPSJ SIG Notes, DBS108, pp.33–40, 1996.

    Google Scholar 

  8. Kitani, T., et al.: BMIR-J2-ATest Collection for Evaluation of Japanese Information Retrieval Systems, in IPSJ SIG Notes, DBS114, pp.15–22, 1998.

    Google Scholar 

  9. Matsumoto, Y., Kitauchi, A., Yamashita, T., Hirano, Y., Imaichi, O. and Imamura, T.: Japanese MorphologicalAnalysis System ChaSen Manual,Technical ReportNAIST-IS-TR97007, Nara Institute of Science and Technology, 1997.

    Google Scholar 

  10. Pirolli, P., Shank, P., Hearst, M. and Diehl, C.: Scatter/Gather Browsing Communicates the Topic Structure of a Very Large Text Collection, in CHI 96, pp.213–220, 1996.

    Google Scholar 

  11. Pollitt, A. S.: The key role of classification and indexing in view-based searching, in Proc. 63rd IFLA General Conf., 1997.

    Google Scholar 

  12. Robertson, G. G., Card, S. K. and Mackinlay, J. D.: Information Visualization using 3D Interactive Animation, Comm. ACM, Vol. 36,No. 4, pp.57–71, 1993.

    Google Scholar 

  13. Salton, G., Singhal, A., Buckley, C. and Mitra, M.: Automatic Text Decomposition Using Text Segments and Text Themes, in Hypertext '96, pp.53–65, 1996.

    Google Scholar 

  14. Sanderson, M. and Croft, B.: Deriving concept hierarchies from text, in SIGIR '99, pp.206–213, 1999.

    Google Scholar 

  15. Tou, J. T. and Gonzalez, R. C.: Pattern Recognition Principles, pp.89–97, Addison-Wesley, 1974.

    Google Scholar 

  16. Voorhees, E. M. and Harman, D. K.: Evaluation Techniques and Measures, in The Seventh Text REtrieval Conference (TREC 7), p.A-1, National Institute of Standards and Technology (NIST), 1998.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2000 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Takata, Y., Nakagawa, K., Seki, H. (2000). Flexible Category Structure for Supporting WWW Retrieval. In: Liddle, S.W., Mayr, H.C., Thalheim, B. (eds) Conceptual Modeling for E-Business and the Web. ER 2000. Lecture Notes in Computer Science, vol 1921. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45394-6_15

Download citation

  • DOI: https://doi.org/10.1007/3-540-45394-6_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41073-7

  • Online ISBN: 978-3-540-45394-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics