Abstract
Most people store “bookmarks” to web pages. These allow the user to return to a web page later on, without having to remember the exact URL address. People attempt to organise their bookmark databases by filing bookmarks under categories, themselves arranged in a hierarchicalfashion. As the maintenance of such large repositories is difficult and time-consuming, a tool that automatically categorises bookmarks is required. This paper investigates how rough set theory can help extract information out of this domain, for use in an experimentalautomatic bookmark classification system. In particular, work on rough set dependency degrees is applied to reduce the otherwise high dimensionality of the feature patterns used to characterize bookmarks. A comparison is made between this approach to data reduction and a conventional entropy-based approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
L. Tauscher and S. Greenberg, Revisitation patterns in World Wide Web navigation, in: Proc. 1997 ACM CHI Conference, Atlanta, GA, March 1997.
Georgia Tech Research Corporation, GVU’s 8th WWW User Survey, 1997, information available at http://www.gvu.gatech.edu/user_surveys/survey-1997-10/
K. Larson and M. Czerwinski, Web page design: implications of memory, structure and scent for information retrieval, in: Proc. 1998 ACM SIGCHI Conf. on Human Factors in Computing Systems, Los Angeles, CA, April 1998, pp. 25–32.
Y. S. Maarek, I. Z. Ben Shaul. Automatically Organizing Bookmarks per Contents. Fifth International W orld Wide Web Conference 1996, Paris, France. http://www5conf.inria.fr/fich html/papers/P37/Overview.html
W. Li, Q. Vu, D. Agrawal, Y. Hara, H. Takano. PowerBookmarks: a system for personalizable Web information organization, sharing, and management. Proceedings of the Eighth InternationalW orld Wide Web Conference, Toronto, Canada, 11-14 May 1999, ISBN 0-444-50264-5.
P. Devijver and J. Kittler, (1982) Pattern Recognition: A Statistical Approach. Prentice Hall.
T. Mitchell (1997) Machine Learning. McGraw-Hill.
Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishing, Dordrecht, 1991.
Q. Shen and A. Chouchoulas. A Modular Approach to Generating Fuzzy Rules with Reduced Attributes for the Monitoring of Complex Systems. Engineering Applications of Artificial Intelligence, 13(3):263–278, 2000.
J.R. Quinlan. Induction of Decision Trees. Machine Learning 1(1), pp. 81–106. 1986.
M. Dash, H. Liu, J. Yao. Dimensionality Reduction of Unsupervised Data. Proceedings of the 9th International Conference on Tools with Artificial Intelligence (ICTAI’97).
A. Chouchoulas and Q. Shen. Rough set-aided keyword reduction for text categorisation. Applied Artificial Intelligence, 2001.
H. S. Heaps, Information retrieval, computationaland theoreticalasp ects. Academic Press, 1978.
G. Salton, Introduction to Modern Information Retrieval. McGraw-Hill, 1983.
G. Salton, E. A. Fox, and H. Wu, (Cornell Technical Report TR82-511) Extended Boolean Information Retrieval. Cornell University. August 1982.
G. Salton, and C. Buckley. Term Weighting Approaches in Automatic Text Retrieval. Technical Report TR87-881, Department of Computer Science, Cornell University, 1987. Information Processing and Management Vol.32 (4), p. 431–443, 1996.
C.J. van Rijsbergen. Information Retrieval. Butterworths, London, United Kingdom, 1979. http://www.dcs.gla.ac.uk/Keith/Preface.html.
W. Pedrycz, and F. Gomide. An Introduction to Fuzzy Sets: Analysis and Design. The MIT Press, 1998.
R. Jensen. Rough-Fuzzy Methods for Determining Fuzzy Reducts. Project Report. The University of Edinburgh, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jensen, R., Shen, Q. (2001). A Rough Set-Aided System for Sorting WWW Bookmarks. In: Zhong, N., Yao, Y., Liu, J., Ohsuga, S. (eds) Web Intelligence: Research and Development. WI 2001. Lecture Notes in Computer Science(), vol 2198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45490-X_10
Download citation
DOI: https://doi.org/10.1007/3-540-45490-X_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42730-8
Online ISBN: 978-3-540-45490-8
eBook Packages: Springer Book Archive