Abstract
We present a lightweight text filtering algorithm intended for use with personal Web information agents. Fast response and low resource usage were the key design criteria, in order to allow the algorithm to run on the client side. The algorithm learns adaptive queries and dissemination thresholds for each topic of interest in its user profile. We describe a factorial experiment used to test the robustness of the algorithm under different learningpa rameters and more importantly, under limited trainingf eedback. The experiment borrows from standard practice in TREC by usingT REC-5 data to simulate a user reading and categorizing documents. Results indicate that the algorithm is capable of achievingg ood filteringp erformance, even with little user feedback.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Callan, J. Learning While Filtering Documents Proceedings of the 21st International ACM-SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, 1998.
Chen, L., Sycara, K. WebMate: A Personal Agent for Browsing and Searching. Proceedings of the Second International Conference on Autonomous Agents, Minneapolis, MN, 1998.
Hull, D.A., Robertson, S. The TREC-8 FilteringT rack Final Report Proceedings of the Eighth Text REtrieval Conference (TREC-8), Gaithersburg, MD, 1999.
Joachims, T., Freitag, D., Mitchell, T. WebWatcher: A Tour Guide for the World Wide Web. Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI-97), Nagoya, Japan, 1997.
Lewis, D.D., Gale, W.A. A Sequential Algorithm for Training Text Classifiers Proceedings of the 17th International ACM-SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, 1994.
Lieberman, H. Letizia: An Agent That Assists Web Browsing. Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95), Montreal, Canada, 1995.
Rocchio, J.J. Relevance Feedback in Information Retrieval Salton, G. (ed.), The SMART Retrieval System: Experiments in Automatic Document Processing Prentice-Hall, 1971.
Salton, G. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer Addison-Wesley, 1988.
Zhai, C., Jansen, P., Stoica, E., Grot, N., Evans, D.A. Threshold Calibration in CLARIT Adaptive Filtering Proceedings of the Seventh Text Retrieval Conference (TREC-7), Gaithersburg, MD, 1998.
Zhai, C., Jansen, P., Roma, N., Stoica, E., Evans, D.A. Optimization in CLARIT TREC-8 Adaptive Filtering Proceedings of the Eighth Text REtrieval Conference (TREC-8), Gaithersburg, MD, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Somlo, G.L., Howe, A.E. (2001). Adaptive Lightweight Text Filtering. In: Hoffmann, F., Hand, D.J., Adams, N., Fisher, D., Guimaraes, G. (eds) Advances in Intelligent Data Analysis. IDA 2001. Lecture Notes in Computer Science, vol 2189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44816-0_32
Download citation
DOI: https://doi.org/10.1007/3-540-44816-0_32
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42581-6
Online ISBN: 978-3-540-44816-7
eBook Packages: Springer Book Archive