Abstract
This paper presents a customizable architecture for software agents that capture and access information in large, heterogeneous, distributed electronic repositories. The key idea is to exploit underlying structure at various levels of granularity to build high-level indices with task-specific interpretations. Information agents construct such indices and are configured as a network of reusable modules called structure detectors and segmenters. We illustrate our architecture with the design and implementation of smart information filters in two contexts: retrieving stock market data from Internet newsgroups, and retrieving technical reports from Internet ftp sites.
Preview
Unable to display preview. Download preview PDF.
References
J. Allan and G. Salton, The identification of text relations using automatic hypertext linking, in the Workshop on Intelligent Hypertext, the ACM Conference on Information and Knowledge Management, November 1993.
D. Angluin and L. Valiant, Fast probabilistic algorithms for Hamiltonian circuits and matchings, Journal of Computer and System Sciences vol. 18, pp 155–193, 1979.
J. L. Balcázar, J. Dúiaz, and J. Gabarró, Structural Complexity I, Springer-Verlag, 1992.
N. Belkin and W. Croft, Information filtering and information retrieval: two sides of the same coin?, Communications of the ACM, vol. 35(12), pp. 29–38, 1992.
M. Blum and D. Kozen, On the power of the compass (or, why mazes are easier to search than graphs), in Proceedings of the Symposium on Foundations of Computer Science, pp. 132–142, 1978.
R. Brooks, Elephants don't play chess, Design of Autonomous Agents, ed. P. Maes, MIT/Elsevier, 1990.
R. Brooks, A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation, 1986.
J. Canny and K. Goldberg, A “RISC” Paradigm for Industrial Robotics, to appear, Proceedings of the International Conference on Robotics and Automation, 1993.
V. Cate, Alex: a global file system, in Proceedings of the Usenix Conference on File Systems, 1992.
P. Crean, C. Russell, and M. V. Dellon, Overview and Programming Guide to the Mind Image Management Systems, Xerox Technical Report X9000627, 1991.
J. Davis and C. Lagoze, Drop-in publishing with the World-Wide Web, in Proceedings of the Second International WWW Conference, pg 749–758, 1994.
B. Donald, Information Invariants in Robotics, to appear, Artificial Intelligence.
B. Donald and J. Jennings, Constructive recognizability for task-directed robot programming, Journal of Robotics and Autonomous Systems, 9(1), 1992.
B. Donald, J. Jennings, and D. Rus, Information Invariants for Cooperating Autonomous Mobile Robots, in Proceedings of the International Symposium on Robotics Research, 1993.
O. Etzioni and D. Weld, A softbot-based interface to the Internet, in Communications of the ACM, vol. 37, no. 7, pp. 72–76, 1994.
H. Fujisawa, Y. Nakano, and K. Kurino, Segmentation methods for character recognition: from segmentation to document structure analysis, Proceedings of the IEEE, vol. 80, no. 7, 1992.
M. Genesereth, S. Ketchpel, Software agents, in Communications of the ACM, vol. 37, no. 7, pg 48–53, 1994.
L. Gravano, H. Garcia-Molina, and A. Tomasic, The Efficacy of GlOSS for the Text Database Discovery Problem, Technical Report no. STAN-CS-TN-9301, Computer Science Department, Stanford University, 1993.
R. Gray, Agent Tcl, A transportable Agent System, in J. Mayfield and T. Finnin, editors, Proceeedings of the CIKM Workshop on Intelligent Agents, Baltimore, MD 1995. Also Technical Report PCS-TR95-261, Department of Computer Science, Dartmouth College, 1995.
R. Gray, D. Kotz, S. Nog, D. Rus, and G. Cybenko, Mobile Agents for Mobile Computers, submitted to MOBICOM96. Also Technical Report PCSTR96-285, Department of Computer Science, Dartmouth College, 1996.
M. Hearst and C. Plaunt, Subtopic Structuring for Full-Length Document Access, in Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–68, 1993.
W. Hoeffding, Probability inequalities for sums of bounded random variables, Journal of American Statistical Association, vot. 58, pp 13–30, 1963.
J. Hopcroft and J. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, 1979.
D. Huttenlocher, G. Klanderman, and W. Rucklidge, Comparing images using the Hausdorff distance, to appear, IEEE Transactions on Pattern Analysis and Machine Intelligence.
D. Huttenlocher, J. Noh, and W. Rucklidge, Tracking non-rigid objects in complex scenes, Cornell University Technical Report TR92-1320, 1992.
A. Jain and S. Bhattacharjee, Address block location on envelopes using Gabor filters, Pattern Recognition, vol. 25, no. 12, 1992.
J. Jennings and D. Rus, Active model acquisition for near-sensorless manipulation with mobile robots, in Proceedings of the LASTED Conference on Robotics and Automation, 1993.
B. Kahle, Overview of Wide Area Information Servers, WAIS on-line documentation, 1991.
R. Kahn and V. Cerf, The World of Knowbots, report to the Corporation for National Research Initiative, Arlington, VA 1988.
H. Kautz, B. Selman, and M. Coen, Bottom-up design of software agents, in Communications of the ACM, vol 37, no. 7, pp. 143–145, 1994
H. Kucera and W. Francis, Computational Analysis of Present Day American English, Brown University Press, Providence, RI, 1967.
M. Lesk, The CORE electronic chemistry library, Proceedings of the SIGIR, 1991.
P. Maes, Agents that reduce work and information overload, in Communications of the ACM, vol 37, no. 7, pp. 31–40, 1994.
T. Mitchell, R. Caruana, D. Freitag, J. McDermott, and D. Zabowski, Experience with a learning personal assistant, in Communications of the ACM, vol 37, no. 7, pp. 81–91, 1994.
M. Mizuno, Y. Tsuji, T. Tanaka, H. Tanaka, M. Iwashita, and T. Temma, Document recognition system with layout structure generator, NEC Research and Development, vol. 32, no. 3, 1991.
J. Munkres, Topology: A First Course, Prentice Hall, 1975.
G. Nagy, S. Seth, and M. Vishwanathan, A prototype document image analysis system for technical journals, Computer, vol. 25, no. 7, 1992.
C. Pearce and C. Nicholas, Generating a dynamic hypertext environment with n-gram analysis, in Proceedings of the ACM Conference on Information Knowledge Management, pp. 148–153, 1993.
S. Robertson, The methodology of information retrieval experiment, Information Retrieval Experiment, in K. Sparck Jones, Ed., pp 9–31, Butterworths, 1981.
G. Robertson, S. Card, and J. Mackinlay, Information visualization using 3D interactive animation, in Communications of the ACM, Vol. 36, No. 4, pp. 57070, 1993.
D. Rus and D. Subramanian, Multi-media RISSC Informatics: Retrieving Information with Simple Structural Components, in Proceedings of the ACM Conference on Information and Knowledge Management, Nov. 1993.
D. Rus and D. Subramanian, Customizing informaiton access, in ACM Transactions on Information Systems, vol. 15, no. 1, pp 67–101, 1997.
D. Rus and K. Summers, Using whitespace for automated document structuring, in N. Adam, B. Bhargava, and Y. Yesha editors, Advances in digital libraries, Springer-Verlag, Lecture Notes in Computer Science 916, 1995.
D. Rus, R. Gray, and D. Kotz, Autonomous and Adaptive Agents that Gather Information, in I. Imam editors, Proceedings of the AAAI96 Workshop on Intelligent Adpative Agents, Protland, OR, 1996.
D. Rus, R. Gray, and D. Kotz, Transportable Information Agents, in Proceedings of the First International Conference on Autonomous Agents, Marina del Ray, CA, 1997.
G. Salton and M. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.
G. Salton and C. Buckley, Improving retrieval performance by relevance feedback, Journal of American Society for Information Science, vol. 41(4), pp. 288–297, 1990.
G. Salton, Automatic Text Processing: the transformation, analysis, and retrieval of information by computer, Addison-Wesley, 1989.
D. Sankoff and J. Kruskal, Time warps, string edits, and macromolecules: the theory and practice of sequence comparison, Addison-Wesley, 1983.
M. Schwartz and P. Tsirigotis, Experience with a Semantically Cognizant Internet White Pages Directory Tool, Journal of Internetworking Research and Experience, March 1991.
M. Schwartz, A. Emtage, B. Kahle, and B. Neuman, A comparison of Internet discovery approaches, Computer Systems, 5(4), 1992.
Y. Tanosaki, K. Suzuki, K. Kikuchi, and M. Kurihara, A logical structure analysis system for documents, Proceedings of the second international symposium on interoperable information systems, 1988.
S. Tsujimoto and H. Asada, Major components of a complete text reading system, in Proceedings of the IEEE, vol. 80, no. 7, 1992.
D. Wang and S. Srihari, Classification of newspaper image blocks using texture analysis, Computer Vision, Graphics, and Image Processing, vol. 47, 1989.
K. Wong, R. Casey, and F. Wahl, Document Analysis System, IBM Journal of Research and Development, vol. 26, no. 6, 1982.
User Manual, Splus Reference Manual, Statistical Sciences,Inc.,Seattle, Washington, 1991.
Communications of the ACM, vol. 36, no. 4, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Rus, D., Subramanian, D. (1997). Information retrieval, information structure, and information agents. In: Nicholas, C., Mayfield, J. (eds) Intelligent Hypertext. WIH WIH 1994 1993. Lecture Notes in Computer Science, vol 1326. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0023964
Download citation
DOI: https://doi.org/10.1007/BFb0023964
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63637-3
Online ISBN: 978-3-540-69622-3
eBook Packages: Springer Book Archive