Information retrieval, information structure, and information agents

  • Daniela Rus
  • Devika Subramanian
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1326)


This paper presents a customizable architecture for software agents that capture and access information in large, heterogeneous, distributed electronic repositories. The key idea is to exploit underlying structure at various levels of granularity to build high-level indices with task-specific interpretations. Information agents construct such indices and are configured as a network of reusable modules called structure detectors and segmenters. We illustrate our architecture with the design and implementation of smart information filters in two contexts: retrieving stock market data from Internet newsgroups, and retrieving technical reports from Internet ftp sites.


Mobile Robot Information Retrieval Regular Expression White Space Structure Detector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [AS]
    J. Allan and G. Salton, The identification of text relations using automatic hypertext linking, in the Workshop on Intelligent Hypertext, the ACM Conference on Information and Knowledge Management, November 1993.Google Scholar
  2. [AV]
    D. Angluin and L. Valiant, Fast probabilistic algorithms for Hamiltonian circuits and matchings, Journal of Computer and System Sciences vol. 18, pp 155–193, 1979.Google Scholar
  3. [BDG]
    J. L. Balcázar, J. Dúiaz, and J. Gabarró, Structural Complexity I, Springer-Verlag, 1992.Google Scholar
  4. [Bel]
    N. Belkin and W. Croft, Information filtering and information retrieval: two sides of the same coin?, Communications of the ACM, vol. 35(12), pp. 29–38, 1992.Google Scholar
  5. [BK]
    M. Blum and D. Kozen, On the power of the compass (or, why mazes are easier to search than graphs), in Proceedings of the Symposium on Foundations of Computer Science, pp. 132–142, 1978.Google Scholar
  6. [Brol]
    R. Brooks, Elephants don't play chess, Design of Autonomous Agents, ed. P. Maes, MIT/Elsevier, 1990.Google Scholar
  7. [Bro2]
    R. Brooks, A robust layered control system for a mobile robot, IEEE Journal of Robotics and Automation, 1986.Google Scholar
  8. [CG]
    J. Canny and K. Goldberg, A “RISC” Paradigm for Industrial Robotics, to appear, Proceedings of the International Conference on Robotics and Automation, 1993.Google Scholar
  9. [Cate]
    V. Cate, Alex: a global file system, in Proceedings of the Usenix Conference on File Systems, 1992.Google Scholar
  10. [CRD]
    P. Crean, C. Russell, and M. V. Dellon, Overview and Programming Guide to the Mind Image Management Systems, Xerox Technical Report X9000627, 1991.Google Scholar
  11. [DL]
    J. Davis and C. Lagoze, Drop-in publishing with the World-Wide Web, in Proceedings of the Second International WWW Conference, pg 749–758, 1994.Google Scholar
  12. [Don]
    B. Donald, Information Invariants in Robotics, to appear, Artificial Intelligence.Google Scholar
  13. [DJ]
    B. Donald and J. Jennings, Constructive recognizability for task-directed robot programming, Journal of Robotics and Autonomous Systems, 9(1), 1992.Google Scholar
  14. [DJR]
    B. Donald, J. Jennings, and D. Rus, Information Invariants for Cooperating Autonomous Mobile Robots, in Proceedings of the International Symposium on Robotics Research, 1993.Google Scholar
  15. [EW]
    O. Etzioni and D. Weld, A softbot-based interface to the Internet, in Communications of the ACM, vol. 37, no. 7, pp. 72–76, 1994.Google Scholar
  16. [FNK]
    H. Fujisawa, Y. Nakano, and K. Kurino, Segmentation methods for character recognition: from segmentation to document structure analysis, Proceedings of the IEEE, vol. 80, no. 7, 1992.Google Scholar
  17. [GK]
    M. Genesereth, S. Ketchpel, Software agents, in Communications of the ACM, vol. 37, no. 7, pg 48–53, 1994.Google Scholar
  18. [GGT]
    L. Gravano, H. Garcia-Molina, and A. Tomasic, The Efficacy of GlOSS for the Text Database Discovery Problem, Technical Report no. STAN-CS-TN-9301, Computer Science Department, Stanford University, 1993.Google Scholar
  19. [Gra]
    R. Gray, Agent Tcl, A transportable Agent System, in J. Mayfield and T. Finnin, editors, Proceeedings of the CIKM Workshop on Intelligent Agents, Baltimore, MD 1995. Also Technical Report PCS-TR95-261, Department of Computer Science, Dartmouth College, 1995.Google Scholar
  20. [GKNRC]
    R. Gray, D. Kotz, S. Nog, D. Rus, and G. Cybenko, Mobile Agents for Mobile Computers, submitted to MOBICOM96. Also Technical Report PCSTR96-285, Department of Computer Science, Dartmouth College, 1996.Google Scholar
  21. [HP]
    M. Hearst and C. Plaunt, Subtopic Structuring for Full-Length Document Access, in Proceedings of the Sixteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 59–68, 1993.Google Scholar
  22. [Hoe]
    W. Hoeffding, Probability inequalities for sums of bounded random variables, Journal of American Statistical Association, vot. 58, pp 13–30, 1963.Google Scholar
  23. [HU]
    J. Hopcroft and J. Ullman, Introduction to Automata Theory, Languages, and Computation, Addison-Wesley, 1979.Google Scholar
  24. [HKR]
    D. Huttenlocher, G. Klanderman, and W. Rucklidge, Comparing images using the Hausdorff distance, to appear, IEEE Transactions on Pattern Analysis and Machine Intelligence.Google Scholar
  25. [HNR]
    D. Huttenlocher, J. Noh, and W. Rucklidge, Tracking non-rigid objects in complex scenes, Cornell University Technical Report TR92-1320, 1992.Google Scholar
  26. [JB]
    A. Jain and S. Bhattacharjee, Address block location on envelopes using Gabor filters, Pattern Recognition, vol. 25, no. 12, 1992.Google Scholar
  27. [JR]
    J. Jennings and D. Rus, Active model acquisition for near-sensorless manipulation with mobile robots, in Proceedings of the LASTED Conference on Robotics and Automation, 1993.Google Scholar
  28. [Kah]
    B. Kahle, Overview of Wide Area Information Servers, WAIS on-line documentation, 1991.Google Scholar
  29. [KC]
    R. Kahn and V. Cerf, The World of Knowbots, report to the Corporation for National Research Initiative, Arlington, VA 1988.Google Scholar
  30. [KSC]
    H. Kautz, B. Selman, and M. Coen, Bottom-up design of software agents, in Communications of the ACM, vol 37, no. 7, pp. 143–145, 1994Google Scholar
  31. [KF]
    H. Kucera and W. Francis, Computational Analysis of Present Day American English, Brown University Press, Providence, RI, 1967.Google Scholar
  32. [Les]
    M. Lesk, The CORE electronic chemistry library, Proceedings of the SIGIR, 1991.Google Scholar
  33. [Mae]
    P. Maes, Agents that reduce work and information overload, in Communications of the ACM, vol 37, no. 7, pp. 31–40, 1994.Google Scholar
  34. [MCFMZ]
    T. Mitchell, R. Caruana, D. Freitag, J. McDermott, and D. Zabowski, Experience with a learning personal assistant, in Communications of the ACM, vol 37, no. 7, pp. 81–91, 1994.Google Scholar
  35. [MT*]
    M. Mizuno, Y. Tsuji, T. Tanaka, H. Tanaka, M. Iwashita, and T. Temma, Document recognition system with layout structure generator, NEC Research and Development, vol. 32, no. 3, 1991.Google Scholar
  36. [Mun]
    J. Munkres, Topology: A First Course, Prentice Hall, 1975.Google Scholar
  37. [NSV]
    G. Nagy, S. Seth, and M. Vishwanathan, A prototype document image analysis system for technical journals, Computer, vol. 25, no. 7, 1992.Google Scholar
  38. [PN]
    C. Pearce and C. Nicholas, Generating a dynamic hypertext environment with n-gram analysis, in Proceedings of the ACM Conference on Information Knowledge Management, pp. 148–153, 1993.Google Scholar
  39. [Rob]
    S. Robertson, The methodology of information retrieval experiment, Information Retrieval Experiment, in K. Sparck Jones, Ed., pp 9–31, Butterworths, 1981.Google Scholar
  40. [RCM]
    G. Robertson, S. Card, and J. Mackinlay, Information visualization using 3D interactive animation, in Communications of the ACM, Vol. 36, No. 4, pp. 57070, 1993.Google Scholar
  41. [RSa]
    D. Rus and D. Subramanian, Multi-media RISSC Informatics: Retrieving Information with Simple Structural Components, in Proceedings of the ACM Conference on Information and Knowledge Management, Nov. 1993.Google Scholar
  42. [RSa]
    D. Rus and D. Subramanian, Customizing informaiton access, in ACM Transactions on Information Systems, vol. 15, no. 1, pp 67–101, 1997.Google Scholar
  43. [RSb]
    D. Rus and K. Summers, Using whitespace for automated document structuring, in N. Adam, B. Bhargava, and Y. Yesha editors, Advances in digital libraries, Springer-Verlag, Lecture Notes in Computer Science 916, 1995.Google Scholar
  44. [RGK]
    D. Rus, R. Gray, and D. Kotz, Autonomous and Adaptive Agents that Gather Information, in I. Imam editors, Proceedings of the AAAI96 Workshop on Intelligent Adpative Agents, Protland, OR, 1996.Google Scholar
  45. [RGK]
    D. Rus, R. Gray, and D. Kotz, Transportable Information Agents, in Proceedings of the First International Conference on Autonomous Agents, Marina del Ray, CA, 1997.Google Scholar
  46. [SM]
    G. Salton and M. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983.Google Scholar
  47. [SB]
    G. Salton and C. Buckley, Improving retrieval performance by relevance feedback, Journal of American Society for Information Science, vol. 41(4), pp. 288–297, 1990.Google Scholar
  48. [Sal]
    G. Salton, Automatic Text Processing: the transformation, analysis, and retrieval of information by computer, Addison-Wesley, 1989.Google Scholar
  49. [SK]
    D. Sankoff and J. Kruskal, Time warps, string edits, and macromolecules: the theory and practice of sequence comparison, Addison-Wesley, 1983.Google Scholar
  50. [ST]
    M. Schwartz and P. Tsirigotis, Experience with a Semantically Cognizant Internet White Pages Directory Tool, Journal of Internetworking Research and Experience, March 1991.Google Scholar
  51. [SEKN]
    M. Schwartz, A. Emtage, B. Kahle, and B. Neuman, A comparison of Internet discovery approaches, Computer Systems, 5(4), 1992.Google Scholar
  52. [TSKK]
    Y. Tanosaki, K. Suzuki, K. Kikuchi, and M. Kurihara, A logical structure analysis system for documents, Proceedings of the second international symposium on interoperable information systems, 1988.Google Scholar
  53. [TA]
    S. Tsujimoto and H. Asada, Major components of a complete text reading system, in Proceedings of the IEEE, vol. 80, no. 7, 1992.Google Scholar
  54. [WS]
    D. Wang and S. Srihari, Classification of newspaper image blocks using texture analysis, Computer Vision, Graphics, and Image Processing, vol. 47, 1989.Google Scholar
  55. [WCW]
    K. Wong, R. Casey, and F. Wahl, Document Analysis System, IBM Journal of Research and Development, vol. 26, no. 6, 1982.Google Scholar
  56. [Splus]
    User Manual, Splus Reference Manual, Statistical Sciences,Inc.,Seattle, Washington, 1991.Google Scholar
  57. [CACM93]
    Communications of the ACM, vol. 36, no. 4, 1994.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1997

Authors and Affiliations

  • Daniela Rus
    • 1
  • Devika Subramanian
    • 2
  1. 1.Department of Computer ScienceDartmouth CollegeHanoverUSA
  2. 2.Department of Computer ScienceRice UniversityHousonUSA

Personalised recommendations