Skip to main content
Log in

Learning to Understand Information on the Internet: An Example-Based Approach

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

The explosive growth of the Web has made intelligent softwareassistants increasingly necessary for ordinary computer users. Bothtraditional approaches—search engines, hierarchical indices—andintelligent software agents require significant amounts of humaneffort to keep up with the Web. As an alternative, we investigate theproblem of automatically learning to interact with informationsources on the Internet. We report on ShopBotand ILA , two implemented agents that learn touse such resources. ShopBot learns how to extract information from onlinevendors using only minimal knowledge about product domains. Giventhe home pages of several online stores, ShopBotautonomously learns how to shop at those vendors. After its learningis complete, ShopBot is able to speedily visitover a dozen software stores and CD vendors, extract productinformation, and summarize the results for the user. ILAlearns to translate information from Internetsources into its own internal concepts. ILAbuilds a model of an information source that specifies the translation between the source's output and ILA 's model of the world. ILA iscapable of leveraging a small amount of knowledge about a domain tolearn models of many information sources. We show that ILA 's learning is fast and accurate, requiring only a smallnumber of queries per information source.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agre, P. and Chapman, D. (1987). Pengi: An implementation of a theory of activity. In Proc. 6th Nat. Conf. on AI.

  • Agre, P. and Horswill, I. (1992). Cultural support for improvisation. In Proc. 10th Nat. Conf. on AI(pp. 363–368).

  • Arens, Yigal, Chee, Chin Y., Hsu, Chun-Nan, and Knoblock, Craig A. (1993). Retrieving and Integrating Data from Multiple Information Sources, International Journal on Intelligent and Cooperative Information Systems, 2(2), 127–158,.

    Google Scholar 

  • Armstrong, Robert, Freitag, Dayne, Joachims, Thorsten, and Mitchell, Tom. (1995). Webwatcher: A learning apprentice for the world wide web. In Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments(pp. 6–12). Stanford University. AAAI Press. To order a copy, contact sss@aaai.org.

    Google Scholar 

  • Berwick, R.C. and Pilato, S. (1987). Learning Syntax by Automata Induction, Machine Learning, 2, 9–38.

    Google Scholar 

  • Dent, Lisa, Boticario, Jesus, McDermott, John, Mitchell, Tom, and Zabowski, David. (1992). A personal learning apprentice. In Proc. 10th Nat. Conf. on AI(pp. 96–103).

  • Doorenbos, R.B., Etzioni, O., and Weld, D.S. (1996). A Scalable Comparison-Shopping Agent for theWorld-Wide Web. Technical Report 96-01-03, University of Washington, Department of Computer Science and Engineering. Available via FTP from pub/ai/ at ftp.cs.washington.edu.

  • Etzioni, O. and Weld, D. (1994). A Softbot-Based Interface to the Internet, CACM, 37(7), 72–76.

    Google Scholar 

  • Hammond, Kristen, Burke, Robin, Martin, Charles, and Lytinen, Steven (1995). FAQ finder: A case-based approach to knowledge navigation. In Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments(pp. 69–73). Stanford University. AAAI Press. To order a copy, contact sss@aaai.org.

    Google Scholar 

  • Horswill, I. (1995). Analysis of Adaptation and Environment, Artificial Intelligence, 73(1–2), 1–30.

    Google Scholar 

  • Kirk, Thomas, Levy, Alon Y., Sagiv, Yehoshua, and Srivastava, Divesh. (1995). The information manifold. In Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments(pp. 85–91). Stanford University. AAAI Press.

    Google Scholar 

  • Knoblock, Craig, Arens, Yigal, and Hsu, Chun-Nan. (1994). Cooperating agents for information retrieval. In Proceedings of the Second International Conference on Cooperative Information Systems. Toronto, Canada.

  • Knoblock, Craig and Levy, Alon (Eds.), (1995). Working Notes of the AAAI Spring Symposium on Information Gathering from Heterogeneous, Distributed Environments. Stanford University. AAAI Press. To order a copy, contact sss@aaai.org.

    Google Scholar 

  • Krulwich, B. (1996). The Bargainfinder Agent: Comparison Price Shopping on the Internet. In J. Williams (Ed.), Bots and Other Internet Beasties. SAMS.NET. http://bf.cstar.ac.com/bf/.

  • Kwok, C. and Weld, D. (1996). Planning to gather information. In Proc. 14th Nat. Conf. on AI.

  • Levy, A.Y. and Ordille, J.J. (1995). An experiment in integrating internet information sources. In AAAI Fall Symposium on AI Applications on Knowledge Navigation and Retrieval. Cambridge, MA.

  • Levy, A.Y., Srivastava, Divesh, and Kirk, Thomas. (1995). Data Model and Query Evaluation in Global Information Systems. Journal of Intelligent Information Systems, Special Issue on Networked Information Discovery and Retrieval, 5(2).

  • Li, Wen-Syan. (1995). Knowledge gathering and matching in heterogeneous databases. In Working Notes of the AAAI Spring Symposium: Information Gathering from Heterogeneous, Distributed Environments. Stanford University. AAAI Press. To order a copy, contact sss@aaai.org.

    Google Scholar 

  • Lieberman, H. (1995). Letizia: An agent that assists web browsing. In Proc. 15th Int. Joint Conf. on AI(pp. 924–929).

  • Maes, Pattie. (1994). Agents that ReduceWork and Information Overload, Comm. of the ACM, 37(7), 31–40, 146.

    Google Scholar 

  • Maes, Pattie and Kozierok, Robyn. (1993). Learning interface agents. In Proceedings of AAAI-93.

  • Motro, A. and Rakov, I. (1996). Estimating the quality of data in relational databases. In Proceedings of the 1996 Conference on Information Quality(pp. 94–106).

  • Rajamoney, S. (1993). The Design of Discrimination Experiments, Machine Learning, 12(1/2/3).

  • Richards, B.L. and Mooney, R.J. (1992). Learning relations by pathfinding. In Proc. 10th Nat. Conf. on AI (pp. 50–55).

  • Russell, S. (1986). Preliminary steps toward the automation of induction. In Proc. 5th Nat. Conf. on AI(pp. 477–484).

  • Schlimmer, J.C. and Hermens, L.A. (1993). Software Agents: Completing Patterns and Constructing User Interfaces, Journal of Artificial Intelligence Research, 61–89.

  • Wiederhold, G. (1992). Mediators in the Architecture of Future Information Systems, IEEE Computer, 38–49.

  • Wittgenstein, Ludwig. (1958). Philosophical Investigations. Macmillan Publishing Co., Inc. Translated by G.E.M. Anscombe.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Perkowitz, M., Doorenbos, R.B., Etzioni, O. et al. Learning to Understand Information on the Internet: An Example-Based Approach. Journal of Intelligent Information Systems 8, 133–153 (1997). https://doi.org/10.1023/A:1008672508721

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1008672508721

Navigation