Advertisement

A Logical Information Retrieval Model Based on a Combination of Propositional Logic and Probability Theory

  • Justin Picard
  • Jacques Savoy
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 50)

Abstract

In addition to working with large amount of documents, information retrieval has to deal with the uncertainties that confront all natural languages, including homonymy, synonymy and polysemy. These represent major hurdles that every automatic natural language processing situation or system must deal with. In order to encourage the discovery of better solutions to these hurdles and to facilitate improved understanding of the matching process between query and documents, our approach is to view retrieval mechanisms as an inference process that involves uncertainty. In this context, a fundamental question involves the choice of an adequate framework which will allow us to: (1) represent the various types of uncertain knowledge; (2) combine various sources of evidence about query or document content; and (3) come up with efficient and sound techniques capable of making the needed inferences. To meet these criteria, we suggest using probabilistic argumentation systems (PAS) which combine propositional logic with probability theory such that we can deal with uncertain knowledge in both a symbolic and a numerical way. In this chapter, a model of information retrieval based on PAS will be presented. This model provides an original interpretation of van Rijsbergen’s logical uncertainty principle, a foundation for most logical IR models. Also presented will be an example of our logical model that takes hypertext links or other interdocument relationships into account, in order to enhance retrieval effectiveness.

Keywords

Information Retrieval Propositional Logic Information Item Inference Process Prob Ability 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Allen, J. (1995). Natural language understanding. Benjamin/Cummings, Redwood, CA.MATHGoogle Scholar
  2. 2.
    Anrig, B., Haenni, R., Kohlas, J., and Lehmann, N. (1997). Assumption-based modeling using ABEL. In Gabbay, D., Kruse, R., Nonnengart, A., and Ohlbach, H., editors, First International Joint Conference on Qualitative and Quantitative Practical Reasoning; ECSQARU-FAPR’97. Springer.Google Scholar
  3. 3.
    Belkin, N. J., Cool, C, Croft, W. B., and Callan, J. P. (1993). The effect of multiple query representations on information system performance. In Proc. of the Int. ACM-SIGIR Conf., pages 339–346, Pittsburg (PA).Google Scholar
  4. 4.
    Blair, D. and Maron, M. (1990). Full-text information retrieval: Further analysis and classification. Information Processing & Management, 26(3):437–477.CrossRefGoogle Scholar
  5. 5.
    Calvé, A. L. and Savoy, J. (1999). Database merging strategy based on logistic regression. Information Processing & Management. To appear.Google Scholar
  6. 6.
    Chiamarella, Y. and Chevallet, Y. (1992). About retrieval models and logic. The Computer Journal, 5(3):233–242.CrossRefGoogle Scholar
  7. 7.
    Cleverdon, C. (1984). Optimizing convenient on-line access to bibliographic databases. Information Service & Use, 4:37–47.Google Scholar
  8. 8.
    Cleverdon, C, Mills, J., and Keen, M. (1966). Factors determining the performance of indexing systems. ASLIB Cranfield Research Project. Technical report, Cranfield, UK.Google Scholar
  9. 9.
    Croft, W. and Turtle, H. (1993). Retrieval strategies for hypertext. Information Processing & Management, 29(3):313–324.CrossRefGoogle Scholar
  10. 10.
    de Kleer, J. (1986). An assumption-based tms. Journal of Artificial Intelligence, 28:127–162.CrossRefGoogle Scholar
  11. 11.
    Egghe, L. and Rousseau, R. (1990). Introduction to informetrics. Quantative methods in library, documentation and information science. Elsevier, Amsterdam, NL.Google Scholar
  12. 12.
    Frei, H. and D. Stieger (1995). The use of semantic links in hypertext information retrieval. Information Processing & Management, 31(1): 1–13.Google Scholar
  13. 13.
    Fuhr, N. (1995). Probabilistic Datalog- A logic for powerful retrieval models. In Proc. of the Int. ACM-SIGIR Conf., pages 282–290.Google Scholar
  14. 14.
    Fuhr, N., Govert, N., and Rolleke, T. (1998). DOLORES: A system for logic-based retrieval of multimedia objects. In Proc. of the Int. ACM-SIGIR Conf., pages 257–265.Google Scholar
  15. 15.
    Furnas, G., Landauer, T., Gomez, L., and Dumais, S. (1987). Vocabulary problem in human-system communication. Communications of the ACM, 30(11):964–971.CrossRefGoogle Scholar
  16. 16.
    Gal, A., Laplame, G., Saint-Dizier, P., and Somers, H. (1991). Prolog for the natural language processing. John Wiley & Sons, UK.Google Scholar
  17. 17.
    Garfield, E. (1983). Citation indexing: its theory and application in science, technology and humanities. The ISI Press, Philapdelphia, PA.Google Scholar
  18. 18.
    Haenni, R. and Lehmann, N. (1998). Reasoning with finite set constraints. In ECAI’98, Workshop W17: Many-valued logic for AI application, pages 1–6.Google Scholar
  19. 19.
    Heidtmann, K. (1989). Smaller sums of disjoint products by subproduct inversion. IEEE Transactions on Reliability, 38(3):305–311.MATHCrossRefGoogle Scholar
  20. 20.
    Katzer, J., McGill, M., Tessier, J., Frakesand, W., and DasGupta, P. (1982). A study of the overlap among document representations. Information Technology: Research & Development, 2:261–274.Google Scholar
  21. 21.
    Kessler, M. (1963). Bibliographie coupling between scientific papers. American Documentation, 14:10–25.CrossRefGoogle Scholar
  22. 22.
    Kohlas, J. and Haenni, R. (1996). Assumption-based reasoning and probabilistic argumentation systems. In Kohlas, J. and Moral, S., editors, Defeasible Reasoning and Uncertainty Management Systems: Algorithms. Oxford University Press.Google Scholar
  23. 23.
    Krovetz, R. and Croft, W. (1992). Lexical ambiguity and information retrieval. ACM Transactions on Information Systems, 10(2):115–141.CrossRefGoogle Scholar
  24. 24.
    Kwok, K. (1975). The use of title and cited titles as document representation for automatic classification. Information Processing & Management, 11(8/12):201–206.MathSciNetCrossRefGoogle Scholar
  25. 25.
    Laimas, M. (1998). Logical models in information retrieval: Introduction and overview. Information Processing & Management, 34(l):19–33.CrossRefGoogle Scholar
  26. 26.
    Lee, J. (1995). Combining multiple evidence from different properties of weighting schemes. In Proc. of the Int. ACM-SIGIR Conf., pages 180–188, Seattle (WA).Google Scholar
  27. 27.
    Liu, M. (1993). The complexity of citation practice: A review of citation studies. Journal of Documentation, 49(4):370–408.CrossRefGoogle Scholar
  28. 28.
    Martyn, J. (1964). Bibliographic coupling. Journal of Documentation, 20(4):236.CrossRefGoogle Scholar
  29. 29.
    Nauta, D. (1970). The meaning of information. Mouton, The Hague (NL).Google Scholar
  30. 30.
    Nie, J. (1989). An information retrieval model based on modal logic. Information Processing & Management, 25(5):477–491.CrossRefGoogle Scholar
  31. 31.
    Nie, J. (1996). An inferential approach to information retrieval and its implementation using a manual thesaurus. Artificial Intelligence Review, 10:409–439.MATHCrossRefGoogle Scholar
  32. 32.
    Ounis, I. and Pasca, M. (1998). RELIEF: Combing expressiveness and rapidity into a single system. In Proc. of the Int. ACM-SIGIR Conf., pages 266–274.Google Scholar
  33. 33.
    Picard, J. (1998). Modeling and combining evidence provided by document relationships using probabilistic argumentation systems. In Proceedings of the International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 182–189, Melbourne, Australia.Google Scholar
  34. 34.
    Picard, J. (1999). Two applications of probabilistic argumentation systems to information retrieval. In Workshop on Logical and Uncertainty Models for Information Systems, London, UK. Accepted for publication.Google Scholar
  35. 35.
    Pirkola, A. and Jarvelin, K. (1996). The effect of anaphor and ellipsis resolution on proximity searching in a text database. Information Processing & Management, 32(2):199–216.CrossRefGoogle Scholar
  36. 36.
    Rolleke, T. and Fuhr, N. (1998). Information retrieval with probabilistic Data-log. In Information retrieval: Uncertainty and logic, chapter 9, pages 221–243. Kluwer.CrossRefGoogle Scholar
  37. 37.
    Saracevic, T. and Kantor, P. (1988). A study of information seeking and retrieving. III. Searchers, searches, overlap. Journal of the American Society for Information Science, 39(3):197–216.CrossRefGoogle Scholar
  38. 38.
    Savoy, J. (1994). A learning scheme for information retrieval in hypertext. Information Processing & Management, 30(4):513–533.CrossRefGoogle Scholar
  39. 39.
    Savoy, J. (1997). Ranking schemes in hybrid Boolean systems: A new approach. Journal of the American Society for Information Science, 48(3):235–253.CrossRefGoogle Scholar
  40. 40.
    Sebastiani, F. (1998). On the role of logic in information retrieval. Information Processing and Management, 34(1):1–18.CrossRefGoogle Scholar
  41. 41.
    Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24:265–269.CrossRefGoogle Scholar
  42. 42.
    Turtle, H. and Croft, W. (1991). Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems, 9(3): 187–222.CrossRefGoogle Scholar
  43. 43.
    Turtle, H. and Croft, W. (1997). Uncertainty in information retrieval systems. In Motro, A. and Smets, P., editors, Uncertainty management in information system, Amsterdam (NL). Kluwer.Google Scholar
  44. 44.
    van Rijsbergen, C. (1986). A non classical logic for information retrieval. Journal of Documentation, 29(6):481–485.MATHGoogle Scholar
  45. 45.
    van Rijsbergen, C. (1989). Towards an information logic. In Proc. of the Int. ACM-SIGIR Conf., pages 77–86.Google Scholar
  46. 46.
    Weinstock, M. (1971). Citation indexes. In Encyclopedia of library and information science, pages 16–40, New York, NY. Marcel Dekker.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Justin Picard
    • 1
  • Jacques Savoy
    • 1
  1. 1.Institut interfacultaire d’informatiqueUniversité de NeuchâtelNeuchâtelSwitzerland

Personalised recommendations