Skip to main content

Flexible Representation and Retrieval of WEB Documents

  • Chapter

Part of the book series: Studies in Fuzziness and Soft Computing ((STUDFUZZ,volume 111))

Abstract

In this paper we present a fuzzy model for representing WEB structured documents in an Information Retrieval System and a flexible query language for expressing soft selection conditions. The documents’ content is organized into thematic (topical) sections where the index terms play a distinct role. The proposed document representation is adaptive to the user, who can indicate the preferred sections of documents, i.e. those which they estimate to bear the most interesting information, and can linguistically quantify the number of sections which determine the global potential interest of the documents.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bookstein A. (1980) Fuzzy requests: an approach to weighted Boolean searches. J. of the American Society for Information Science 31, 240–247.

    Article  Google Scholar 

  2. Bordogna, G., Pasi, G. A. (1993) Fuzzy linguistic approach generalizing Boolean IR: a model and its evaluation. J. of the American Society for Information Science, 44 (2), 70–82.

    Article  Google Scholar 

  3. Bordogna G., Pasi G. (1995) Controlling retrieval through a user adaptive representation of documents, Int. J. of approximate reasoning 12, 317–339.

    Google Scholar 

  4. Bordogna G., and Pasi G. (2000) Flexible Representation and Querying of heterogeneous Structured Documents,, Kybernetyca, 36 (6), 617–633.

    MATH  Google Scholar 

  5. Chiaramella Y., and Kheirbek A. (1996) An integrated model for Hypermedia and Information Retrieval, in “Information Retrieval and Hypertext”, edited by M. Agosti and A. Smeaton, 136–176.

    Google Scholar 

  6. Buell D.A., and Kraft D.H. (1981) Threshold values and Boolean retrieval systems. Information Processing & Management 17, 127–136.

    Article  MATH  Google Scholar 

  7. Christophides V., et al., (1994) From Structured Documents to Novel Query Facilities, in proc. of the ACM SIGMOD Int. Conf. on Management of Data. ACM press, Minneapolis, USA.

    Google Scholar 

  8. Florescu D., Manolescu I., Kossmann D., (1999) Storing and Querying XML data Using an RDBMS, IEEE Data Engineering Bulletin, 22 (3), 27–34.

    Google Scholar 

  9. Kim H., Cho S., (2000), Structured Storage and Retrieval of SGML Documents Using GROVE, Information Processing and Management, 36, 643–657.

    Article  Google Scholar 

  10. Krovetz R., Croft W.B., (1992) Lexical Ambiguity and Information Retrieval. ACM Trans. on Information System, (10)2, 115–141.

    Google Scholar 

  11. Klir G.J., Folger T.A. (1988) Fuzzy Sets, Uncertainty and Information, Prentice Hall PTR Englewood Cliffs.

    MATH  Google Scholar 

  12. Kraft, D. H., Bordogna, G. and Pasi, G. (1995) An extended fuzzy linguistic approach to generalize Boolean information retrieval, Journal of Information Sciences, Applications., 2(3), 119–134 .

    Google Scholar 

  13. Lalmas M., Ruthven I., (1998), Representing and retrieving structured documents using the Dempster-Shafer theory of evidence: Modelling and Evaluation, Journal of Documentation, 54 (5), 529–565.

    Article  Google Scholar 

  14. Macleod I. (1990), Storage and Retrieval of Structured Documents, Information Processing and Management, 26 (2), 197–208.

    Article  Google Scholar 

  15. Molinari, A., G. Pasi G. (1996) A Fuzzy Representation of HTML Documents for Information Retrieval Systems, in proc. of IEEE International Conference on Fuzzy Systems, New Orleans, 8–12 September, 1996.

    Google Scholar 

  16. Negoita, C. V. (1973) On the notion of relevance in information retrieval. Kybernetes, 2 (3), 161–165.

    Article  MathSciNet  MATH  Google Scholar 

  17. Paice, C. D. (1984) Soft evaluation of Boolean search queries in information retrieval systems. Information Technology: Research Development Applications, 3 (1), 33–41.

    Google Scholar 

  18. Papakonstantinou Y., Widom J., Molina H.G., (1996), Object Exchange and Heterogeneous Information sources. In proc. of IEEE Int. Conf. on Engineering, Birmingham, England.

    Google Scholar 

  19. Paradis F., Berrut C., (1996), Experiments with theme extraction in explanatory texts, in proc. of the II Int. Conf. on Conceptions of Library and Information (CoLIB 2), Copenhagen, Denmark, October 13–16, 433446.

    Google Scholar 

  20. Perez-Carballo, J., Strzalkowski, T., (2000) Natural Language Information Retrieval: Progress Report, Information Processing and Management, 36, 155178.

    Google Scholar 

  21. Rao A., et al. (2000) Query Processing in TREC-6, Information Processing and Management, 36, 179–186.

    Article  Google Scholar 

  22. Sager N., (1981) Natural Language Information Processing, Addison Wesley.

    Google Scholar 

  23. Salton, G., Fox, E., Wu, H. (1983) Extended Boolean information retrieval. Communications of the ACM, 26 (12), 1022–1036.

    Article  MathSciNet  MATH  Google Scholar 

  24. Salton G., and McGill M.J. (1984) Introduction to modern information retrieval. McGraw-Hill Int. Book Co.

    Google Scholar 

  25. Sparck Jones, K. A. (1971) Automatic keyword classification for information retrieval. London, England: Butterworths.

    Google Scholar 

  26. Sparck Jones, K. A. (1972) A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28 (1), 11–20.

    Article  Google Scholar 

  27. van Rijsbergen, C. J. (1979) Information Retrieval. London, England, Butterworths & Co., Ltd.

    Google Scholar 

  28. Yager R. R. (1988) On Ordered Weighted Averaging aggregation Operators in Multi Criteria Decision Making, IEEE Trans. on Systems, Man and Cybernetics 18 (1), 183–190.

    Article  MathSciNet  MATH  Google Scholar 

  29. The Ordered Weighted Averaging Operators: Theory and Applications, R.R Yager and J. Kacprzyk eds., Kluwer Academic Publishers (1997).

    Google Scholar 

  30. Zadeh, L.A. (1965) Fuzzy sets. Information and control, 8, 338–353.

    MathSciNet  MATH  Google Scholar 

  31. Zadeh L.A. (1983) A computational Approach to Fuzzy Quantifiers in Natural Languages, Computing and Mathematics with Applications. 9, 149–184.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Bordogna, G., Pasi, G. (2003). Flexible Representation and Retrieval of WEB Documents. In: Szczepaniak, P.S., Segovia, J., Kacprzyk, J., Zadeh, L.A. (eds) Intelligent Exploration of the Web. Studies in Fuzziness and Soft Computing, vol 111. Physica, Heidelberg. https://doi.org/10.1007/978-3-7908-1772-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-7908-1772-0_3

  • Publisher Name: Physica, Heidelberg

  • Print ISBN: 978-3-7908-2519-0

  • Online ISBN: 978-3-7908-1772-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics