Abstract
In the first part of this chapter, we shall briefly present the concepts of information retrieval systems (IRSs) and information filtering systems (IFSs). Then, the key characteristics of business information sources on the Web will be described. Subsequently, the main problems with applying the existing filtering and retrieval techniques to exploit the Internet sources will be highlighted. As a result of the criticism, the new model of information filtering system will be proposed in the last part of this chapter. This very model will be the starting point for later considerations in this book.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abramowicz W (1984) Ein mathematisches Modell eines IR-Systems zur Verbreitung von Informationen in einem Netz. Institut fü r Informatik, Eidgenössische Technische Hochschule, Zürich, 1984 (in German)
Abramowicz W (1990) Information Dissemination to Users with Heterogenous Interests. In: Grabowski J (ed) Computers in Science and Higher Education, Mathematical Research, Vol. 57, Akademie-Verlag, Berlin, Germany, pp 62–71
Abramowicz W (2001) Information Filters Supplying Management Information Systems. Proceedings of the Second Southern Conference on Computing, 26–28 Oct. 2000, Hattiesburg, Mississippi,USA
Abramowicz W, Ceglarek D (1998) Applying Cluster-Based Connection Structure in the Document Base of the SDI System. WebNet’98 World Conference of the WWW, Internet & Intranet, 7–12 Nov. 1998, Orlando, Florida, USA
Abramowicz W, Kalczynski PJ, Wçcel K (2001) Information Filters Supplying Data Warehouses with Benchmarking Information. In: Abramowicz W, Zurada J (eds) Knowledge Discovery for Business Information Systems, Kluwer Academic Publishers, USA, pp 1–28
Aggarwal CC, Wolf JL, Wu K, Yu PS (1992) Horting Hatches an Egg: A New Graph-theoretic Approach to Collaborative Filtering. In Proc. of the ACM KDD’99 Conference, San Diego, CA, pp 201–212
Allan J (1996) Incremental Relevance Feedback. In Proc. of the 19th ACM SIGIR International Conference on Research and Development in Information Retrieval, Zurich, pp 270–278
Allcock S, Plenty A, Webber S, Yeates R (1999) Business Information and the Internet: Use of the Internet as an Information Resource for Small and Medium-sized Enterprises: Final Report. British Library Research and Innovation Report, 136),London,England,1999, business.dis.strath.ac.uk/project/finalfittop
Amati G, D’Alosi D, Giannini V, Ubaldini F (1997) A Framework for Filtering News and Managing Distributed Data. Journal of Universal Computer Science, Vol. 3 No 8, pp 10071021
Baeza-Yates R, Ribeiro-Neto B (1999) Modern Information Retrieval. Addison-Wesley ACM Press New York, USA
Belkin NJ, Croft WB (1992) Information Filtering and Information Retrieval: Two Sides of the Same Coin. Communications of the ACM, 35(12):29–38
Bestavros A (2000) The Curse of Zipf’s Law“, http://www.personalization.com/soapbox/contributions/zipfcurse.htm(2000–12–24)
Bush W (1945) As We May Think. Atlantic Monthly, USA, Jul. 1945, pp 101–108
Ceglarek D (1997) Zastosowanie metod taksonomicznych w systemach selektywnego rozpowszechniania i wyszukiwania informacji ekonomicznej (Cluster Analysis Improving Selectvie Dissemination of Information and Retrieval Systems). Doctoral Dissertation, Department of Computer Science, Faculty of Economics, The Poznan University of Economics, Poznafi, Poland (in Polish)
Cohen WW (1996) Learning Rules that Classify E-Mail. AAAI Spring Symposium on Machine Learning in Information Access, Stanford
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by Latent Semantic Analysis. Journal of the American Society for Information Science, 41(6), 1990, pp 391–407
Denning PJ (1982) Electronic Junk. Communications of the ACM, 25(3):163–165, 1982
Dittrich KR, Domenig R (1999) Towards Exploitation of the Data Universe — Database Technology for Comprehensive Query Services. In: Abramowicz W, Orlowska M (eds) Proc of the 3rd International Conference on Business Information Systems BIS’99, Springer-Verlag London, pp 231–249
Dumais ST, Furnas GW, Landauer TK, Deerwester S (1988) Using Latent Semantic Analysis to Improve Information Retrieval. Proceedings of ACM CHI’88 Conference on Human Factors in Computing, New York, pp 281–285
Foltz PW (1990) Using Latent Semantic Indexing for Information Filtering. In: Allen RB (ed)Proc. of the Conference on Office Information Systems, Cambridge, MA, pp 40–47
Goldberg D, Nichols D, Oki BM, Terry D (1992) Using Collaborative Filtering to Weave an Information Tapestry. Communications of the ACM, December 1992
Grieves M (1998) The impact of information use on decision making: studies in five sectors: introduction, summary and conclusions. Library management, 19 (2), 1998, pp 78–85
Gurrin C, Smeaton AF (2000) A Connectivity Analysis Approach to Increasing Precision in Retrieval from Hyperlinked Documents. NIST Special Publication of the Eight Text Retrieval Conference - TREC 8, USA
Hackathorn R (1999) Web Farming for the Data Warehouse. Morgan Kaufman Publishers, San Francisco, USA
Hall H (1994) Information strategy and manufacturing industry: case studies in the Scottish textile industry. International Journal of Information Management, 14/1994, pp 281–294
Herlocker J, Konstan J, Borchers A, Riedl J (1999) An Algorithmic Framework for Performing Collaborative Filtering. In Proc. of the ACM SIGIR’99, ACM Press
Hoyle MA, Lueg C (1997) Open Sesame!: A Look at Personal Assistants. In Proc. of the International Conference on the Practical Application of Intelligent Agents and Multi-Agent Technology, PAAM’97, London, pp 51–60
Hull D (1998) The TREC-6 Filtering Track: Description and Analysis. In: Voorhees EM, Harman DK (eds) NIST Special Publication 500–240: The Sixth Text REtrieval Conference (TREC-6), Department of Commerce, National Institute of Standards and Technology, USA
Kalczyíiski PJ (2000) HyperSDI zasilajgcy hurtownig danych informacjami benchmarkingowymi (HyperSDI Supplying the Data Warehouse with Benchmarking Information), Master Thesis, Department of Computer Science, Faculty of Economics, The Poznaíñ University of Economics, Poznaí, Poland (in Polish)
Koenemann J (1996) Relevance feedback: usage, usability, utility. Ph.D. Dissertation, Department of Psychology, Rutgers University, New Brunswick, NJ
Lassila O (1997) Introduction to RDF Metadata, W3C NOTE 1997–11–13, http://www.w3.org/TR/NOTE-rdf-simple-intro
Mackay WE, Malone TW, Crowston K, Rao R, Rosenblitt D, Card SK (1989) How Do Experienced Information Lens Users Use Rules?. Proceedings of the ACM CHI’89 Conference on Human Factors in Computing Systems, USA, pp 211–216
Mattison R (1999) Web Warehousing and Knowledge Management, McGraw-Hill, USA Nelson T (1965) A file structure for the Complex, the Changing and the Indeterminate. ACM 20th National Conference, USA
Orminski EM (1991) Business information needs of science park companies. London: The British Library. (Library and information research report; 81)
Palme J (1984) You have 134 unread mail! Do you want to read them now?. In IFIP, pp 175184
Pfeifer R, Rademaker P (1991) Situated adaptive design: Toward a methodology for knowledge systems development. In DAICW 1991, pp 53–64
Ponte JM (1998) A Language Modeling Approach to Information Retrieval. Doctoral Dissertation, University of Massachusetts, Amhers
Reid C (1986) Business information needs in Scotland. Aslib proceedings. 38 (2) Feb. 1986, pp 51–64
Resnick P, Iacovou N, Suchak M, Bergstrom P, Riedl J (1994) GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proc. of CSCW’94, Chapel Hill, ACM Press, pp 175–186
Rijsbergen van CJ (1979) Information Retrieval. Butterworths, London, England, http://www.dcs.gla.ac.uk/Keith/Preface.html
Robertson SE, Sparck-Jones K (1976) Relevance Weighting of Search Terms. Journal of the American Society for Information Sciences, 27 Mar 1976, pp 129–146
Rocchio JJ (1971) Relevance Feedback in Information Retrieval. In: Salton G (ed) The SMART Retrieval System, Prentice—Hall, Englewood NJ, pp 313–323
Salton G (1971) The SMART Retrieval System — Experiments in Automatic Document Processing. Prentice Hall Inc., Englewood Cliffs
Salton G, McGill M (1983) Introduction to Modern Information Retrieval. McGraw-Hill Book Company, USA
Sarwar B, Karypis G, Konstan J, Riedl J (2001) Item-Based Collaborative Filtering Recommendation Algorithms. In: Proc. of the 10th World-Wide Web Conference, Hong-Kong, ACM. http://wwwl0.org
Savia E, Kurki T, Jokela S (1998) Metadata-based Matching of Documents and User Profiles. Proceedings of the 8th Finnish Artificial Intelligence Conference, Human and Artificial Information Processing, Finnish Artificial Intelligence Society, Finland, pp 61–69
Shardanand U, Maes P (1995) Social Information Filtering: Algorithms for automating `Word of Mouth’. In Proc. of CHI’95, Denver
TREC, Text Retrieval Conferences, 1992–2000,http://trec.nist.gov
Wgcel K (2000) Odkrywanie wiedzy dla doskonalenia profili HyperSDI w hurtowniach danych (Knowledge Discovery for Improving HyperSDI Profiles in Data Warehouses). Master Thesis, Department of Computer Science, Faculty of Economics, The Poznañ University of Economics, Poznar, Poland (in Polish)
Weibel S, Miller E (1997) Dublin Core Metadata Element Set WWW homepage,purl.org/metadata/dublin_core
White B et al. (1982) Information and the small manufacturing firm. Edinburgh: Capital Planning Information
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag London
About this chapter
Cite this chapter
Abramowicz, W., Kalczyński, P., Węcel, K. (2002). Information Filtering and Retrieval from Web Sources. In: Filtering the Web to Feed Data Warehouses. Springer, London. https://doi.org/10.1007/978-1-4471-0137-6_4
Download citation
DOI: https://doi.org/10.1007/978-1-4471-0137-6_4
Publisher Name: Springer, London
Print ISBN: 978-1-4471-1107-8
Online ISBN: 978-1-4471-0137-6
eBook Packages: Springer Book Archive