Abstract
An important service for systems providing access to information is the organization of returned search results. Vector model search results may be represented by a sphere in an n-dimensional space. A query represents the center of this sphere whose size is determined by its radius or by the amount of documents it contains. The goal of searching is to have all documents relevant to a query present within this sphere. It is known that not all relevant documents are present in this sphere and that is why various methods for improving search results, which can be implemented on the basis of expanding the original question, have been developed. Our goal is to utilize knowledge of document similarity contained in textual databases to obtain a larger amount of relevant documents while minimizing those cancelled due to their irrelevance. In the article we will define the concept k-path (topical development). For the individual development of vector query results, we will propose the SORT-EACH algorithm, which uses the aforementioned methods for acquiring topical development.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Armstrong, M.A.: Basic Topology (Undergraduate Texts in Mathematics). Springer, Heidelberg (1997)
Berry, M.: Survey of Text Mining: Clustering, Classification, and Retrieval. Springer, Heidelberg (2003)
Carpineto, C., de Mori, R., Romano, G., Bigi, B.: An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001), http://doi.acm.org/10.1145/366836.366860
Chalmers, M., Chitson, P.: Bead: explorations in information visualization. In: SIGIR 1992: Proceedings of the 15th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 330–337. ACM, New York (1992), http://dx.doi.org/10.1145/133160.133215
Dvorský, J., Martinovič, J., Snášel, V.: Query expansion and evolution of topic in information retrieval systems. In: DATESO, pp. 117–127. Desná – Černá Říčka, Czech Republic (2004)
Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications. ASA-SIAM Series on Statistics and Applied Probability. SIAM, Philadelphia (2007)
Hearst, M.A.: Tilebars: Visualization of term distribution information in full text information access. In: Proceedings of the Conference on Human Factors in Computing Systems, CHI 1995 (1995), http://citeseer.ist.psu.edu/hearst95tilebars.html
Ishioka, T.: Evaluation of criteria on information retrieval. Systems and Computers in Japan 35(6), 42–49 (2004), http://dx.doi.org/10.1002/scj.v35:6
Jacobs, D.W., Weinshall, D., Gdalyahu, Y.: Classification with nonmetric distances: image retrieval and class representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(6), 583–600 (2000), http://dx.doi.org/10.1109/34.862197
Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: a review. ACM Computing Surveys 31(3), 264–323 (1999), citeseer.ist.psu.edu/jain99data.html
Korfhage, R.R.: To see, or not to see - is that the query? In: SIGIR 1991: Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 134–141. ACM Press, New York (1991), http://dx.doi.org/10.1145/122860.122873
Kowalski, G.J., Maybury, M.T.: Information Storage and Retrieval Systems Theory and Implementation, 2nd edn. The Information Retrieval Series, vol. 8. Springer, Norwell (2000)
Leuski, A.: Evaluating document clustering for interactive information retrieval. In: CIKM, pp. 33–40 (2001), http://citeseer.ist.psu.edu/leuski01evaluating.html
Martinovič, J., Gajdoš, P., Snášel, V.: Similarity in information retrieval. In: 7th Computer Information Systems and Industrial Management Applications, 2008. CISIM 2008, pp. 145–150. IEEE, Los Alamitos (2008)
Martinovič, J.: Evolution of topic in information retrieval systems. In: WOFEX, Ostrava, Czech Republic (2004)
Martinovič, J., Gajdoš, P.: Vector model improvement by FCA and topic evolution. In: DATESO, pp. 46–57. Desná – Černá Říčka, Czech Republic (2005)
Osinski, S., Weiss, D.: Carrot2: Design of a flexible and efficient web information retrieval framework. In: Szczepaniak, P.S., Kacprzyk, J., Niewiadomski, A. (eds.) AWIC 2005. LNCS (LNAI), vol. 3528, pp. 439–444. Springer, Heidelberg (2005)
Salton, G.: Automatic Text Processing. Addison-Wesley, Reading (1989)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24(5), 513–523 (1988), http://dx.doi.org/10.1016/0306-4573(88)90021-0
Spoerri, A.: Infocrystal: a visual tool for information retrieval & management. In: CIKM 1993: Proceedings of the second international conference on Information and knowledge management, pp. 11–20. ACM, New York (1993), http://doi.acm.org/10.1145/170088.170095
Thompson, R.H., Croft, W.B.: Support for browsing in an intelligent text retrieval system. Int. J. Man-Mach. Stud. 30(6), 639–668 (1989)
http://demo.carrot2.org/demo-stable/main (1.8.2008)
http://vivisimo.com (1.8.2008)
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn., Department of Computer Science, University of Glasgow (1979)
Zamir, O., Etzioni, O.: Grouper: a dynamic clustering interface to web search results. Computer Networks 31(11-16), 1361–1374 (1999), http://dx.doi.org/10.1016/S1389-1286(99)00054-7
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Martinovič, J., Snášel, V., Dvorský, J., Dráždilová, P. (2010). Search in Documents Based on Topical Development. In: Snášel, V., Szczepaniak, P.S., Abraham, A., Kacprzyk, J. (eds) Advances in Intelligent Web Mastering - 2. Advances in Intelligent and Soft Computing, vol 67. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10687-3_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-10687-3_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10686-6
Online ISBN: 978-3-642-10687-3
eBook Packages: EngineeringEngineering (R0)