Skip to main content

Probabilistic Reuse of Past Search Results

  • Conference paper
Database and Expert Systems Applications (DEXA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8644))

Included in the following conference series:

Abstract

In this paper, a new Monte Carlo algorithm to improve precision of information retrieval by using past search results is presented. Experiments were carried out to compare the proposed algorithm with traditional retrieval on a simulated dataset. In this dataset, documents, queries, and judgments of users were simulated. Exponential and Zipf distributions were used to build document collections. Uniform distribution was applied to build the queries. Zeta distribution was utilized to simulate the Bradford’s law representing the judgments of users. Empirical results show a better performance of our algorithm compared with traditional retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bigot, A., Chrisment, C., Dkaki, T., Hubert, G., Mothe, J.: Fusing different information retrieval systems according to query-topics: a study based on correlation in information retrieval systems and trec topics. Inf. Retr. 14(6), 617–648 (2011)

    Article  Google Scholar 

  2. Gray, P., Watson, H.J.: Present and future directions in data warehousing. SIGMIS Database 29(3), 83–90 (1998)

    Article  Google Scholar 

  3. Nopiah, Z.M., Khairir, M.I., Abdullah, S., Baharin, M.N., Arifin, A.: Time complexity analysis of the genetic algorithm clustering method. In: Proceedings of the 9th WSEAS International Conference on Signal Processing, Robotics and Automation, ISPRA 2010, Stevens Point, Wisconsin, USA, pp. 171–176. World Scientific and Engineering Academy and Society, WSEAS (2010)

    Google Scholar 

  4. Kearns, M.J.: The Computational Complexity of Machine Learning. PhD thesis, Harvard University, USA, Cambridge, MA, USA (1989)

    Google Scholar 

  5. Cetintas, S., Si, L., Yuan, H.: Using past queries for resource selection in distributed information retrieval. Technical Report 1743, Department of Computer Science, Purdue University (2011)

    Google Scholar 

  6. Shen, X., Zhai, C.X.: Exploiting query history for document ranking in interactive information retrieval. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR 2003, pp. 377–378. ACM, New York (2003)

    Chapter  Google Scholar 

  7. Shen, X., Tan, B., Zhai, C.: Context-sensitive information retrieval using implicit feedback. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2005, pp. 43–50. ACM, New York (2005)

    Google Scholar 

  8. Fonseca, B.M., Golgher, P.B., de Moura, E.S., Ziviani, N.: Using association rules to discover search engines related queries. In: Proceedings of the First Conference on Latin American Web Congress, LA-WEB 2003, pp. 66–71. IEEE Computer Society, Washington, DC (2003)

    Chapter  Google Scholar 

  9. Baeza-Yates, R., Hurtado, C., Mendoza, M.: Query recommendation using query logs in search engines. In: Lindner, W., Fischer, F., Türker, C., Tzitzikas, Y., Vakali, A.I. (eds.) EDBT 2004. LNCS, vol. 3268, pp. 588–596. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  10. Teevan, J., Adar, E., Jones, R., Potts, M.A.S.: Information re-retrieval: repeat queries in yahoo’s logs. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, pp. 151–158. ACM, New York (2007)

    Google Scholar 

  11. Garcia, S.: Search Engine Optimisation Using Past Queries. PhD thesis, RMIT University, Australia (2007)

    Google Scholar 

  12. Clough, P., Sanderson, M.: Evaluating the performance of information retrieval systems using test collections. Information Research 18(2) (2013)

    Google Scholar 

  13. Huurnink, B., Hofmann, K., de Rijke, M., Bron, M.: Validating query simulators: An experiment using commercial searches and purchases. In: Agosti, M., Ferro, N., Peters, C., de Rijke, M., Smeaton, A. (eds.) CLEF 2010. LNCS, vol. 6360, pp. 40–51. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Joachims, T.: A probabilistic analysis of the rocchio algorithm with tfidf for text categorization. In: Proceedings of the Fourteenth International Conference on Machine Learning, ICML 1997, pp. 143–151. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  15. Chan, E.P., Garcia, S., Roukos, S.: Probabilistic modeling for information retrieval with unsupervised training data. In: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD), pp. 159–163. AAAI Press (1998)

    Google Scholar 

  16. Salton, G., Buckley, C.: Readings in information retrieval. In: Sparck Jones, K., Willett, P. (eds.) Readings in Information Retrieval, pp. 355–364. Morgan Kaufmann Publishers Inc., San Francisco (1997)

    Google Scholar 

  17. Radwan, A.A.A., Latef, B.A.A., Ali, A.M.A., Sadek, O.A.: Using genetic algorithm to improve information retrieval systems. World Academy of Science, Engineering and Technology 17, 1021–1027 (2008)

    Google Scholar 

  18. Lillis, D., Toolan, F., Mur, A., Peng, L., Collier, R., Dunnion, J.: Probability-based fusion of information retrieval result sets. Artif. Intell. Rev. 25(1-2), 179–191 (2006)

    Article  Google Scholar 

  19. Gutiérrez-Soto, C., Hubert, G.: Evaluating the interest of revamping past search results. In: Decker, H., Lhotská, L., Link, S., Basl, J., Tjoa, A.M. (eds.) DEXA 2013, Part II. LNCS, vol. 8056, pp. 73–80. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Poosala, V.: Zipf’s law. Technical Report 900 839 0750, Bell Laboratories (1997)

    Google Scholar 

  21. Garfield, E.: Bradford’s Law and Related Statistical Patterns. Essays of an Information Scientist 4(19), 476–483 (1980)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Gutiérrez-Soto, C., Hubert, G. (2014). Probabilistic Reuse of Past Search Results. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8644. Springer, Cham. https://doi.org/10.1007/978-3-319-10073-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10073-9_21

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10072-2

  • Online ISBN: 978-3-319-10073-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics