Detecting Missing Content Queries in an SMS-Based HIV/AIDS FAQ Retrieval System

Thuma, Edwin; Rogers, Simon; Ounis, Iadh

doi:10.1007/978-3-319-06028-6_21

Edwin Thuma^22,23,
Simon Rogers²² &
Iadh Ounis²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8416))

Included in the following conference series:

European Conference on Information Retrieval

2925 Accesses
1 Citations
2 Altmetric

Abstract

Automated Frequently Asked Question (FAQ) answering systems use pre-stored sets of question-answer pairs as an information source to answer natural language questions posed by the users. The main problem with this kind of information source is that there is no guarantee that there will be a relevant question-answer pair for all user queries. In this paper, we propose to deploy a binary classifier in an existing SMS-Based HIV/AIDS FAQ retrieval system to detect user queries that do not have the relevant question-answer pair in the FAQ document collection. Before deploying such a classifier, we first evaluate different feature sets for training in order to determine the sets of features that can build a model that yields the best classification accuracy. We carry out our evaluation using seven different feature sets generated from a query log before and after retrieval by the FAQ retrieval system. Our results suggest that, combining different feature sets markedly improves the classification accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bornman, E.: The Mobile Phone in Africa: Has It Become a Highway to the Information Society or Not? Contemp. Edu. Tech. 3(4) (2012)
Google Scholar
Breiman, L.: Random forests. Machine Learning 45(1) (2001)
Google Scholar
Caruana, R., Niculescu-Mizil, A.: An Empirical Comparison of Supervised Learning Algorithms. In: Proc. of ICML (2006)
Google Scholar
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for Support Vector Machine. ACM Trans. Intell. Syst. Technol. 2(3) (2011)
Google Scholar
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting Query Performance. In: Proc. of SIGIR (2002)
Google Scholar
Daelemans, W., Zavrel, J., Sloot, K.V.D., Bosch, A.V.D.: TiMBL: Tilburg Memory-Based Learner - version 4.3 - Reference Guide (2002)
Google Scholar
Donner, J.: Research Approaches to Mobile Use in the Developing World: A Review of the Literature. The Info. Soc. 24(3) (2008)
Google Scholar
Ferguson, P., O’Hare, N., Lanagan, J., Smeaton, A.F., McCarthy, K., Phelan, O., Smyth, B.: CALRITY at the TREC 2011 Microblog Track. In: Proc. of TREC (2011)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: an Update. SIGKDD Explor. Newsl. 11(1) (2009)
Google Scholar
Hauff, C., Murdock, V., Baeza-Yates, R.: Improved Query Difficulty Prediction for the Web. In: Proc. of CIKM (2008)
Google Scholar
He, B., Ounis, I.: Inferring Query Performance using Pre-Retrieval Predictors. In: Proc. of SPIRE (2004)
Google Scholar
He, B., Ounis, I.: Query Performance Prediction. Info. Syst. 31(7) (2006)
Google Scholar
Hogan, D., Leveling, J., Wang, H., Ferguson, P., Gurrin, C.: DCU@FIRE 2011: SMS-based FAQ Retrieval. In: Proc. of FIRE (2011)
Google Scholar
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A Practical Guide to Support Vector Classification (2010)
Google Scholar
John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proc. of UAI (1995)
Google Scholar
Lane, I., Kawahara, T., Matsui, T., Nakamura, S.: Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics. IEEE Transact. on Aud. Speech, and Lang. Process. 15(1) (2007)
Google Scholar
Leveling, J.: On the Effect of Stopword Removal for SMS-Based FAQ Retrieval. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol. 7337, pp. 128–139. Springer, Heidelberg (2012)
Chapter Google Scholar
Medhi, I., Ratan, A., Toyama, K.: Mobile-Banking Adoption and Usage by Low-Literate, Low-Income Users in the Developing World. In: Proc. of IDGD (2009)
Google Scholar
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: Proc. of OSIR at SIGIR (2006)
Google Scholar
Porter, M.F.: An Algorithm for Suffix Stripping. Elec. Lib. Info. Syst. 14(3) (1980)
Google Scholar
Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Info. Retr. 3(4) (2009)
Google Scholar
Sneiders, E.: Automated FAQ Answering: Continued Experience with Shallow Language Understanding. Question Answering Systems. In: Proc. of AAAI Fall Symp. (1999)
Google Scholar
Sneiders, E.: Automated FAQ Answering with Question-Specific Knowledge Representation for Web Self-Service. In: Proc. of HSI (2009)
Google Scholar
Thuma, E., Rogers, S., Ounis, I.: Evaluating Bad Query Abandonment in an Iterative SMS-Based FAQ Retrieval System. In: Proc. of OAIR (2008)
Google Scholar
Yom-Tov, E., Fine, S., Carmel, D., Darlow, A.: Learning to Estimate Query Difficulty: Including Applications to Missing Content Detection and Distributed Information Retrieval. In: Proc. of SIGIR (2005)
Google Scholar
Zhang, M., Dodgson, M.Y.: High-tech Entrepreneurship in Asia: Innovation, Industry and Institutional Dynamics in Mobile Payments. Edward Elgar Publishing, Inc. (2007)
Google Scholar
Zhao, Y., Scholer, F., Tsegay, Y.: Effective Pre-Retrieval Query Performance Prediction Using Similarity and Variability Evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 52–64. Springer, Heidelberg (2008)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing Science, University of Glasgow, Glasgow, UK
Edwin Thuma, Simon Rogers & Iadh Ounis
Department of Computer Science, University of Botswana, Gaborone, Botswana
Edwin Thuma

Authors

Edwin Thuma
View author publications
You can also search for this author in PubMed Google Scholar
Simon Rogers
View author publications
You can also search for this author in PubMed Google Scholar
Iadh Ounis
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Amsterdam, Amsterdam, The Netherlands
Maarten de Rijke & Tom Kenter &
Centrum Wiskunde en Informatica, Amsterdam, The Netherlands and Delft University of Technology, Delft, The Netherlands
Arjen P. de Vries
University of Illinois at Urbana-Champaign, Urbana, IL, USA
ChengXiang Zhai
University of Twente, Twente, The Netheralnds and Erasmus University Rotterdam, Rotterdam, The Netherlands
Franciska de Jong
SalesPredict, Haifa, Israel
Kira Radinsky
Microsoft Research, Cambridge, UK
Katja Hofmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thuma, E., Rogers, S., Ounis, I. (2014). Detecting Missing Content Queries in an SMS-Based HIV/AIDS FAQ Retrieval System. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-06028-6_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics