Abstract
Automated Frequently Asked Question (FAQ) answering systems use pre-stored sets of question-answer pairs as an information source to answer natural language questions posed by the users. The main problem with this kind of information source is that there is no guaranteeĀ that there will be a relevant question-answer pair for all user queries. In this paper, we propose to deploy a binary classifier in an existing SMS-Based HIV/AIDS FAQ retrieval system to detect user queries that do not have the relevant question-answer pair in the FAQ document collection. Before deploying such a classifier, we first evaluate different feature sets for training in order to determine the sets of features that can build a model that yields the best classification accuracy. We carry out our evaluation using seven different feature sets generated from a query log before and after retrieval by the FAQ retrieval system. Our results suggest that, combining different feature sets markedly improves the classification accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bornman, E.: The Mobile Phone in Africa: Has It Become a Highway to the Information Society or Not? Contemp. Edu. Tech.Ā 3(4) (2012)
Breiman, L.: Random forests. Machine LearningĀ 45(1) (2001)
Caruana, R., Niculescu-Mizil, A.: An Empirical Comparison of Supervised Learning Algorithms. In: Proc. of ICML (2006)
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for Support Vector Machine. ACM Trans. Intell. Syst. Technol.Ā 2(3) (2011)
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting Query Performance. In: Proc. of SIGIR (2002)
Daelemans, W., Zavrel, J., Sloot, K.V.D., Bosch, A.V.D.: TiMBL: Tilburg Memory-Based Learner - version 4.3 - Reference Guide (2002)
Donner, J.: Research Approaches to Mobile Use in the Developing World: A Review of the Literature. The Info. Soc.Ā 24(3) (2008)
Ferguson, P., OāHare, N., Lanagan, J., Smeaton, A.F., McCarthy, K., Phelan, O., Smyth, B.: CALRITY at the TREC 2011 Microblog Track. In: Proc. of TREC (2011)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: an Update. SIGKDD Explor. Newsl.Ā 11(1) (2009)
Hauff, C., Murdock, V., Baeza-Yates, R.: Improved Query Difficulty Prediction for the Web. In: Proc. of CIKM (2008)
He, B., Ounis, I.: Inferring Query Performance using Pre-Retrieval Predictors. In: Proc. of SPIRE (2004)
He, B., Ounis, I.: Query Performance Prediction. Info. Syst.Ā 31(7) (2006)
Hogan, D., Leveling, J., Wang, H., Ferguson, P., Gurrin, C.: DCU@FIRE 2011: SMS-based FAQ Retrieval. In: Proc. of FIRE (2011)
Hsu, C.-W., Chang, C.-C., Lin, C.-J.: A Practical Guide to Support Vector Classification (2010)
John, G.H., Langley, P.: Estimating Continuous Distributions in Bayesian Classifiers. In: Proc. of UAI (1995)
Lane, I., Kawahara, T., Matsui, T., Nakamura, S.: Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics. IEEE Transact. on Aud. Speech, and Lang. Process.Ā 15(1) (2007)
Leveling, J.: On the Effect of Stopword Removal for SMS-Based FAQ Retrieval. In: Bouma, G., Ittoo, A., MĆ©tais, E., Wortmann, H. (eds.) NLDB 2012. LNCS, vol.Ā 7337, pp. 128ā139. Springer, Heidelberg (2012)
Medhi, I., Ratan, A., Toyama, K.: Mobile-Banking Adoption and Usage by Low-Literate, Low-Income Users in the Developing World. In: Proc. of IDGD (2009)
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., Lioma, C.: Terrier: A High Performance and Scalable Information Retrieval Platform. In: Proc. of OSIR at SIGIR (2006)
Porter, M.F.: An Algorithm for Suffix Stripping. Elec. Lib. Info. Syst.Ā 14(3) (1980)
Robertson, S., Zaragoza, H.: The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Info. Retr.Ā 3(4) (2009)
Sneiders, E.: Automated FAQ Answering: Continued Experience with Shallow Language Understanding. Question Answering Systems. In: Proc. of AAAI Fall Symp. (1999)
Sneiders, E.: Automated FAQ Answering with Question-Specific Knowledge Representation for Web Self-Service. In: Proc. of HSI (2009)
Thuma, E., Rogers, S., Ounis, I.: Evaluating Bad Query Abandonment in an Iterative SMS-Based FAQ Retrieval System. In: Proc. of OAIR (2008)
Yom-Tov, E., Fine, S., Carmel, D., Darlow, A.: Learning to Estimate Query Difficulty: Including Applications to Missing Content Detection and Distributed Information Retrieval. In: Proc. of SIGIR (2005)
Zhang, M., Dodgson, M.Y.: High-tech Entrepreneurship in Asia: Innovation, Industry and Institutional Dynamics in Mobile Payments. Edward Elgar Publishing, Inc. (2007)
Zhao, Y., Scholer, F., Tsegay, Y.: Effective Pre-Retrieval Query Performance Prediction Using Similarity and Variability Evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol.Ā 4956, pp. 52ā64. Springer, Heidelberg (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Thuma, E., Rogers, S., Ounis, I. (2014). Detecting Missing Content Queries in an SMS-Based HIV/AIDS FAQ Retrieval System. In: de Rijke, M., et al. Advances in Information Retrieval. ECIR 2014. Lecture Notes in Computer Science, vol 8416. Springer, Cham. https://doi.org/10.1007/978-3-319-06028-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-06028-6_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06027-9
Online ISBN: 978-3-319-06028-6
eBook Packages: Computer ScienceComputer Science (R0)