Skip to main content

Language Modeling Approach to Retrieval for SMS and FAQ Matching

  • Conference paper
Multilingual Information Access in South Asian Languages

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7536))

  • 678 Accesses

Abstract

Short Messaging service popularly known as “SMS” has seen growth due to the growth in Mobile phone users. A mobile phone is considered as a cheap and easy device for communication. It is also used as a source to acquire and spread information. SMS based FAQ Retrieval task proposed in FIRE 2011 aims to provide the required information from frequently asked questions (FAQs). Challenge is to find a question from corpora of FAQs that best answers/matches with the SMS query. But, SMS queries are noisy as users tend to compress text by omitting letters, using slang, etc. This is observed due to a cap on the length of messages (160 characters constitute one SMS), lack of screen space (which makes reading large amounts of text difficult). In this paper, we propose a method using language modeling approach to match noisy SMS text with right FAQ. We extended this framework to match SMS queries with Cross-language FAQs. Results are promising for monolingual retrieval applied on English, Hindi and Malayalam languages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Contractor, D., Faruquie, T., Subramaniam, L.: Unsupervised cleansing of noisy text. In: Proceedings of the 23rd International Conference on Computational Linguistics: Posters, pp. 189–196 (2010)

    Google Scholar 

  2. Kothari, G., Negi, S., Faruquie, T., Chakravarthy, V., Subramaniam, L.V.: SMS based Interface for FAQ Retrieval. In: Annual Meeting of the Association for Computation Linguistics (2009)

    Google Scholar 

  3. Sneiders, E.: Automated FAQ Answering: Continued Experience with Shallow Language Understanding Question Answering Systems. In: AAAI Fall Symposium. Technical Report FS-99-02, pp. 97–107. AAAI Press (1999)

    Google Scholar 

  4. Mihalcea, R., Corley, C., Strapparava, C.: Corpus-based and knowledge-based measures of text semantic similarity. In: AAAI (2006)

    Google Scholar 

  5. Sahami, M., Heilman, T.: A web-based kernel function for measuring the similarity of short text snippets. In: World Wide Web. ACM Press (2006)

    Google Scholar 

  6. Pedersen, T.: Computational approaches to measuring the similarity of short contexts: A review of applications and methods. CoRR, abs/0806.3787 (2008)

    Google Scholar 

  7. Shrestha, P.: Corpus-based methods for short text similarity. In: 15th Rencontre des Etudiants Chercheurs en Informatique pour le Traitement Automatique des Langues, vol. 2, pp. 297–302 (2011)

    Google Scholar 

  8. Bharadwaj, R., Tandon, N., Varma, V.: An Iterative approach to extract dictionaries from Wikipedia for under-resourced languages. In: 8th International Conference on Natural Language Processing, ICON (2010)

    Google Scholar 

  9. Ponte, J.M., Bruce Croft, W.: A language modeling approach to information retrieval. In: 21st ACM SIGIR, pp. 275–281 (1998)

    Google Scholar 

  10. Berger, A., Lafferty, J.: Information retrieval as statistical translation. In: ACM SIGIR, pp. 222–229 (1999)

    Google Scholar 

  11. Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Information Retrieval. ACM Transactions on Information Systems 22(2), 179–214 (2004)

    Article  Google Scholar 

  12. Ballesteros, L., Croft, B.: Dictionary Methods for Cross-Lingual Information Retrieval. In: Thoma, H., Wagner, R.R. (eds.) DEXA 1996. LNCS, vol. 1134, pp. 791–801. Springer, Heidelberg (1996)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Mogadala, A., Kothwal, R., Varma, V. (2013). Language Modeling Approach to Retrieval for SMS and FAQ Matching. In: Majumder, P., Mitra, M., Bhattacharyya, P., Subramaniam, L.V., Contractor, D., Rosso, P. (eds) Multilingual Information Access in South Asian Languages. Lecture Notes in Computer Science, vol 7536. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40087-2_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40087-2_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40086-5

  • Online ISBN: 978-3-642-40087-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics