Skip to main content

A Study on Different Normalization Approaches of Word

  • Conference paper
  • First Online:
Advances in Electronics, Communication and Computing

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 443))

  • 1893 Accesses

Abstract

With the advancing of social media communication, people communicate with each other through SMS (Short Message Service), tweets, and chats messages. But the texts used in such medium are quite different from the standard text such as in limitation of character length, misspelling, and some in abbreviated form also called as Non-Standard Form (NSF) which will not be found on dictionaries. The aim of this paper is to study the different existing approaches used for normalizing such kind of texts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Sproat, R., Black, A., Chen, S., Kumar, S., Ostendorf, M., Richards, C.: Normalization of non-standard words. Comput. Speech Lang. 15(3), 287–333 (2001)

    Article  Google Scholar 

  2. https://en.wikipedia.org/wiki/Text_normalization, 9:52 PM, 15/06/16

  3. Damerau, F.J.: A technique for computer detection and correction of spelling errors. Commun. ACM 7, 171–176 (1964)

    Article  Google Scholar 

  4. Levenshtein, V.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Doklady 10, 707 (1966)

    MathSciNet  Google Scholar 

  5. Xue, Z., Yin, D., Davison, B.D.: Normalizing microtext. Analyzing Microtext. AAAI Workshop (WS-11-05) (2011)

    Google Scholar 

  6. http://www.androidauthority.com/what-is-sms-280988/

  7. Pennell, D.L., Liu, Y.: Normalization of text messages for text-to-speech. ICASSP, 978-1-4244-4296-6/10/$25, IEEE (2010)

    Google Scholar 

  8. Pennell, D., Liu, Y.: Toward text message normalization: modeling abbreviation generation. In: ICASSP, Prague, Czech Republic (2011)

    Google Scholar 

  9. Kobus, C., et al.: Normalizing SMS: are two metaphors better than one? In: Proceedings of the 22nd international conference on computational linguistics, pp. 441–448, Manchester (2008)

    Google Scholar 

  10. Khanuja, G.S., Yadav, S.: Normalisation of SMS text. Department of Computer Science and Engineering, IIT Kanpur, India (2013)

    Google Scholar 

  11. Aw, A.T., Zhang, M., Xiao, J., Su, J.: A phrase-based statistical model for SMS text normalization. In Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 3340, Sydney, Australia (2006)

    Google Scholar 

  12. Choudhury, M., Saraf, R., Jain, V., Mukherjee, A., Sarkar, S., Basu, A.: Investigation and modeling of the structure of texting language. Int. J. Doc. Anal. Recogn. 10:157–174 (2007)

    Google Scholar 

  13. Cook, P., Stevenson, S.: An unsupervised model for text message normalization. In: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity, pp. 71–78 (2009)

    Google Scholar 

  14. Li, C., Liu, Y.: Normalization of text messages using character- and phone-based machine translation approaches. Computer Science Department, The University of Texas at Dallas, Richardson, TX, USA (2012)

    Google Scholar 

  15. Kaufmann, M.: Syntactic normalization of twitter messages. In: The 8th International Conference on Natural Language Processing (2010)

    Google Scholar 

  16. Aw, A.T., Zhang, M., Fan, Z.Z., Yeo, P.K., Su, J.: Input normalization for an English-to-Chinese SMS translation system. MT Summit (2005)

    Google Scholar 

  17. Zhu, C., et al.: A unified tagging approach to text normalization. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 688–695, Prague, Czech Republic (2007)

    Google Scholar 

  18. Desai, N., Narvekar, M.: Normalization of noisy text data. In: International Conference on Advanced Computing Technologies and Applications, ICACTA (2015)

    Google Scholar 

  19. Brill, E., Moore, R.C.: An improved error model for noisy channel spelling correction. In Proceedings of ACL Hong Kong, pp. 286–293 (2000)

    Google Scholar 

  20. Pennell, D., Liu, Y.: A character level machine translation approach for normalization of SMS abbreviations. In: Fifth International Joint Conference on Natural Language Processing, pp. 974–982 (2011)

    Google Scholar 

  21. Clarka, E., Arakia, K.: Text normalization in social media: progress, problems and applications for a pre-processing system of casual English. Pacific Association for Computational Linguistics (2011)

    Google Scholar 

  22. Raghunathan, K., Krawczyk, S., Manning, C.: SMS text normalisation using SMT. NLP Department of Stanford University, Standford Projects (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Md. Ruhul Islam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chitrapriya, N., Ruhul Islam, M., Roy, M., Pradhan, S. (2018). A Study on Different Normalization Approaches of Word. In: Kalam, A., Das, S., Sharma, K. (eds) Advances in Electronics, Communication and Computing. Lecture Notes in Electrical Engineering, vol 443. Springer, Singapore. https://doi.org/10.1007/978-981-10-4765-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-4765-7_25

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-4764-0

  • Online ISBN: 978-981-10-4765-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics