Skip to main content

A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9041))

Abstract

Reduplication is an important phenomenon in language studies especially in Indian languages. The definition of reduplication is the repetition of the smallest linguistic unit partially or completely i.e. repetition of phoneme, morpheme, word, phrase, clause or the utterance as a whole and it gives different meaning in syntax as well as semantic level. The reduplicated words has important role in many natural language processing (NLP) applications, namely in machine translation (MT), text summarization, identification of multiword expressions, etc. This article focuses on an algorithm for identifying the reduplicated words from a text corpus and computing statistics (descriptive statistics) of reduplicated words frequently used in Bengali.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Dash, N.: A Descriptive Study of Bengali Words, pp. 225–251. CUP (2015)

    Google Scholar 

  2. Ananthanarayana, H.S.: Reduplication in Sanketi Tamil OpiL, vol. 2, pp. 39–49 (1976)

    Google Scholar 

  3. Abbi, A.: Reduplicated Adverbs of Manner and Cause of Hindi. Indian Linguistics 38(2), 125–135 (1977)

    Google Scholar 

  4. Murthy, C.: Formation of Echo-Words in Kannada. In: All India Conference of Dravidian Linguistics(eds.) (1972)

    Google Scholar 

  5. Nongmeikapam, K.: Identification of Reduplication MWEs in Manipuri, a rule-based approach. In: 23rd International Conference on the Computer Processing of Oriental Languages, California, USA, pp. 49–54 (2010)

    Google Scholar 

  6. Chattopadhyay, S.K.: Bhasa-Prakash Bangala Vyakaran, 3rd edn. Pupa publication (1992)

    Google Scholar 

  7. Chaudhuri, B.B.: Bangla Dhwanipratik: Swarup o Abhidhan (Bangla Sound Symbolism: Properties and Dictionary). Paschimbanga Bangla Academy, Kolkata (2010)

    Google Scholar 

  8. Thompson, H.R.: Bengali: A Comprehensive Grammar, pp. 663–672. Routledge publication (2010)

    Google Scholar 

  9. Bandyopadhyay, S.: Identification of Reduplication in Bengali Corpus and their Semantic Analysis: A Rule-Based Approach. In: Proceedings of the Workshop on Multiword Expressions: from Theory to Applications (MWE 2010), Beijing, pp. 72–75 (2010)

    Google Scholar 

  10. Senapati, A., Garain, U.: Anaphora Resolution in Bangla using global discourse knowledge. In: Int. Conf. of Asian Language Processing, Hanoi, Vietnam (2012)

    Google Scholar 

  11. Sharon, L.L.: Sampling: Design and Analysis, 2nd edn. Advanced Series, pp. 73–101 (2010)

    Google Scholar 

  12. TDIL Corpus: A nation-wide consortium for machine translation of Indic languages is being funded by the Ministry of Information Technology, Govt. of India (1995), http://www.tdil-dc.in

  13. Digital Dictionaries of South Asia, http://dsal.uchicago.edu/dictionaries/biswas-bangala/

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Apurbalal Senapati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Senapati, A., Garain, U. (2015). A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2015. Lecture Notes in Computer Science(), vol 9041. Springer, Cham. https://doi.org/10.1007/978-3-319-18111-0_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18111-0_34

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18110-3

  • Online ISBN: 978-3-319-18111-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics