A Computational Approach for Corpus Based Analysis of Reduplicated Words in Bengali

  • Apurbalal SenapatiEmail author
  • Utpal Garain
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9041)


Reduplication is an important phenomenon in language studies especially in Indian languages. The definition of reduplication is the repetition of the smallest linguistic unit partially or completely i.e. repetition of phoneme, morpheme, word, phrase, clause or the utterance as a whole and it gives different meaning in syntax as well as semantic level. The reduplicated words has important role in many natural language processing (NLP) applications, namely in machine translation (MT), text summarization, identification of multiword expressions, etc. This article focuses on an algorithm for identifying the reduplicated words from a text corpus and computing statistics (descriptive statistics) of reduplicated words frequently used in Bengali.


Reduplication Bengali Corpus Descriptive statistics Evaluation 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Dash, N.: A Descriptive Study of Bengali Words, pp. 225–251. CUP (2015)Google Scholar
  2. 2.
    Ananthanarayana, H.S.: Reduplication in Sanketi Tamil OpiL, vol. 2, pp. 39–49 (1976)Google Scholar
  3. 3.
    Abbi, A.: Reduplicated Adverbs of Manner and Cause of Hindi. Indian Linguistics 38(2), 125–135 (1977)Google Scholar
  4. 4.
    Murthy, C.: Formation of Echo-Words in Kannada. In: All India Conference of Dravidian Linguistics(eds.) (1972)Google Scholar
  5. 5.
    Nongmeikapam, K.: Identification of Reduplication MWEs in Manipuri, a rule-based approach. In: 23rd International Conference on the Computer Processing of Oriental Languages, California, USA, pp. 49–54 (2010)Google Scholar
  6. 6.
    Chattopadhyay, S.K.: Bhasa-Prakash Bangala Vyakaran, 3rd edn. Pupa publication (1992)Google Scholar
  7. 7.
    Chaudhuri, B.B.: Bangla Dhwanipratik: Swarup o Abhidhan (Bangla Sound Symbolism: Properties and Dictionary). Paschimbanga Bangla Academy, Kolkata (2010)Google Scholar
  8. 8.
    Thompson, H.R.: Bengali: A Comprehensive Grammar, pp. 663–672. Routledge publication (2010)Google Scholar
  9. 9.
    Bandyopadhyay, S.: Identification of Reduplication in Bengali Corpus and their Semantic Analysis: A Rule-Based Approach. In: Proceedings of the Workshop on Multiword Expressions: from Theory to Applications (MWE 2010), Beijing, pp. 72–75 (2010)Google Scholar
  10. 10.
    Senapati, A., Garain, U.: Anaphora Resolution in Bangla using global discourse knowledge. In: Int. Conf. of Asian Language Processing, Hanoi, Vietnam (2012)Google Scholar
  11. 11.
    Sharon, L.L.: Sampling: Design and Analysis, 2nd edn. Advanced Series, pp. 73–101 (2010)Google Scholar
  12. 12.
    TDIL Corpus: A nation-wide consortium for machine translation of Indic languages is being funded by the Ministry of Information Technology, Govt. of India (1995),
  13. 13.
    Digital Dictionaries of South Asia,

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Central Institute of TechnologyBTADKokrajharIndia
  2. 2.Indian Statistical InstituteKolkataIndia

Personalised recommendations