Abstract
Stemming is one of the important tasks of Natural Language Processing Applications, such as in Information retrieval and Machine Translation. In this research paper, we focused on Derivational Stemmer for resource-poor language Sindhi, in Devanagari Script by using suffix Stripping approach. A dictionary of frequent words is added to reduce over and under stemming error. This is our first attempt to develop a Rule-based Derivational Stemmer in Sindhi Devanagari Script. We compared the results of this derivational stemmer with inflectional stemmer of Sindhi Devanagari Script, previously developed by us.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Shahani, A.T.: Sindhi Self-instructor: In Arabic Sindhi and Devanagari Scripts with Pronunciations in Roman Characters, 5th edn. Sindhi Academy, Delhi (2011)
Saraswat, U.: Nutan Sindhi Vyakaran. Suresh Saraswat, Lajpat Nagar, Delhi (2014)
Jetley, M.: Sindhi Bhasha Vyakaran Evam Prayog. Sindhi Academy, Delhi (2012)
Rahman, M.U.: Sindhi morphology and noun inflections. In: Proceedings of the Conference on Language and Technology, pp. 74–81 (2009)
Oad, J.D.: Implementing GF resource grammar for Sindhi language. Doctor dissertation, M.Sc. thesis, Chalmers University of Technology, Gothenburg, Sweden (2012)
Mahar, J.A., Memon, G.Q.: Probabilistic analysis of sindhi word prediction using N-Grams. Aust. J. Basic Appl. Sci. 5(5), 1137–1143 (2011)
Mahar, J.A., Memon, G.Q., Danwar, S.H.: Algorithms for sindhi word segmentation using Lexicon-Driven approach. Int. J. Acad. Res. 3(3) (2011)
Lashari, M.A., Soomro, A.A.: Subject-verb agreement in Sindhi and English: a comparative study. Lang. India 13(6), 473–495 (2013)
Motlani, R., Tyers, F. M., Sharma, D.M.: A finite-state morphological analyser for Sindhi. In: LREC (2016)
Narejo, W.A., Mahar, J.A.: Morphology: Sindhi morphological analysis for natural language processing applications. In: 2016 International Conference on Computing, Electronic and Electrical Engineering (ICE Cube), pp. 27–31. IEEE (2016)
Narejo, W.A., Mahar, J.A., Mahar, S.A., Surahio, F.A., Jumani, A.K.: Sindhi morphological analysis: an algorithm for Sindhi word segmentation into morphemes. Int. J. Comput. Sci. Inf. Secur. 14(6), 293 (2016)
Makhija, S.D.: A study of different stemmer for sindhi language based on devanagari script. In: Computing for Sustainable Global Development (INDIACom), 2016 3rd International Conference on, pp. 2326–2329. IEEE (2016)
Shah, M., Shaikh, H., Mahar, J., Mahar, S.: Sindhi stemmer for information retrieval system using rule-based stripping approach. Sindh Univ. Res. J.-SURJ (Sci. Ser.) 48(4) (2016)
Suba, K., Jiandani, D., Bhattacharyya, P.: Hybrid inflectional stemmer and rule-based derivational stemmer for gujarati. In: Proceedings of the 2nd Workshop on South Southeast Asian Natural Language Processing (WSSANLP), pp. 1–8 (2011)
Kanuparthi, N., Inumella, A., Sharma, D.M.: Hindi derivational morphological analyzer. In: Proceedings of the Twelfth Meeting of the Special Interest Group on Computational Morphology and Phonology, pp. 10–16. Association for Computational Linguistics (2012)
Gupta, V., Joshi, N., Mathur, I.: Design & development of rule based inflectional and derivational Urdu stemmer ‘Usal’. In: 2015 International Conference on Futuristic Trends on Computational Analysis and Knowledge Management (ABLAZE), pp. 7–12. IEEE (2015)
Saharia, N., Sharma, U., Kalita, J.: Stemming resource-poor Indian languages. ACM Trans. Asian Lang. Inf. Process. (TALIP) 13(3), 14 (2014)
Rathod, S., Govilkar, S.: Survey of various POS tagging techniques for Indian regional languages
Govilkar, S.S., Bakal, J.W., Kulkarni, S.R.: Extraction of root words using morphological analyzer for Devanagari script. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 8(1), 33 (2016)
Karanikolas, N.N.: A methodology for building simple but robust stemmers without language knowledge: stemmer configuration. Procedia Soc. Behav. Sci. 147, 370–375 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Nathani, B., Joshi, N., Purohit, G.N. (2020). Rule-Based Derivational Stemmer for Sindhi Devanagari Using Suffix Stripping Approach. In: Somani, A.K., Shekhawat, R.S., Mundra, A., Srivastava, S., Verma, V.K. (eds) Smart Systems and IoT: Innovations in Computing. Smart Innovation, Systems and Technologies, vol 141. Springer, Singapore. https://doi.org/10.1007/978-981-13-8406-6_23
Download citation
DOI: https://doi.org/10.1007/978-981-13-8406-6_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8405-9
Online ISBN: 978-981-13-8406-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)