Skip to main content

Improving Arabic Texts Morphological Disambiguation Using a Possibilistic Classifier

  • Conference paper
Natural Language Processing and Information Systems (NLDB 2014)

Abstract

Morphological ambiguity is an important problem that has been studied through different approaches. We investigate, in this paper, some classification methods to disambiguate Arabic morphological features of non-vocalized texts. A possibilistic approach is improved and proposed to handle imperfect training and test datasets. We introduce a data transformation method to convert the imperfect dataset to a perfect one. We compare the disambiguation results of classification approaches to results given by the possibilistic classifier dealing with imperfection context.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Khoja, S.: APT: Arabic part-of-speech tagger. In: Proceedings of Student Workshop at the Second Meeting of the North American Association for Computational Linguistics. Carnegie Mellon University, Pennsylvania (2001)

    Google Scholar 

  2. Hajic, J.: Morphological Tagging: Data vs. Dictionaries. In: Proceedings of the 1st North American Chapter of the Association for Computational Linguistics Conference, pp. 94–101. Association for Computational Linguistics, Stroudsburg (2000)

    Google Scholar 

  3. Roth, R., Rambow, O., Habash, N., Diab, M., Rudin, C.: Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking. In: Proceedings of the Association for Computational Linguistics Conference (ACL), Columbus, Ohio, USA, pp. 117–120 (2008)

    Google Scholar 

  4. Habash, N., Rambow, O.: Arabic Tokenization, Part-of-speech Tagging and Morphological Disambiguation in One Fell Swoop. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 573–580 (2005)

    Google Scholar 

  5. Habash, N., Rambow, O.: Arabic DiacritizationThrough Full Morphological Tagging. In: Human Language Technologies: The Conference of the North American Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 53–56 (2007)

    Google Scholar 

  6. Ayed, R., Bounhas, I., Elayeb, B., Evrard, F.: Bellamine Ben Saoud, N.: A Possibilistic Approach for the Automatic Morphological Disambiguation of Arabic Texts. In: Hochin, T., Lee, R. (eds.) Proceedings of the 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel Distributed Computing (SNPD), Kyoto, Japan, pp. 187–194 (2012)

    Google Scholar 

  7. Ayed, R., Bounhas, I., Elayeb, B., Evrard, F., Saoud, N.B.B.: Arabic Morphological Analysis and Disambiguation Using a Possibilistic Classifier. In: Huang, D.-S., Ma, J., Jo, K.-H., Gromiha, M.M. (eds.) ICIC 2012. LNCS (LNAI), vol. 7390, pp. 274–279. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  8. Haouari, B., Ben Amor, N., Elouedi, Z., Mellouli, K.: Naïve Possibilistic Network Classifiers. Fuzzy Sets and Systems 160(22), 3224–3238 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  9. Dubois, D., Prade, H.: Possibility Theory: An Approach to computerized Processing of Uncertainty. Plenum Press, New York (1994)

    Google Scholar 

  10. Dubois, D.J., Prade, H.: Théorie des possibilités: applications à la représentation des connaissances en informatique. Masson, Paris (1985)

    MATH  Google Scholar 

  11. Dubois, D., Prade, H.: Possibility Theory: Qualitative and Quantitative Aspects. In: Gabbay, D.M., Smets, P. (eds.) Handbook on Defeasible Reasoning and Uncertainty Management Systems, pp. 169–226. Kluwer Academic, Springer, Dordrecht, Netherlands (1998)

    Google Scholar 

  12. Alkuhlani, S., Habash, N., Roth, R.: Automatic Morphological Enrichment of a Morphologically Underspecified Treebank. In: Clemmer, A., Post, M. (eds.) Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, HLT-NAACL, pp. 460–470. Omnipress of Madison, Wisconsin (2013)

    Google Scholar 

  13. Bounhas, M., Mellouli, K., Prade, H., Serrurier, M.: Possibilistic classifiers for numerical data. Soft Computing 17(5), 733–751 (2013)

    Article  Google Scholar 

  14. Buckwalter, T.: BuckwalterArabicMorphological Analyzer Version 2.0. Linguistic Data Consortium (LDC) catalogue number LDC2004L02 (2004) ISBN 1-58563-324-0

    Google Scholar 

  15. Raghavan, H., Allan, J.: An interactive algorithm for asking and incorporating feature feedback into support vector machines. In: ACM SIGIR Conference (2007)

    Google Scholar 

  16. Pearl, J.: Probabilistic reasoning in intelligent systems: networks of plausible inference. In: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmman, San Francisco (1988)

    Google Scholar 

  17. Quinlan, R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)

    Google Scholar 

  18. Harrag, F., Hamdi-Cherif, A., Malik, A., Al-Salman, S., El-Qawasmeh, E.: Experiments in Improvement of Arabic Information Retrieval. In: Proc. 3rd International Conference on Arabic Language Processing (CITALA), Rabat, Morocco, pp. 71–81 (2009)

    Google Scholar 

  19. Vapnik, V.: Statistical Learning Theory, pp. 1–736. Wiley, New York (1998)

    MATH  Google Scholar 

  20. Aggarwal, C.C., Changing, Z.: A survey of text classification algorithms. In: Mining Text Data, pp. 163–213 (2012)

    Google Scholar 

  21. Mesleh, A.: Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study. In: 12th WSEAS International Conference on applied mathematics, Cairo, Egypt, pp. 11–16 (2007)

    Google Scholar 

  22. Al-Echikh, A.A.: Encyclopedia of the six major citation collections. Daresselem, Ryadh (1998)

    Google Scholar 

  23. Jbara, K.: Knowledge Discovery in Al-Hadith Using Text Classification Algorithm. Journal of American Science 6(11), 409–419 (2010)

    Google Scholar 

  24. Alkhatib, M.: Classification of Al-Hadith Al-Shareef Using Data Mining Algorithm. In: Proceedings of European, Mediterranean & Middle Eastern Conference on Information Systems, Abu Dhabi (2010)

    Google Scholar 

  25. Bounhas, I., Elayeb, B., Evrard, F., Slimani, Y.: Organizing Contextual Knowledge for Arabic Text Disambiguation and Terminology Extraction. Knowledge Organization 38(6), 473–490 (2011)

    Google Scholar 

  26. Outahajala, M., Benajiba, Y., Rosso, P., Zenkouar, L.: POS Tagging in Amazighe Using Support Vector Machines and Conditional Random Fields. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 238–241. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  27. Georgescul, M., Rayner, M., Bouillon, P.: Spoken Language Understanding via Supervised Learning and Linguistically Motivated Features. In: Hopfe, C.J., Rezgui, Y., Métais, E., Preece, A., Li, H. (eds.) NLDB 2010. LNCS, vol. 6177, pp. 117–128. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Ayed, R., Bounhas, I., Elayeb, B., Ben Saoud, N.B., Evrard, F. (2014). Improving Arabic Texts Morphological Disambiguation Using a Possibilistic Classifier. In: Métais, E., Roche, M., Teisseire, M. (eds) Natural Language Processing and Information Systems. NLDB 2014. Lecture Notes in Computer Science, vol 8455. Springer, Cham. https://doi.org/10.1007/978-3-319-07983-7_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07983-7_18

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07982-0

  • Online ISBN: 978-3-319-07983-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics