Advertisement

Construction of Computational Lexicon for Malay Language

  • Harshida HasmyEmail author
  • Zainab Abu Bakar
  • Fatimah Ahmad
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9429)

Abstract

This paper focuses on construction of computational lexicon for Malay language that involves computational study and the use of electronic lexicons. To construct the lexicons, it includes a study on morphological arrangement of Malay affixation process which comprises of prefixes, suffixes, circumfixes and infixes with the intention of constructing a collection of new Malay lexicons or words that will be automatically constructed from a single root word. This research conducts experiments on 2101 unique Malay root words found in the Malay translated Quranic documents that are later experimented with Malay affixation rules using the affixed words analyser. Numerous new words are constructed from a single root word by adding 52 affix rules to the root word. Finally, each new word is compared with Malay dictionary to ensure whether it is truly a new generated Malay word. Results from this analysis open opportunity to construct new Malay word variant to enrich the Malay lexicon.

Keywords

Lexicon Affixes Root word Affixed word analyser Malay lexicon 

References

  1. 1.
    Guthrie, L., Pustejovsky, J., Wilks, Y., Slator, B.: The role of lexicons in natural language processing. Commun. ACM 39(1), 63–72 (1996)CrossRefGoogle Scholar
  2. 2.
    Shalabi, R., Kanaan, G.: Constructing an automatic lexicon for Arabic language. Int. J. Comput. Inf. Sci. 2(2), 114–128 (2004)Google Scholar
  3. 3.
    Varathan, K.D., Sembok, T.M.T., Kadir, R.A.: Automatic lexicon generator. In: International Conference on Information Retrieval and Knowledge Management, (CAMP), pp. 24–27. IEEE (2010)Google Scholar
  4. 4.
    Zamin, N., Oxley, A., Bakar, Z.A., Farhan, S.A.: A statistical dictionary-based word alignment algorithm: an unsupervised approach. In: 2012 International Conference on Computer and Information Science (ICCIS), vol. 1, pp. 396–402, (2012)Google Scholar
  5. 5.
    Zamin, N., Oxley, A., Abu Bakar, Z., Farhan, S.A.: A lazy man’s way to part-of-speech tagging. In: Richards, D., Kang, B.H. (eds.) PKAW 2012. LNCS, vol. 7457, pp. 106–117. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  6. 6.
    Alfred, R., Mujat, A., Obit, J.H.: A ruled-based part of speech (RPOS) tagger for Malay text articles. In: Selamat, A., Nguyen, N.T., Haron, H. (eds.) ACIIDS 2013, Part II. LNCS, vol. 7803, pp. 50–59. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  7. 7.
    Baldwin, T., Awab, S.: Open source corpus analysis tools for Malay. In: Proceedings of the 5th International Conference on Language Resources and Evaluation (2006)Google Scholar
  8. 8.
    Karim, N.S., Onn, F.M., Musa, H.: Tatabahasa Dewan. Dewan Bahasa Pustaka, Kuala Lumpur (2011)Google Scholar
  9. 9.
    Sharum, M.Y., Abdullah, M.T., Sulaiman, M.N., Murad, M.A.A., Hamzah, Z.A.Z.: MALIM—a new computational approach of Malay morphology. In: 2010 International Symposium in Information Technology, vol. 2, pp. 837–843. IEEE (2010)Google Scholar
  10. 10.
    Tan, Y.L.: A minimally-supervised Malay affix learner. In: Proceedings of the Class of 2003 Senior Conference, Computer Science Department, Swarthmore College (2003)Google Scholar
  11. 11.
    Ranaivo-Malancon, B.: Computational analysis of affixed words in Malay language. In: Proceedings of the 8th International Symposium on Malay/Indonesian Linguistics, Penang, Malaysia (2004)Google Scholar
  12. 12.
    Dewan Bahasa dan Pustaka: Kamus Dewan, Edisi Keempat, Dewan Bahasa Pustaka, Kuala Lumpur (2011)Google Scholar
  13. 13.
    Bakar, Z.A.: Evaluation of retrieval effectiveness of conflation methods on Malay documents. Ph.D. thesis, Universiti Kebangsaan Malaysia, Bangi (1999)Google Scholar
  14. 14.
    Ahmad, F.: A Malay language document retrieval system: an experimental approach and analysis. Ph.D. thesis, Universiti Kebangsaan Malaysia, Bangi (1995)Google Scholar
  15. 15.
    Joharry, S.A., Rahim, H.A.: Corpus research in Malaysia: a bibliographic analysis. Kajian Malaysia 32(1), 17 (2014)Google Scholar
  16. 16.
    Basri, S.B., Alfred, R., On, C.K.: Automatic spell checker for Malay blog. In: 2012 IEEE International Conference on Control System, Computing and Engineering (ICCSCE), pp. 506–510. IEEE (2012)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Harshida Hasmy
    • 1
    Email author
  • Zainab Abu Bakar
    • 1
  • Fatimah Ahmad
    • 2
  1. 1.Faculty of Computer and Mathematical SciencesUiTMShah AlamMalaysia
  2. 2.Faculty of Defence Science and TechnologyUniversiti Pertahanan Nasional MalaysiaKuala LumpurMalaysia

Personalised recommendations