Detection of Loan Words in Uyghur Texts

Mi, Chenggang; Yang, Yating; Wang, Lei; Li, Xiao; Dalielihan, Kamali

doi:10.1007/978-3-662-45924-9_10

Chenggang Mi^16,17,
Yating Yang¹⁶,
Lei Wang¹⁶,
Xiao Li¹⁶ &
…
Kamali Dalielihan¹⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 496))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

1834 Accesses
1 Citations

Abstract

For low-resource languages like Uyghur, data sparseness is always a serious problem in related information processing, especially in some tasks based on parallel texts. To enrich bilingual resources, we detect Chinese and Russian loan words from Uyghur texts according to phonetic similarities between a loan word and its corresponding donor language word. In this paper, we propose a novel approach based on perceptron model to discover loan words from Uyghur texts, which consider the detection of loan words in Uyghur as a classification procedure. The experimental results show that our method is capable of detecting the Chinese and Russian loan words in Uyghur Texts effectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chris, M., Hinrich, S.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)
MATH Google Scholar
Chung, C., Ho, C., Ping, C.: Using Sublexical Translations to Handle the OOV Problem in Machine Translation. ACM Transactions on Asian Language Information Processing 10(3), 1–20 (2011)
Google Scholar
Lauren, A.H.L.: English Loanwords in Mandarin Chinese. The University of Arizona, Arizona (2002)
Google Scholar
Gillian, K.: English loanwords in Japanese. World Englishes 14(1), 67–76 (1995)
Article Google Scholar
Kui, Z.: On Chinese-English Language Contact through Loanwords. English Language and Literature Studies 1(2), 100–105 (2011)
Google Scholar
Xuan, L., Lanqin, Z.: On Chinese Loanwords in English. Theory and Practice in Language Studies 1(12), 1816–1819 (2011)
Google Scholar
Yan, C., Ping, C.: A Comparison on the methods of Uyghur and Chinese Loan Words. Journal of Kashgar Teachers College 32(2), 51–55 (2011)
Google Scholar
Yan, Z.: Influence of Loan Words on the Words of Uygur Language. Journal of Hubei University of Education 28(1), 37–39 (2011)
Google Scholar
Shiming, C.: New Research on Chinese Loan Words in the Uygur Language. N.W.Journal of Ethnology 28(1), 176–180 (2011)
Google Scholar
Mi, C., Yang, Y., Zhou, X., Li, X., Yang, M.: Recognition of Chinese Loan Words in Uyghur Based on String Similarity. Journal of Chinese Information Processing 27(5), 173–178 (2013)
Google Scholar
Brown, P.E., Pietra, S.A.D., Pietra, V.J.D., Mercer, R.L.: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics 19(2), 263–311 (1993)
Google Scholar
Vogel, S., Ney, H., Tillmann, C.: Hmm-based word alignment in statistical translation. In: Proceedings of the 16th Conference on Computational Linguistics, pp. 836–841. Association for Computational Linguistics (1996)
Google Scholar
Dempster, A., Laird, N., Rubin, D.B.: Maximum-likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological) (39), 1–38 (1977)
Google Scholar
Gallant, S.I.: Perceptron-based learning algorithms. IEEE Transactions on Neural Networks 1(2), 179–191 (1990)
Article Google Scholar
Collins, M.: Discriminative training methods for hidden markov models: Theory and experiments with perceptron algorithms. In: Proceedings of the ACL 2002 Conference on Empirical Methods in Natural Language Processing, vol. 10. Association for Computational Linguistics (2002)
Google Scholar
Dasgupta, S., Kalai, A.T., Monteleoni, C.: Analysis of perceptron-based active learning. Learning Theory, 249–263 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Xinjiang Technical Institute of Physics & Chemistry of Chinese Academy of Sciences, Urumqi, Xinjiang, 830011, China
Chenggang Mi, Yating Yang, Lei Wang, Xiao Li & Kamali Dalielihan
University of Chinese Academy of Sciences, Beijing, 100049, China
Chenggang Mi

Authors

Chenggang Mi
View author publications
You can also search for this author in PubMed Google Scholar
Yating Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Wang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Li
View author publications
You can also search for this author in PubMed Google Scholar
Kamali Dalielihan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, 100190, Beijing, China
Chengqing Zong
Dept. of Computer Science and Operations Research, University of Montreal, Montreal, Quebec, Canada
Jian-Yun Nie
Peking University, Beijing, China
Dongyan Zhao
Institute of Computer Science & Technology, Peking University, 100871, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mi, C., Yang, Y., Wang, L., Li, X., Dalielihan, K. (2014). Detection of Loan Words in Uyghur Texts. In: Zong, C., Nie, JY., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2014. Communications in Computer and Information Science, vol 496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-45924-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-662-45924-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-45923-2
Online ISBN: 978-3-662-45924-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics