Abstract
Lexicography is the science and practice of making dictionaries. Its development has led to new techniques for the visual presentation of lexicographic entries. This article focuses on the technique of photodocumentation, which enables a textual quotation to be shown in its natural context. We aim to present a technological system which will make it possible, relatively cheaply, to produce a monolingual dictionary together with quotations and chronologisation—that is, the date at which a given word first appears. We consider the example of Vietnamese. As a preliminary database of material we selected just over 100 books, which we scanned and from which we excerpted quotations to illustrate the natural use of the headwords.
Research reported in this paper was supported by the Polish Ministry of Education under Grant no. 0014/NPRH3/H11/82/2014, Narodowy Fotokorpus Języka Polskiego.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Austin, P. (ed).: 1000 Languages. The Worldwide History of Living and Lost Tongues. Thames & Hudson, London (2008)
Dzienisiewicz, D., Wierzchoń, P.: On the Japaneseness of Polish: A Linguochronological Approach, vol, 3, pp. 53−76. Opuscula Iaponica & Slavica, (2016)
Graliński, F., Wierzchoń, P.: RetroC—a corpus for evaluating temporal classifiers. In: Vetulani, Z., Mariani, J. (eds.) Proceedings of 7th Language and Technology Conference, pp 245–249. Poznań (2015)
Graliński F.: Polish digital libraries as a text corpus. In: Vetulani, Z., Uszkoreit, H. (eds.) Proceedings of 6th Language and Technology Conference, pp. 509–513. Poznań (2013)
Iwanowski, M.: Fotoaddenda do leksykografii polskiej, Warszawa (2009)
Le-Hong, P., Nguyen, T.M.H., Roussanaly, A., Vinh, H.T.: A hybrid approach to word segmentation of Vietnamese texts. In: Proceedings of the 2nd International Conference on Language and Automata Theory and Applications, pp. 240–249. Tarragona, Spain, Springer, LNCS 5196 (2008)
Smith R.: An Overview of the Tesseract OCR Engine. In: ICDAR ‘07 Proceedings of the Ninth International Conference on Document Analysis and Recognition, vol. 2. Washington, DC (2007)
Wawrzyńczyk, J.: 1000 słów zadośćuczynienia. (Wypiski ze strony http://www.nfjp.pl). Wawrzyn, Warszawa (2016)
Wawrzyńczyk, J., Nasze Drobne Kompensacje Leksykograficzne, czyli jak wzbogacamy zasoby strony http://www.nfjp.pl (2016)
Wierzchoń P., Fotodokumentacja 3.0, „Język. Komunikacja. Informacja” 2009, t. 4, pp. 63–80 (2009)
Wierzchoń P., Graliński F., Z kart historii „parcia na” neologizmy, „Poradnik Językowy” 2016, z. 4, pp. 110–129 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Wierzchoń, P. (2017). The Great National Photocorpus of 20th-Century Vietnamese. Origins, Assumptions and Goals. In: Król, D., Nguyen, N., Shirai, K. (eds) Advanced Topics in Intelligent Information and Database Systems. ACIIDS 2017. Studies in Computational Intelligence, vol 710. Springer, Cham. https://doi.org/10.1007/978-3-319-56660-3_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-56660-3_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56659-7
Online ISBN: 978-3-319-56660-3
eBook Packages: EngineeringEngineering (R0)