Abstract
Knowing what concepts are substantial to each country can be helpful in enhancing emotional communication between two countries. As a concrete example of identifying substantial country concepts, we focus on a task of finding latent country words from cross-cultural texts of two countries. We do this by combining word embedding and tensor decomposition: common words that appear in both countries’ texts are selected; their country specific word embeddings are learned; a three-way tensor consisting of word factor, word embedding factor, and country factor are constructed; and CANDECOMP/PARAFAC decomposition is performed on the three-way tensor while fixing the country factor values of the decomposed result. We tested our method on a motivating example of finding latent country words from J-pop lyrics from Japan and K-pop lyrics from South Korea. We found that J-pop lyrics words feature nature related motifs such as ‘petal’, ‘cloud’, ‘universe’, ‘star’, and ‘sky’, whereas K-pop lyrics words highlight human body related motifs such as ‘style’, ‘shirt’, ‘head’, ‘foot’, and ‘skin’.
This research was supported by the National Research Foundation of South Korea (NRF) grant funded by the South Korean government (NRF-2017R1A2B4011015). This research was partially supported by a Grant-in-Aid for Scientific Research (A) (17H00759, 2017–2020) from Japan Society for the Promotion of Science (JSPS).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kiers, H.-A.-L., Van Mechelen, I.: Three-way component analysis: principles and illustrative application. Psychol. Methods 6(1), 84–110 (2001)
Mørup, M.: Applications of tensor (multiway array) factorizations and decompositions in data mining. WIREs Data Min. Knowl. Discov. 1(1), 24–40 (2011)
Carroll, J.-D., Chang, J.-J.: Analysis of individual differences in multidimensional scaling via an N-way generalization of ‘Eckart-Young’ decomposition. Psychometrika 35, 283–319 (1970)
Harshman, R.-A.: Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics, vol. 16, pp. 1–84 (1970)
Kolda, T.-G., Bader, B.-W.: Tensor decompositions and applications. SIAM Rev. 51, 455–500 (2009)
Cho, H., Yoon, S.-M.: Issues in visualizing intercultural dialogue using Word2Vec and t-SNE. In: Proceedings of 2017 International Conference on Culture & Computing, Kyoto, Japan, pp. 149–150 (2017)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th International Conference on Neural Information Processing Systems, Nevada, USA, pp. 3111–3119 (2013)
Acknowledgments
We thank the anonymous reviewers for many constructive comments. A heartwarming thanks to Yangjean Cho for revising and expanding the J-pop/K-pop lyrics word alignment dictionary.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Cho, H., Ishida, T. (2019). Discovering Latent Country Words: A Step Towards Cross-Cultural Emotional Communication. In: Nakanishi, H., Egi, H., Chounta, IA., Takada, H., Ichimura, S., Hoppe, U. (eds) Collaboration Technologies and Social Computing. CRIWG+CollabTech 2019. Lecture Notes in Computer Science(), vol 11677. Springer, Cham. https://doi.org/10.1007/978-3-030-28011-6_17
Download citation
DOI: https://doi.org/10.1007/978-3-030-28011-6_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28010-9
Online ISBN: 978-3-030-28011-6
eBook Packages: Computer ScienceComputer Science (R0)