Abstract
Traditional bag-of-words (BOW) draws advantage from distributional theory to represent document. The drawback of BOW is high dimensionality. However, this disadvantage has been solved by various dimensionality reduction techniques such as principal component analysis (PCA) or singular value decomposition (SVD). On the other hand, neural network-based approaches do not suffer from dimensionality problem. They can represent documents or words with shorter vectors. Especially, recurrent neural network (RNN) architectures have gained big attractions for short sequence representation. In this study, we compared traditional representation (BOW) with RNN-based architecture in terms of capability of solving sentiment problem. Traditional methods represent text with BOW approach and produce one-hot encoding. Further well-known linear machine learning algorithms such as logistic regression and Naive Bayes classifier could learn the decisive boundary in the data points. On the other hand, RNN-based models take text as a sequence of words and transform the sequence using hidden and recurrent states. At the end, the transformation finally represents input text with dense and short vector. On top of it, a final neural layer maps this dense and short representation to a sentiment of a list. We discussed our findings by conducting several experiments in depth. We comprehensively compared traditional representation and deep learning models by using a sentiment benchmark dataset of five different topics such as books and kitchen in Turkish language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Liu, Bing. 2012. Sentiment Analysis and Opinion Mining. Morgan & Claypool Publishers
Hu, Minqing and Bing Liu. 2004. Mining and summarizing customer reviews. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’04), 168–177. New York: ACM
Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the ACL-02 Conference on Empirical Methods in Natural Language Processing, vol. 10 (EMNLP ’02), 79–86. Stroudsburg, PA: Association for Computational Linguistics
Yu, Z., H. Wang, X. Lin, and M. Wang. 2015. Learning term embeddings for hypernymy identification. In Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 1390–1397
Turian, J., L. Ratinov, and Y. Bengio. 2010. Word representations: A simple and general method for semi-supervised learning. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL ’10, 384–394. Stroudsburg, PA, USA
Pennington, J., R. Socher, C. Manning. 2014. Glove: Global vectors for word representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 1532–1543, Doha, Qatar
Mikolov, T., W. Yih, and G. Zweig. 2013. Linguistic regularities in continuous space word representations. In Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 746–751. Atlanta, Georgia, USA
Rothe, Sascha, Sebastian Ebert, Hinrich Schütze. 2016. Ultradense word embeddings by orthogonal transformation. In Proceedings of the 2016 Conference of the North American Chapter of the ACL, San Diego, California
Duchi, John, Elad Hazan, and Yoram Singer. 2011. Adaptive subgradient methods for online learning and stochastic optimization. The Journal of Machine Learning Research 12: 2121–2159
Tieleman, Tijmen and Geoffrey Hinton. 2012. Lecture 6.5-RMSPROP: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 4
Goodfellow, I., Y. Bengio, and Courville Aaron. 2016. Deep Learning. Cambridge: MIT Press
Sutskever, I., J. Martens, G. Dahl, G. Hinton. 2013. On the importance of initialization and momentum in deep learning
Hochreiter, Sepp and J\(\ddot{\text{u}}\)urgen Schmidhuber. 1997. Long short-term memory. Neural Comput. 9(8): 1735–1780
Bengio, Y., P. Simard, and P. Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult
Demirtas, Erkin and Mykola Pechenizkiy. 2013. Cross-lingual polarity detection with machine translation. In Proceedings of the Second International Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM ’13)
Schütze, Hinrich, David A. Hull, and Jan O. Pedersen. 1995. A comparison of classifiers and document representations for the routing problem. In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’95), ed. Edward A. Fox, Peter Ingwersen, and Raya Fidel, 229–237. New York, NY: ACM
Salton, G. 1971. The SMART Retrieval System-Experiments in Automatic Document Processing. Upper Saddle River, NJ: Prentice-Hall Inc
Bishop, Christopher M. 2011. Pattern Recognition and Machine Learning. Berlin: Springer
Dozat, T. 2016. Incorporating Nesterov Momentum into Adam
Kingma, Diederik and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Yildirim, S. (2020). Comparing Deep Neural Networks to Traditional Models for Sentiment Analysis in Turkish Language. In: Agarwal, B., Nayak, R., Mittal, N., Patnaik, S. (eds) Deep Learning-Based Approaches for Sentiment Analysis. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-15-1216-2_12
Download citation
DOI: https://doi.org/10.1007/978-981-15-1216-2_12
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1215-5
Online ISBN: 978-981-15-1216-2
eBook Packages: EngineeringEngineering (R0)