Abstract
With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products. For adaptation to the reviews’ characteristics, our strategy is made up of a multi-label classification (MLC) for reviews, a binary classification (BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.
Similar content being viewed by others
References
SUN S L, LUO C, CHEN J Y. A review of natural language processing techniques for opinion mining systems [J]. Information Fusion, 2016, 36: 10–25.
YANG B H, CARDIE C. Joint inference for finegrained opinion extraction [C]//Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics. Sofia, Bulgaria: Association for Computational Linguistics, 2013: 1640–1649.
WIEBE J, WILSON T, CARDIE C. Annotating expressions of opinions and emotions in language [J]. Language Resources and Evaluation, 2005, 39(2/3): 165–210.
HUANG Z H, XU W, YU K. Bidirectional LSTMCRF models for sequence tagging [EB/OL]. (2015-08-09) [2018-03-05]. https://doi.org/arxiv.org/abs/1508.01991v1.
SHEN X P, BOUTELL M, LUO J B, et al. Multilabel machine learning and its application to semantic scene classification [C]//Proceedings of SPIE-IS & T Electronic Imaging. [s.l.]: SPIE, 2004: 18–22.
ZHANG J, LI D Y, WANG S G. Multiple performances identification for car review texts based on multi-label learning [J]. Computer Engineering and Science, 2016, 38(1): 188–194 (in Chinese).
ZHANG M L, ZHOU Z H. A review on multi-label learning algorithms [J]. IEEE Transactions on Knowledge and Data Engineering, 2014, 26(8): 1819–1837.
LAFFERTY J, MCCALLUM A, PEREIRA F. Conditional random fields: Probabilistic models for segmenting and labeling sequence data [C]//Eighteenth International Conference on Machine Learning. San Francisco, USA: Morgan Kaufmann Publishers, 2001: 282–289.
PONTIKI M, GALANIS D, PAVLOPOULOS J, et al. SemEval-2014 task 4: Aspect based sentiment analysis [C]//Proceedings of the 8th International Workshop on Semantic Evaluation. Dublin, Ireland: [s.n.], 2014: 27–35.
IRSOY O, CARDIE C. Opinion mining with deep recurrent neural networks [C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 720–728.
LIU P, JOTY S, MENG H. Fine-grained opinion mining with recurrent neural networks and word embeddings [C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon, Portugal: Association for Computational Linguistics, 2015: 1433–1443.
BRECK E, CHOI Y, CARDIE C. Identifying expressions of opinion in context [C]//Proceedings of the 20th International Joint Conference on Artifical Intelligence. Hyderabad, India: ACM, 2007: 2683–2688.
WEI J, HUNG H H, ROHINI K. A novel lexicalized HMM-based learning framework for web opinion mining [C]//Proceedings of the 26th International Conference on Machine Learning. Montreal, Canada: [s.n.], 2009: 465–472.
SAMANEH M, MARTIN E. On the design of LDA models for aspect-based opinion mining [C]//Proceedings of CIKM. Maui, USA: ACM, 2012; 803–812.
MADJAROV G, KOCEV D, GJORGJEVIKJ D, et al. An extensive experimental comparison of methods for multi-label learning [J]. Pattern Recognition, 2012, 45(9): 3084–3104.
HOCHREITER S, SCHMIDHUBER J. Long shortterm memory [J]. Neural Computation, 1997, 9(8): 1735–1780.
GRAVES A, SCHMIDHUBER J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures [J]. Neural Networks, 2005, 18(5): 602–610.
ZHANG M L, PEÑA J M, ROBLES V. Feature selection for multi-label naive bayes classification [J]. Information Sciences, 2009, 179(19): 3218–3229.
Author information
Authors and Affiliations
Corresponding author
Additional information
Foundation item: the National Natural Science Foundation of China (No. 61375053)
Rights and permissions
About this article
Cite this article
Wang, Y., Wang, M. Fine-Grained Opinion Extraction from Chinese Car Reviews with an Integrated Strategy. J. Shanghai Jiaotong Univ. (Sci.) 23, 620–626 (2018). https://doi.org/10.1007/s12204-018-1961-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12204-018-1961-6
Key words
- opinion extraction
- multi-label classification (MLC)
- binary classification (BC)
- sequence labelling
- recurrent neural network (RNN)