The distributed representation for societal risk classification toward BBS posts

Chen, Jindong; Tang, Xijin

doi:10.1007/s11424-016-5099-z

The distributed representation for societal risk classification toward BBS posts

Published: 29 December 2016

Volume 30, pages 627–644, (2017)
Cite this article

Journal of Systems Science and Complexity Aims and scope Submit manuscript

Jindong Chen^1,2 &
Xijin Tang¹

109 Accesses
2 Citations
Explore all metrics

Abstract

The risk classification of BBS posts is important to the evaluation of societal risk level within a period. Using the posts collected from Tianya forum as the data source, the authors adopted the societal risk indicators from socio psychology, and conduct document-level multiple societal risk classification of BBS posts. To effectively capture the semantics and word order of documents, a shallow neural network as Paragraph Vector is applied to realize the distributed vector representations of the posts in the vector space. Based on the document vectors, the authors apply one classification method KNN to identify the societal risk category of the posts. The experimental results reveal that paragraph vector in document-level societal risk classification achieves much faster training speed and at least 10% improvements of F-measures than Bag-of-Words. Furthermore, the performance of paragraph vector is also superior to edit distance and Lucene-based search method. The present work is the first attempt of combining document embedding method with socio psychology research results to public opinions area.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Empirical Feasibility Study of Societal Risk Classification Toward BBS Posts

Article 27 June 2018

Jindong Chen, Xiaoji Zhou & Xijin Tang

Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification

Ensemble of multiple kNN classifiers for societal risk classification

Article 22 April 2017

Jindong Chen & Xijin Tang

References

Zheng Y and Tok S K, “Harmonious Society” and “Harmonious World”: China’s policy discourse under Hu Jintao, China Policy Institute, The University of Nottingham, UK. Briefing Series, 2007, 26.
Google Scholar
Tang X J, Exploring online societal risk perception for harmonious society measurement, J. of Systems Science and Systems Engineering, 2013, 22(4): 469–486.
Article Google Scholar
Tang X J, Qualitative meta-synthesis techniques for analysis of public opinions for in-depth study, Proceedings of the 1st International Conference on Complex Sciences: Theory and Applications II (ed. by Zhou J), Springer, LNICST, Shanghai, 2009, 5: 2338–2353.
Article Google Scholar
Gu J F, Tang X J, and Niu W Y, Meta-synthesis system approach for solving social complex problems, The 1st International Congress of the International Federation for Systems Research (IFSR2005), Kobe, Japan, 2005.
Google Scholar
Song L F, Societal risk index and mechanism of social fluctuation, Sociological Research, 1995, 6: 90–95 (in Chinese).
Google Scholar
Wang E P, Social monitoring system based on public attitudes survey, Bulletin of the Chinese Academy of Sciences, 2006, 21(2): 125–131 (in Chinese).
Google Scholar
Zheng R, Zhou J, and Chen X F, Applying the social psychological behavior research to promote the innovation of social management, Bulletin of Chinese Academy of Sciences (in Chinese), 2012, 27(1): 24–30.
Google Scholar
Dodds P S and Danforth C M, Measuring the happiness of large-scale written expression: Songs, blogs, and presidents, Journal of Happiness Study, 2010, 11: 441–456.
Article Google Scholar
Cao L N and Tang X J, Topics and threads of the online public concerns based on Tianya forum, Journal of Systems Science and Systems Engineering, 2014, 23(2): 212–230.
Article Google Scholar
Hao B, Li L, Gao R, et al., Sensing subjective well-being from social media. Proceedings of the 10th International Conference on Active Media Technology (eds. by ślezak D, Schaefer G, and Vuong S, et al.), Springer, LNCS, 2014, 324–335.
Hu Y and Tang X J, Using support vector machine for classification of Baidu hot word, Proceedings of the 2013 International Conference on Knowledge Smcience, Engineering and Management (KSEM2013) (ed. by Wang M, Dalian, China), Springer, LNCS, 2013, 580–590.
Google Scholar
Zheng R, Shi K, and Li S, The influence factors and mechanism of societal risk perception, Proceedings of the First International Conference on Complex Sciences: Theory and Application (ed. by Zhou J, Shanghai, China), Springer, Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 2009, 2266–2275.
Google Scholar
Chen J D and Tang X J, Exploring societal risk classification of the posts of Tianya club, International Journal of Knowledge and Systems Science, 2014, 5(1): 36–48.
Article Google Scholar
Zhang W, Yoshida T, and Tang X J, Text classification based on multi-word with support vector machine, Knowledge-Based Systems, 2008, 21(8): 879–886.
Article Google Scholar
Bengio Y, Ducharme R, Vincent P, et al., A neural probabilistic language model, Journal of Machine Learning Research, 2003, 3: 1137–1155.
MATH Google Scholar
Zhang W, Yoshida T, and Tang X J, A comparative study of TFIDF, LSI and multi-words for text classification, Expert Systems with Applications, 2011, 38(3): 2758–2765.
Article Google Scholar
Cover T Mand Hart P E, Nearest neighbor pattern classification, IEEE Transactions on Information Theory, 1967, 13(1): 21–27.
Article MATH Google Scholar
Qiu L, Cao Y, Nie Z Q, et al., Learning word representation considering proximity and ambiguity, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (eds. by Brodley C E and Stone P, Québec, Canada), AAAI, 2014, 1572–1578.
Google Scholar
Collobert R, Weston J, Bottou L, et al., Natural language processing (almost) from scratch, Journal of Machine Learning Research, 2011, 12: 2461–2505.
MATH Google Scholar
Mikolov T, Chen K, Corrado G, et al., Efficient estimation of word representations in vector space, Proceedings of Workshop at the International Conference on Learning Representations 2013 (Scottsdale, Arizona, US), 2013, 1–12.
Google Scholar
Jeffrey P, Richard S, and Christopher M, Glove: Global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (Doha, Qatar), Stroudsburg, Association for Computational Linguistics, 2014, 1532–1543.
Google Scholar
Mitchell J and Lapata M, Composition in distributional models of semantics, Cognitive Science, 2010, 34(8): 1388–1429.
Article Google Scholar
Mikolov T, Sutskever I, Chen K, et al., Distributed representations of words and phrases and their compositionality, Proceedings of Advances in Neural Information Processing Systems 2013 (NIPS 2013) (eds. by Burges C J C, Bottou L, and Welling M, et al., Lake Tahoe, Nevada, US), 2013, 3111–3119.
Google Scholar
Richard S, Cliff C L, Andrew Y N, et al., Parsing natural scenes and natural language with recursive neural networks, Proceedings of the 28th International Conference on Machine Learning (ICML-11) (Bellevue, Washington, USA), JMLR Workshop and Conference Proceedings, 2011, 129–136.
Google Scholar
Richard S, Alex P, Jean W, et al., Recursive deep models for semantic compositionality over a sentiment treebank, Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (Seattle, Washington), Stroudsburg, Association for Computational Linguistics, 2013, 1631–1642.
Google Scholar
Le Q and Mikolov T, Distributed representations of sentences and documents, Proceedings of the 31st International Conference on Machine Learning (ICML-14) (Beijing, China), JMLR Workshop and Conference Proceedings, 2014, 1188–1196.
Google Scholar
Zhao Y L and Tang X J, A preliminary research of pattern of users’ behavior based on Tianya forum, The 14th International Symposium on Knowledge and Systems Sciences (eds. by Wang S Y, Nakamori Y, and Jin W L, Ningbo, China), JAIST Press, 2013, 139–145.
Google Scholar
Zhang Z D, A web mining system based on Tianya Forum—The design and realization of Tianya forum Vision 1.0, Graduate University of Chinese Academy of Sciences, 2012.
Google Scholar
Wen S Y and Wan X J, Emotion classification in Microblog texts using class sequential rules, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence (eds. by Brodley C E and Stone P, Québec, Canada), AAAI, 2014, 187–193.
Google Scholar
Wagner R and Fischer M, The string-to-string correction problem, Journal of ACM, 1974, 21(1): 168–178.
Article MathSciNet MATH Google Scholar
Hirsch L, Hirsch R, and Saeedi M, Evolving Lucene search queries for text classification, Proceeding of 9th Annual Conference on Genetic and Evolutionary Computation (London, England), ACM, 2007, 1604–1611.
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Systems Science, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
Jindong Chen & Xijin Tang
China Academy of Aerospace Systems Science and Engineering, Beijing, 100048, China
Jindong Chen

Authors

Jindong Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xijin Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jindong Chen.

Additional information

This research is supported by the National Natural Science Foundation of China under Grant Nos. 71171187, 71371107, and 61473284.

This paper was recommended for publication by Editor WANG Shouyang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, J., Tang, X. The distributed representation for societal risk classification toward BBS posts. J Syst Sci Complex 30, 627–644 (2017). https://doi.org/10.1007/s11424-016-5099-z

Download citation

Received: 15 April 2015
Revised: 25 August 2015
Published: 29 December 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11424-016-5099-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The distributed representation for societal risk classification toward BBS posts

Abstract

Access this article

Similar content being viewed by others

An Empirical Feasibility Study of Societal Risk Classification Toward BBS Posts

Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification

Ensemble of multiple kNN classifiers for societal risk classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

The distributed representation for societal risk classification toward BBS posts

Abstract

Access this article

Similar content being viewed by others

An Empirical Feasibility Study of Societal Risk Classification Toward BBS Posts

Ensemble of SVM Classifiers with Different Representations for Societal Risk Classification

Ensemble of multiple kNN classifiers for societal risk classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation