Semi-supervised Learning for Cyberbullying Detection in Social Networks

Nahar, Vinita; Al-Maskari, Sanad; Li, Xue; Pang, Chaoyi

doi:10.1007/978-3-319-08608-8_14

Vinita Nahar¹⁷,
Sanad Al-Maskari¹⁷,
Xue Li¹⁷ &
…
Chaoyi Pang¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8506))

Included in the following conference series:

Australasian Database Conference

1358 Accesses
44 Citations

Abstract

Current approaches on cyberbullying detection are mostly static: they are unable to handle noisy, imbalanced or streaming data efficiently. Existing studies on cyberbullying detection are mainly supervised learning approaches, assuming data is sufficiently pre-labelled. However this is impractical in the real-world situation where only a small number of labels are available in streaming data. In this paper, we propose a semi-supervised leaning approach that will augment training data samples and apply a fuzzy SVM algorithm. The augmented training technique automatically extracts and enlarges training set from the unlabelled streaming text, while learning is conducted by utilising a very small training set provided as an initial input. The experimental results indicate that the proposed augmented approach outperformed all other methods, and is suitable in the real-world situations, where sufficiently labelled instances are not available for training. For the proposed fuzzy SVM approach we handle complex and multidimensional data generated by streaming text, where the importance of features are discriminated for the decision function. The evaluation conducted on different experimental scenarios indicates the superiority of the proposed fuzzy SVM against all other methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Xu, J.M., Burchfiel, B., Zhu, X., Bellmore, A.: An examination of regret in bullying tweets. In: The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 697–702 (2013)
Google Scholar
Dadvar, M., Trieschnigg, D., Ordelman, R., de Jong, F.: Improving cyberbullying detection with user context. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 693–696. Springer, Heidelberg (2013)
Chapter Google Scholar
Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: AAAI Conference on Weblogs and Social Media, pp. 11–17 (2011)
Google Scholar
Nahar, V., Unankard, S., Li, X., Pang, C.: Sentiment analysis for effective detection of cyber bullying. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 767–774. Springer, Heidelberg (2012)
Chapter Google Scholar
Yin, D., Xue, Z., Hong, L., Davisoni, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: Content Analysis in the Web 2.0 Workshop at WWW (2009)
Google Scholar
Zhang, Y., Li, X., Orlowska, M.: One-class classification of text streams with concept drift. In: ICDMW, pp. 116–125 (2008)
Google Scholar
Nahar, V., Li, X., Pang, C., Zhang, Y.: Cyberbullying detection based on text-stream classification. In: AusDM (2013) (in press)
Google Scholar
Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: ICML, pp. 919–926. ACM (2004)
Google Scholar
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)
Article Google Scholar
Zhang, D.Q., Chen, S., Pan, Z.S., Tan, K.R.: Kernel-based fuzzy clustering incorporating spatial constraints for image segmentation. 4, 2189–2192 (2003)
Google Scholar
Zhang, D.Q., Chen, S.C.: A novel kernelized fuzzy c-means algorithm with application in medical image segmentation. Artificial Intelligence in Medicine 32, 37–50 (2004)
Article Google Scholar
Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transaction on Fuzzy Systems 1, 98–110 (1993)
Article Google Scholar
Wong, C.C., Chen, C.C., Yeh, S.L.: K-means-based fuzzy classifier design. 1, 48–52 (2000)
Google Scholar
Gröll, L., Jäkel, J.: A new convergence proof of fuzzy c-means. 13, 717–720 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Information Technology and Electrical Engineering, The University of Queensland, Australia
Vinita Nahar, Sanad Al-Maskari & Xue Li
The Australian E-Health Research Center, CSIRO, Australia
Chaoyi Pang

Authors

Vinita Nahar
View author publications
You can also search for this author in PubMed Google Scholar
Sanad Al-Maskari
View author publications
You can also search for this author in PubMed Google Scholar
Xue Li
View author publications
You can also search for this author in PubMed Google Scholar
Chaoyi Pang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Centre for Applied Informatics (CAI), College of Engineering and Science, Victoria University, Ballarat Road, 8001, Footscray, VIC, Australia
Hua Wang
Faculty of Engineering, Architecture and Information Technology, School of Information Technology and Electrical Engineering, The University of Queensland, St. Lucia, 4072, Brisbane, QLD, Australia
Mohamed A. Sharaf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nahar, V., Al-Maskari, S., Li, X., Pang, C. (2014). Semi-supervised Learning for Cyberbullying Detection in Social Networks. In: Wang, H., Sharaf, M.A. (eds) Databases Theory and Applications. ADC 2014. Lecture Notes in Computer Science, vol 8506. Springer, Cham. https://doi.org/10.1007/978-3-319-08608-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-08608-8_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08607-1
Online ISBN: 978-3-319-08608-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics