Semi-supervised Learning for Cyberbullying Detection in Social Networks

  • Vinita Nahar
  • Sanad Al-Maskari
  • Xue Li
  • Chaoyi Pang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8506)


Current approaches on cyberbullying detection are mostly static: they are unable to handle noisy, imbalanced or streaming data efficiently. Existing studies on cyberbullying detection are mainly supervised learning approaches, assuming data is sufficiently pre-labelled. However this is impractical in the real-world situation where only a small number of labels are available in streaming data. In this paper, we propose a semi-supervised leaning approach that will augment training data samples and apply a fuzzy SVM algorithm. The augmented training technique automatically extracts and enlarges training set from the unlabelled streaming text, while learning is conducted by utilising a very small training set provided as an initial input. The experimental results indicate that the proposed augmented approach outperformed all other methods, and is suitable in the real-world situations, where sufficiently labelled instances are not available for training. For the proposed fuzzy SVM approach we handle complex and multidimensional data generated by streaming text, where the importance of features are discriminated for the decision function. The evaluation conducted on different experimental scenarios indicates the superiority of the proposed fuzzy SVM against all other methods.


Cyberbullying Detection Text-Stream Classification Semi-supervised learning Social Networks 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Xu, J.M., Burchfiel, B., Zhu, X., Bellmore, A.: An examination of regret in bullying tweets. In: The 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 697–702 (2013)Google Scholar
  2. 2.
    Dadvar, M., Trieschnigg, D., Ordelman, R., de Jong, F.: Improving cyberbullying detection with user context. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 693–696. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  3. 3.
    Dinakar, K., Reichart, R., Lieberman, H.: Modeling the detection of textual cyberbullying. In: AAAI Conference on Weblogs and Social Media, pp. 11–17 (2011)Google Scholar
  4. 4.
    Nahar, V., Unankard, S., Li, X., Pang, C.: Sentiment analysis for effective detection of cyber bullying. In: Sheng, Q.Z., Wang, G., Jensen, C.S., Xu, G. (eds.) APWeb 2012. LNCS, vol. 7235, pp. 767–774. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  5. 5.
    Yin, D., Xue, Z., Hong, L., Davisoni, B.D., Kontostathis, A., Edwards, L.: Detection of harassment on web 2.0. In: Content Analysis in the Web 2.0 Workshop at WWW (2009)Google Scholar
  6. 6.
    Zhang, Y., Li, X., Orlowska, M.: One-class classification of text streams with concept drift. In: ICDMW, pp. 116–125 (2008)Google Scholar
  7. 7.
    Nahar, V., Li, X., Pang, C., Zhang, Y.: Cyberbullying detection based on text-stream classification. In: AusDM (2013) (in press)Google Scholar
  8. 8.
    Zhang, T.: Solving large scale linear prediction problems using stochastic gradient descent algorithms. In: ICML, pp. 919–926. ACM (2004)Google Scholar
  9. 9.
    Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys 34(1), 1–47 (2002)CrossRefGoogle Scholar
  10. 10.
    Zhang, D.Q., Chen, S., Pan, Z.S., Tan, K.R.: Kernel-based fuzzy clustering incorporating spatial constraints for image segmentation. 4, 2189–2192 (2003)Google Scholar
  11. 11.
    Zhang, D.Q., Chen, S.C.: A novel kernelized fuzzy c-means algorithm with application in medical image segmentation. Artificial Intelligence in Medicine 32, 37–50 (2004)CrossRefGoogle Scholar
  12. 12.
    Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Transaction on Fuzzy Systems 1, 98–110 (1993)CrossRefGoogle Scholar
  13. 13.
    Wong, C.C., Chen, C.C., Yeh, S.L.: K-means-based fuzzy classifier design. 1, 48–52 (2000)Google Scholar
  14. 14.
    Gröll, L., Jäkel, J.: A new convergence proof of fuzzy c-means. 13, 717–720 (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Vinita Nahar
    • 1
  • Sanad Al-Maskari
    • 1
  • Xue Li
    • 1
  • Chaoyi Pang
    • 2
  1. 1.School of Information Technology and Electrical EngineeringThe University of QueenslandAustralia
  2. 2.The Australian E-Health Research CenterCSIROAustralia

Personalised recommendations