Advertisement

Automated Spam Detection in Short Text Messages

  • Gaurav Goswami
  • Richa Singh
  • Mayank Vatsa
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 390)

Abstract

Increase in the popularity and reach of short text messages has led to their usage in propagating unsolicited advertising, promotional offers, and other unwarranted material to users. This has led to a high influx of such spam messages. In order to protect the interests of the user, several countermeasures have been deployed by telecommunication companies to hinder the volume of such spam. However, some volume of spam messages still manage to avoid these measures and cause varying degree of annoyance to users. In this chapter, an automated spam detection algorithm is proposed to deal with the particular problem of short text message spam. The proposed algorithm performs the two class (spam, ham) classification using stylistic and text features specific to short text messages. The algorithm is evaluated on three databases belonging to diverse demographic settings. Experimental results indicate that the proposed algorithm is highly accurate in detecting spam in short messages and can be utilized by a wide variety of users to reduce the volume of spam messages.

Keywords

Text Feature Text Message Short Message Service Word Count Short Message 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgments

The authors would like to thank the authors of [17] for providing the SMS dataset.

References

  1. 1.
    Almeida, T.A., Hidalgo, J.M.G., Yamakami, A.: Contributions to the study of sms spam filtering: new collection and results. In: ACM Symposium on Document Engineering, pp. 259–262 (2011)Google Scholar
  2. 2.
    Chang, C.C., Lin, C.J.: LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3) (2011)Google Scholar
  3. 3.
    Cormack, G.V., Gómez Hidalgo, J.M., Sánz, E.P.: Spam filtering for short messages. In: Conference on Information and Knowledge Management, pp. 313–320 (2007)Google Scholar
  4. 4.
    Delany, S.J., Zamolotskikh, A.: An assessment of case base reasoning for short text message classification (2004)Google Scholar
  5. 5.
    Deng, W.W., Peng, H.: Research on a naive bayesian based short message filtering system. In: International Conference on Machine learning and cybernetics, pp. 1233–1237 (2006)Google Scholar
  6. 6.
    Gómez Hidalgo, J.M., Bringas, G.C., Sánz, E.P., García, F.C.: Content based sms spam filtering. In: ACM Symposium on Document Engineering, pp. 107–114 (2006)Google Scholar
  7. 7.
    Jiang, N., Jin, Y., Skudlark, A., Zhang, Z.L.: Understanding sms spam in a large cellular network: characteristics, strategies and defenses. Res. Attacks Intrusions Defenses 8145, 328–347 (2013)CrossRefGoogle Scholar
  8. 8.
    Junaid, M.B., Farooq, M.: Using evolutionary learning classifiers to do mobilespam (sms) filtering. In: Genetic and Evolutionary Computation Conference, pp. 1795–1802 (2011)Google Scholar
  9. 9.
    Liu, W., Wang, T.: Index-based online text classification for sms spam filtering. J. Comput. 5(6), 844–851 (2010)Google Scholar
  10. 10.
    Longzhen, D., An, L., Longjun, H.: A new spam short message classification. Int. Workshop Educ. Technol. Comput. Sci. 2, 168–171 (2009)Google Scholar
  11. 11.
    Murynets, I., Piqueras Jover, R.: Crime scene investigation: sms spam data analysis. In: ACM Conference on Internet Measurement Conference, pp. 441–452 (2012)Google Scholar
  12. 12.
    Narayan, A., Saxena, P.: The curse of 140 characters: evaluating the efficacy of sms spam detection on android. In: Third ACM Workshop on Security and Privacy in Smartphones and Mobile Devices, pp. 33–42 (2013)Google Scholar
  13. 13.
    Qian, X., Evan, W.X., Yang, Q.: Sms spam detection using non-content features (2012)Google Scholar
  14. 14.
    Rafique, M.Z., Farooq, M.: Sms spam detection by operating on byte-level distributions using hidden markov models (hmms). In: Virus Bulletin International Conference (2010)Google Scholar
  15. 15.
  16. 16.
    Xiang, Y., Chowdhury, M., Ali, S.: Filtering mobile spam by support vector machine. In: Conference on Computer Sciences, Software Engineering, Information Technology, E-Business and Applications, pp. 1–4 (2004)Google Scholar
  17. 17.
    Yadav, K., Kumaraguru, P., Goyal, A., Gupta, A., Naik, V.: Smsassassin: crowdsourcing driven mobile-based system for sms spam filtering. In: Mobile Computing Systems and Applications, HotMobile, pp. 1–6 (2011)Google Scholar

Copyright information

© Springer India 2016

Authors and Affiliations

  1. 1.Indraprastha Institute of Information TechnologyDelhiIndia

Personalised recommendations