Abstract
Classifying spam is a topic of ongoing research in the area of natural language processing, especially with the increase in the usage of the Internet for social networking. This has given rise to the increase in spam activity by the spammers who try to take commercial or non-commercial advantage by sending the spam messages. In this paper, we have implemented an evolving area of technique known as deep learning technique. A special architecture known as Long Short Term Memory (LSTM), a variant of the Recursive Neural Network (RNN) is used for spam classification. It has an ability to learn abstract features unlike traditional classifiers, where the features are hand-crafted. Before using the LSTM for classification task, the text is converted into semantic word vectors with the help of word2vec, WordNet and ConceptNet. The classification results are compared with the benchmark classifiers like SVM, Naïve Bayes, ANN, k-NN and Random Forest. Two corpuses are used for comparison of results: SMS Spam Collection dataset and Twitter dataset. The results are evaluated using metrics like Accuracy and F measure. The evaluation of the results shows that LSTM is able to outperform traditional machine learning methods for detection of spam with a considerable margin.
Similar content being viewed by others
References
MAAWG. Messaging anti-abuse working group. Email metrics report. Q1 2012 to Q2 2014. https://www.m3aawg.org/sites/default/files/document/M3AAWG_2012-2014Q2_Spam_Metrics_Report16.pdf. Accessed 30 Mar 2017
Mowbray M (2010) The twittering machine. In: WEBIST (2), pp 299–304
Benevenuto F, Magno G, Rodrigues T, Almeida V (2010) Detecting spammers on twitter. In: Collaboration, electronic messaging, anti-abuse and spam conference (CEAS), vol 6, no. 2010, p 12
Mittal N, Agarwal B, Agarwal S, Agarwal S, Gupta P (2013) A hybrid approach for twitter sentiment analysis. In: 10th international conference on natural language processing (ICON-2013), pp 116–120
Ahmed S, Mithun F (2004) Word stemming to enhance spam filtering. In: The conference on email and anti-spam (CEAS’04) 2004
Agarwal B, Mittal N (2016) Prominent feature extraction for sentiment analysis. Springer International Publishing, Berlin, pp 21–45
Khorsi A (2007) An overview of content-based spam filtering techniques. Informatica 31(3):269–277
Kolari P, Java A, Finin T, Oates T, Joshi A (2006) Detecting spam blogs: a machine learning approach. In: Proceedings of the 21st national conference on artificial intelligence (AAAI), July 2006
Wang AH (2010) Don’t follow me: spam detection in twitter. In: Proceedings of the 2010 international conference on security and cryptography (SECRYPT). IEEE, New York, pp 1–10
Tretyakov K (2004) Machine learning techniques in spam filtering. In: Data mining problem-oriented seminar. MTAT, vol 3, no 177, pp 60–79
Ntoulas A, Najork M, Manasse M, Fetterly D (2006) Detecting spam web pages through content analysis. In: Proceedings of the 15th international conference on World Wide Web. ACM, New York, pp 83–92
Mccord M, Chuah M (2011) Spam detection on twitter using traditional classifiers. In: International conference on autonomic and trusted computing. Springer, Berlin, pp 175–186
SMS Spam Collection v.1. http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/. Accessed 27 Dec 2016
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Bengio Y (2009) Learning deep architectures for AI. In: Foundations and trends® in machine learning, vol 2, no 1, pp 1–127
Deng L (2014) A tutorial survey of architectures, algorithms, and applications for deep learning. In: APSIPA transactions on signal and information processing, vol 3
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Glorot X, Bordes A, Bengio Y (2011) Domain adaptation for large-scale sentiment classification: a deep learning approach. In: Proceedings of the 28th international conference on machine learning (ICML-11), pp 513–520
Tang D, Wei F, Qin B, Liu T, Zhou M (2014) Coooolll: a deep learning system for twitter sentiment classification. In: Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014), pp 208–212
Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, New York, pp 8599–8603
Hong J, Fang M (2015) Sentiment analysis with deeply learned distributed representations of variable length texts. Technical report, Stanford University, pp 655–665
Tzortzis G, Likas A (2007) Deep belief networks for spam filtering. In: 19th IEEE international conference on tools with artificial intelligence, 2007. ICTAI 2007, vol 2. IEEE, New York, pp 306–309
Mi G, Gao Y, Tan Y (2015) Apply stacked auto-encoder to spam detection. In: International conference in swarm intelligence. Springer, Cham, pp 3–15
Jain G, Sharma M, Agarwal B (2018) Spam detection on social media using semantic convolutional neural network. Int J Knowl Discov Bioinform (IJKDB) 8(1):12–26
Tai KS, Socher R, Manning CD (2015) Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53st annual meeting on association for computational linguistics, ACL’15, Stroudsburg, PA, USA. Association for Computational Linguistics
Tang D, Qin B, Liu T (2015) Document modeling with gated recurrent neural network for sentiment classification. In: Proceedings of the 2015 conference on empirical methods in natural language processing, pp 1422–1432
Ul-Hasan A, Ahmed SB, Rashid F, Shafait F, Breuel TM (2013) Offline printed Urdu Nastaleeq script recognition with bidirectional LSTM networks. In: 12th international conference on document analysis and recognition (ICDAR), 2013. IEEE, pp 1061–1065
Wöllmer M, Metallinou A, Eyben F, Schuller B, Narayanan S (2010) Context-sensitive multimodal emotion recognition from speech and facial expression using bidirectional lstm modeling. In: Proceedings on INTERSPEECH 2010, Makuhari, Japan, pp 2362–2365
Sundermeyer M, Schlüter R, Ney H (2012) LSTM neural networks for language modeling. In: Thirteenth annual conference of the international speech communication association, pp 194–197
Sutskever I, Vinyals O, Le QV (2014) Sequence to sequence learning with neural networks. In: Advances in neural information processing systems, pp 3104–3112
Miller GA (1995) WordNet: a lexical database for English. Commun ACM 38(11):39–41
Liu H, Singh P (2004) ConceptNet—a practical commonsense reasoning tool-kit. BT Technol J 22(4):211–226
Mikolov T, Chen K, Corrado G, Dean J (2013) Efficient estimation of word representations in vector space. In: Proceedings of international conference on learning representations (ICLR)
Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12(Jul):2121–2159
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jain, G., Sharma, M. & Agarwal, B. Optimizing semantic LSTM for spam detection. Int. j. inf. tecnol. 11, 239–250 (2019). https://doi.org/10.1007/s41870-018-0157-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41870-018-0157-5