Research of Spam Filtering System Based on LSA and SHA

Sun, Jingtao; Zhang, Qiuyu; Yuan, Zhanting; Huang, Wenhan; Yan, Xiaowen; Dong, Jianshe

doi:10.1007/978-3-540-87734-9_38

Research of Spam Filtering System Based on LSA and SHA

Jingtao Sun^6,7,
Qiuyu Zhang⁷,
Zhanting Yuan⁷,
Wenhan Huang⁸,
Xiaowen Yan⁹ &
…
Jianshe Dong⁷

Conference paper

2969 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5264))

Abstract

Along with the widespread concern of spam problem, at present, there are spam filtering system nowadays about the problem of semantic imperfection and spam filter low effect in the multi-send spam. This paper proposes a model of spam filtering which based on latent semantic analysis (LSA) and message-digest algorithm 5 (SHA). Making use of the LSA marks the latent feature phrase in the spam, semantic analysis is led into the spam filtering technique; the "e-mail fingerprint" of multi-send spam is born with SHA on the LSA analytical foundation, the problem of filtering technique’s low effect in the multi-send spam is resolved with this kind of method. We have designed a spam filtering system based on this model. Our designed system was evaluated with an optional dataset. The results obtained were compared with KNN algorithm filter experiment results show that system based on Latent Semantic Analysis and SHA performs KNN. The experiments show the expected results obtained, and the feasibility and advantage of the new spam filtering method is validated.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Anti-spam Alliance in China, http://www.anti-spam.org.cn
Hoanca, B.: How Good are Our Weapons in the Spam Wars? Technology and Society Magazine 25(1), 22–30 (2006)
Article Google Scholar
Whitworth, B., Whitworth, E.: Spam and the Social Technical Gap. Computer & Graphics 37(10), 38–45 (2004)
Google Scholar
Tang, P.Z., Li, L.Q., Zuo, L.M.: A New Verification Technology Based on SHA and OTP. Journal of East China Jiao Tong University 22(2), 55–59 (2005)
Google Scholar
Wang, G.P.: An Efficient Implementation of SHA-1 Hash Function. In: The 2006 IEEE International Conference on Information Technology, pp. 575–579. IEEE Press, China (2006)
Google Scholar
Chen, H., Zhou, J.L., Feng, S.: Double Figure Authentication System Based on SHA and RSA. Network & Computer Security 4, 6–8 (2006)
Google Scholar
Burr, W.E.: Cryptographic Hash Standards: Where Do We Go From Here? Security & Privacy Magazine 4(2), 88–91 (2006)
Article Google Scholar
Zhu, W.Z., Chen, C.M.: Storylines: Visual Exploration and Analysis in Latent Semantic Spaces. Computers & Graphics 31(3), 78–79 (2007)
Article Google Scholar
Maletic, J.I., Marcus, A.: Using Latent Semantic Analysis to Identify Similarities in Source Code to Support Program Understanding. In: 12th IEEE International Conference on Tools with Artificial Intelligence, pp. 46–53. IEEE Press, New York (2000)
Google Scholar
Martin, D.I., Martin, J.C., Berry, M.W.: Out-of-core SVD Performance for Document Indexing. Applied Numerical Mathematics 57(11-12), 224–226 (1994)
MathSciNet Google Scholar
Gai, J., Wang, Y., Wu, G.S.: The Theory and Application of Latent Semantic Analysis. Application Research of Computers 21(3), 161–164 (2004)
Google Scholar
Michail, H., Kakarountas, A.P.: A Low-power and High-throughput Implementation of the SHA-1 Hash Function. In: The 2005 IEEE International Symposium on Circuits and Systems, vol. 4, pp. 4086–4089. IEEE Press, Kobe Japan (2005)
Chapter Google Scholar
Wang, M.Y., Su, C.P., Huang, C.T., Wu, C.W.: An HMAC Processor with Integrated SHA-1 and MD5 Algorithms. In: Design Automation Conference, Proceedings of the ASP-DAC 2004, Japan, pp. 456–458 (2004)
Google Scholar
Paul, D.B.: MySQL: The Definitive Guide to Using, Programming, and Administering MySQL 4, 2nd edn. China Machine Press, China (2004)
Google Scholar
Learning to Filter Unsolicited Commercial E-mail, http://www.aueb.gr/users/ion/docs/TR2004_updated.pdf
Deshpande, V.P., Erbacher, R.F., Harris, C.: An Evaluation of Naïve Bayesian Anti-Spam Filtering. In: Information Assurance and Security Workshop, pp. 333–340. IEEE SMC Press, Spain (2007)
Chapter Google Scholar
Li, J.Z., Zhang, D.D.: Algorithms for Dynamically Adjusting the Sizes of Sliding Windows. Journal of Software 15(12), 13–16 (2004)
Google Scholar
Parthasarathy, G., Chatterji, B.N.: A Class of New KNN Methods for Low Sample Problems. Systems, Man and Cybernetics 20(3), 715–718 (1990)
Article Google Scholar
Yuan, W., Liu, J., Zhou, H.B.: An Improved KNN Method and Its Application to Tumor Diagnosis. In: The 2004 IEEE International Conference on Machine Learning and Cybernetics, vol. 5, pp. 2836–2841. IEEE Press, Shanghai (2004)
Google Scholar
Soucy, P., Mineau, G.W.: A Simple KNN Algorithm for Text Categorization. In: Data Mining. The 2001 IEEE International Conference on Data Mining, pp. 647–648. IEEE Press, USA (2001)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

College of Electrical and Information Engineering, Lanzhou University of Technology, 730050, Lanzhou, China
Jingtao Sun
College of Computer and Communication, Lanzhou University of Technology, 730050, Lanzhou, China
Jingtao Sun, Qiuyu Zhang, Zhanting Yuan & Jianshe Dong
Department of Computer science and technology, Shaanxi University of Technology, 723003, Hanzhong, China
Wenhan Huang
Shaanxi Xiyu Highway Corporation Ltd. Hancheng, 715400, Shaanxi, China
Xiaowen Yan

Authors

Jingtao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhanting Yuan
View author publications
You can also search for this author in PubMed Google Scholar
Wenhan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowen Yan
View author publications
You can also search for this author in PubMed Google Scholar
Jianshe Dong
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghu University, 100084, Beijing, China
Fuchun Sun
Institute TAMS (Technical Aspects of Multimodal Systems), department of Informatics, University of Hamburg, Vogt-Koelln-Straße 30, 22527, Hamburg, Germany
Jianwei Zhang
Intel China Research Center, 8/F, Peking University, Department of Machine Intelligence, 100871, Beijing, China
Ying Tan
Department of Mathematics, Southeast University, 210096, Nanjing, China
Jinde Cao
Departamento de Control Automático, CINVESTAV-IPN, A.P. 14-740, Av.IPN 2508, 07360, México D.F., México
Wen Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, J., Zhang, Q., Yuan, Z., Huang, W., Yan, X., Dong, J. (2008). Research of Spam Filtering System Based on LSA and SHA. In: Sun, F., Zhang, J., Tan, Y., Cao, J., Yu, W. (eds) Advances in Neural Networks - ISNN 2008. ISNN 2008. Lecture Notes in Computer Science, vol 5264. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87734-9_38

Download citation

DOI: https://doi.org/10.1007/978-3-540-87734-9_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87733-2
Online ISBN: 978-3-540-87734-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics