Skip to main content

That Ain’t You: Blocking Spearphishing Through Behavioral Modelling

  • Conference paper
  • First Online:
Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9148))

Abstract

One of the ways in which attackers steal sensitive information from corporations is by sending spearphishing emails. A typical spearphishing email appears to be sent by one of the victim’s coworkers or business partners, but has instead been crafted by the attacker. A particularly insidious type of spearphishing emails are the ones that do not only claim to be written by a certain person, but are also sent by that person’s email account, which has been compromised. Spearphishing emails are very dangerous for companies, because they can be the starting point to a more sophisticated attack or cause intellectual property theft, and lead to high financial losses. Currently, there are no effective systems to protect users against such threats. Existing systems leverage adaptations of anti-spam techniques. However, these techniques are often inadequate to detect spearphishing attacks. The reason is that spearphishing has very different characteristics from spam and even traditional phishing. To fight the spearphishing threat, we propose a change of focus in the techniques that we use for detecting malicious emails: instead of looking for features that are indicative of attack emails, we look for emails that claim to have been written by a certain person within a company, but were actually authored by an attacker. We do this by modelling the email-sending behavior of users over time, and comparing any subsequent email sent by their accounts against this model. Our approach can block advanced email attacks that traditional protection systems are unable to detect, and is an important step towards detecting advanced spearphishing attacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hacking attack at RSA targeted Flash flaw. http://www.ft.com/cms/s/2/96518afc-5cb1-11e0-ab7c-00144feab49a.html

  2. Shamoon was an external attack on Saudi oil production. http://www.infosecurity-magazine.com/view/29750/shamoon-was-an-external-attack-on-saudi-oil-production/

  3. SpamAssassin: performance. http://wiki.apache.org/spamassassin/UsingNetworkTests

  4. Abbasi, A., Chen, H., Nunamaker, J.F.: Stylometric identification in electronic markets: scalability and robustness. J. Manage. Inform. Syst. 25, 49–78 (2008)

    Article  Google Scholar 

  5. Afroz, S., Brennan, M., Greenstadt, R.: Detecting hoaxes, frauds, and deception in writing style online. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  6. Aloul, F., Zahidi, S., El-Hajj, W.: Two factor authentication using mobile phones. In: IEEE/ACS International Conference on Computer Systems and Applications (2009)

    Google Scholar 

  7. Calix, K., Connors, M., Levy, D., Manzar, H., MCabe, G., Westcott, S.: Stylometry for e-mail author identification and authentication. In: Proceedings of CSIS Research Day, Pace University (2008)

    Google Scholar 

  8. Corney, M.W.: Analysing E-mail Text Authorship for Forensic Purposes

    Google Scholar 

  9. Drucker, H., Wu, D., Vapnik, V.N.: Support vector machines for spam categorization. IEEE Trans. Neural Networks 10, 1048–1054 (1999)

    Article  Google Scholar 

  10. Egele, M., Stringhini, G., Kruegel, C., Vigna, G.: COMPA: detecting compromised social network accounts. In: Symposium on Network and Distributed System Security (NDSS) (2013)

    Google Scholar 

  11. Fette, I., Sadeh, N., Tomasic, A.: Learning to Detect Phishing Emails

    Google Scholar 

  12. Forsyth, R., Holmes, D.: Feature finding for text classification. Literary Linguist. Comput. 11, 163–174 (1996)

    Article  Google Scholar 

  13. Frantzeskou, G., Stamatatos, E., Gritzalis, S., Chaski, C.E., Howald, B.S.: Identifying authorship by byte-level n-grams: the source code author profile (scap) method. Int. J. Digit. Evid. (2007)

    Google Scholar 

  14. Hao, S., Syed, N.A., Feamster, N., Gray, A.G., Krasser, S.: Detecting spammers with SNARE: spatio-temporal network-level automatic reputation engine. In: USENIX Security Symposium (2009)

    Google Scholar 

  15. Iqbal, F., Hadjidj, R., Fung, B., Debbabi, M.: A novel approach of mining write-prints for authorship attribution in e-mail forensics. Digit. Invest. 5, S42–S51 (2008)

    Article  Google Scholar 

  16. Jagatic, T.N., Johnson, N.A., Jakobsson, M., Menczer, F.: Social phishing. Commun. ACM 50, 94–100 (2007)

    Article  Google Scholar 

  17. John, J.P., Moshchuk, A., Gribble, S.D., Krishnamurthy, A.: Studying spamming botnets using botlab. In: USENIX Symposium on Networked Systems Design and Implementation (NSDI) (2009)

    Google Scholar 

  18. Kakavelakis, G., Beverly, R., Young, J.: Auto-learning of SMTP TCP transport-layer features for spam and abusive message detection. In: USENIX Large Installation System Administration Conference (2011)

    Google Scholar 

  19. Klimt, B., Yang, Y.: Introducing the enron corpus. In: CEAS (2004)

    Google Scholar 

  20. Leiba, B.: DomainKeys Identified Mail (DKIM): Using digital signatures for domain verification. In: CEAS (2007)

    Google Scholar 

  21. Lin, E., Aycock, J., Mannan, M.: Lightweight client-side methods for detecting email forgery. In: Lee, D.H., Yung, M. (eds.) WISA 2012. LNCS, vol. 7690, pp. 254–269. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  22. Meyer, T., Whateley, B.: SpamBayes: effective open-source, Bayesian based, email classification system. In: CEAS (2004)

    Google Scholar 

  23. Narayanan, A., Paskov, H., Gong, N.Z., Bethencourt, J., Stefanov, E., Shin, E.C.R., Song, D.: On the feasibility of internet-scale author identification. In: IEEE Symposium on Security and Privacy (2012)

    Google Scholar 

  24. Pitsillidis, A., Levchenko, K., Kreibich, C., Kanich, C., Voelker, G.M., Paxson, V., Weaver, N., Savage, S.: Botnet Judo: fighting spam with itself. In: Symposium on Network and Distributed System Security (NDSS) (2010)

    Google Scholar 

  25. Platt, J., et al.: Sequential minimal optimization: a fast algorithm for training support vector machines

    Google Scholar 

  26. Ramachandran, A., Feamster, N., Vempala, S.: Filtering spam with behavioral blacklisting. In: ACM Conference on Computer and Communications Security (CCS) (2007)

    Google Scholar 

  27. Sahami, M., Dumais, S., Heckermann, D., Horvitz, E.: A Bayesian approach to filtering junk e-mail. In: Learning for Text Categorization (1998)

    Google Scholar 

  28. Sculley, D., Wachman, G.M.: Relaxed online SVMs for spam filtering. In: ACM SIGIR Conference on Research and Development in Information Retrieval (2007)

    Google Scholar 

  29. Stolfo, S.J., Hershkop, S., Hu, C.-W., Li, W.-J., Nimeskern, O., Wang, K.: Behavior-based modeling and its application to email analysis. ACM Trans. Internet Technol. (TOIT) 6, 187–221 (2006)

    Article  Google Scholar 

  30. Stolfo, S.J., Hershkop, S., Wang, K., Nimeskern, O., Hu, C.-W.: Behavior profiling of email. In: Chen, H., Miranda, R., Zeng, D.D., Demchak, C.C., Schroeder, J., Madhusudan, T. (eds.) ISI 2003. LNCS, vol. 2665, pp. 74–90. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  31. Stringhini, G., Egele, M., Zarras, A., Holz, T., Kruegel, C., Vigna, G.: B@BEL: leveraging email delivery for spam mitigation. In: USENIX Security Symposium (2012)

    Google Scholar 

  32. Stringhini, G., Holz, T., Stone-Gross, B., Kruegel, C., Vigna, G.: BotMagnifier: locating spambots on the internet. In: USENIX Security Symposium (2011)

    Google Scholar 

  33. Stringhini, G., Thonnard, O.: That ain’t you: detecting spearphishing emails before they are sent. arXiv preprint arXiv:1410.6629 (2014)

  34. Symantec Corp. Symantec intelligence report (2013). http://www.symanteccloud.com/mlireport/SYMCINT_2013_01_January.pdf

  35. Taylor, B.: Sender reputation in a large webmail service. In: CEAS (2006)

    Google Scholar 

  36. The Radicati Group. Email Statistics Report. http://www.radicati.com/wp/wp-content/uploads/2011/05/Email-Statistics-Report-2011-2015-Executive-Summary.pdf

  37. Thonnard, O., Bilge, L., O’Gorman, G., Kiernan, S., Lee, M.: Industrial espionage and targeted attacks: understanding the characteristics of an escalating threat. In: Balzarotti, D., Stolfo, S.J., Cova, M. (eds.) RAID 2012. LNCS, vol. 7462, pp. 64–85. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  38. Threatpost. New Email Worm Turns Back the Clock on Virus Attacks (2010). http://threatpost.com/en_us/blogs/new-email-worm-turns-back-clock-virus-attacks-090910

  39. Trend Micro Inc., Spear-Phishing Email: Most Favored APT Attack Bait (2012)

    Google Scholar 

  40. Tweedie, F., Baayern, R.: How variable may a constant be? Measures of lexical richness in perspective. Comput. Humanit. 32, 323–352 (1998)

    Article  Google Scholar 

  41. Venkataraman, S., Sen, S., Spatscheck, O., Haffner, P., Song, D.: Exploiting network structure for proactive spam mitigation. In: USENIX Security Symposium (2007)

    Google Scholar 

  42. Wong, M., Schlitt, W.: RFC 4408: Sender Policy Framework (SPF) for Authorizing Use of Domains in E-Mail, Version 1 (2006). http://tools.ietf.org/html/rfc4408

  43. Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: signatures and characteristics. SIGCOMM Comput. Commun. Rev. 38, 171–182 (2008)

    Article  Google Scholar 

  44. Yule, G.: The Statistical Study of Literary Vocabulary. Cambridge University Press, Cambridge (1944)

    Google Scholar 

  45. Zalewski, M.: p0f v3 (2012). http://lcamtuf.coredump.cx/p0f3/

  46. Zhang, Y., Hong, J.I., Cranor, L.F.: Cantina: A Content-based Approach to Detecting Phishing Web Sites

    Google Scholar 

  47. Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Am. Soc. Inform. Sci. Technol. 57, 378–393 (2005)

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by a Symantec Research Labs Graduate Fellowship for the year 2012. We would like to thank the anonymous reviewers for their useful comments. We would also like to thank the people at Symantec, in particular Marc Dacier, David T. Lin, Dermot Harnett, Joe Krug, David Cawley, and Nick Johnston for their support and comments. We would also like to thank Adam Doupè and Ali Zand for reviewing an early version of this paper. Your feedback was very helpful.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gianluca Stringhini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Stringhini, G., Thonnard, O. (2015). That Ain’t You: Blocking Spearphishing Through Behavioral Modelling. In: Almgren, M., Gulisano, V., Maggi, F. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2015. Lecture Notes in Computer Science(), vol 9148. Springer, Cham. https://doi.org/10.1007/978-3-319-20550-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20550-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20549-6

  • Online ISBN: 978-3-319-20550-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics