Forecasting Suspicious Account Activity at Large-Scale Online Service Providers

Halawa, Hassan; Beznosov, Konstantin; Coskun, Baris; Liu, Meizhu; Ripeanu, Matei

doi:10.1007/978-3-030-32101-7_33

Hassan Halawa¹⁰,
Konstantin Beznosov¹⁰,
Baris Coskun¹¹,
Meizhu Liu¹² &
…
Matei Ripeanu¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11598))

Included in the following conference series:

International Conference on Financial Cryptography and Data Security

1825 Accesses
2 Citations

Abstract

In the face of large-scale automated social engineering attacks to large online services, fast detection and remediation of compromised accounts are crucial to limit the spread of the attack and to mitigate the overall damage to users, companies, and the public at large. We advocate a fully automated approach based on machine learning: we develop an early warning system that harnesses account activity traces to predict which accounts are likely to be compromised in the future. We demonstrate the feasibility and applicability of the system through an experiment at a large-scale online service provider using four months of real-world production data encompassing hundreds of millions of users. We show that—even limiting ourselves to login data only in order to derive features with low computational cost, and a basic model selection approach—our classifier can be tuned to achieve good classification precision when used for forecasting. Our system correctly identifies up to one month in advance the accounts later flagged as suspicious with precision, recall, and false positive rates that indicate the mechanism is likely to prove valuable in operational settings to support additional layers of defense.

This work was done when Baris Coskun was with Yahoo! Research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Where the context makes the notation unambiguous, we skip the prefix and use DW only for training-DW or testing-DW. Similarly for LW.

References

von Ahn, L., Blum, M., Hopper, N.J., Langford, J.: CAPTCHA: using hard AI problems for security. In: Biham, E. (ed.) EUROCRYPT 2003. LNCS, vol. 2656, pp. 294–311. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-39200-9_18
Chapter Google Scholar
Benevenuto, F., Magno, G., Rodrigues, T., Almeida, V.: Detecting spammers on twitter. In: Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference (CEAS), vol. 6, p. 12 (2010)
Google Scholar
Bilge, L., Han, Y., Dell’Amico, M.: Riskteller: predicting the risk of cyber incidents. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 2017, pp. 1299–1311. ACM, New York, NY, USA (2017). https://doi.org/10.1145/3133956.3134022, https://doi.acm.org/10.1145/3133956.3134022
Blanzieri, E., Bryl, A.: A survey of learning-based techniques of email spam filtering. Artif. Intell. Rev. 29(1), 63–92 (2008). https://doi.org/10.1007/s10462-009-9109-6. https://dx.doi.org/10.1007/s10462-009-9109-6
Article Google Scholar
Boshmaf, Y., et al.: Integro: leveraging victim prediction for robust fake account detection in OSNs. In: 22nd Annual Network and Distributed System Security Symposium (NDSS), San Diego, California, USA, 8–11 February 2015, pp. 1–15. http://www.internetsociety.org/doc/integro-leveraging-victim-prediction-robust-fake-account-detection-osns
Canali, D., Bilge, L., Balzarotti, D.: On the effectiveness of risk prediction based on users browsing behavior. In: Proceedings of the 9th ACM Symposium on Information, Computer and Communications Security, ASIA CCS 2014, pp. 171–182. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2590296.2590347, https://doi.acm.org/10.1145/2590296.2590347
Castillo, C., Donato, D., Gionis, A., Murdock, V., Silvestri, F.: Know your neighbors: web spam detection using the web topology. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2007, pp. 423–430. ACM, New York, NY, USA (2007). https://doi.org/10.1145/1277741.1277814, https://doi.acm.org/10.1145/1277741.1277814
Egele, M., Stringhini, G., Kruegel, C., Vigna, G.: COMPA: detecting compromised accounts on social networks. In: Proceedings of the Network & Distributed System Security Symposium, NDSS 2013, ISOC, February 2013
Google Scholar
Fernández-Delgado, M., Cernadas, E., Barro, S., Amorim, D.: Do we need hundreds of classifiers to solve real world classification problems? J. Mach. Learn. Res. 15(1), 3133–3181 (2014). http://dl.acm.org/citation.cfm?id=2627435.2697065
MathSciNet MATH Google Scholar
Halawa, H., Beznosov, K., Boshmaf, Y., Coskun, B., Ripeanu, M., Santos-Neto, E.: Harvesting the low-hanging fruits: defending against automated large-scale cyber-intrusions by focusing on the vulnerable population. In: Proceedings of the 2016 New Security Paradigms Workshop, NSPW 2016, pp. 11–22. ACM, New York, NY, USA (2016). https://doi.org/10.1145/3011883.3011885, https://doi.acm.org/10.1145/3011883.3011885
Halawa, H., Ripeanu, M., Beznosov, K., Coskun, B., Liu, M.: Forecasting suspicious account activity at large-scale online service providers. CoRR abs/1801.08629 (2018). http://arxiv.org/abs/1801.08629
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009). https://doi.org/10.1109/TKDE.2008.239
Article Google Scholar
Ho, G., Javed, A.S.M., Paxson, V., Wagner, D.: Detecting credential spearphishing attacks in enterprise settings. In: Proceedings of the 26rd USENIX Security Symposium, USENIX Security 2017, pp. 469–485 (2017)
Google Scholar
Jagatic, T.N., Johnson, N.A., Jakobsson, M., Menczer, F.: Social phishing. Commun. ACM 50(10), 94–100 (2007)
Article Google Scholar
Liu, G., Xiang, G., Pendleton, B.A., Hong, J.I., Liu, W.: Smartening the crowds: computational techniques for improving human verification to fight phishing scams. In: Proceedings of the Seventh Symposium on Usable Privacy and Security, SOUPS 2011, pp. 8:1–8:13. ACM, New York, NY, USA (2011). https://doi.org/10.1145/2078827.2078838, https://doi.acm.org/10.1145/2078827.2078838
Liu, Y., et al.: Cloudy with a chance of breach: forecasting cyber security incidents. In: Proceedings of the 24th USENIX Security Symposium, USENIX Security 2015, pp. 1009–1024 (2015)
Google Scholar
Lomax, S., Vadera, S.: A survey of cost-sensitive decision tree induction algorithms. ACM Comput. Surv. 45(2), 16:1–16:35 (2013). https://doi.org/10.1145/2431211.2431215. https://doi.acm.org/10.1145/2431211.2431215
Article MATH Google Scholar
Ludl, C., McAllister, S., Kirda, E., Kruegel, C.: On the effectiveness of techniques to detect phishing sites. In: M. Hämmerli, B., Sommer, R. (eds.) DIMVA 2007. LNCS, vol. 4579, pp. 20–39. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73614-1_2
Chapter Google Scholar
Moore, T., Clayton, R., Anderson, R.: The economics of online crime. J. Econ. Perspect. 23(3), 3–20 (2009). https://doi.org/10.1257/jep.23.3.3. https://www.aeaweb.org/articles/?doi=10.1257/jep.23.3.3
Article Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42(3), 203–231 (2001). https://doi.org/10.1023/A:1007601015854. https://dx.doi.org/10.1023/A:1007601015854
Article MATH Google Scholar
Shon, T., Moon, J.: A hybrid machine learning approach to network anomaly detection. Inf. Sci. 177(18), 3799–3821 (2007)
Article Google Scholar
Soska, K., Christin, N.: Automatically detecting vulnerable websites before they turn malicious. In: Proceedings of the 23rd USENIX Security Symposium, USENIX Security 2014, pp. 625–640 (2014)
Google Scholar
Stein, T., Chen, E., Mangla, K.: Facebook immune system. In: Proceedings of the 4th Workshop on Social Network Systems, SNS 2011, pp. 8:1–8:8. ACM, New York, NY, USA (2011). https://doi.org/10.1145/1989656.1989664. https://doi.acm.org/10.1145/1989656.1989664
Thomas, K., Li, F., Grier, C., Paxson, V.: Consequences of connectivity: characterizing account hijacking on twitter. In: Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS 2014, pp. 489–500. ACM, New York, NY, USA (2014). https://doi.org/10.1145/2660267.2660282. https://doi.acm.org/10.1145/2660267.2660282
Wang, G., Konolige, T., Wilson, C., Wang, X., Zheng, H., Zhao, B.Y.: You are how you click: clickstream analysis for sybil detection. In: Proceedings of the 22Nd USENIX Conference on Security, SEC 2013, pp. 241–256. USENIX Association, Berkeley, CA, USA (2013). http://dl.acm.org/citation.cfm?id=2534766.2534788
Whittaker, C., Ryner, B., Nazif, M.: Large-scale automatic classification of phishing pages. In: Proceedings of the 17th Annual Network and Distributed System Security Symposium, NDSS Symposium 2010, San Diego, CA, USA (2010)
Google Scholar
Yang, Z., Wilson, C., Wang, X., Gao, T., Zhao, B.Y., Dai, Y.: Uncovering social network sybils in the wild. In: Proceedings of the 2011 ACM SIGCOMM Conference on Internet Measurement Conference, IMC 2011, pp. 259–268. ACM, New York, NY, USA (2011). https://doi.org/10.1145/2068816.2068841. https://doi.acm.org/10.1145/2068816.2068841
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of the 2Nd USENIX Conference on Hot Topics in Cloud Computing, HotCloud 2010, p. 10. USENIX Association, Berkeley, CA, USA (2010). http://dl.acm.org/citation.cfm?id=1863103.1863113
Zhang, J., et al.: Safeguarding academic accounts and resources with the university credential abuse auditing system. In: IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2012), pp. 1–8, June 2012. https://doi.org/10.1109/DSN.2012.6263961

Download references

Author information

Authors and Affiliations

University of British Columbia, Vancouver, Canada
Hassan Halawa, Konstantin Beznosov & Matei Ripeanu
Amazon Web Services, New York, USA
Baris Coskun
Yahoo! Research, New York, USA
Meizhu Liu

Authors

Hassan Halawa
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Beznosov
View author publications
You can also search for this author in PubMed Google Scholar
Baris Coskun
View author publications
You can also search for this author in PubMed Google Scholar
Meizhu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Matei Ripeanu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hassan Halawa .

Editor information

Editors and Affiliations

Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
Ian Goldberg
Tandy School of Computer Science, University of Tulsa, Tulsa, USA
Tyler Moore

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Halawa, H., Beznosov, K., Coskun, B., Liu, M., Ripeanu, M. (2019). Forecasting Suspicious Account Activity at Large-Scale Online Service Providers. In: Goldberg, I., Moore, T. (eds) Financial Cryptography and Data Security. FC 2019. Lecture Notes in Computer Science(), vol 11598. Springer, Cham. https://doi.org/10.1007/978-3-030-32101-7_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-32101-7_33
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32100-0
Online ISBN: 978-3-030-32101-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics