Advertisement

Modeling Server Workloads for Campus Email Traffic Using Recurrent Neural Networks

  • Spyros Boukoros
  • Anupiya Nugaliyadde
  • Angelos Marnerides
  • Costas Vassilakis
  • Polychronis Koutsakis
  • Kok Wai WongEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10638)

Abstract

As email workloads keep rising, email servers need to handle this explosive growth while offering good quality of service to users. In this work, we focus on modeling the workload of the email servers of four universities (2 from Greece, 1 from the UK, 1 from Australia). We model all types of email traffic, including user and system emails, as well as spam. We initially tested some of the most popular distributions for workload characterization and used statistical tests to evaluate our findings. The significant differences in the prediction accuracy results for the four datasets led us to investigate the use of a Recurrent Neural Network (RNN) as time series modeling to model the server workload, which is a first for such a problem. Our results show that the use of RNN modeling leads in most cases to high modeling accuracy for all four campus email traffic datasets.

Keywords

Email traffic Model server workload Recurrent Neural Network Time series modeling 

Notes

Acknowledgements

We would like to sincerely thank Mr. Panagiotis Kontogiannis, Head of the Educational Computational Infrastructure at the Technical University of Crete, Mr. Martin Connell, Senior Systems Engineer at LJMU and Mr. Mario Pinelli, Manager of Computer Services and IT at Murdoch University. Without their help with collecting the datasets this research would not have been possible.

References

  1. 1.
    Jackson, T., Dawson, R., Wilson, D.: The cost of email interruption. J. Syst. Inf. Technol. 5, 81–92 (2001)CrossRefGoogle Scholar
  2. 2.
    Takemura, T., Ebara, H.: Spam mail reduces economic effects. In: Proceedings of the 2nd IEEE International Conference on the Digital Society (2008)Google Scholar
  3. 3.
    Kashyap, A., et al.: Internet Security Threat report (2014). http://www.symantec.com/content/en/us/enterprise/other_resources/b-istr_main_report_v19_21291018.en-us.pdf. Accessed 15 June 2017
  4. 4.
    Gomez, L.H., Cazita, C., Almeida, J.M., Almeida, V., Meira Jr., W.: Workload models of spam and legitimate e-mails. Perform. Eval. 64(7–8), 690–741 (2007)CrossRefGoogle Scholar
  5. 5.
    Bertolotti, L., Calzarossa, M.C.: Workload characterization of email servers. In: Proceedings of SPECTS (2000)Google Scholar
  6. 6.
    Shah, S., Noble, B.D.: A study of e-mail patterns. Softw. – Pract. Exp. 37(14), 1515–1538 (2007)CrossRefGoogle Scholar
  7. 7.
    Paxson, V.: Empirically-derived analytic models of wide-area TCP connections. IEEE/ACM Trans. Netw. 2(4), 316–336 (1994)CrossRefGoogle Scholar
  8. 8.
    Anderson, T.W., Darling, D.A.: Asymptotic theory of certain “goodness of fit” criteria based on stochastic processes. Ann. Math. Stat. 23(2), 193–212 (1952)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Law, A.M., Kelton, W.D.: Simulation Modeling and Analysis, 2nd edn. McGraw-Hill, New York City (1991)zbMATHGoogle Scholar
  10. 10.
    Massey, F.J.: The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 46(253), 68–78 (1951)CrossRefzbMATHGoogle Scholar
  11. 11.
    Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Lanfranchi, L.I., Bing, B.K.: MPEG-4 bandwidth prediction for broadband cable networks. IEEE Trans. Broadcast. 54(4), 741–751 (2008)CrossRefGoogle Scholar
  13. 13.
    Boukoros, S., Kalampogia, A., Koutsakis, P.: A new highly accurate workload model for campus email traffic. In: Proceedings of the International Conference on Computing, Networking and Communications (ICNC), pp. 1–7 (2016)Google Scholar
  14. 14.
    Navaroli, N., DuBois, C., Smyth, P.: Statistical models for exploring individual email communication behavior. In: Proceedings of the Asian Conference on Machine Learning (2012)Google Scholar
  15. 15.
    Hüsken, M., Stagge, P.: Recurrent neural networks for time series classication. Neurocomputing 50(C), 223–235 (2013)zbMATHGoogle Scholar
  16. 16.
    Längkvist, M., Karlsson, L., Loutfi, A.: A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogn. Lett. 42(1), 11–24 (2014)CrossRefGoogle Scholar
  17. 17.
    Rather, A.M., Agarwal, A., Sastry, V.: Recurrent neural network and a hybrid model for prediction of stock returns. Expert Syst. Appl. 42(6), 3234–3241 (2015)CrossRefGoogle Scholar
  18. 18.
    Bontempi, G., Ben Taieb, S., Le Borgne, Y.-A.: Machine learning strategies for time series forecasting. In: Aufaure, M.-A., Zimányi, E. (eds.) eBISS 2012. LNBIP, vol. 138, pp. 62–77. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-36318-4_3 CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Spyros Boukoros
    • 1
  • Anupiya Nugaliyadde
    • 2
  • Angelos Marnerides
    • 3
  • Costas Vassilakis
    • 4
  • Polychronis Koutsakis
    • 2
  • Kok Wai Wong
    • 2
    Email author
  1. 1.Department of Computer ScienceTechnische Universität DarmstadtDarmstadtGermany
  2. 2.School of Engineering and Information TechnologyMurdoch UniversityPerthAustralia
  3. 3.School of Computing and CommunicationsLancaster UniversityLancasterUK
  4. 4.Department of Informatics and TelecommunicationsUniversity of PeloponneseTripoliGreece

Personalised recommendations