Cybersecurity in Nigeria pp 3-22 | Cite as

# Natural Laws (Benford’s Law and Zipf’s Law) for Network Traffic Analysis

## Abstract

Recently, Benford’s law and Zipf’s law, which are both statistical laws, have been effectively used to distinguish between authentic data and fake data. Some similarities that exist between Benford’s law and Zipf’s law are that both of these laws are classified as natural laws. Also, both laws are Power laws and it is expected that distributions that follow Benford’s law should also follow Zipf’s law. Even though both laws have similarities, there exist some differences between these two laws. Benford’s law establishes a relationship between digit and frequency. In contrast, Zipf’s law shows a relationship between rank and frequency. Another difference that exists between these two laws is that Benford’s law applies to numeric attributes, whereas Zipf’s law applies to both numeric and string attributes. In this chapter, we perform a comparative analysis of these two laws on network traffic data and to determine whether they follow these laws and discriminate between non-malicious and malicious network traffic flows. We observe that both the laws effectively detected whether a particular network was non-malicious or malicious by investigating its data using these laws. Furthermore, we observe that the initial Benford’s law chi-square divergence values obtained seem to be inversely proportional to Zipf’s law P-values, which can be potentially exploited for intrusion detection system applications. These passive forensic detection methods when properly deployed to analyse network traffic data in Nigeria will save the Nigerian cyber space from malware and related attacks.

## Keywords

Benford’s law Zipf’s law Network traffic analysis Cyber space## Notes

### Acknowledgements

The author would like to specially thank Prof. Anthony T.S. Ho, Prof. Adrian Waller, Prof. Shujun Li, Dr. Norman Poh and Dr. Santosh Tirunagari for their assistance.

## References

- 1.Sambridge M, Tkalčić H, Jackson A (2010) Benford’s law in the natural sciences. Geophys Res Lett 37(22)Google Scholar
- 2.Nigrini MJ, Mittermaier LJ (1997) The use of Benford’s law as an aid in analytical procedures. Auditing 16(2):52Google Scholar
- 3.Mahanti A, Carlsson N, Arlitt M, Williamson C (2013) A tale of the tails: power-laws in Internet measurements. IEEE Netw 27(1):59–64CrossRefGoogle Scholar
- 4.Arshadi L, Jahangir AH (2014) Benford’s law behavior of internet traffic. J Netw Comput Appl 40:194–205CrossRefGoogle Scholar
- 5.Faloutsos M, Faloutsos P, Faloutsos C (1999) On power-law relationships of the internet topology. In: ACM SIGCOMM Computer Communication Review, vol 29, pp 251–262. ACMGoogle Scholar
- 6.van Mierlo T, Hyatt D, Ching AT (2015) Mapping power law distributions in digital health social networks: methods, interpretations, and practical implications. J Med Internet Res 17(6)Google Scholar
- 7.Fu D, Shi YQ, Su Q (2007) A generalized Benford’s law for JPEG coefficients and its applications in image forensics. In: Proceedings of the SPIE Multimedia Content Access: Algorithms and SystemsGoogle Scholar
- 8.Li XH, Zhao YQ, Liao M, Shih FY (2012) Detection of tampered region for JPEG images by using mode-based first digit features. EURASIP J Adv Signal 1:1–10Google Scholar
- 9.Xu B, Wang J, Liu G, Dai Y (2011) Photorealistic computer graphics forensics based on leading digit law. J Electron (China) 28(1):95–100Google Scholar
- 10.Benford F (1938) The law of anomalous numbers. Proc Am Philos Soc 78:551–572Google Scholar
- 11.Pérez-González F, Heileman GL, Abdallah CT (2007) Benford’s law in image processing. In: IEEE International Conference on Image Processing, vol 1, pp I–405. ICIP 2007 78:551–572. IEEEGoogle Scholar
- 12.Hill TP (1995) Base-invariance implies Benford’s law. Proc Am Math Soc 123(3):887–895MathSciNetzbMATHGoogle Scholar
- 13.Durtschi C, Hillison W, Pacini C (2004) The effective use of Benford’s law to assist in detecting fraud in accounting data. J Forensic Account 5(1):17–34Google Scholar
- 14.Manning CD, Schtze H (1999) Foundations of statistical natural language processing. MIT PressGoogle Scholar
- 15.Newman MEJ (2005) Power laws, Pareto distributions and Zipf’s law. Contemp Phys 46(5):323–351CrossRefGoogle Scholar
- 16.Tao T (2009) Benford’s law, Zipf’s law, and the Pareto distribution. http://terrytao.wordpress.com/2009/07/03/benfords-law-zipfs-lawand-the-pareto-distribution/
- 17.Cristelli M, Batty M, Pietronero L (2012) There is more than a power law in Zipf. Sci Rep 2Google Scholar
- 18.Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51(4):661–703MathSciNetCrossRefGoogle Scholar
- 19.Huang SH, Yen DC, Yang LW, Hua JS (2008) An investigation of Zipf’s law for fraud detection. Decis Support Syst 46:70–83Google Scholar
- 20.Iorliam A, Ho ATS, Poh N, Tirunagari S, Bours P (2015) Data forensic techniques using Benford’s law and Zipf’s law for keystroke dynamics. In: 3rd International Workshop on Biometrics and Forensics (IWBF 2015). IEEE, pp 1–6Google Scholar
- 21.Kruegel C, Valeur F, Vigna G (2004) Intrusion detection and correlation: challenges and solutions, vol 14. Springer Science & Business MediaGoogle Scholar
- 22.Sperotto A, Pras A (2011) Flow-based intrusion detection. In: IFIP/IEEE International Symposium on Integrated Network Management (IM), 2011. IEEE, pp 958–963Google Scholar
- 23.Patcha A, Park JM (2007) An overview of anomaly detection techniques: existing solutions and latest technological trends. Comput Netw 51(12):3448–3470CrossRefGoogle Scholar
- 24.Gogoi P, Bhuyan MH, Bhattacharyya DK, Kalita JK (2012) Packet and ow based network intrusion dataset. In: Contemporary Computing, pp 322–334. SpringerGoogle Scholar
- 25.Eskin E (2000) Anomaly detection over noisy data using learned probability distributionsGoogle Scholar
- 26.Chan PK, Mahoney MV, Arshad MH (2003) A machine learning approach to anomaly detection. Department of Computer Sciences, Florida Institute of Technology, MelbourneGoogle Scholar
- 27.Simmross-Wattenberg F, Asensio-Perez JI, Casaseca de-la Higuera P, Martin-Fernandez M, Dimitriadis IA, Alberola-Lopez C (2011) Anomaly detection in network traffic based on statistical inference and alpha-stable modeling. IEEE Trans Dependable Secur Comput 8(4):494–509Google Scholar
- 28.Lu W, Ghorbani AA (2009) Network anomaly detection based on wavelet analysis. EURASIP J Adv Signal Process 2009:4zbMATHGoogle Scholar
- 29.Bejtlich R (2004) The Tao of network security monitoring: beyond intrusion detection. Pearson EducationGoogle Scholar
- 30.Steinberger J, Schehlmann L, Abt S, Baier H (2013) Anomaly detection and mitigation at internet scale: a survey. In: Emerging Management Mechanisms for the Future Internet, pp 49–60. SpringerGoogle Scholar
- 31.Lakhina A, Papagiannaki K, Crovella M, Diot C, Kolaczyk ED, Taft N (2004) Structural analysis of network traffic flows, vol 32. ACMGoogle Scholar
- 32.Tune P, Roughan M (2013) Internet traffic matrices: a Primer. Recent Adv Netw. ACM SIGCOMM eBook, vol. 1. ACMGoogle Scholar
- 33.Lawrence Berkeley National Laboratory and International Computer Science Institute (2005) LBNL/ICSI enterprise tracing project. http://www.icir.org/enterprise-tracing. Accessed 04 Apr 2015
- 34.Shiravi A, Shiravi H, Tavallaee M, Ghorbani AA (2012) Toward developing a systematic approach to generate benchmark datasets for intrusion detection. Comput Secur 31(3):357–374Google Scholar
- 35.Szabó G, Gódor I, Veres A, Malomsoky S, Molnár S (2010) Traffic classification over gbit speed with commodity hardware. IEEE J Commun Softw Syst 5Google Scholar
- 36.Pcap Traces (2015). http://www.simpleweb.org/wiki/Traces. Accessed 20 May 2015
- 37.NETRESEC AB. Publicly available PCAP files, http://www.netresec.com/?page=pcapfiles. Accessed 20 May 2015
- 38.Inter-service academy cyber defense competition (2009). https://www.itoc.usma.edu/research/dataset/. Accessed 20 May 2015
- 39.Capture files from Mid-Atlantic CCDC (2015). http://www.netresec.com/?page=MACCDC. Accessed 20 May 2015
- 40.Sperotto A, Sadre R, van Vliet DF, Pras A (2009) A labeled data set for ow-based intrusion detection. In: Proceedings of the 9th IEEE International Workshop on IP Operations and Management, IPOM 2009, Venice, Italy. Lecture Notes in Computer Science, vol 5843. Springer, pp 39–50Google Scholar
- 41.Song J, Takakura H, Okabe Y (2008) Cooperation of intelligent honeypots to detect unknown malicious codes. In WOMBAT Workshop on Information Security Threats Data Collection and Sharing. WISTDCS’08., pp 31–39. IEEEGoogle Scholar
- 42.Saad S, Traore I, Ghorbani A, Sayed B, Zhao D, Lu W, Felix J, Hakimian P (2011) Detecting P2P botnets through network behavior analysis and machine learning. In: Proceedings of 2011 9th Annual International Conference on Privacy, Security and Trust (PST 2011), pp 174–180. IEEEGoogle Scholar