Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 342))

Abstract

Current network measurement systems are becoming highly sophisticated, producing huge amounts of convoluted measurement data and statistics. As a very common case, those networks implementing statistics reporting based on the NetFlow [15] technology can generate several GBs of data on a daily basis. In addition, these measurements are often very hard to interpret. In this chapter we describe a method that provides linguistic summaries of network traffic measurements as well as a procedure for finding hidden facts in the form of linguistic association rules. Thus, here we address an association rules mining problem. The method is suitable for summarization and analysis of network measurements at the flow level. As a first step, fuzzy linguistic summaries are applied to analyze and extract concise and human consistent summaries from NetFlow collections. Then, a procedure for mining hidden facts in network flow measurements in the form of fuzzy association rules is developed. The method is applied to a wide set of heterogeneous flow measurements, and is shown to be of practical application to network operation and traffic engineering [6, 5], where it can help solve a number of current issues.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules Between Sets of Items in Large Databases. In: Buneman, P., Jajodia, S. (eds.) ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press, Washington (1993)

    Google Scholar 

  2. Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.: Fast Discovery of Association Rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. The MIT Press, Cambridge (1996)

    Google Scholar 

  3. et al DM (2008), Softflowd - fast software NetFlow probe, http://www.mindrot.org/projects/softflowd/

  4. Anand, S.S., Bell, D.A., Hughes, J.G.: EDM: A General Framework for Data Mining Based on Evidence Theory. Data & Knowledge Engineering 18(3), 189–223 (1996)

    Article  MATH  Google Scholar 

  5. Awduche, D., Malcolm, J., Agogbua, J., O’Dell, M., McManus, J.: Requirements for Traffic Engineering Over MPLS. RFC 2702, Internet Engineering Task Force, Network Working Group (1999)

    Google Scholar 

  6. Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., Xiao, X.: Overview and Principles of Internet Traffic Engineering. RFC 3272, Internet Engineering Task Force, Network Working Group, category: Informational (2002)

    Google Scholar 

  7. Boote, J.W., et al.: Towards Multi-Domain Monitoring for the European Research Networks. In: 21th TERENA Networking Conference, Trans-European Research and Education Networking Association, Poznan, Poland (2005)

    Google Scholar 

  8. Borgelt, C.: Efficient Implementations of Apriori and Eclat. In: ICDM Workshop of Frequent Item Set Mining Implementations (FIMI 2003), Melbourne, FL, USA, pp. 24–32 (2003)

    Google Scholar 

  9. Broido, A., Hyun, Y., Gao, R., Claffy, K.C.: Their Share: Diversity and Disparity in IP Traffic. In: 5th Passive and Active Measurement Workshop (PAM), Antibes Juan-Les-Pins, France, pp. 113–125 (2004)

    Google Scholar 

  10. Brownlee, N., Claffy, K.C.: Understanding Internet Traffic Streams: Dragonflies and Tortoises. IEEE Communications Magazine 40(10), 110–117 (2002)

    Article  Google Scholar 

  11. Cai, D., McTear, M.F., McClean, S.I.: Knowledge discovery in distributed databases using evidence theory. International Journal of Intelligent Systems 15(8), 745–761 (2000)

    Article  MATH  Google Scholar 

  12. Calyam, P., Krymskiy, D., Sridharan, M., Schopis, P.: Active and Passive Measurements on Campus, Regional and National Network Backbone Paths. In: 14th IEEE International Conference on Computer Communications and Networks (ICCCN 2005), San Diego, California, USA, pp. 537–542 (2005)

    Google Scholar 

  13. Casillas, J., Cordón, O., Herrera, F., Magdalena, L. (eds.): Interpretability Issues in Fuzzy Modeling. Studies in Fuzziness and Soft Computing. Springer, Berlin (2003) ISBN: 978-3-540-02932-8

    Google Scholar 

  14. Choi, B.Y., Bhattacharyya, S.: Observations on Cisco sampled NetFlow. ACM SIGMETRICS Performance Evaluation Review 33(3), 18–23 (2005)

    Article  Google Scholar 

  15. CISCO IOS NETFLOW: Cisco IOS NetFlow (2007), http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html

  16. Claise, B., et al.: Specification of the IPFIX Protocol for the Exchange of IP Traffic Flow Information. Revision 26, Internet Engineering Task Force, IPFIX Working Group, Internet Draft (2007)

    Google Scholar 

  17. Cooperative Association for Internet Data Analysis, CAIDA Visualization Tools (2008), http://www.caida.org/tools/visualization/

  18. Crovella, M.E., Bestavros, A.: Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes. IEEE/ACM Transactions on Networking 5(6), 835–846 (1997)

    Article  Google Scholar 

  19. Crovella, M.E., Krishnamurthy, B.: Internet Measurement: Infrastructure, Traffic and Applications. Wiley, Chichester (2006) ISBN: 978-0470014615

    Google Scholar 

  20. Delgado, M., Marín, N., Sánchez, D., Vila, M.A.: Fuzzy Association Rules: General Model and Applications. IEEE Transactions on Fuzzy Systems 11(2), 214–225 (2003)

    Article  Google Scholar 

  21. Deri, L.: ntop (2008), http://www.ntop.org

  22. Duffield, N.: Sampling for Passive Internet Measurement: A Review. Statistical Science 19(3), 472–498 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  23. Estan, C., Savage, S., Varghese, G.: Automatically Inferring Patterns of Resource Consumption in Network Traffic. In: ACM SIGCOMM 2003, Karlsruhe, Germany, pp. 137–148 (2003)

    Google Scholar 

  24. Fullmer, M., et al.: flow-tools (2007), http://www.splintered.net/sw/flow-tools/

  25. Guillaume, S.: Designing fuzzy inference systems from data: An interpretability-oriented review. IEEE Transactions on Fuzzy Systems 9(3), 426–443 (2001)

    Article  MathSciNet  Google Scholar 

  26. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2003) ISBN: 978-0387952840

    Google Scholar 

  27. internet2 netflow weekly reports, Internet2 NetFlow: Weekly Reports (2008), http://netflow.internet2.edu/weekly/

  28. internet2observatory, The Internet2 Observatory (2008), http://www.internet2.edu/observatory/

  29. Kacprzyk, J., Yager, R.R.: Linguistic Summaries of Data Using Fuzzy Logic. International Journal of General Systems 30(2), 133–154 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  30. Kacprzyk, J., Zadrozny, S.: Linguistic Summarization of Data Sets Using Association Rules. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), St. Louis, USA, pp. 702–707 (2003)

    Google Scholar 

  31. Kacprzyk, J., Zadrozny, S.: Linguistic Database Summaries and Their Protoforms: Towards Natural Language Based Knowledge Discovery Tools. Information Sciences 173(4), 281–304 (2005)

    Article  MathSciNet  Google Scholar 

  32. Kacprzyk, J., Yager, R.R., Zadrozny, S.: Fuzzy Linguistic Summaries of Databases for an Efficient Business Data Analysis and Decision Support. In: Knowledge Discovery for Business Information Systems. Springer International Series in Engineering and Computer Science, vol. 600, pp. 129–152. Kluwer Academic Publishers, Boston (2001) ISBN: 978079237243

    Chapter  Google Scholar 

  33. Kotz, D., Henderson, T., Abyzov, I.: CRAWDAD data set dartmouth/campus (v. 2007-02-08) (2007), http://crawdad.cs.dartmouth.edu/dartmouth/campus

  34. Lakhina, A., Papagiannaki, K., Crovella, M.E., Diot, C., Kolaczyk, E.D., Taft, N.: Structural analysis of network traffic flows. In: Joint International Conference on Measurement and Modeling of Computer Systems (ACM SIGMETRICS), New York, NY, USA, pp. 61–72 (2004)

    Google Scholar 

  35. Liétard, L.: A New Definition of Linguistic Summaries of Data. In: 17th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2008), IEEE World Congress on Computational Intelligence, Hong Kong, China, pp. 506–511 (2008)

    Google Scholar 

  36. Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers (I). Interpretations. Fuzzy Sets and Systems 95(1), 1–21 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  37. Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers (II). Reasoning and applications. Fuzzy Sets and Systems 95(2), 135–146 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  38. Montesino-Pouzols, F., Barriga, A., Lopez, D.R., Sánchez-Solano, S.: Linguistic Summarization of Network Traffic Flows. In: 17th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2008), IEEE World Congress on Computational Intelligence, Hong Kong, China, pp. 619–624 (2008)

    Google Scholar 

  39. Plonka, D.: FlowScan: A Network Traffic Flow Reporting and Visualization Tool. In: 14th USENIX conference on System administration, New Orleans, Louisiana, USA, pp. 305–318 (2000)

    Google Scholar 

  40. Plonka, D.: Cflow (2005), http://net.doit.wisc.edu/~plonka/Cflow/

  41. Raschia, G., Mouaddib, N.: SAINTETIQ: a fuzzy set-based approach to database summarization. Fuzzy Sets and Systems 129(2), 137–162 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  42. Rasmussen, D., Yager, R.R.: Finding fuzzy and gradual functional dependencies with SummarySQL. Fuzzy Sets and Systems 106(2), 31–42 (1999)

    Article  MathSciNet  Google Scholar 

  43. Rolls, D., Michailidis, G., Hernández-Campos, F.: Queueing Analysis of Network Traffic: Methodology and Visualization Tools. Computer Networks 48(3), 447–473 (2005)

    Article  Google Scholar 

  44. Shalunov, S., Teitelbaum, B.: TCP Use and Performance on Internet2. In: ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, USA, pp. 147–160 (2001)

    Google Scholar 

  45. Sommers, J., Barford, P., Willinger, W.: SPLAT: A Visualization Tool for Mining Internet Measurements. In: 7th Passive and Active Network Measurement Workshop, pp. 31–40 (2006)

    Google Scholar 

  46. Widely Integrated Distributed Environment (WIDE) Project, MAWI Working Group, Packet traces from wide backbone (2008), http://tracer.csl.sony.co.jp/mawi/

  47. Yager, R.R.: A New Approach to the Summarization of Data. Information Sciences 28, 69–86 (1982)

    Article  MATH  MathSciNet  Google Scholar 

  48. Yager, R.R.: On Ordered Weighted Averaging Operators in Multicriteria Decision Making. IEEE Transactions on Systems, Man and Cybernetics 18(1), 183–190 (1988)

    Article  MATH  MathSciNet  Google Scholar 

  49. Yager, R.R.: Database Discovery Using Fuzzy Sets. International Journal of Intelligent Systems 11, 691–712 (1996)

    Article  Google Scholar 

  50. Yager, R.R., Engemann, K.J., Filev, D.P.: On the Concept of Immediate Probabilities. International Journal of Intelligent Systems 10(4), 373–397 (1995)

    Article  MATH  Google Scholar 

  51. Yusuf, S., Luk, W., Sloman, M., Dulay, N., Lupu, E.C., Brown, G.: Reconfigurable Architecture for Network Flow Analysis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16(2), 57–65 (2008)

    Article  Google Scholar 

  52. Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning. Information Sciences 8(3), 199–249 (1975)

    Article  MathSciNet  Google Scholar 

  53. Zadeh, L.A.: A Computational Approach to Fuzzy Quantifiers in Natural Languages. Computers and Mathematics with Applications 9, 149–184 (1983)

    Article  MATH  MathSciNet  Google Scholar 

  54. Zadeh, L.A.: A Prototype-Centered Approach to Adding Deduction Capability to Search Engines-the Concept of Protoform. In: First International IEEE Symposium on Intelligent Systems, Varna, Bulgaria, vol. 1, pp. 2–3 (2002)

    Google Scholar 

Download references

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Pouzols, F.M., Lopez, D.R., Barros, A.B. (2011). Summarization and Analysis of Network Traffic Flow Records. In: Mining and Control of Network Traffic by Computational Intelligence. Studies in Computational Intelligence, vol 342. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18084-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-18084-2_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-18083-5

  • Online ISBN: 978-3-642-18084-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics