Abstract
Current network measurement systems are becoming highly sophisticated, producing huge amounts of convoluted measurement data and statistics. As a very common case, those networks implementing statistics reporting based on the NetFlow [15] technology can generate several GBs of data on a daily basis. In addition, these measurements are often very hard to interpret. In this chapter we describe a method that provides linguistic summaries of network traffic measurements as well as a procedure for finding hidden facts in the form of linguistic association rules. Thus, here we address an association rules mining problem. The method is suitable for summarization and analysis of network measurements at the flow level. As a first step, fuzzy linguistic summaries are applied to analyze and extract concise and human consistent summaries from NetFlow collections. Then, a procedure for mining hidden facts in network flow measurements in the form of fuzzy association rules is developed. The method is applied to a wide set of heterogeneous flow measurements, and is shown to be of practical application to network operation and traffic engineering [6, 5], where it can help solve a number of current issues.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Imielinski, T., Swami, A.N.: Mining Association Rules Between Sets of Items in Large Databases. In: Buneman, P., Jajodia, S. (eds.) ACM SIGMOD International Conference on Management of Data, pp. 207–216. ACM Press, Washington (1993)
Agrawal, R., Mannila, H., Srikant, R., Toivonen, H., Verkamo, A.: Fast Discovery of Association Rules. In: Advances in Knowledge Discovery and Data Mining, pp. 307–328. The MIT Press, Cambridge (1996)
et al DM (2008), Softflowd - fast software NetFlow probe, http://www.mindrot.org/projects/softflowd/
Anand, S.S., Bell, D.A., Hughes, J.G.: EDM: A General Framework for Data Mining Based on Evidence Theory. Data & Knowledge Engineering 18(3), 189–223 (1996)
Awduche, D., Malcolm, J., Agogbua, J., O’Dell, M., McManus, J.: Requirements for Traffic Engineering Over MPLS. RFC 2702, Internet Engineering Task Force, Network Working Group (1999)
Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., Xiao, X.: Overview and Principles of Internet Traffic Engineering. RFC 3272, Internet Engineering Task Force, Network Working Group, category: Informational (2002)
Boote, J.W., et al.: Towards Multi-Domain Monitoring for the European Research Networks. In: 21th TERENA Networking Conference, Trans-European Research and Education Networking Association, Poznan, Poland (2005)
Borgelt, C.: Efficient Implementations of Apriori and Eclat. In: ICDM Workshop of Frequent Item Set Mining Implementations (FIMI 2003), Melbourne, FL, USA, pp. 24–32 (2003)
Broido, A., Hyun, Y., Gao, R., Claffy, K.C.: Their Share: Diversity and Disparity in IP Traffic. In: 5th Passive and Active Measurement Workshop (PAM), Antibes Juan-Les-Pins, France, pp. 113–125 (2004)
Brownlee, N., Claffy, K.C.: Understanding Internet Traffic Streams: Dragonflies and Tortoises. IEEE Communications Magazine 40(10), 110–117 (2002)
Cai, D., McTear, M.F., McClean, S.I.: Knowledge discovery in distributed databases using evidence theory. International Journal of Intelligent Systems 15(8), 745–761 (2000)
Calyam, P., Krymskiy, D., Sridharan, M., Schopis, P.: Active and Passive Measurements on Campus, Regional and National Network Backbone Paths. In: 14th IEEE International Conference on Computer Communications and Networks (ICCCN 2005), San Diego, California, USA, pp. 537–542 (2005)
Casillas, J., Cordón, O., Herrera, F., Magdalena, L. (eds.): Interpretability Issues in Fuzzy Modeling. Studies in Fuzziness and Soft Computing. Springer, Berlin (2003) ISBN: 978-3-540-02932-8
Choi, B.Y., Bhattacharyya, S.: Observations on Cisco sampled NetFlow. ACM SIGMETRICS Performance Evaluation Review 33(3), 18–23 (2005)
CISCO IOS NETFLOW: Cisco IOS NetFlow (2007), http://www.cisco.com/en/US/products/ps6601/products_ios_protocol_group_home.html
Claise, B., et al.: Specification of the IPFIX Protocol for the Exchange of IP Traffic Flow Information. Revision 26, Internet Engineering Task Force, IPFIX Working Group, Internet Draft (2007)
Cooperative Association for Internet Data Analysis, CAIDA Visualization Tools (2008), http://www.caida.org/tools/visualization/
Crovella, M.E., Bestavros, A.: Self-Similarity in World Wide Web Traffic: Evidence and Possible Causes. IEEE/ACM Transactions on Networking 5(6), 835–846 (1997)
Crovella, M.E., Krishnamurthy, B.: Internet Measurement: Infrastructure, Traffic and Applications. Wiley, Chichester (2006) ISBN: 978-0470014615
Delgado, M., MarÃn, N., Sánchez, D., Vila, M.A.: Fuzzy Association Rules: General Model and Applications. IEEE Transactions on Fuzzy Systems 11(2), 214–225 (2003)
Deri, L.: ntop (2008), http://www.ntop.org
Duffield, N.: Sampling for Passive Internet Measurement: A Review. Statistical Science 19(3), 472–498 (2004)
Estan, C., Savage, S., Varghese, G.: Automatically Inferring Patterns of Resource Consumption in Network Traffic. In: ACM SIGCOMM 2003, Karlsruhe, Germany, pp. 137–148 (2003)
Fullmer, M., et al.: flow-tools (2007), http://www.splintered.net/sw/flow-tools/
Guillaume, S.: Designing fuzzy inference systems from data: An interpretability-oriented review. IEEE Transactions on Fuzzy Systems 9(3), 426–443 (2001)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer, New York (2003) ISBN: 978-0387952840
internet2 netflow weekly reports, Internet2 NetFlow: Weekly Reports (2008), http://netflow.internet2.edu/weekly/
internet2observatory, The Internet2 Observatory (2008), http://www.internet2.edu/observatory/
Kacprzyk, J., Yager, R.R.: Linguistic Summaries of Data Using Fuzzy Logic. International Journal of General Systems 30(2), 133–154 (2001)
Kacprzyk, J., Zadrozny, S.: Linguistic Summarization of Data Sets Using Association Rules. In: IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), St. Louis, USA, pp. 702–707 (2003)
Kacprzyk, J., Zadrozny, S.: Linguistic Database Summaries and Their Protoforms: Towards Natural Language Based Knowledge Discovery Tools. Information Sciences 173(4), 281–304 (2005)
Kacprzyk, J., Yager, R.R., Zadrozny, S.: Fuzzy Linguistic Summaries of Databases for an Efficient Business Data Analysis and Decision Support. In: Knowledge Discovery for Business Information Systems. Springer International Series in Engineering and Computer Science, vol. 600, pp. 129–152. Kluwer Academic Publishers, Boston (2001) ISBN: 978079237243
Kotz, D., Henderson, T., Abyzov, I.: CRAWDAD data set dartmouth/campus (v. 2007-02-08) (2007), http://crawdad.cs.dartmouth.edu/dartmouth/campus
Lakhina, A., Papagiannaki, K., Crovella, M.E., Diot, C., Kolaczyk, E.D., Taft, N.: Structural analysis of network traffic flows. In: Joint International Conference on Measurement and Modeling of Computer Systems (ACM SIGMETRICS), New York, NY, USA, pp. 61–72 (2004)
Liétard, L.: A New Definition of Linguistic Summaries of Data. In: 17th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2008), IEEE World Congress on Computational Intelligence, Hong Kong, China, pp. 506–511 (2008)
Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers (I). Interpretations. Fuzzy Sets and Systems 95(1), 1–21 (1998)
Liu, Y., Kerre, E.E.: An overview of fuzzy quantifiers (II). Reasoning and applications. Fuzzy Sets and Systems 95(2), 135–146 (1998)
Montesino-Pouzols, F., Barriga, A., Lopez, D.R., Sánchez-Solano, S.: Linguistic Summarization of Network Traffic Flows. In: 17th IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2008), IEEE World Congress on Computational Intelligence, Hong Kong, China, pp. 619–624 (2008)
Plonka, D.: FlowScan: A Network Traffic Flow Reporting and Visualization Tool. In: 14th USENIX conference on System administration, New Orleans, Louisiana, USA, pp. 305–318 (2000)
Plonka, D.: Cflow (2005), http://net.doit.wisc.edu/~plonka/Cflow/
Raschia, G., Mouaddib, N.: SAINTETIQ: a fuzzy set-based approach to database summarization. Fuzzy Sets and Systems 129(2), 137–162 (2002)
Rasmussen, D., Yager, R.R.: Finding fuzzy and gradual functional dependencies with SummarySQL. Fuzzy Sets and Systems 106(2), 31–42 (1999)
Rolls, D., Michailidis, G., Hernández-Campos, F.: Queueing Analysis of Network Traffic: Methodology and Visualization Tools. Computer Networks 48(3), 447–473 (2005)
Shalunov, S., Teitelbaum, B.: TCP Use and Performance on Internet2. In: ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, USA, pp. 147–160 (2001)
Sommers, J., Barford, P., Willinger, W.: SPLAT: A Visualization Tool for Mining Internet Measurements. In: 7th Passive and Active Network Measurement Workshop, pp. 31–40 (2006)
Widely Integrated Distributed Environment (WIDE) Project, MAWI Working Group, Packet traces from wide backbone (2008), http://tracer.csl.sony.co.jp/mawi/
Yager, R.R.: A New Approach to the Summarization of Data. Information Sciences 28, 69–86 (1982)
Yager, R.R.: On Ordered Weighted Averaging Operators in Multicriteria Decision Making. IEEE Transactions on Systems, Man and Cybernetics 18(1), 183–190 (1988)
Yager, R.R.: Database Discovery Using Fuzzy Sets. International Journal of Intelligent Systems 11, 691–712 (1996)
Yager, R.R., Engemann, K.J., Filev, D.P.: On the Concept of Immediate Probabilities. International Journal of Intelligent Systems 10(4), 373–397 (1995)
Yusuf, S., Luk, W., Sloman, M., Dulay, N., Lupu, E.C., Brown, G.: Reconfigurable Architecture for Network Flow Analysis. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 16(2), 57–65 (2008)
Zadeh, L.A.: The concept of a linguistic variable and its application to approximate reasoning. Information Sciences 8(3), 199–249 (1975)
Zadeh, L.A.: A Computational Approach to Fuzzy Quantifiers in Natural Languages. Computers and Mathematics with Applications 9, 149–184 (1983)
Zadeh, L.A.: A Prototype-Centered Approach to Adding Deduction Capability to Search Engines-the Concept of Protoform. In: First International IEEE Symposium on Intelligent Systems, Varna, Bulgaria, vol. 1, pp. 2–3 (2002)
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Pouzols, F.M., Lopez, D.R., Barros, A.B. (2011). Summarization and Analysis of Network Traffic Flow Records. In: Mining and Control of Network Traffic by Computational Intelligence. Studies in Computational Intelligence, vol 342. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-18084-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-18084-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-18083-5
Online ISBN: 978-3-642-18084-2
eBook Packages: EngineeringEngineering (R0)