Network Data Mining and Analysis: The $$ \mathcal{N}\mathcal{E}\mathcal{M}\mathcal{E}\mathcal{S}\mathcal{I}\mathcal{S} $$ Project

Garofalakis, Minos; Rastogi, Rajeev

doi:10.1007/3-540-47887-6_1

Minos Garofalakis⁴ &
Rajeev Rastogi⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2336))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

2111 Accesses
1 Citations

Abstract

Modern communication networks generate large amounts of operational data, including traffic and utilization statistics and alarm/fault data at various levels of detail. These massive collections of network-management data can grow in the order of several Terabytes per year, and typically hide “knowledge” that is crucial to some of the key tasks involved in effectively managing a communication network (e.g., capacity planning and traffic engineering). In this short paper, we provide an overview of some of our recent and ongoing work in the context of the $ \mathcal{N}\mathcal{E}\mathcal{M}\mathcal{E}\mathcal{S}\mathcal{I}\mathcal{S} $ project at Bell Laboratories that aims to develop novel data warehousing and mining technology for the effective storage, exploration, and analysis of massive network-management data sets. We first give some highlights of our work on Model-Based Semantic Compression (MBSC), a novel data-compression framework that takes advantage of attribute semantics and data-mining models to perform lossy compression of massive network-data tables. We discuss the architecture and some of the key algorithms underlying $ \mathcal{S}\mathcal{P}\mathcal{A}\mathcal{R}\mathcal{T}\mathcal{A}\mathcal{N} $ , a model-based semantic compression system that exploits predictive data correlations and prescribed error tolerances for individual attributes to construct concise and accurate Classification and Regression Tree (CaRT) models for entire columns of a table. We also summarize some of our ongoing work on warehousing and analyzing network-fault data and discuss our vision of how data-mining techniques can be employed to help automate and improve fault-management in modern communication networks. More specifically, we describe the two key components of modern fault-management architectures, namely the event-correlation and the root-cause analysis engines, and propose the use of mining ideas for the automated inference and maintenance of the models that lie at the core of these components based on warehoused network data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

“NetFlow Services and Applications”. Cisco Systems White Paper, 1999.
Google Scholar
S. Babu, M. Garofalakis, and R. Rastogi. “SPARTAN: A Model-Based Semantic Compression System for Massive Data Tables”. In Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of Data, May 2001.
Google Scholar
L. Breiman, J.H. Friedman, R.A. Olshen, and C.J. Stone. “Classification and Regression Trees”. Chapman & Hall, 1984.
Google Scholar
A.L. Buchsbaum, D.F. Caldwell, K. Church, G.S. Fowler, and S. Muthukrishnan. “Engineering the Compression of Massive Tables: An Experimental Approach”. In Proc. of the 11th Annual ACM-SIAM Symp. on Discrete Algorithms, January 2000.
Google Scholar
K. Chakrabarti, M. Garofalakis, R. Rastogi, and K. Shim. “Approximate Query Processing Using Wavelets”. In Proc. of the 26th Intl. Conf. on Very Large Data Bases, September 2000.
Google Scholar
A. Deshpande, M. Garofalakis, and R. Rastogi. “Independence is Good: Dependency-Based Histogram Synopses for High-Dimensional Data”. In Proc. of the 2001 ACM SIGMOD Intl. Conf. on Management of Data, May 2001.
Google Scholar
C. Fraleigh, S. Moon, C. Diot, B. Lyles, and F. Tobagi. “Architecture of a Passive Monitoring System for Backbone IP Networks”. Technical Report TR00-ATL-101-801, Sprint Advanced Technology Laboratories, October 2000.
Google Scholar
M.R. Garey and D.S. Johnson. “Computers and Intractability: A Guide to the Theory of NP-Completeness”. W.H. Freeman, 1979.
Google Scholar
M.M. Halldórsson. “Approximations of Weighted Independent Set and Hereditary Subset Problems”. Journal of Graph Algorithms and Applications, 4(1), 2000.
Google Scholar
H.V. Jagadish, J. Madar, and R. Ng. “Semantic Compression and Pattern Extraction with Fascicles”. In Proc. of the 25th Intl. Conf. on Very Large Data Bases, September 1999.
Google Scholar
G. Jakobson and M.D. Weissman. “Alarm Correlation”. IEEE Network, November 1993.
Google Scholar
Judea Pearl. “Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference”. Morgan Kaufmann Publishers, Inc., 1988.
Google Scholar
William Stallings. “SNMP, SNMPv2, SNMPv3, and RMON 1 and 2”. Addison-Wesley Longman, Inc., 1999. (Third Edition).
Google Scholar
S. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. “High Speed & Robust Event Correlation”. IEEE Communications Magazine, May 1996.
Google Scholar

Download references

Author information

Authors and Affiliations

Bell Labs, Lucent Technologies, USA
Minos Garofalakis & Rajeev Rastogi

Authors

Minos Garofalakis
View author publications
You can also search for this author in PubMed Google Scholar
Rajeev Rastogi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

EE Department, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei, Taiwan, ROC
Ming-Syan Chen
IBM Thomas J. Watson Research Center, 30 Sawmill River Road, Hawthorne, NY, 10532, USA
Philip S. Yu
School of Computing, National University of Singapore, Lower Kent Ridge Road, Singapore, 119260
Bing Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garofalakis, M., Rastogi, R. (2002). Network Data Mining and Analysis: The $ \mathcal{N}\mathcal{E}\mathcal{M}\mathcal{E}\mathcal{S}\mathcal{I}\mathcal{S} $ Project. In: Chen, MS., Yu, P.S., Liu, B. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2002. Lecture Notes in Computer Science(), vol 2336. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-47887-6_1

Download citation

DOI: https://doi.org/10.1007/3-540-47887-6_1
Published: 29 April 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43704-8
Online ISBN: 978-3-540-47887-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Network Data Mining and Analysis: The \( \mathcal{N}\mathcal{E}\mathcal{M}\mathcal{E}\mathcal{S}\mathcal{I}\mathcal{S} \) Project

Abstract

Access this chapter

Preview

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Network Data Mining and Analysis: The \( \mathcal{N}\mathcal{E}\mathcal{M}\mathcal{E}\mathcal{S}\mathcal{I}\mathcal{S} \) Project

Abstract

Access this chapter

Preview

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation