Abstract
As became apparent after the tragic events of September 11, 2001, terrorist organizations and other criminal groups are increasingly using the legitimate ways of Internet access to conduct their malicious activities. Such actions cannot be detected by existing intrusion detection systems that are generally aimed at protecting computer systems and networks from some kind of “cyber attacks”. Preparation of an attack against the human society itself can only be detected through analysis of the content accessed by the users. The proposed study aims at developing an innovative methodology for abnormal activity detection, which uses web content as the audit information provided to the detection system. The new behavior-based detection method learns the normal behavior by applying an unsupervised clustering algorithm to the contents of publicly available web pages viewed by a group of similar users. In this paper, we represent page content by the well-known vector space model. The content models of normal behavior are used in real-time to reveal deviation from normal behavior at a specific location on the net. The detection algorithm sensitivity is controlled by a threshold parameter. The method is evaluated by the trade-off between the detection rate (TP) and the false positive rate (FP).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. Debar, M. Dacier, A. Wespi, “Towards a taxonomy of intrusion-detection systems”, Computer Networks, 1999, Vol. 31, pp. 805–822.
W. Lee, S.J. Stolfo, P. K. Chan, E. Eskin, W. Fan, M. Miller, S. Hershkop, J. Zhang, “Real Time Data Mining-based Intrusion Detection”, Proceedings of DISCEX II, 2001.
W. Lee, S.J. Stolfo, “A Framework for Constructing Features and Models for Intrusion Detection Systems”, ACM Transactions on Information and System Security, 2000, Vol. 3, No. 4.
W. Lee, S.J. Stolfo, “Data Mining Approaches for Intrusion Detection”, In Proceedings of the Seventh USENIX Security Symposium, San Antonio, TX, 1998.
K. Richards, “Network Based Intrusion Detection: A Review of Technologies”, Computers & Security, 1999, Vol. 18, pp. 671–682.
E.H. Spafford, D. Zamboni, “Intrusion detection using autonomous agents”, Computer Networks, 2000, Vol. 4, pp. 547–570.
J.S. Balasubramaniyan, J.O. Garcia-Fernandez, D Isacoff, E. Spafford, D. Zamboni, “An architecture for intrusion detection using autonomous agents”, Proceedings 14th Annual Computer Security Applications Conference, IEEE Comput. Soc, Los Alamitos, CA, USA, 1998, xiii+365, pp. 13–24.
J. Cannady, “Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks”, Proceedings of the 23rd National Information Systems Security Conference, 2000.
J. Cannady, “Neural Networks for Misuse Detection: Initial Results”, Proceedings of the Recent Advances in Intrusion Detection’ 98 Conference, 1998, pp. 31–47.
B. Balajinath, S. Raghavan, “Intrusion detection through learning behavior model”, International Journal Of Computer Communications, 2001, Vol. 24, No. 12, pp. 1202–1212.
G. White, V. Pooch, “Cooperating Security Managers: distribute intrusion detection systems”, Computers & Security, 1996, Vol. 15, No. 5, pp. 441–450.
M. Y. Huang, R.J. Jasper, T.M. Wicks, “A large scale distributed intrusion detection framework based on attack strategy analysis”, Computer Networks, 1999, Vol. 31, pp. 2465–2475.
P. Ning, X.S. Wang, S. Jajodia, “Modeling requests among cooperating intrusion detection systems”, Computer Communications, 2000, Vol. 23, pp. 1702–1715.
J. Cannady, “Applying CMAC-based on-line learning to intrusion detection”, In Proceedings of the International Joint Conference on Neural Networks, Italy, 2000, Vol. 5, pp. 405–410.
V. Paxson, “Bro: a system for detecting network intruders in real-time”, Computer Networks, 1999, Vol. 31, pp. 2435–2463.
B.C. Rhodes, J.A. Mahaffey, J.D. Cannady, “Multiple Self-Organizing Maps for Intrusion Detection”, 23rd National Information Systems Security Conference, 2000.
E. Eskin, A. Arnold, M. Prerau, L. Portnoy, S. Stolfo, “A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data”, Data Mining in Security Applications, Kluwer Academic Publishers, 2002.
R.P. Lippmann, R.K. Cunningham, “Improving intrusion detection performance using keyword selection and neural networks”, Computer Networks, 2000, Vol. 34, pp. 597–603.
J.A. Marin, D. Ragsdale, J. Surdu, “A hybrid approach to the profile creation and intrusion detection”, Proceedings DARPA Information Survivability Conference and Exposition II, IEEE Comput. Soc, CA, USA, 2001, Vol. 1, pp. 69–76.
T. Fawcett, F. Provost, “Activity Monitoring: Noticing interesting changes in behavior”, Proceedings on the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999.
Z. Boger, T. Kuflik, P. Shoval, B. Shapira, “Automatic keyword identification by artificial neural networks compared to manual identification by users of filtering systems”, Information Processing and Management, 2001, Vol. 37, pp. 187–198.
E. Bloedorn, I. Mani, “Using NLP for Machine Learning of User Profiles”, Intelligent Data Analysis, 1998, Vol. 2, pp. 3–18.
S. Pierrea, C. Kacanb, W. Probstc, “An agent-based approach for integrating user profile into a knowledge management process”, 2000, Knowledge-Based Systems, Vol. 13, pp. 307–314.
B. Shapira, P. Shoval, U. Hanani, “Stereotypes in Information Filtering Systems”, Information Processing & Management, 1997, Vol. 33, No. 3, pp. 273–287.
B. Shapira, P. Shoval, U. Hanani, “Experimentation with an information filtering system that combines cognitive and sociological filtering integrated with user stereotypes”, Decision Support Systems, 1999, Vol. 27, pp. 5–24.
D. Hand, H. Mannila, P. Smyth, “Principles of Data Mining”, MIt Press, England, 2001.
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, “From Data Mining to Knowledge Discovery in Databases”, AI Magazine, 1996, Vol. 17, No. 3, pp. 37–54.
A.K. Jain, M.N. Murty, P.J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, 1999, Vol. 31, No. 3, pp. 264–323.
A. Schenker, M. Last, H. Bunke, and A. Kandel, “Clustering of Web Documents using a Graph Model”, to appear in “Web Document Analysis: Challenges and Opportunities”, Apostolos Antonacopoulos and Jianying Hu (Editors), World Scientific, 2003.
G. Salton, Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer (Addison-Wesley, Reading, 1989).
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach (Prentice-Hall, Upper Saddle River, 1995).
X. Lu, Document retrieval: a structural approach, Information Processing and Management 26, 2 (1990) 209–218.
Han, J. and Kamber, M., “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 2001.
K. Sequeira and M. Zaki, “ADMIT: Anomaly-based Data Mining for Intrusions”, Proceeding of SIGKDD 02, pp. 386–395, ACM, 2002.
Salton, G., Wong, A., and Yang C.S.A.: Vector Space Model for Automatic Indexing. Communications of the ACM 18, 613–620, 1975
R. Lemos, “What are the real risks of cyberterrorism?”, ZDNet, August 26, 2002, URL: http://zdnet.com.com/2100-1105-955293.html.
George Karypis, CLUTO — A Clustering Toolkit, Release 2.0, University of Minnesota, 2002 [http://www-users.cs.umn.edu/~karypis/cluto/download.html].
U. Hanani, B. Shapira and P. Shoval, “Information Filtering: Overview of Issues, Research and Systems”, User Modeling and User-Adapted Interaction (UMUAI), Vol. 11(3), 203–259, 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Last, M., Shapira, B., Elovici, Y., Zaafrany, O., Kandel, A. (2003). Content-Based Methodology for Anomaly Detection on the Web. In: Menasalvas, E., Segovia, J., Szczepaniak, P.S. (eds) Advances in Web Intelligence. AWIC 2003. Lecture Notes in Computer Science, vol 2663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44831-4_13
Download citation
DOI: https://doi.org/10.1007/3-540-44831-4_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40124-7
Online ISBN: 978-3-540-44831-0
eBook Packages: Springer Book Archive