Content-Based Methodology for Anomaly Detection on the Web

Last, Mark; Shapira, Bracha; Elovici, Yuval; Zaafrany, Omer; Kandel, Abraham

doi:10.1007/3-540-44831-4_13

Mark Last⁵,
Bracha Shapira⁵,
Yuval Elovici⁵,
Omer Zaafrany⁵ &
…
Abraham Kandel⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2663))

Included in the following conference series:

International Atlantic Web Intelligence Conference

468 Accesses
8 Citations

Abstract

As became apparent after the tragic events of September 11, 2001, terrorist organizations and other criminal groups are increasingly using the legitimate ways of Internet access to conduct their malicious activities. Such actions cannot be detected by existing intrusion detection systems that are generally aimed at protecting computer systems and networks from some kind of “cyber attacks”. Preparation of an attack against the human society itself can only be detected through analysis of the content accessed by the users. The proposed study aims at developing an innovative methodology for abnormal activity detection, which uses web content as the audit information provided to the detection system. The new behavior-based detection method learns the normal behavior by applying an unsupervised clustering algorithm to the contents of publicly available web pages viewed by a group of similar users. In this paper, we represent page content by the well-known vector space model. The content models of normal behavior are used in real-time to reveal deviation from normal behavior at a specific location on the net. The detection algorithm sensitivity is controlled by a threshold parameter. The method is evaluated by the trade-off between the detection rate (TP) and the false positive rate (FP).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Debar, M. Dacier, A. Wespi, “Towards a taxonomy of intrusion-detection systems”, Computer Networks, 1999, Vol. 31, pp. 805–822.
Article Google Scholar
W. Lee, S.J. Stolfo, P. K. Chan, E. Eskin, W. Fan, M. Miller, S. Hershkop, J. Zhang, “Real Time Data Mining-based Intrusion Detection”, Proceedings of DISCEX II, 2001.
Google Scholar
W. Lee, S.J. Stolfo, “A Framework for Constructing Features and Models for Intrusion Detection Systems”, ACM Transactions on Information and System Security, 2000, Vol. 3, No. 4.
Google Scholar
W. Lee, S.J. Stolfo, “Data Mining Approaches for Intrusion Detection”, In Proceedings of the Seventh USENIX Security Symposium, San Antonio, TX, 1998.
Google Scholar
K. Richards, “Network Based Intrusion Detection: A Review of Technologies”, Computers & Security, 1999, Vol. 18, pp. 671–682.
Article Google Scholar
E.H. Spafford, D. Zamboni, “Intrusion detection using autonomous agents”, Computer Networks, 2000, Vol. 4, pp. 547–570.
Article Google Scholar
J.S. Balasubramaniyan, J.O. Garcia-Fernandez, D Isacoff, E. Spafford, D. Zamboni, “An architecture for intrusion detection using autonomous agents”, Proceedings 14th Annual Computer Security Applications Conference, IEEE Comput. Soc, Los Alamitos, CA, USA, 1998, xiii+365, pp. 13–24.
Google Scholar
J. Cannady, “Next Generation Intrusion Detection: Autonomous Reinforcement Learning of Network Attacks”, Proceedings of the 23rd National Information Systems Security Conference, 2000.
Google Scholar
J. Cannady, “Neural Networks for Misuse Detection: Initial Results”, Proceedings of the Recent Advances in Intrusion Detection’ 98 Conference, 1998, pp. 31–47.
Google Scholar
B. Balajinath, S. Raghavan, “Intrusion detection through learning behavior model”, International Journal Of Computer Communications, 2001, Vol. 24, No. 12, pp. 1202–1212.
Article Google Scholar
G. White, V. Pooch, “Cooperating Security Managers: distribute intrusion detection systems”, Computers & Security, 1996, Vol. 15, No. 5, pp. 441–450.
Article Google Scholar
M. Y. Huang, R.J. Jasper, T.M. Wicks, “A large scale distributed intrusion detection framework based on attack strategy analysis”, Computer Networks, 1999, Vol. 31, pp. 2465–2475.
Article Google Scholar
P. Ning, X.S. Wang, S. Jajodia, “Modeling requests among cooperating intrusion detection systems”, Computer Communications, 2000, Vol. 23, pp. 1702–1715.
Article Google Scholar
J. Cannady, “Applying CMAC-based on-line learning to intrusion detection”, In Proceedings of the International Joint Conference on Neural Networks, Italy, 2000, Vol. 5, pp. 405–410.
Google Scholar
V. Paxson, “Bro: a system for detecting network intruders in real-time”, Computer Networks, 1999, Vol. 31, pp. 2435–2463.
Article Google Scholar
B.C. Rhodes, J.A. Mahaffey, J.D. Cannady, “Multiple Self-Organizing Maps for Intrusion Detection”, 23rd National Information Systems Security Conference, 2000.
Google Scholar
E. Eskin, A. Arnold, M. Prerau, L. Portnoy, S. Stolfo, “A Geometric Framework for Unsupervised Anomaly Detection: Detecting Intrusions in Unlabeled Data”, Data Mining in Security Applications, Kluwer Academic Publishers, 2002.
Google Scholar
R.P. Lippmann, R.K. Cunningham, “Improving intrusion detection performance using keyword selection and neural networks”, Computer Networks, 2000, Vol. 34, pp. 597–603.
Article Google Scholar
J.A. Marin, D. Ragsdale, J. Surdu, “A hybrid approach to the profile creation and intrusion detection”, Proceedings DARPA Information Survivability Conference and Exposition II, IEEE Comput. Soc, CA, USA, 2001, Vol. 1, pp. 69–76.
Chapter Google Scholar
T. Fawcett, F. Provost, “Activity Monitoring: Noticing interesting changes in behavior”, Proceedings on the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999.
Google Scholar
Z. Boger, T. Kuflik, P. Shoval, B. Shapira, “Automatic keyword identification by artificial neural networks compared to manual identification by users of filtering systems”, Information Processing and Management, 2001, Vol. 37, pp. 187–198.
Article MATH Google Scholar
E. Bloedorn, I. Mani, “Using NLP for Machine Learning of User Profiles”, Intelligent Data Analysis, 1998, Vol. 2, pp. 3–18.
Article Google Scholar
S. Pierrea, C. Kacanb, W. Probstc, “An agent-based approach for integrating user profile into a knowledge management process”, 2000, Knowledge-Based Systems, Vol. 13, pp. 307–314.
Article Google Scholar
B. Shapira, P. Shoval, U. Hanani, “Stereotypes in Information Filtering Systems”, Information Processing & Management, 1997, Vol. 33, No. 3, pp. 273–287.
Article Google Scholar
B. Shapira, P. Shoval, U. Hanani, “Experimentation with an information filtering system that combines cognitive and sociological filtering integrated with user stereotypes”, Decision Support Systems, 1999, Vol. 27, pp. 5–24.
Article Google Scholar
D. Hand, H. Mannila, P. Smyth, “Principles of Data Mining”, MIt Press, England, 2001.
Google Scholar
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, “From Data Mining to Knowledge Discovery in Databases”, AI Magazine, 1996, Vol. 17, No. 3, pp. 37–54.
Google Scholar
A.K. Jain, M.N. Murty, P.J. Flynn, “Data Clustering: A Review”, ACM Computing Surveys, 1999, Vol. 31, No. 3, pp. 264–323.
Article Google Scholar
A. Schenker, M. Last, H. Bunke, and A. Kandel, “Clustering of Web Documents using a Graph Model”, to appear in “Web Document Analysis: Challenges and Opportunities”, Apostolos Antonacopoulos and Jianying Hu (Editors), World Scientific, 2003.
Google Scholar
G. Salton, Automatic Text Processing: the Transformation, Analysis, and Retrieval of Information by Computer (Addison-Wesley, Reading, 1989).
Google Scholar
S. Russell and P. Norvig, Artificial Intelligence: A Modern Approach (Prentice-Hall, Upper Saddle River, 1995).
MATH Google Scholar
X. Lu, Document retrieval: a structural approach, Information Processing and Management 26, 2 (1990) 209–218.
Google Scholar
Han, J. and Kamber, M., “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 2001.
Google Scholar
K. Sequeira and M. Zaki, “ADMIT: Anomaly-based Data Mining for Intrusions”, Proceeding of SIGKDD 02, pp. 386–395, ACM, 2002.
Google Scholar
Salton, G., Wong, A., and Yang C.S.A.: Vector Space Model for Automatic Indexing. Communications of the ACM 18, 613–620, 1975
Article MATH Google Scholar
R. Lemos, “What are the real risks of cyberterrorism?”, ZDNet, August 26, 2002, URL: http://zdnet.com.com/2100-1105-955293.html.
George Karypis, CLUTO — A Clustering Toolkit, Release 2.0, University of Minnesota, 2002 [http://www-users.cs.umn.edu/~karypis/cluto/download.html].
U. Hanani, B. Shapira and P. Shoval, “Information Filtering: Overview of Issues, Research and Systems”, User Modeling and User-Adapted Interaction (UMUAI), Vol. 11(3), 203–259, 2001.
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems Engineering, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
Mark Last, Bracha Shapira, Yuval Elovici & Omer Zaafrany
Department of Computer Science and Engineering, University of South Florida, 4202 E. Fowler Ave. ENB 118, Tampa, FL, 33620, USA
Abraham Kandel

Authors

Mark Last
View author publications
You can also search for this author in PubMed Google Scholar
Bracha Shapira
View author publications
You can also search for this author in PubMed Google Scholar
Yuval Elovici
View author publications
You can also search for this author in PubMed Google Scholar
Omer Zaafrany
View author publications
You can also search for this author in PubMed Google Scholar
Abraham Kandel
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Facultad de Informática, Universidad Politécnica de Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
Ernestina Menasalvas & Javier Segovia &
Institute of Computer Science, Technical University of Lodz, ul.Sterlinga 16/18, 90-217, Lodz, Poland
Piotr S. Szczepaniak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Last, M., Shapira, B., Elovici, Y., Zaafrany, O., Kandel, A. (2003). Content-Based Methodology for Anomaly Detection on the Web. In: Menasalvas, E., Segovia, J., Szczepaniak, P.S. (eds) Advances in Web Intelligence. AWIC 2003. Lecture Notes in Computer Science, vol 2663. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44831-4_13

Download citation

DOI: https://doi.org/10.1007/3-540-44831-4_13
Published: 30 April 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40124-7
Online ISBN: 978-3-540-44831-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics