Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet

Modupe, Abiodun; Olugbara, Oludayo O; Ojo, Sunday O

doi:10.1007/978-3-642-27317-9_10

Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet

Abiodun Modupe¹⁸,
Oludayo O Olugbara¹⁹ &
Sunday O Ojo²⁰

Conference paper

1576 Accesses
1 Citations

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 86))

Abstract

Due to its inherent vulnerability, internet is frequently abused for various criminal activities such as Advanced Fee Fraud (AFF). At present, it is difficult to accurately detect activities of AFF defrauders on internet. For this purpose, we compare classification accuracies of Binary Logistic Regression (BLR), Back-propagation Neural Network (BNN), Naive Bayesian Classifier (NBC) and Support Vector Machine (SVM) learning methods. The word clustering method (globalCM) is used to create clusters of words present in the training dataset. A Vector Space Model (VSM) is calculated from words in each e-mail in the training set. The WEKA data mining framework is selected as a tool to build supervised learning classifiers from the set of VSMs using the learning methods. Experiments are performed using stratified 10-fold cross-validation method to estimate classification accuracies of the classifiers. Results generally show that SVM utilizing a polynomial kernel gives the best classification accuracy. This study makes a positive contribution to the problem of detecting unwanted e-mails. The comparison of different learning methods is also valuable for a decision maker to consider tradeoffs in method accuracy versus complexity.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Grobier, M.: Strategic information security: facing the cyber impact. In: Proceedings of the Workshop on ICT uses in Warfare and Safeguarding of Peace, pp. 12–22. SAICSIT (2010)
Google Scholar
Internet Crime Complaint Center (IC3). An FBI–NW3C partnership, http://www.ic3.gov/media/annualreports.aspx (accessed July 2011)
UAGI. Ultrascan 419unit-419 Advance Fee Fraud Statistics, http://www.ultrascanagi.com/public_html/html/pdf_files/419_Advance_Fee_Fraud_Statistics_2009.pdf
Marcus, K.R., Seigfried, K.: The future of computer forensics:a needs analysis survey. Computer & Security 23(1), 12–16 (2004)
Article Google Scholar
Ciardhuáin, O.S.: An extended model of cybercrime investigations. International Journal of Digital Evidence 3(1) (2004)
Google Scholar
Chandrasekaran, M., Narayanan, K., Upadhyaya, K.S.: Phishing email detection based on structural properties. In: First Annual Symposium on Information Assurance: Intrusion Detection and Prevention, New York, pp. 2–8 (2006)
Google Scholar
Abu-Nimeh, S., Nappa, D., Wang, X., Nair, S.: A comparison of machine learning techniques for phishing detection. In: Proceedings of the Anti-Phishing Working Groups (APWG), Second Annual eCrime Researchers Summit, Pittsburgh, PA, US, pp. 1–10 (2007)
Google Scholar
Fette, I., Sadeh, N., Tomasic, A.: Learning to detect phishing emails. In: Proceedings of the 16th International Conference on World Wide Web, pp. 649–656. ACM Press, New York (2007)
Chapter Google Scholar
Airoldi, E., Malin, B.: Data mining challenges for electronic safety: the case of fraudulent intent detection in emails. In: Proceedings of the Workshop on Privacy and Security Aspects of Data Mining, IEEE International Conference on Data Mining, Brighton, England, pp. 1–10 (2004)
Google Scholar
Hadjidj, R., Debbabi, M., Lounis, H., Iqbal, F.: Towards an Integrated Email Forensic Analysis Framework. Digital Investigation 5, 124–137 (2009)
Article Google Scholar
Modupe, A., Olugbara, O.O., Ojo, S.O.: Identifying advanced fee fraud activities on the internet using machine learning algorithms. In: 3rd IEEE International Conference on Computational Intelligence and Industrial Application (PACIIA), Wuhan, China, pp. 240–242 (2010)
Google Scholar
Wenliang, C., Xingzhi, C., Huizhen, W., Jingbo, Z., Tianshun, Y.: Automatic word clustering for text categorization using global information. In: AIRS, Beijing, China, pp. 1–6. ACM (2004)
Google Scholar
Worth, A.P., Cronin, M.T.D.: The use of discriminant analysis, logistic regression and classification tree analysis in the development of classification models for human health effects. Journal of Molecular Structure 622, 97–111 (2003)
Article Google Scholar
Khan, A., Baharudin, B., Lee, L.H., Khan, K.: A review of machine learning algorithms for text documents classification. Journal of Advanced in Information Technology 1(1), 4–20 (2010)
Google Scholar
Byvatov, E., Fechner, U., Sadowski, J., Schneider, G.: Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J. Chem. Inf. Comput. Sci. 43, 1882–1889 (2003)
Article Google Scholar
Yu, B., Xu, Z., Li, C.: Latent semantic analysis for text categorization using neural network. Knowledge-Based Systems 24, 900–904 (2008)
Article Google Scholar
Bishop, C.M.: Neural networks for pattern recognition. Oxford University Press (1995)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley-Interscience, New York (2000)
MATH Google Scholar
Cortes, C., Vapnik, V.: Support vector networks in machine learning, vol. 20, pp. 273–297 (1995)
Google Scholar
Rios, G., Zhu, H.: Exploring support vector machines and random forests for spam detection. In: Proceedings of CEAS 2004 (2004)
Google Scholar
Mitra, V., Wang, C., Banerjee, S.: Text classification: a least square support vector machine approach. Applied Soft Computing 7, 908–914 (2007)
Article Google Scholar
Porter, M.: An algorithm for suffix stripping. Program 14(3), 130–137 (1980)
Article Google Scholar
Kurz, T., Stoffel, K.: Going beyond stemming: creating concept signatures of complex medical terms. Knowledge Based Systems 15, 309–313 (2002)
Article Google Scholar
Klimt, B., Yang, Y.: The Enron Corpus: A New Dataset for Email Classification Research. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 217–226. Springer, Heidelberg (2004)
Chapter Google Scholar
Salton, G., Yang, C., Wang, A.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software. SIGKDD Explorations 11(1) (2009)
Google Scholar
Wang, T., Chiang, H.: Fuzzy support vector machine for multi-class text categorization. Information Process and Management 43, 914–929 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology, Tshwane University of Technongy, Pretoria, South Africa
Abiodun Modupe
Department of Information Technology, Durban University of Technongy, Durban, South Africa
Oludayo O Olugbara
Faculty of Information Technology, Tshwane University of Technongy, Pretoria, South Africa
Sunday O Ojo

Authors

Abiodun Modupe
View author publications
You can also search for this author in PubMed Google Scholar
Oludayo O Olugbara
View author publications
You can also search for this author in PubMed Google Scholar
Sunday O Ojo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Jackson State University, Jackson, MS, USA
Natarajan Meghanathan
University of Calcutta, Calcutta, India
Nabendu Chaki
Wireilla Net Solutions PTY Ltd., Melbourne, VIC, Australia
Dhinaharan Nagamalai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Modupe, A., Olugbara, O.O., Ojo, S.O. (2012). Comparing Supervised Learning Classifiers to Detect Advanced Fee Fraud Activities on Internet. In: Meghanathan, N., Chaki, N., Nagamalai, D. (eds) Advances in Computer Science and Information Technology. Computer Science and Information Technology. CCSIT 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 86. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27317-9_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-27317-9_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27316-2
Online ISBN: 978-3-642-27317-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics