Spyware Prevention by Classifying End User License Agreements

Lavesson, Niklas; Davidsson, Paul; Boldt, Martin; Jacobsson, Andreas

doi:10.1007/978-3-540-79355-7_36

Niklas Lavesson¹,
Paul Davidsson¹,
Martin Boldt¹ &
…
Andreas Jacobsson¹

Part of the book series: Studies in Computational Intelligence ((SCI,volume 134))

682 Accesses
1 Citations

Abstract

We investigate the hypothesis that it is possible to detect from the End User License Agreement (EULA) if the associated software hosts spyware. We apply 15 learning algorithms on a data set consisting of 100 applications with classified EULAs. The results show that 13 algorithms are significantly more accurate than random guessing. Thus, we conclude that the hypothesis can be accepted. Based on the results, we present a novel tool that can be used to prevent spyware by automatically halting application installers and classifying the EULA, giving users the opportunity to make an informed choice about whether to continue with the installation. We discuss positive and negative aspects of this prevention approach and suggest a method for evaluating candidate algorithms for a future implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Boldt, M., Jacobsson, A., Lavesson, N., Davidsson, P.: Automated Spyware Detection Using End User License Agreements. In: 2nd International Conference on Information Security and Assurance, IEEE Press, New York (2008)
Google Scholar
Adaware, http://www.lavasoft.com
EULA Analyzer, http://www.spywareguide.com/analyze
EULAlyzer, http://www.javacoolsoftware.com/eulalyzerpro.html
Cohen, W.: Learning Rules that Classify E-Mail. In: Advances in Inductive Logic Programming, IOS Press, Amsterdam (1996)
Google Scholar
Drucker, H., Wu, D., Vapnik, V.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)
Article Google Scholar
Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to Classify E-Mail. Information Sciences 177, 2167–2187 (2007)
Article Google Scholar
CNET Download.com, http://www.download.com
Spyware Guide, http://www.spywareguide.com
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Frank, E., Bouckaert, R.R.: Naive Bayes for Text Classification with Unbalanced Classes. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 503–510. Springer, Heidelberg (2006)
Chapter Google Scholar
Provost, F., Fawcett, T., Kohavi, R.: The Case against Accuracy Estimation for Comparing Induction Algorithms. In: 15th International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)
Google Scholar
Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: 3rd International Conference on Knowledge Discovery and Data Mining, pp. 43–48. AAAI Press, Menlo Park (1997)
Google Scholar
Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning 52(3), 239–281 (2003)
Article MATH Google Scholar
McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization, pp. 41–48. AAAI Press, Menlo Park (1998)
Google Scholar
Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for Text Categorization Revisited. In: 7th Australian Joint Conference on Artificial Intelligence, pp. 488–499. Springer, Berlin (2004)
Google Scholar
Lavesson, N., Davidsson, P.: Quantifying the Impact of Learning Algorithm Parameter Tuning. In: 21st National Conference on Artificial Intelligence, pp. 395–400. AAAI Press, Menlo Park (2006)
Google Scholar
Lavesson, N., Davidsson, P.: Generic Methods for Multi-criteria Evaluation. In: SIAM International Conference on Data Mining (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Software and Systems Engineering, Blekinge Institute of Technology, Box 520, SE–372 25, Ronneby, Sweden
Niklas Lavesson, Paul Davidsson, Martin Boldt & Andreas Jacobsson

Authors

Niklas Lavesson
View author publications
You can also search for this author in PubMed Google Scholar
Paul Davidsson
View author publications
You can also search for this author in PubMed Google Scholar
Martin Boldt
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Jacobsson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Ngoc Thanh Nguyen Radoslaw Katarzyniak

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lavesson, N., Davidsson, P., Boldt, M., Jacobsson, A. (2008). Spyware Prevention by Classifying End User License Agreements. In: Nguyen, N.T., Katarzyniak, R. (eds) New Challenges in Applied Intelligence Technologies. Studies in Computational Intelligence, vol 134. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79355-7_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-79355-7_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-79354-0
Online ISBN: 978-3-540-79355-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics