Skip to main content

Spyware Prevention by Classifying End User License Agreements

  • Chapter
New Challenges in Applied Intelligence Technologies

Part of the book series: Studies in Computational Intelligence ((SCI,volume 134))

Abstract

We investigate the hypothesis that it is possible to detect from the End User License Agreement (EULA) if the associated software hosts spyware. We apply 15 learning algorithms on a data set consisting of 100 applications with classified EULAs. The results show that 13 algorithms are significantly more accurate than random guessing. Thus, we conclude that the hypothesis can be accepted. Based on the results, we present a novel tool that can be used to prevent spyware by automatically halting application installers and classifying the EULA, giving users the opportunity to make an informed choice about whether to continue with the installation. We discuss positive and negative aspects of this prevention approach and suggest a method for evaluating candidate algorithms for a future implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Boldt, M., Jacobsson, A., Lavesson, N., Davidsson, P.: Automated Spyware Detection Using End User License Agreements. In: 2nd International Conference on Information Security and Assurance, IEEE Press, New York (2008)

    Google Scholar 

  2. Adaware, http://www.lavasoft.com

  3. EULA Analyzer, http://www.spywareguide.com/analyze

  4. EULAlyzer, http://www.javacoolsoftware.com/eulalyzerpro.html

  5. Cohen, W.: Learning Rules that Classify E-Mail. In: Advances in Inductive Logic Programming, IOS Press, Amsterdam (1996)

    Google Scholar 

  6. Drucker, H., Wu, D., Vapnik, V.: Support Vector Machines for Spam Categorization. IEEE Transactions on Neural Networks 10(5), 1048–1054 (1999)

    Article  Google Scholar 

  7. Koprinska, I., Poon, J., Clark, J., Chan, J.: Learning to Classify E-Mail. Information Sciences 177, 2167–2187 (2007)

    Article  Google Scholar 

  8. CNET Download.com, http://www.download.com

  9. Spyware Guide, http://www.spywareguide.com

  10. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  11. Frank, E., Bouckaert, R.R.: Naive Bayes for Text Classification with Unbalanced Classes. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 503–510. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  12. Provost, F., Fawcett, T., Kohavi, R.: The Case against Accuracy Estimation for Comparing Induction Algorithms. In: 15th International Conference on Machine Learning, pp. 445–453. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  13. Provost, F., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: 3rd International Conference on Knowledge Discovery and Data Mining, pp. 43–48. AAAI Press, Menlo Park (1997)

    Google Scholar 

  14. Nadeau, C., Bengio, Y.: Inference for the Generalization Error. Machine Learning 52(3), 239–281 (2003)

    Article  MATH  Google Scholar 

  15. McCallum, A., Nigam, K.: A Comparison of Event Models for Naive Bayes Text Classification. In: AAAI 1998 Workshop on Learning for Text Categorization, pp. 41–48. AAAI Press, Menlo Park (1998)

    Google Scholar 

  16. Kibriya, A.M., Frank, E., Pfahringer, B., Holmes, G.: Multinomial Naive Bayes for Text Categorization Revisited. In: 7th Australian Joint Conference on Artificial Intelligence, pp. 488–499. Springer, Berlin (2004)

    Google Scholar 

  17. Lavesson, N., Davidsson, P.: Quantifying the Impact of Learning Algorithm Parameter Tuning. In: 21st National Conference on Artificial Intelligence, pp. 395–400. AAAI Press, Menlo Park (2006)

    Google Scholar 

  18. Lavesson, N., Davidsson, P.: Generic Methods for Multi-criteria Evaluation. In: SIAM International Conference on Data Mining (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Ngoc Thanh Nguyen Radoslaw Katarzyniak

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lavesson, N., Davidsson, P., Boldt, M., Jacobsson, A. (2008). Spyware Prevention by Classifying End User License Agreements. In: Nguyen, N.T., Katarzyniak, R. (eds) New Challenges in Applied Intelligence Technologies. Studies in Computational Intelligence, vol 134. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-79355-7_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-79355-7_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-79354-0

  • Online ISBN: 978-3-540-79355-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics