Abstract
In adversarial settings, the interests of modellers and those being modelled do not run together. This includes domains such as law enforcement and counterterrorism but, increasingly, also more mainstream domains such as customer relationship management. The conventional strategy, maximizing the fit of a model to the available data, does not work in adversarial settings because the data cannot all be trusted, and because it makes the results too predictable to adversaries. Some existing techniques remain applicable, others can be used if they are modified to allow for the adversarial setting, while others must be discarded. General principles for this domain are discussed and the implications for practice outlined.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abbasi, A., Chen, H.: Visualizing authorship for identification. In: Mehrotra, S., Zeng, D.D., Chen, H., Thuraisingham, B., Wang, F.-Y. (eds.) ISI 2006. LNCS, vol. 3975, pp. 60–71. Springer, Heidelberg (2006)
Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)
Bourassa, M.A.J., Skillicorn, D.B.: Hardening adversarial prediction with anomaly tracking. In: IEEE Intelligence and Security Informatics 2009, pp. 43–48 (2009)
Breiman, L.: Random forests–random features. Technical Report 567, Department of Statistics, University of California, Berkeley (September 1999)
Burges, C.J.C.: A tutorial on support vector machines for pattern recognition. In: Data Mining and Knowledge Discovery, vol. 2, pp. 121–167 (1998)
Campbell, R.S., Pennebaker, J.W.: The secret life of pronouns: Flexibility in writing style and physical health. Psychological Science 14(1), 60–65 (2003)
Chaski, C.E.: Who’s at the keyboard: Authorship attribution in digital evidence investigations. International Journal of Digital Evidence 4(1) (2005)
Chung, C.K., Pennebaker, W.J.: Revealing dimensions of thinking in open-ended self-descriptions: An automated meaning extraction method for natural language. Journal of Research in Personality 42, 96–132 (2008)
Chung, C.K., Pennebaker, J.W.: The psychological function of function words. In: Fiedler, K. (ed.) Frontiers in Social Psychology. Psychology Press (in press)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
de Vel, O., Anderson, A., Corney, M., Mohay, G.: Mining E-mail content for author identification forensics. SIGMOD Record 30(4), 55–64 (2001)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39, 138 (1977)
Dutrisac, J.G., Skillicorn, D.B.: Hiding clusters in adversarial settings. In: 2008 IEEE Intelligence and Security Informatics, pp. 185–187 (2008)
Dutrisac, J.G., Skillicorn, D.B.: Subverting prediction in adversarial settings. In: 2008 IEEE Intelligence and Security Informatics, pp. 19–24 (2008)
Fong, S.W., Skillicorn, D.B., Roussinov, D.: Detecting word substitutions in text. IEEE Transactions on Knowledge and Data Engineering 20(8), 1067–1076 (2008)
Jodoin, P.-M., Konrad, J., Saligrama, V., Gaboury, V.: Motion detection with an unstable camera. In: IEEE International Conference on Image Processing, pp. 229–232 (2008)
Jonas, J., Harper, J.: Effective counterterrorism and the limited role of predictive data mining. Policy Analysis 584, 1–12 (2006)
Koppel, M., Akiva, N., Alshech, E., Bar, K.: Automatically classifying documents by ideological and organizational affiliation. In: Proceedings of the IEEE International Conference on Intelligence and Security Informatics (ISI 2009), pp. 176–178 (2009)
Koppel, M., Argamon, S., Shimoni, A.R.: Automatically categorizing written texts by author gender. Literary and Linguistic Computing 17(4), 401–412 (2002)
Koppel, M., Schler, J., Bonchek-Dokow, E.: Measuring differentiability: Unmasking pseudonymous authors. Journal of Machine Learning Research 8, 1261–1276 (2007)
Newman, M.L., Pennebaker, J.W., Berry, D.S., Richards, J.M.: Lying words: Predicting deception from linguistic style. Personality and Social Psychology Bulletin 29, 665–675 (2003)
Pietronero, L., Tosattib, E., Tosattib, V., Vespignani, A.: Explaining the uneven distribution of numbers in nature: the laws of Benford and Zipf. Physica A: Statistical Mechanics and its Applications 1-2, 297–304 (2001)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Schneier, B.: Why data mining won’t stop terror. Wired (2006)
Senator, T.E.: Multi-stage classification. In: Proceedings of the Fifth IEEE International Conference on Data Mining, pp. 386–393 (2005)
Skillicorn, D.B.: Knowledge Discovery for Counterterrorism and Law Enforcement. CRC Press, Boca Raton (2008)
von Luxburg, U.: A tutorial on spectral clustering. Technical Report 149, Max Plank Institute for Biological Cybernetics (August 2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Skillicorn, D.B. (2012). Knowledge Discovery in Adversarial Settings. In: Holmes, D., Jain, L. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23151-3_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-23151-3_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23150-6
Online ISBN: 978-3-642-23151-3
eBook Packages: EngineeringEngineering (R0)