Skip to main content

A Generative Model for Self/Non-self Discrimination in Strings

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5495))

Abstract

A statistical model is presented as an alternative to negative selection in anomaly detection of discrete data. We extend the use of probabilistic generative models from fixed-length binary strings into variable-length strings from a finite symbol alphabet using a mixture model of multinomial distributions for the frequency of adjacent symbols in a sliding window over a string. Robust and localized change analysis of text corpora is viewed as an application area.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Forrest, S., Perelson, A.S., Allen, L., Cherukuri, R.: Self-nonself discrimination in a computer. In: Proceedings of the 1994 IEEE Symposium on Research in Security and Privacy, Oakland, CA, pp. 202–212. IEEE Computer Society Press, Los Alamitos (1994)

    Chapter  Google Scholar 

  2. Stibor, T.: An empirical study of self/non-self discrimination in binary data with a kernel estimator. In: Bentley, P.J., Lee, D., Jung, S. (eds.) ICARIS 2008. LNCS, vol. 5132, pp. 352–363. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  3. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, Cambridge (2001)

    Google Scholar 

  4. Stibor, T.: Discriminating self from non-self with finite mixtures of multivariate Bernoulli distributions. In: Proceedings of Genetic and Evolutionary Computation Conference – GECCO, pp. 127–134. ACM Press, New York (2008)

    Chapter  Google Scholar 

  5. Pöllä, M., Honkela, T.: Change detection of text documents using negative first-order statistics. In: Proceedings of AKRR 2008, The Second International and Interdisciplinary Conference on Adaptive Knowledge Representation and Reasoning, Porvoo, Finland, September 2008, pp. 48–55 (2008)

    Google Scholar 

  6. D’haeseleer, P.: An immunological approach to change detection: theoretical results. In: Proceedings of the 9th Computer Security Foundations Workshop, pp. 18–26. IEEE Computer Society Press, Los Alamitos (1996)

    Google Scholar 

  7. de Castro, L.N., Timmis, J. (eds.): Artificial Immune Systems: A New Computational Intelligence Approach. Springer, Heidelberg (2002)

    MATH  Google Scholar 

  8. González, F.A.: Anomaly detection using real-valued negative selection. Genetic programming and evolvable machines. Journal of Genetic Programming and Evolvable Machines, 4–383 (2003)

    Google Scholar 

  9. Stibor, T., Timmis, J., Eckert, C.: The link between r-contiguous detectors and k-CNF satisfiability. In: Congress on Evolutionary Computation – CEC, pp. 491–498. IEEE Press, Los Alamitos (2006); revised and extended version

    Google Scholar 

  10. Stibor, T., Mohr, P., Timmis, J., Eckert, C.: Is negative selection appropriate for anomaly detection? In: GECCO 2005: Proceedings of the 2005 conference on Genetic and evolutionary computation, pp. 321–328. ACM, New York (2005)

    Google Scholar 

  11. Stibor, T., Bayarou, K.M., Eckert, C.: An investigation of R-chunk detector generation on higher alphabets. In: Deb, K., et al. (eds.) GECCO 2004. LNCS, vol. 3102, pp. 299–307. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  12. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum-likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Society B 39, 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  13. Novovičová, J., Malík, A.: Application of multinomial mixture model to text classification. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 646–653. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization, pp. 161–175 (1994)

    Google Scholar 

  15. Keselj, V., Peng, F., Cercone, N., Thomas, C.: N-gram-based author profiles for authorship attribution (2003)

    Google Scholar 

  16. Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. Journal of Machine Learning Research 2, 139–154 (2001)

    MATH  Google Scholar 

  17. Srihari, X.W.R., Zheng, Z.: Document representation for one-class SVM. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS, vol. 3201, pp. 489–500. Springer, Heidelberg (2004)

    Google Scholar 

  18. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Pöllä, M. (2009). A Generative Model for Self/Non-self Discrimination in Strings. In: Kolehmainen, M., Toivanen, P., Beliczynski, B. (eds) Adaptive and Natural Computing Algorithms. ICANNGA 2009. Lecture Notes in Computer Science, vol 5495. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04921-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-04921-7_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-04920-0

  • Online ISBN: 978-3-642-04921-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics