Advertisement

PLSI: The True Fisher Kernel and beyond

IID Processes, Information Matrix and Model Identification in PLSI
  • Jean-Cédric Chappelier
  • Emmanuel Eckard
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5781)

Abstract

The Probabilistic Latent Semantic Indexing model, introduced by T. Hofmann (1999), has engendered applications in numerous fields, notably document classification and information retrieval. In this context, the Fisher kernel was found to be an appropriate document similarity measure. However, the kernels published so far contain unjustified features, some of which hinder their performances. Furthermore, PLSI is not generative for unknown documents, a shortcoming usually remedied by “folding them in” the PLSI parameter space.

This paper contributes on both points by (1) introducing a new, rigorous development of the Fisher kernel for PLSI, addressing the role of the Fisher Information Matrix, and uncovering its relation to the kernels proposed so far; and (2) proposing a novel and theoretically sound document similarity, which avoids the problem of “folding in” unknown documents. For both aspects, experimental results are provided on several information retrieval evaluation sets.

Keywords

Information Retrieval Latent Dirichlet Allocation Fisher Information Matrix Mean Average Precision Document Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Ahrendt, P., Goutte, C., Larsen, J.: Co-occurrence models in music genre classification. In: ieee Int. Workshop on Machine Learning for Signal Processing (2005)Google Scholar
  2. 2.
    Bast, H., Weber, I.: Insights from viewing ranked retrieval as rank aggregation. In: Proc. of Int. Workshop on Challenges in Web Information Retrieval and Integration (WIRI 2005), pp. 232–239 (2005)Google Scholar
  3. 3.
    Blei, D., Lafferty, J.: A correlated topic model of Science. Annals of Applied Statistics 1(1), 17–35 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)zbMATHGoogle Scholar
  5. 5.
    Bosch, A., Zisserman, A., Munoz, X.: Scene classification via plsa. In: Proc. of the European Conf. on Computer Vision (2006)Google Scholar
  6. 6.
    Gaussier, E., Goutte, C., Popat, K., Chen, F.: A hierarchical model for clustering and categorising documents. In: Proc. of 24th BCS-IRSG Europ. Coll. on IR Research, pp. 229–247 (2002)Google Scholar
  7. 7.
    Gehler, P.V., Holub, A.D., Welling, M.: The rate adapting Poisson model for information retrieval and object recognition. In: Proc. 23rd Int. Conf. on Machine Learning, pp. 337–344 (2006)Google Scholar
  8. 8.
    Harman, D.: Overview of the fourth Text REtrieval Conference (TREC–4). In: Proc. of the 4th Text REtrieval Conf., pp. 1–23 (1995)Google Scholar
  9. 9.
    Hinneburg, A., Gabriel, H.-H., Gohr, A.: Bayesian folding-in with Dirichlet kernels for PLSI. In: Proc. of the 7th IEEE Int. Conf. on Data Mining, pp. 499–504 (2007)Google Scholar
  10. 10.
    Hofmann, T.: Probabilistic latent semantic indexing. In: Proc. of 22nd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, pp. 50–57 (1999)Google Scholar
  11. 11.
    Hofmann, T.: Learning the similarity of documents: An information-geometric approach to document retrieval and categorization. In: Advances in Neural Information Processing Systems, vol. 12, pp. 914–920 (2000)Google Scholar
  12. 12.
    Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Machine Learning 42(1), 177–196 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Jaakkola, T., Haussler, D.: Exploiting generative models in discriminative classifiers. In: Advances in Neural Information Processing Systems, vol. 11, pp. 487–493. MIT Press, Cambridge (1999)Google Scholar
  14. 14.
    Jin, X., Zhou, Y., Mobasher, B.: Web usage mining based on probabilistic latent semantic analysis. In: Proc. of 10th Int. Conf. on Knowledge Discovery and Data Mining, pp. 197–205 (2004)Google Scholar
  15. 15.
    Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proc. ACM SIGIR Conf. on Research and Development in Information Retrieval (2001)Google Scholar
  16. 16.
    Lienhart, R., Slaney, M.: Plsa on large-scale image databases. In: Proc. of the 2007 Int. Conf. on Acoustics, Speech and Signal Processing, IEEE (ICASSP 2007), vol. 4, pp. 1217–1220 (2007)Google Scholar
  17. 17.
    McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, Chichester (2000)CrossRefzbMATHGoogle Scholar
  18. 18.
    Mei, Q., Zhai, C.: A mixture model for contextual text mining. In: Proc. of 12th Int. Conf. on Knowledge Discovery and Data Mining, pp. 649–655 (2006)Google Scholar
  19. 19.
    Monay, F., Gatica-Perez, D.: Plsa-based image auto-annotation: Constraining the latent space. In: Proc. ACM Int. Conf. on Multimedia, ACM MM (2004)Google Scholar
  20. 20.
    Monay, F., Gatica-Perez, D.: Modeling semantic aspects for cross-media image indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2007)Google Scholar
  21. 21.
    Nyffenegger, M., Chappelier, J.-C., Gaussier, E.: Revisiting Fisher kernels for document similarities. In: Proc. of 17th European Conf. on Machine Learning, pp. 727–734 (2006)Google Scholar
  22. 22.
    Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: 21st SIGIR Conf. on Research and Development in Information Retrieval, pp. 275–281 (1998)Google Scholar
  23. 23.
    Popescul, A., Ungar, L.H., Pennock, D.M., Lawrence, S.: Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In: Proc. of the 17th Conf. in Uncertainty in Artificial Intelligence, pp. 437–444 (2001)Google Scholar
  24. 24.
    Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T., Gool, L.V.: Modeling scenes with local descriptors and latent aspects. In: Proc. of ICCV 2005, vol. 1, pp. 883–890 (2005)Google Scholar
  25. 25.
    Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M., Gatford, M.: Okapi at TREC–3. In: Proc. of the 3rd Text REtrieval Conf. (1994)Google Scholar
  26. 26.
    Steyvers, M., Smyth, P., Rosen-Zvi, M., Griffiths, T.: Probabilistic author-topic models for information discovery. In: Proc. 10th Int. Conf. on Knowl. Discovery and Data Mining, pp. 306–315 (2004)Google Scholar
  27. 27.
    Vinokourov, A., Girolami, M.: A probabilistic framework for the hierarchic organisation and classification of document collections. Journal of Intelligent Information Systems 18(2/3), 153–172 (2002)CrossRefGoogle Scholar
  28. 28.
    Welling, M., Rosen-Zvi, M., Hinton, G.: Exponential family harmoniums with an application to information retrieval. In: Advances in Neural Information Processing Systems, vol. 17, pp. 1481–1488 (2005)Google Scholar
  29. 29.
    Zhai, C.: Statistical language models for information retrieval a critical review. Found. Trends Inf. Retr. 2(3), 137–213 (2008)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jean-Cédric Chappelier
    • 1
  • Emmanuel Eckard
    • 1
  1. 1.School of Computer and Communication SciencesÉcole Polytechnique Fédérale de LausanneSwitzerland

Personalised recommendations