Nonlinear Embedded Map Projection for Dimensionality Reduction

  • Simone Marinai
  • Emanuele Marino
  • Giovanni Soda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5716)

Abstract

We describe a dimensionality reduction method used to perform similarity search that is tested on document image retrieval applications. The approach is based on data point projection into a low dimensional space obtained by merging together the layers of a Growing Hierarchical Self Organizing Map (GHSOM) trained to model the distribution of objects to be indexed. The low dimensional space is defined by embedding the GHSOM sub-maps in the space defined by a non-linear mapping of neurons belonging to the first level map. The latter mapping is computed with the Sammon projection algorithm.

The dimensionality reduction is used in a similarity search framework whose aim is to efficiently retrieve similar objects on the basis of the distance among projected points corresponding to high dimensional feature vectors describing the indexed objects.

We compare the proposed method with other dimensionality reduction techniques by evaluating the retrieval performance on three datasets.

Keywords

Dimensionality Reduction Voronoi Diagram Dimensionality Reduction Method Voronoi Region Dimensionality Reduction Technique 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Marinai, S., Marino, E., Soda, G.: Font adaptive word indexing of modern printed documents. IEEE Transactions on PAMI 28(8), 1187–1199 (2006)CrossRefGoogle Scholar
  2. 2.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference, and prediction. Springer series in statistics, New York (2001)Google Scholar
  3. 3.
    Kanth, K.V.R., Agrawal, D., Singh, A.: Dimensionality reduction for similarity searching in dynamic databases. SIGMOD Rec. 27(2), 166–176 (1998)CrossRefGoogle Scholar
  4. 4.
    Samet, H.: Foundations of multidimensional and metric data structures. Morgan Kaufmann, Amsterdam (2006)MATHGoogle Scholar
  5. 5.
    Marinai, S., Marino, E., Soda, G.: Embedded map projection for dimensionality reduction-based similarity search. In: da Vitoria Lobo, N., Kasparis, T., Roli, F., Kwok, J.T., Georgiopoulos, M., Anagnostopoulos, G.C., Loog, M. (eds.) S+SSPR 2008. LNCS, vol. 5342, pp. 582–591. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  6. 6.
    van der Maaten, L., Postma, E., van den Herik, H.: Dimension reduction: A comparative review (preprint, 2007)Google Scholar
  7. 7.
    DeMers, D., Cottrell, G.: Nonlinear dimensionality reduction. In: NIPS-5 (1993)Google Scholar
  8. 8.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefMATHGoogle Scholar
  9. 9.
    Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimensionality reduction via local tangent space alignment. SIAM Journal of Scientific Computing 26(1), 313–338 (2004)CrossRefMATHGoogle Scholar
  10. 10.
    Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Transactions on Neural Networks 11(3), 574–585 (2000)CrossRefGoogle Scholar
  11. 11.
    Chan, A., Pampalk, E.: Growing hierarchical self organising map (ghsom) toolbox: visualisations and enhancements. In: Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP 2002, vol. 5, pp. 2537–2541 (2002)Google Scholar
  12. 12.
    Li, C., Chang, E., Garcia-Molina, H., Wiederhold, G.: Clustering for approximate similarity search in high-dimensional spaces. IEEE Transactions on Knowledge and Data Engineering 14(4), 792–808 (2002)CrossRefGoogle Scholar
  13. 13.
    Marinai, S., Faini, S., Marino, E., Soda, G.: Efficient word retrieval by means of SOM clustering and PCA. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 336–347. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2003)MATHGoogle Scholar
  15. 15.
    Wu, Z., Yen, G.: A som projection technique with the growing structure for visualizing high-dimensional data. In: Proceedings of the International Joint Conference on Neural Networks, 2003, vol. 3, pp. 1763–1768 (2003)Google Scholar
  16. 16.
    Yen, G.G., Wu, Z.: Ranked centroid projection: a data visualization approach with self-organizing maps. IEEE Transactions on Neural Networks 19(2), 245–258 (2008)CrossRefGoogle Scholar
  17. 17.
    van der Maaten, L.: An introduction to dimensionality reduction using matlab. Technical Report Technical Report MICC 07-07 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Simone Marinai
    • 1
  • Emanuele Marino
    • 1
  • Giovanni Soda
    • 1
  1. 1.Dipartimento di Sistemi e InformaticaUniversità di FirenzeFirenzeItaly

Personalised recommendations