Subspace Mapping of Noisy Text Documents

Soto, Axel J.; Strickert, Marc; Vazquez, Gustavo E.; Milios, Evangelos

doi:10.1007/978-3-642-21043-3_45

Axel J. Soto²¹,
Marc Strickert²²,
Gustavo E. Vazquez²³ &
…
Evangelos Milios²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6657))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

1605 Accesses
1 Citations

Abstract

Subspace mapping methods aim at projecting high-dimensional data into a subspace where a specific objective function is optimized. Such dimension reduction allows the removal of collinear and irrelevant variables for creating informative visualizations and task-related data spaces. These specific and generally de-noised subspaces spaces enable machine learning methods to work more efficiently. We present a new and general subspace mapping method, Correlative Matrix Mapping (CMM), and evaluate its abilities for category-driven text organization by assessing neighborhood preservation, class coherence, and classification. This approach is evaluated for the challenging task of processing short and noisy documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Zhang, J., Huang, H., Wang, J.: Manifold Learning for Visualizing and Analyzing High-Dimensional Data. IEEE Intel. Syst. 25, 54–61 (2010)
Article Google Scholar
van der Maaten, L., Postma, E., van den Herik, J.: Dimensionality Reduction: A Comparative Review. Tilburg University, TiCC TR 2009–005 (2009)
Google Scholar
Strickert, M., Soto, A.J., Vazquez, G.E.: Adaptive Matrix Distances Aiming at Optimum Regression Subspaces. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning - ESANN 2010, pp. 93–98 (2010)
Google Scholar
Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E.: Adaptive Visualization of Text Documents Incorporating Domain Knowledge. In: Challenges of Data Visualization, NIPS 2010 Workshop (2010)
Google Scholar
Machine Learning Open Source Software, http://mloss.org
Matlab Statistics Toolbox, http://www.mathworks.com/products/statistics/
McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. Wiley-Interscience, Hoboken (2004)
MATH Google Scholar
Hardoon, D.R., Szedmak, S.R., Shawe-Taylor, J.R.: Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Comput. 16, 2639–2664 (2004)
Article MATH Google Scholar
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighborhood Components Analysis. Adv. Neural Inf. Process. Syst. 17, 513–520 (2005)
Google Scholar
Globerson, A., Roweis, S.: Metric Learning by Collapsing Classes. Adv. Neural Inf. Process. Syst. 18, 451–458 (2006)
Google Scholar
Aviation Safety Reporting System, http://asrs.arc.nasa.gov/
Lee, J.A., Verleysen, M.: Quality Assessment of Dimensionality Reduction: Rank-Based Criteria. Neurocomputing 72, 1431–1443 (2009)
Article Google Scholar
Dunnet, C.W.: A Multiple Comparisons Procedure for Comparing Several Treatments with a Control. J. Am. Stat. Assoc. 50, 1096–1121 (1955)
Article Google Scholar
Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E.: Technical Report, Dalhousie University (in preparation), http://www.cs.dal.ca/research/techreports

Download references

Author information

Authors and Affiliations

Faculty of Computer Science, Dalhousie University, Canada
Axel J. Soto & Evangelos Milios
Institute for Vision and Graphics, Siegen University, Germany
Marc Strickert
Dept. Computer Science, Univ. Nacional del Sur, Argentina
Gustavo E. Vazquez

Authors

Axel J. Soto
View author publications
You can also search for this author in PubMed Google Scholar
Marc Strickert
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo E. Vazquez
View author publications
You can also search for this author in PubMed Google Scholar
Evangelos Milios
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Regina, 3737 Wascana Parkway, Regina, S4S 0A2, Saskatchewan, Canada
Cory Butz
Department of Mathematics and Computing Science, Saint Mary’s University, B3H 3C3, Halifax, Nova Scotia, Canada
Pawan Lingras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E. (2011). Subspace Mapping of Noisy Text Documents. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-21043-3_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics