Abstract
Subspace mapping methods aim at projecting high-dimensional data into a subspace where a specific objective function is optimized. Such dimension reduction allows the removal of collinear and irrelevant variables for creating informative visualizations and task-related data spaces. These specific and generally de-noised subspaces spaces enable machine learning methods to work more efficiently. We present a new and general subspace mapping method, Correlative Matrix Mapping (CMM), and evaluate its abilities for category-driven text organization by assessing neighborhood preservation, class coherence, and classification. This approach is evaluated for the challenging task of processing short and noisy documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhang, J., Huang, H., Wang, J.: Manifold Learning for Visualizing and Analyzing High-Dimensional Data. IEEE Intel. Syst. 25, 54–61 (2010)
van der Maaten, L., Postma, E., van den Herik, J.: Dimensionality Reduction: A Comparative Review. Tilburg University, TiCC TR 2009–005 (2009)
Strickert, M., Soto, A.J., Vazquez, G.E.: Adaptive Matrix Distances Aiming at Optimum Regression Subspaces. In: European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning - ESANN 2010, pp. 93–98 (2010)
Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E.: Adaptive Visualization of Text Documents Incorporating Domain Knowledge. In: Challenges of Data Visualization, NIPS 2010 Workshop (2010)
Machine Learning Open Source Software, http://mloss.org
Matlab Statistics Toolbox, http://www.mathworks.com/products/statistics/
McLachlan, G.: Discriminant Analysis and Statistical Pattern Recognition. Wiley-Interscience, Hoboken (2004)
Hardoon, D.R., Szedmak, S.R., Shawe-Taylor, J.R.: Canonical Correlation Analysis: An Overview with Application to Learning Methods. Neural Comput. 16, 2639–2664 (2004)
Goldberger, J., Roweis, S., Hinton, G., Salakhutdinov, R.: Neighborhood Components Analysis. Adv. Neural Inf. Process. Syst. 17, 513–520 (2005)
Globerson, A., Roweis, S.: Metric Learning by Collapsing Classes. Adv. Neural Inf. Process. Syst. 18, 451–458 (2006)
Aviation Safety Reporting System, http://asrs.arc.nasa.gov/
Lee, J.A., Verleysen, M.: Quality Assessment of Dimensionality Reduction: Rank-Based Criteria. Neurocomputing 72, 1431–1443 (2009)
Dunnet, C.W.: A Multiple Comparisons Procedure for Comparing Several Treatments with a Control. J. Am. Stat. Assoc. 50, 1096–1121 (1955)
Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E.: Technical Report, Dalhousie University (in preparation), http://www.cs.dal.ca/research/techreports
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Soto, A.J., Strickert, M., Vazquez, G.E., Milios, E. (2011). Subspace Mapping of Noisy Text Documents. In: Butz, C., Lingras, P. (eds) Advances in Artificial Intelligence. Canadian AI 2011. Lecture Notes in Computer Science(), vol 6657. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21043-3_45
Download citation
DOI: https://doi.org/10.1007/978-3-642-21043-3_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21042-6
Online ISBN: 978-3-642-21043-3
eBook Packages: Computer ScienceComputer Science (R0)