Skip to main content

Feature Selection for Unsupervised Learning

  • Conference paper
Neural Information Processing (ICONIP 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7665))

Included in the following conference series:

Abstract

In this paper, we present a methodology for identifying best features from a large feature space. In high dimensional feature space nearest neighbor search is meaningless. In this feature space we see quality and performance issue with nearest neighbor search. Many data mining algorithms use nearest neighbor search. So instead of doing nearest neighbor search using all the features we need to select relevant features. We propose feature selection using Non-negative Matrix Factorization(NMF) and its application to nearest neighbor search.

Recent clustering algorithm based on Locally Consistent Concept Factorization(LCCF) shows better quality of document clustering by using local geometrical and discriminating structure of the data. By using our feature selection method we have shown further improvement of performance in the clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Beyer, K., Goldstein, J., Ramakrishnan, R.: Shaft, Uri.: When is ”Nearest Neighbor” Meaningful? In: Int. Conf. on Database Theory (1999)

    Google Scholar 

  2. Christopher, J.C.B.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2, 121–167 (1998)

    Article  Google Scholar 

  3. Cai, D., He, X.F., Han, J.W.: Locally Consistent Concept Factorization for Document Clustering. IEEE Trans. on Knowl. and Data Eng. 23, 902–913 (2011)

    Article  Google Scholar 

  4. Chapelle, O., Keerthi, S.: Multi-class Feature Selection with Support Vector Machines. In: Proceedings of the American Statistical Association (2008)

    Google Scholar 

  5. Cunningham, P., Delany, S.J.: K-nearest Neighbour Classifiers. Technical Report (2007)

    Google Scholar 

  6. Lee, D.D., Seung, H.S.: Algorithms for Non-negative Matrix Factorization. In: Advances in Neural Information Processing Systems, vol. 13, MIT Press (2001)

    Google Scholar 

  7. Xu, W., Gong, Y.H.: Document Clustering by Concept Factorization. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2004. ACM (2004)

    Google Scholar 

  8. Xu, W., Liu, X., Gong, Y.H.: Document Clustering Based on Non-negative Matrix Factorization. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval, SIGIR 2003. ACM (2003)

    Google Scholar 

  9. Yang, Y.M., Pedersen, J.O.: A Comparative Study on Feature Selection in Text Categorization. Morgan Kaufmann Publishers (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Adhikary, J.R., Narasimha Murty, M. (2012). Feature Selection for Unsupervised Learning. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34487-9_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34487-9_47

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34486-2

  • Online ISBN: 978-3-642-34487-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics