Skip to main content

Document Classification with Hierarchically Structured Dictionaries

  • Conference paper
  • First Online:
Intelligent Systems Technologies and Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 385))

Abstract

Classification, clustering of documents, detecting novel documents, detecting emerging topics etc in a fast and efficient way, is of high relevance these days with the volume of online generated documents increasing rapidly. Experiments have resulted in innovative algorithms, methods and frameworks to address these problems. One such method is Dictionary Learning. We introduce a new 2-level hierarchical dictionary structure for classification such that the dictionary at the higher level is utilized to classify the K classes of documents. The results show around an 85% recall during the classification phase. This model can be extended to distributed environment where the higher level dictionary should be maintained at the master node and the lower level ones should be kept at worker nodes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bruckstein, A.M., Donoho, D.L., Elad, M.: From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev. 51, 34–81 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  2. Kasiviswanathan, S.P., Melville, P., Banerjee, A., Sindhwani, V.: Emerging topic detection using dictionary learning. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, CIKM 2011, pp. 745–754. ACM, New York (2011)

    Google Scholar 

  3. Kasiviswanathan, S.P., Cong, G., Melville, P., Lawrence, R.D.: Novel document detection for massive data streams using distributed dictionary learning. IBM Journal of Research and Development 57(3/4), 9 (2013)

    Google Scholar 

  4. Kasiviswanathan, S.P., Wang, H., Banerjee, A., Melville, P.: Online l1-dictionary learning with application to novel document detection. In: Bartlett, P. L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) NIPS, pp. 2267–2275 (2012)

    Google Scholar 

  5. Ramrez, I., Sprechmann, P., Sapiro, G.: Classification and clustering via dictionary learning with structured incoherence and shared features. In: CVPR, pp. 3501–3508, IEEE (2010)

    Google Scholar 

  6. Menon, S.R., Nair, S.S.: Sparsity-based representation for categorical data. In: Recent Advances in Intelligent Computational Systems (RAICS). IEEE (2013)

    Google Scholar 

  7. Kasiviswanathan, S.P.: Fast online l 1-dictionary learning algorithms for novel document detection. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 8585–8589. IEEE (2013)

    Google Scholar 

  8. Berry, M.W., Drmac, Z., Jessup, E.R.: Matrices, vector spaces, and information retrieval. Society for Industrial and Applied Mathematics 41, 335–362 (1999)

    MathSciNet  MATH  Google Scholar 

  9. Aharon, M.: Overcomplete dictionaries for sparse representation of signals. PhD thesis, Technion-Israel Institute of Technology, Faculty of Computer Science (2006)

    Google Scholar 

  10. Aharon, M., Elad, M., Bruckstein, A.: Svdd: An algorithm for designing overcomplete dictionaries for sparse representation. Trans. Sig. Proc. 54, 4311–4322 (2006)

    Article  Google Scholar 

  11. Rubinstein, R., Zibulevsky, M., Elad, M.: Efficient implementation of the k-svd algorithm using batch orthogonal matching pursuit. CS Technion 40(8), 1–15 (2008)

    Google Scholar 

  12. Jolliffe, I.: Principal component analysis. Wiley Online Library (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Remya R. K. Menon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Menon, R.R.K., Aswathi, P. (2016). Document Classification with Hierarchically Structured Dictionaries . In: Berretti, S., Thampi, S., Dasgupta, S. (eds) Intelligent Systems Technologies and Applications. Advances in Intelligent Systems and Computing, vol 385. Springer, Cham. https://doi.org/10.1007/978-3-319-23258-4_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23258-4_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23257-7

  • Online ISBN: 978-3-319-23258-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics