Skip to main content

Text Classification with K-Nearest Neighbors Algorithm Using Gain Ratio

  • Conference paper
  • First Online:
Progress in Computing, Analytics and Networking

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1119))

Abstract

Content classification is the errand of naturally arranging a lot of records into classifications from a predefined set. This implies that it allocates predefined classifications to free-content archives. This paper introduces a unique two-phase determination strategy for content classification using data gain (CCDG) that will guide and examine the hereditary calculation of the given dataset. In the first phase of CCDG, each term inside the archive is positioned depending on its significance for grouping and data gain. In the second stage, hereditary calculation through GA and main segment investigation through PCA determines and highlights the relevant extraction of the trend of the given stream of bits in decreasing impact. In this manner, all the content that has lesser significance can be overlooked while only impactful content remains for providing details. Experiments show encouraging and better results for proposed CCDG as compared to conventional methods under all the dataset and test conditions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ertugrul, Ö.F., Tagluk, M.E.: A novel version of k nearest neighbor: dependent nearest neighbor. Appl. Soft Comput. 55, 480–490 (2017)

    Article  Google Scholar 

  2. Singh, A., Deep, K., Grover, P.: A novel approach to accelerate calibration process of a k-nearest neighbor classifier using GPU. J. Parallel Distrib. Comput. 104, 114–129 (2017)

    Article  Google Scholar 

  3. Parvin, H., Alizadeh, H., Minati, B.: A modification on k-nearest neighbor classifier. Glob. J. Comput. Sci. Technol. (2010)

    Google Scholar 

  4. Keller, J.M., Gray, M.R., Givens, J.A.: A fuzzy k-nearest neighbor algorithm. IEEE Trans. Syst. Man. Cybern. 4, 580–585 (1985)

    Article  Google Scholar 

  5. Faziludeen, S., Sankaran, P.: ECG beat classification using evidential K-nearest neighbours. Proc. Comput. Sci. 89, 499–505 (2016)

    Article  Google Scholar 

  6. Song, Y., Liang, J., Lu, J., Zhao, X.: An efficient instance selection algorithm for k nearest neighbor regression. Neurocomputing 251, 26–34 (2017)

    Article  Google Scholar 

  7. Bian W.: Fuzzy-rough nearest-neighbor classification method: an integrated framework. In: Proceedings of the IASTED internet. conference on applied informatics, pp. 160–164, Austria (2002)

    Google Scholar 

  8. Nguyen, B., Morell, C., De Baets, B.: Large-scale distance metric learning for k-nearest neighbors regression. Neurocomputing 214, 805–814 (2016)

    Article  Google Scholar 

  9. Lin, Y., Li, J., Lin, M., Chen, J.: A new nearest neighbor classifier via fusing neighborhood information. Neurocomputing 143, 164–169 (2014)

    Article  Google Scholar 

  10. Manocha, S., Girolami, M.A.: An empirical analysis of the probabilistic k-nearest neighbour classifier. Pattern Recogn. Lett. 28(13), 1818–1824 (2007)

    Article  Google Scholar 

  11. Sarkar, M.: Fuzzy-rough nearest neighbor algorithms in classification. Fuzzy Sets Syst. 158(19), 2134–2152 (2007)

    Article  MathSciNet  Google Scholar 

  12. Timofte, R., Van Gool, L.: Iterative nearest neighbors. Pattern Recogn. 48(1), 60–72 (2015)

    Article  Google Scholar 

  13. Roweis S.T., Saul L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Google Scholar 

  14. Aharon, M., Elad, M., Bruckstein, A.: SVD: an algorithm for designing over complete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)

    Article  Google Scholar 

  15. Sahu S., Saurabh P., Rai S.: An enhancement in clustering for sequential pattern mining through neural algorithm using web logs. In: International conference on computational intelligence and communication networks, pp. 758–764 (2015)

    Google Scholar 

  16. Saxena, M., Saurabh, P., Verma, B.: A new hashing scheme to overcome the problem of overloading of articles in Usenet, pp. 967–975. Springer, AISC (2012)

    Google Scholar 

  17. Mishra, B.K., Saurabh, P., Verma, B.: A novel approach to classify high dimensional datasets using supervised manifold learning, pp. 22–30. Springer, CCIS (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Rathore, M.S., Saurabh, P., Prasad, R., Mewada, P. (2020). Text Classification with K-Nearest Neighbors Algorithm Using Gain Ratio. In: Das, H., Pattnaik, P., Rautaray, S., Li, KC. (eds) Progress in Computing, Analytics and Networking. Advances in Intelligent Systems and Computing, vol 1119. Springer, Singapore. https://doi.org/10.1007/978-981-15-2414-1_3

Download citation

Publish with us

Policies and ethics