Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes

Singh, Gautam; Dasgupta, Gargi; Deng, Yu

doi:10.1007/978-3-030-17642-6_26

Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes

Gautam Singh²⁴,
Gargi Dasgupta²⁵ &
Yu Deng²⁶

Conference paper
First Online: 10 April 2019

1502 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11434))

Abstract

In scenarios involving text classification where the number of classes is large (in multiples of 10000 s) and training samples for each class are few and often verbose, nearest neighbor methods are effective but very slow in computing a similarity score with training samples of every class. On the other hand, machine learning models are fast at runtime but training them adequately is not feasible using few available training samples per class. In this paper, we propose a hybrid approach that cascades (1) a fast but less-accurate recurrent neural network (RNN) model and (2) a slow but more-accurate nearest-neighbor model using bag of syntactic features.

Using the cascaded approach, our experiments, performed on data set from IT support services where customer complaint text needs to be classified to return top-N possible error codes, show that the query-time of the slow system is reduced to \(1/6^{th}\) while its accuracy is being improved. Our approach outperforms an LSH-based baseline for query-time reduction. We also derive a lower bound on the accuracy of the cascaded model in terms of the accuracies of the individual models. In any two-stage approach, choosing the right number of candidates to pass on to the second stage is crucial. We prove a result that aids in choosing this cutoff number for the cascaded system.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In: 2006 47th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2006, pp. 459–468. IEEE (2006)
Google Scholar
Asadi, N., Lin, J.: Effectiveness/efficiency tradeoffs for candidate generation in multi-stage retrieval architectures. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 997–1000. ACM (2013)
Google Scholar
Broder, A.Z., Charikar, M., Frieze, A.M., Mitzenmacher, M.: Min-wise independent permutations. J. Comput. Syst. Sci. 60(3), 630–659 (2000)
Article MathSciNet Google Scholar
Cho, K., et al.: Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Clarke, C.L., Culpepper, J.S., Moffat, A.: Assessing efficiency-effectiveness tradeoffs in multi-stage retrieval systems without using relevance judgments. Inf. Retrieval J. 19(4), 351–377 (2016)
Article Google Scholar
Dumais, S., Chen, H.: Hierarchical classification of web content. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 256–263. ACM (2000)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
McCallum, A., Nigam, K., et al.: A comparison of event models for Naive Bayes text classification. In: AAAI-98 Workshop on Learning for Text Categorization, Madison, WI, vol. 752, pp. 41–48 (1998)
Google Scholar
Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Interspeech, vol. 2, p. 3 (2010)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Schütze, H., Hull, D.A., Pedersen, J.O.: A comparison of classifiers and document representations for the routing problem. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 229–237. ACM (1995)
Google Scholar
Sidorov, G., Velasquez, F., Stamatatos, E., Gelbukh, A., Chanona-Hernández, L.: Syntactic dependency-based n-grams as classification features. In: Batyrshin, I., Mendoza, M.G. (eds.) MICAI 2012. LNCS (LNAI), vol. 7630, pp. 1–11. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37798-3_1
Chapter Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Effective and efficient multilabel classification in domains with large number of labels. In: Proceedings of ECML/PKDD2008 Workshop on Mining Multidimensional Data (MMD 2008), pp. 30–44 (2008)
Google Scholar
Wang, J., Zhang, T., Sebe, N., Shen, H.T., et al.: A survey on learning to hash. IEEE Trans. Pattern Anal. Mach. Intell. 40, 769–790 (2017)
Article Google Scholar
Wang, S., Manning, C.D.: Baselines and bigrams: simple, good sentiment and topic classification. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2, pp. 90–94. Association for Computational Linguistics (2012)
Google Scholar
Ye, X., Shen, H., Ma, X., Bunescu, R., Liu, C.: From word embeddings to document similarities for improved information retrieval in software engineering. In: Proceedings of the 38th International Conference on Software Engineering, pp. 404–415. ACM (2016)
Google Scholar
Zhang, X., Zhao, J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems, pp. 649–657 (2015)
Google Scholar
Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., Xu, B.: Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:1611.06639 (2016)

Download references

Author information

Authors and Affiliations

IBM Research-India, New Delhi, 110070, DL, India
Gautam Singh
IBM Research-India, Bangalore, 560045, KA, India
Gargi Dasgupta
IBM T.J. Watson Research Center, Yorktown Heights, New York, 10598, NY, USA
Yu Deng

Authors

Gautam Singh
View author publications
You can also search for this author in PubMed Google Scholar
Gargi Dasgupta
View author publications
You can also search for this author in PubMed Google Scholar
Yu Deng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yu Deng .

Editor information

Editors and Affiliations

Deakin University, Melbourne, VIC, Australia
Xiao Liu
University of Pau and Pays, Pau Cedex, France
Michael Mrissa
Fudan University, Shanghai Shi, China
Liang Zhang
LIRIS Lab, University Lyon 1, IUT, Villeurbanne Cedex, France
Djamal Benslimane
School of IT and Computer Science, University of Wollongong, Wollongong, NSW, Australia
Aditya Ghose
Harbin Institute of Technology, Harbin, China
Zhongjie Wang
Scientific and Technological Hub, Fondazione Bruno Kessler (FBK), Trento, Italy
Antonio Bucchiarone
Macquarie University, Sydney, NSW, Australia
Wei Zhang
Queen’s University, Kingston, ON, Canada
Ying Zou
Rochester Institute of Technology, Rochester, NY, USA
Qi Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, G., Dasgupta, G., Deng, Y. (2019). Fast Nearest-Neighbor Classification Using RNN in Domains with Large Number of Classes. In: Liu, X., et al. Service-Oriented Computing – ICSOC 2018 Workshops. ICSOC 2018. Lecture Notes in Computer Science(), vol 11434. Springer, Cham. https://doi.org/10.1007/978-3-030-17642-6_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-17642-6_26
Published: 10 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-17641-9
Online ISBN: 978-3-030-17642-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics