Multitask Learning for Text Classification with Deep Neural Networks

Noushahr, Hossein Ghodrati; Ahmadi, Samad

doi:10.1007/978-3-319-47175-4_8

Hossein Ghodrati Noushahr³ &
Samad Ahmadi³

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

1154 Accesses
2 Citations

Abstract

Multitask learning, the concept of solving multiple related tasks in parallel promises to improve generalization performance over the traditional divide-and-conquer approach in machine learning. The training signals of related tasks induce a bias that helps to find better hypotheses. This paper reviews the concept of multitask learning and prior work on it. An experimental evaluation is done on a large scale text classification problem. A deep neural network is trained to classify English newswire stories by their overlapping topics in parallel. The results are compared to the traditional approach of training a separate deep neural network for each topic separately. The results confirm the initial hypothesis that multitask learning improves generalization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Caruana, R.: Multitask learning. Mach.Learn. 28(1), 41–75 (1997)
Google Scholar
Collobert, R., Weston, J.: A unified architecture for natural language processing. In: Proceedings of the 25th International Conference on Machine learning—ICML ’08, pp. 160–167. ACM Press, New York, USA (2008)
Google Scholar
Hinton, G.E., Bengio, Y., Lecun, Y.: Deep Learning: NIPS 2015 Tutorial (2015)
Google Scholar
LeCun, Y., Bottou, L., Orr, G.B., Müller, K.R.: Efficient BackProp. In: Neural Networks: Tricks of the Trade, vol. 1524, chap. Efficient BackProp, pp. 9–50. Springer, Berlin, Heidelberg (1998)
Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout : a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)
MathSciNet MATH Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: ICML Workshop on Deep Learning for Audio, Speech and Language Processing, vol. 28, p. 1 (2013)
Google Scholar
Hochreiter, S.: Recurrent neural net learning and vanishing gradient. Int. J. Uncertainity Fuzziness Knowl. Based Syst. 6(2), 8 (1998)
Google Scholar
Bengio, Y., Simard, P., Frasconi, P.: Learning long term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994)
Article Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep Sparse Rectifier Neural Networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS-11), vol. 15, pp. 315–323 (2011). (Journal of Machine Learning Research—Workshop and Conference Proceedings)
Google Scholar
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19(1), 153–160 (2007)
Google Scholar
Hinton, G.E.: Learning multiple layers of representation. Trends Cogn. Sci. 11(10), 428–434 (2007)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Jin, X., Xu, C., Feng, J., Wei, Y., Xiong, J., Yan, S.: Deep Learning with S-shaped Rectified Linear Activation Units. arXiv preprint arXiv:1512.07030 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1026–1034. IEEE (2015)
Google Scholar
Witten, I.H.: Text mining. Practical Handbook of Internet Computing, pp. 14–1 (2005)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of the International Conference on Learning Representations (ICLR 2013), pp. 1–12 (2013)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mikolov, T., Yih, W.t., Zweig, G.: Linguistic regularities in continuous space word representations. In: Proceedings of NAACL-HLT, pp. 746–751 (2013)
Google Scholar
Lewis, D.D., Yang, Y., Rose, T.G., Li, F.: RCV1: a new benchmark collection for text categorization research. J. Mach. Learn. Res. 5, 361–397 (2004)
Google Scholar
Powers, D.M.W.: Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
MathSciNet Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, É.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2012)
MathSciNet MATH Google Scholar
Theano Development team: theano: a python framework for fast computation of mathematical expressions. arXiv e-prints (2016)
Google Scholar
Chollet, F.: Keras. https://github.com/fchollet/keras (2015)

Download references

Acknowledgments

We thank Microsoft Research for supporting this work by providing a grant for the Microsoft Azure cloud platform.

Author information

Authors and Affiliations

Centre for Computational Intelligence, School of Computer Science and Informatics, De Montfort University, The Gateway, Leicester, LE1 9BH, UK
Hossein Ghodrati Noushahr & Samad Ahmadi

Authors

Hossein Ghodrati Noushahr
View author publications
You can also search for this author in PubMed Google Scholar
Samad Ahmadi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hossein Ghodrati Noushahr .

Editor information

Editors and Affiliations

School of Computing, University of Portsmouth, Portsmouth, Hampshire, United Kingdom
Max Bramer
School of Computing, Engineering and Mathematics, University of Brighton, Brighton, East Sussex, United Kingdom
Miltos Petridis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noushahr, H.G., Ahmadi, S. (2016). Multitask Learning for Text Classification with Deep Neural Networks. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXXIII. SGAI 2016. Springer, Cham. https://doi.org/10.1007/978-3-319-47175-4_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-47175-4_8
Published: 05 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47174-7
Online ISBN: 978-3-319-47175-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics