Learning Representations for Soft Skill Matching

Sayfullina, Luiza; Malmi, Eric; Kannala, Juho

doi:10.1007/978-3-030-11027-7_15

Luiza Sayfullina²⁶,
Eric Malmi²⁶ &
Juho Kannala²⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11179))

Included in the following conference series:

International Conference on Analysis of Images, Social Networks and Texts

1098 Accesses
9 Citations
3 Altmetric

Abstract

Employers actively look for talents having not only specific hard skills but also various soft skills. To analyze the soft skill demands on the job market, it is important to be able to detect soft skill phrases from job advertisements automatically. However, a naive matching of soft skill phrases can lead to false positive matches when a soft skill phrase, such as friendly, is used to describe a company, a team, or another entity, rather than a desired candidate.

In this paper, we propose a phrase-matching-based approach which differentiates between soft skill phrases referring to a candidate vs. something else. The disambiguation is formulated as a binary text classification problem where the prediction is made for the potential soft skill based on the context where it occurs. To inform the model about the soft skill for which the prediction is made, we develop several approaches, including soft skill masking and soft skill tagging.

We compare several neural network based approaches, including CNN, LSTM and Hierarchical Attention Model. The proposed tagging-based input representation using LSTM achieved the highest recall of 83.92% on the job dataset when fixing a precision to 95%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/AnonymousWWW18/soft_skills.
2.
The dataset is available at: https://www.kaggle.com/c/job-salary-prediction.
3.
https://www.crowdflower.com.
4.
https://github.com/muzaluisa/soft-skill-matching.
5.
https://nlp.stanford.edu/projects/glove/.
6.
The implementation of the model was adopted from: https://github.com/EdGENetworks/anuvada.

References

Bakhshi, H., Downing, J.M., Osborne, M.A., Schneider, P.: The future of skills employment in 2030. Technical report, Pearson PLC (2017)
Google Scholar
Bastian, M., et al.: LinkedIn skills: large-scale topic extraction and inference. In: Proceedings of the 8th ACM Conference on Recommender systems, pp. 1–8. ACM (2014)
Google Scholar
Chen, D., Manning, C.: A fast and accurate dependency parser using neural networks. In: EMNLP, pp. 740–750 (2014)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Gated feedback recurrent neural networks. In: International Conference on Machine Learning, pp. 2067–2075 (2015)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jurafsky, D., Martin, J.H.: Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition (2009)
Google Scholar
Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: generalization gap and sharp minima. In: ICLR (2017)
Google Scholar
Kim, Y.: Convolutional neural networks for sentence classification, pp. 1746–1751 (2014)
Google Scholar
Kivimäki, I., et al.: A graph-based approach to skill extraction from text. In: Proceedings of TextGraphs-8 Graph-based Methods for NLP, pp. 79–87 (2013)
Google Scholar
Loper, E., Bird, S.: NLTK: the natural language toolkit. In: Proceedings of the ACL-02 Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, ETMTNLP 2002 , pp. 63–70 (2002)
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: NIPS, pp. 3111–3119 (2013)
Google Scholar
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Nivre, J.: An efficient algorithm for projective dependency parsing. In: Proceedings of the 8th International Workshop on Parsing Technologies (IWPT). Citeseer (2003)
Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., et al.: Automatic differentiation in PyTorch. In: NIPS-W (2017)
Google Scholar
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large Corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50 (2010)
Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)
Article Google Scholar
Sayfullina, L., Malmi, E., Liao, Y., Jung, A.: Domain adaptation for resume classification using convolutional neural networks. In: van der Aalst, W.M.P., et al. (eds.) AIST 2017. LNCS, vol. 10716, pp. 82–93. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-73013-4_8
Chapter Google Scholar
Schulz, B.: The importance of soft skills: education beyond academic knowledge (2008)
Google Scholar
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Tang, D., Qin, B., Liu, T.: Document modeling with gated recurrent neural network for sentiment classification. In: EMLNP, pp. 1422–1432 (2015)
Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: Proceedings of the 2016 Conference of the North American Chapter of the ACL: Human Language Technologies, pp. 1480–1489 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Aalto University, Espoo, Finland
Luiza Sayfullina, Eric Malmi & Juho Kannala

Authors

Luiza Sayfullina
View author publications
You can also search for this author in PubMed Google Scholar
Eric Malmi
View author publications
You can also search for this author in PubMed Google Scholar
Juho Kannala
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luiza Sayfullina .

Editor information

Editors and Affiliations

RWTH Aachen University, Aachen, Germany
Wil M. P. van der Aalst
University of Ljubljana, Ljubljana, Slovenia
Vladimir Batagelj
University of Mannheim, Mannheim, Germany
Goran Glavaš
National Research University Higher School of Economics, Moscow, Russia
Dmitry I. Ignatov
Institute of Mathematics and Mechanics, Yekaterinburg, Russia
Michael Khachay
National Research University Higher School of Economics, Moscow, Russia
Sergei O. Kuznetsov
National Research University Higher School of Economics , Saint Petersburg, Russia
Olessia Koltsova
National Research University Higher School of Economics, Moscow, Russia
Irina A. Lomazova
Moscow State University, Moscow, Russia
Natalia Loukachevitch
Loria, Vandoeuvre lès Nancy, France
Amedeo Napoli
University of Hamburg, Hamburg, Germany
Alexander Panchenko
University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Ca Foscari University of Venice, Venice, Italy
Marcello Pelillo
National Research University Higher School of Economics, Nizhny Novgorod, Russia
Andrey V. Savchenko

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sayfullina, L., Malmi, E., Kannala, J. (2018). Learning Representations for Soft Skill Matching. In: van der Aalst, W., et al. Analysis of Images, Social Networks and Texts. AIST 2018. Lecture Notes in Computer Science(), vol 11179. Springer, Cham. https://doi.org/10.1007/978-3-030-11027-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-11027-7_15
Published: 31 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11026-0
Online ISBN: 978-3-030-11027-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics