Core Competencies Keywords Discovering Algorithm for Employment Advertisements

  • Xiaoping Du
  • Lelai DengEmail author
  • Xingzhi Zhang
  • Qinghong Yang
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 849)


As librarianship evolves, it is important to understand the changes taken place in its core competencies. One good way to do this is to analyze job advertisements (ads) for professional librarian positions. Most related works are based on manual method; the semi-automatic framework requires a classifier consisting of manual rulesets as input. In this paper, a framework and a multi-label short text clustering algorithm, ICNTC, are proposed to automatically identify core competencies from job ads. Data from the American Library Association (ALA) Joblist from 2009 through 2014 is used to validate the method. The analysis of experiment results shows that the method may identify most of core competencies, with a good performance in evaluating the frequency of each competencies. The accuracy of keyword extraction on ALA dataset is 89 ± 1.3%.


Core competency Keywords discovering Job advertisement Text clustering 


  1. 1.
    Holmberg, K., Huvila, I., Kronqvist Berg, M., WidnWulff, G.: What is library 2.0? J. Documentation 65(4), 668–681 (2009)CrossRefGoogle Scholar
  2. 2.
    Debortoli, S., Mller, O., Brocke, J.V.: Comparing business intelligence and big data skills. Bus. Inf. Syst. Eng. 6(5), 289–300 (2014)CrossRefGoogle Scholar
  3. 3.
    Huvila, I., Holmberg, K., Kronqvistberg, M., Nivakoski, O., Widn, G.: What is librarian 2.0–new competencies or interactive relations? a library professional viewpoint. J. Librarianship Inf. Sci. 45(3), 198–205 (2013)CrossRefGoogle Scholar
  4. 4.
    Yang, Q., Zhang, X., Du, X., Bielefield, A., Liu, Y.Q.: Current market demand for core competencies of librarianship a text mining study of American library association’s advertisements from 2009 through 2014. Appl. Sci. 6(2), 48 (2016)CrossRefGoogle Scholar
  5. 5.
    Cai, D., Wang, X., He, X.: Probabilistic dyadic data analysis with local and global consistency. In: Proceedings of the International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, p. 14, June 2009Google Scholar
  6. 6.
    Phan, X.H., Nguyen, C.T., Le, D.T., Nguyen, L.M., Horiguchi, S., Ha, Q.T.: Ahiddentopic-based framework toward building applications with short web documents. IEEE Trans. Knowl. Data Eng. 23(7), 961–976 (2010)CrossRefGoogle Scholar
  7. 7.
    Phan, X.H., Nguyen, L.M., Horiguchi, S.: Learning to classify short and sparse text & web with hidden topics from large-scale data collections. In: Proceedings of the International Conference on World Wide Web, WWW 2008, Beijing, China, pp. 91–100, April 2008Google Scholar
  8. 8.
    Cheng, X., Yan, X., Lan, Y., Guo, J.: BTM: topic modeling over short texts. IEEE Trans. Knowl. Data Eng. 26(12), 2928–2941 (2014)CrossRefGoogle Scholar
  9. 9.
    Zhu, J., Ning, C., Xing, E.P.: Bayesian inference with posterior regularization and infinite latent support vector machines. J. Mach. Learn. Res. 15(1), 1799–1847 (2012)zbMATHGoogle Scholar
  10. 10.
    Zhang, W., Zhang, Q., Yu, B., Zhao, L.: Knowledge map of creativity research based on keywords network and co-word analysis, 1992–2011. Qual. Quant. 49(3), 1023–1038 (2015)CrossRefGoogle Scholar
  11. 11.
    Jain, P.K., Lungu, E.M.: Harmonic analysis of solar radiation data for Sebele, Botswana. In: World Renewable Energy Conference Vi, pp. 2575–2578 (2000)CrossRefGoogle Scholar
  12. 12.
    Bollegala, D., Ishizuka, M., Matsuo, Y.: Measuring semantic similarity between words using web search engines. In: International Conference on World Wide Web, pp. 757–766 (2007)Google Scholar
  13. 13.
    Sun, A.: Short text classification using very few words. In: Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1145–1146 (2012)Google Scholar
  14. 14.
    Sahami, M., Heilman, T.D.: A web-based kernel function for measuring the similarity of short text snippets. In: International Conference on World Wide Web, WWW 2006, Edinburgh, Scotland, UK, pp. 377–386, May 2006Google Scholar
  15. 15.
    Basile, P., Caputo, A., Semeraro, G.: Semantic vectors: an information retrieval scenario. In: IIR 2010 Proceedings of the First Italian Information Retrieval Workshop, Padua, Italy, pp. 1–5, January 2010Google Scholar
  16. 16.
    Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. Comput. Sci. 188–1196 (2014)Google Scholar
  17. 17.
    Socher, R., Huval, B., Manning, C.D., Ng, A.Y.: Semantic compositionality through recursive matrix-vector spaces. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 1201–1211 (2012)Google Scholar
  18. 18.
    Wang, P., Xu, J., Xu, B., Liu, C., Zhang, H., Wang, F., Hao, H.: Semantic clustering and convolutional neural network for short text categorization. In: Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language ProcessingGoogle Scholar
  19. 19.
    Hotho, A., Staab, S., Stumme, G.: Wordnet improves text document clustering. In: Proceedings of the Sigrid Semantic Web Workshop, pp. 541–544 (2003)Google Scholar
  20. 20.
    Banerjee, S., Ramanathan, K., Gupta, A.: Clustering short texts using Wikipedia. In: SIGIR 2007: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands, pp. 787–788, July 2007Google Scholar
  21. 21.
    Hu, X., Sun, N., Zhang, C., Chua, T.S.: Exploiting internal and external semantics for the clustering of short texts using world knowledge. In: ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, pp. 919–928, November 2009Google Scholar
  22. 22.
    Errecalde, M.L., Ingaramo, D.A.: A new anttree-based algorithm for clustering short text corpora. J. Comput. Sci. Technol. 10(1), 1–7 (2010)Google Scholar
  23. 23.
    Rangrej, A., Kulkarni, S., Tendulkar, A.V.: Comparative study of clustering techniques for short text documents. In: International Conference Companion on Worldwide Web, pp. 111–112 (2011)Google Scholar
  24. 24.
    Gu, D., Zhang, Z., Zhang, X., Liu, L.: Research on user-oriented short text clustering. In: International Conference on Information Science and Control Engineering, pp. 563–567 (2016)Google Scholar
  25. 25.
    Finegan-Dollak, C., Coke, R., Zhang, R., Ye, X., Radev, D.: Effects of creativity and cluster tightness on short text clustering performance. In: Meeting of the Association for Computational Linguistics, pp. 654–665 (2016)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Xiaoping Du
    • 1
  • Lelai Deng
    • 1
    Email author
  • Xingzhi Zhang
    • 1
  • Qinghong Yang
    • 1
  1. 1.Beihang UniversityBeijingChina

Personalised recommendations