ST-LDA: High Quality Similar Words Augmented LDA for Service Clustering

  • Yi ZhaoEmail author
  • Keqing He
  • Yu Qiao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11335)


Service discovery is a key problem in the field of services computing, which is essential to improve the accuracy and efficiency of both services composition and recommendation. Service clustering is a major way to facilitate service discovery. The main technical difficulty in solving service clustering problem lies in the semantic gap among services. Some traditional approaches like LDA perform well in service clustering to some extent. However, their performances are still limited by the inevitable semantic noise words. To bridge this gap, we propose a novel solution, namely ST-LDA (short for “Similar Words and TF-IDF Augmented Latent Dirichlet Allocation”), approaching the challenges from the perspective of similar words learning and noise words filtering to improve service clustering. Specifically, we adopt Word2Vec to adapt the representation of services, and learn a list of similar words in service corpus. Moreover, we further integrate TF-IDF into our similarity calculation to filter noise words. In this way, we can enhance LDA with the similar words finding and filtering strategy for service clustering. We conduct extensive experiments on a real-world dataset, which demonstrate that our approach can improve the efficiency of service clustering.


TF-IDF Latent Dirichlet Allocation Word2vec Web service clustering 



This work was supported by the National Natural Science Foundation of China (Nos. 61672387 and 61702378), and the Natural Science Foundation of Hubei Province of China (Nos. 2018CFB511 and 2017CKB894).


  1. 1.
    Lo, D.: An Exploratory Study of Functionality and Learning Resources of Web APIs on ProgrammableWebGoogle Scholar
  2. 2.
    Chen, L., Wang, Y., Yu, Q., Zheng, Z., Wu, J.: WT-LDA: user tagging augmented LDA for web service clustering. In: Basu, S., Pautasso, C., Zhang, L., Fu, X. (eds.) ICSOC 2013. LNCS, vol. 8274, pp. 162–176. Springer, Heidelberg (2013). Scholar
  3. 3.
    Bobadilla, J., Ortega, F., Hernando, A., et al.: A collaborative filtering approach to mitigate the new user cold start problem. Knowl. Based Syst. 26, 225–238 (2012)CrossRefGoogle Scholar
  4. 4.
    Shi, M., Liu, J., Zhou, D., et al.: WE-LDA: a word embeddings augmented LDA model for web services clustering. In: IEEE International Conference on Web Services, pp. 9–16. IEEE (2017)Google Scholar
  5. 5.
    Poria, S, Chaturvedi, I, Cambria, E, et al.: Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International Joint Conference on Neural Networks, pp. 4465–4473. IEEE (2016)Google Scholar
  6. 6.
    Hao, Y., Junliang, C., Xiangwu, M., Bingyu, Q.: Dynamically traveling web service clustering based on spatial and temporal aspects. In: Hainaut, J.-L., et al. (eds.) ER 2007. LNCS, vol. 4802, pp. 348–357. Springer, Heidelberg (2007). Scholar
  7. 7.
    Platzer, C., Rosenberg, F., Dustdar, S.: Web service clustering using multidimensional angles as proximity measures. ACM Trans. Internet Technol. 9(3), 1–26 (2009)CrossRefGoogle Scholar
  8. 8.
    Sun, P., Jiang, C.: Using service clustering to facilitate process-oriented semantic web service discovery. Chin. J. Comput. 31(8), 1340–1353 (2008)Google Scholar
  9. 9.
    Kumara, B.T.G.S., Paik, I., Chen, W.: Web-service clustering with a hybrid of ontology learning and information-retrieval-based term similarity. In: IEEE, International Conference on Web Services, pp. 340–347. IEEE Computer Society (2013)Google Scholar
  10. 10.
    Klusch, M., Fries, B., Sycara, K.: OWLS-MX: a hybrid semantic web service matchmaker for OWL-S services. Web Seman. Sci. Serv. Agents World Wide Web 7(2), 121–133 (2009)CrossRefGoogle Scholar
  11. 11.
    Klusch, M., Kapahnke, P., Zinnikus, I.: Hybrid adaptive web service selection with SAWSDL-MX and WSDL-analyzer. In: Aroyo, L., et al. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 550–564. Springer, Heidelberg (2009). Scholar
  12. 12.
    Gu, Y., Cai, H., Xie, C., et al.: Utilizing semantic information from linked open data in web service clustering. In: International Conference on Progress in Informatics and Computing, pp. 654–658. IEEE (2017)Google Scholar
  13. 13.
    Dasgupta, S., Aroor, A., Shen, F., et al.: SMARTSPACE: multiagent based distributed platform for semantic service discovery. IEEE Trans. Syst. Man Cybern. Syst. 44(7), 805–821 (2017)CrossRefGoogle Scholar
  14. 14.
    Wang, J., Gao, P.P., Ma, Y.T., He, K.Q., Patrick, C.K.: A web service discovery approach based on common topic groups extraction. IEEE Access 5, 10193–10208 (2017). Scholar
  15. 15.
    Wu, H.C., Luk, R.W.P., Wong, K.F., et al.: Interpreting TF-IDF term weights as making relevance decisions. ACM Trans. Inf. Syst. 26(3), 55–59 (2008)CrossRefGoogle Scholar
  16. 16.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J Mach. Learn. Res. Arch. 3, 993–1022 (2003). Scholar
  17. 17.
    Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (2002)CrossRefGoogle Scholar
  18. 18.
    Bartunov, S., Kondrashkin, D., Osokin, A., et al.: Breaking sticks and ambiguities with adaptive skip-gram. Comput. Sci. (2015) Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer ScienceWuhan UniversityWuhanChina

Personalised recommendations