ST-LDA: High Quality Similar Words Augmented LDA for Service Clustering
Service discovery is a key problem in the field of services computing, which is essential to improve the accuracy and efficiency of both services composition and recommendation. Service clustering is a major way to facilitate service discovery. The main technical difficulty in solving service clustering problem lies in the semantic gap among services. Some traditional approaches like LDA perform well in service clustering to some extent. However, their performances are still limited by the inevitable semantic noise words. To bridge this gap, we propose a novel solution, namely ST-LDA (short for “Similar Words and TF-IDF Augmented Latent Dirichlet Allocation”), approaching the challenges from the perspective of similar words learning and noise words filtering to improve service clustering. Specifically, we adopt Word2Vec to adapt the representation of services, and learn a list of similar words in service corpus. Moreover, we further integrate TF-IDF into our similarity calculation to filter noise words. In this way, we can enhance LDA with the similar words finding and filtering strategy for service clustering. We conduct extensive experiments on a real-world dataset, which demonstrate that our approach can improve the efficiency of service clustering.
KeywordsTF-IDF Latent Dirichlet Allocation Word2vec Web service clustering
This work was supported by the National Natural Science Foundation of China (Nos. 61672387 and 61702378), and the Natural Science Foundation of Hubei Province of China (Nos. 2018CFB511 and 2017CKB894).
- 1.Lo, D.: An Exploratory Study of Functionality and Learning Resources of Web APIs on ProgrammableWebGoogle Scholar
- 2.Chen, L., Wang, Y., Yu, Q., Zheng, Z., Wu, J.: WT-LDA: user tagging augmented LDA for web service clustering. In: Basu, S., Pautasso, C., Zhang, L., Fu, X. (eds.) ICSOC 2013. LNCS, vol. 8274, pp. 162–176. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45005-1_12CrossRefGoogle Scholar
- 4.Shi, M., Liu, J., Zhou, D., et al.: WE-LDA: a word embeddings augmented LDA model for web services clustering. In: IEEE International Conference on Web Services, pp. 9–16. IEEE (2017)Google Scholar
- 5.Poria, S, Chaturvedi, I, Cambria, E, et al.: Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International Joint Conference on Neural Networks, pp. 4465–4473. IEEE (2016)Google Scholar
- 6.Hao, Y., Junliang, C., Xiangwu, M., Bingyu, Q.: Dynamically traveling web service clustering based on spatial and temporal aspects. In: Hainaut, J.-L., et al. (eds.) ER 2007. LNCS, vol. 4802, pp. 348–357. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76292-8_41CrossRefGoogle Scholar
- 8.Sun, P., Jiang, C.: Using service clustering to facilitate process-oriented semantic web service discovery. Chin. J. Comput. 31(8), 1340–1353 (2008)Google Scholar
- 9.Kumara, B.T.G.S., Paik, I., Chen, W.: Web-service clustering with a hybrid of ontology learning and information-retrieval-based term similarity. In: IEEE, International Conference on Web Services, pp. 340–347. IEEE Computer Society (2013)Google Scholar
- 12.Gu, Y., Cai, H., Xie, C., et al.: Utilizing semantic information from linked open data in web service clustering. In: International Conference on Progress in Informatics and Computing, pp. 654–658. IEEE (2017)Google Scholar
- 18.Bartunov, S., Kondrashkin, D., Osokin, A., et al.: Breaking sticks and ambiguities with adaptive skip-gram. Comput. Sci. (2015) Google Scholar