Abstract
Service discovery is a key problem in the field of services computing, which is essential to improve the accuracy and efficiency of both services composition and recommendation. Service clustering is a major way to facilitate service discovery. The main technical difficulty in solving service clustering problem lies in the semantic gap among services. Some traditional approaches like LDA perform well in service clustering to some extent. However, their performances are still limited by the inevitable semantic noise words. To bridge this gap, we propose a novel solution, namely ST-LDA (short for “Similar Words and TF-IDF Augmented Latent Dirichlet Allocation”), approaching the challenges from the perspective of similar words learning and noise words filtering to improve service clustering. Specifically, we adopt Word2Vec to adapt the representation of services, and learn a list of similar words in service corpus. Moreover, we further integrate TF-IDF into our similarity calculation to filter noise words. In this way, we can enhance LDA with the similar words finding and filtering strategy for service clustering. We conduct extensive experiments on a real-world dataset, which demonstrate that our approach can improve the efficiency of service clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Lo, D.: An Exploratory Study of Functionality and Learning Resources of Web APIs on ProgrammableWeb
Chen, L., Wang, Y., Yu, Q., Zheng, Z., Wu, J.: WT-LDA: user tagging augmented LDA for web service clustering. In: Basu, S., Pautasso, C., Zhang, L., Fu, X. (eds.) ICSOC 2013. LNCS, vol. 8274, pp. 162–176. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-45005-1_12
Bobadilla, J., Ortega, F., Hernando, A., et al.: A collaborative filtering approach to mitigate the new user cold start problem. Knowl. Based Syst. 26, 225–238 (2012)
Shi, M., Liu, J., Zhou, D., et al.: WE-LDA: a word embeddings augmented LDA model for web services clustering. In: IEEE International Conference on Web Services, pp. 9–16. IEEE (2017)
Poria, S, Chaturvedi, I, Cambria, E, et al.: Sentic LDA: improving on LDA with semantic similarity for aspect-based sentiment analysis. In: International Joint Conference on Neural Networks, pp. 4465–4473. IEEE (2016)
Hao, Y., Junliang, C., Xiangwu, M., Bingyu, Q.: Dynamically traveling web service clustering based on spatial and temporal aspects. In: Hainaut, J.-L., et al. (eds.) ER 2007. LNCS, vol. 4802, pp. 348–357. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-76292-8_41
Platzer, C., Rosenberg, F., Dustdar, S.: Web service clustering using multidimensional angles as proximity measures. ACM Trans. Internet Technol. 9(3), 1–26 (2009)
Sun, P., Jiang, C.: Using service clustering to facilitate process-oriented semantic web service discovery. Chin. J. Comput. 31(8), 1340–1353 (2008)
Kumara, B.T.G.S., Paik, I., Chen, W.: Web-service clustering with a hybrid of ontology learning and information-retrieval-based term similarity. In: IEEE, International Conference on Web Services, pp. 340–347. IEEE Computer Society (2013)
Klusch, M., Fries, B., Sycara, K.: OWLS-MX: a hybrid semantic web service matchmaker for OWL-S services. Web Seman. Sci. Serv. Agents World Wide Web 7(2), 121–133 (2009)
Klusch, M., Kapahnke, P., Zinnikus, I.: Hybrid adaptive web service selection with SAWSDL-MX and WSDL-analyzer. In: Aroyo, L., et al. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 550–564. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-02121-3_41
Gu, Y., Cai, H., Xie, C., et al.: Utilizing semantic information from linked open data in web service clustering. In: International Conference on Progress in Informatics and Computing, pp. 654–658. IEEE (2017)
Dasgupta, S., Aroor, A., Shen, F., et al.: SMARTSPACE: multiagent based distributed platform for semantic service discovery. IEEE Trans. Syst. Man Cybern. Syst. 44(7), 805–821 (2017)
Wang, J., Gao, P.P., Ma, Y.T., He, K.Q., Patrick, C.K.: A web service discovery approach based on common topic groups extraction. IEEE Access 5, 10193–10208 (2017). https://doi.org/10.1109/ACCESS.2017.2712744
Wu, H.C., Luk, R.W.P., Wong, K.F., et al.: Interpreting TF-IDF term weights as making relevance decisions. ACM Trans. Inf. Syst. 26(3), 55–59 (2008)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J Mach. Learn. Res. Arch. 3, 993–1022 (2003). https://doi.org/10.1162/jmlr.2003.3.4-5.993
Karypis, G., Han, E.H., Kumar, V.: Chameleon: hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (2002)
Bartunov, S., Kondrashkin, D., Osokin, A., et al.: Breaking sticks and ambiguities with adaptive skip-gram. Comput. Sci. (2015)
Acknowledgement
This work was supported by the National Natural Science Foundation of China (Nos. 61672387 and 61702378), and the Natural Science Foundation of Hubei Province of China (Nos. 2018CFB511 and 2017CKB894).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhao, Y., He, K., Qiao, Y. (2018). ST-LDA: High Quality Similar Words Augmented LDA for Service Clustering. In: Vaidya, J., Li, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2018. Lecture Notes in Computer Science(), vol 11335. Springer, Cham. https://doi.org/10.1007/978-3-030-05054-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-05054-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-05053-5
Online ISBN: 978-3-030-05054-2
eBook Packages: Computer ScienceComputer Science (R0)