Abstract
A multi-label classification method of short text based on similarity graph and restart random walk model is proposed. Firstly, the similarity graph is created by using data and labels as the node, and the weights on the edges are calculated through an external knowledge, so the initial matching degree of between the sample and the label set is obtained. After that, we build a label dependency graph with labels as vertices, and using the previous matching degree as the initial prediction value to calculate the relationship between the sample and each node until the probability distribution becomes stable. Finally, the obtained relationship vector is the label probability distribution vector of the sample predicted by the method in this paper. Experimental results show that we provides a more efficient and reliable multi-label short-text classification algorithm.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, Boston (2009). https://doi.org/10.1007/978-0-387-09823-4_34
Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Trohidis, K., Tsoumakas, G., Kalliris, G.: Multi-label classification of music by emotion. EURASIP J. Audio Speech Music. Process. 2011(1), 4 (2011)
Guo, T., Li, G.Y.: An improved binary relevance algorithm for multi-label classification. Appl. Mech. Mater. 536–537, 394–398 (2014)
Liu, W., Tsang, I.W.: On the optimality of classifier chain for multi-label classification. In: International Conference on Neural Information Processing Systems. MIT Press (2015)
Read, J., Pfahringer, B., Holmes, G., et al.: Classifier chains for multi-label classification. Mach. Learn. 85(3), 333 (2011). https://doi.org/10.1007/s10994-011-5256-5
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit. 40(7), 2038–2048 (2007)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Random k-labelsets for multi-label classification. IEEE Trans. Knowl. Data Eng. 23(7), 1079–1089 (2011)
Read, J.: A pruned problem transformation method for multi-label classification. In: New Zealand Computer Science Research Student Conference (NZCSRS 2008), vol. 143150, p. 41 (2008)
Jizhao, Q., Hua, J.I., Huaxiang, Z.: Modified algorithm with label-specific features for multi-label learning. Comput. Eng. Appl. 49(22), 163–166 (2013)
Huang, J., Li, G., Wang, S., Zhang, W., Huang, Q.: Group sensitive classifier chains for multi-label classification. In: IEEE International Conference on Multimedia and Expo (ICME), Turin, pp. 1–6 (2015)
Huang, J., Li, G., Huang, Q., et al.: Learning label-specific features and class-dependent labels for multi-label classification. IEEE Trans. Knowl. Data Eng. 28(12), 3309–3323 (2016)
Qiao, L., Zhang, L., Sun, Z., et al.: Selecting label-dependent features for multi-label classification. Neurocomputing 259, 112–118 (2017)
Li, X., Ouyang, J., Zhou, X.: Supervised Topic Models for Multi-Label Classification. Elsevier Science Publishers B.V., Amsterdam (2015)
Soleimani, H., Miller, D.J.: Semi-supervised multi-label topic models for document classification and sentence labeling. In: ACM International on Conference on Information & Knowledge Management. ACM (2016)
Stanchev, L.: Creating a similarity graph from WordNet. In: 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14), pp. 1–11. Association for Computing Machinery, New York (2014). Article no. 36
Stanchev, L.: Semantic document clustering using a similarity graph. In: IEEE Tenth International Conference on Semantic Computing, pp. 1–8. IEEE (2016)
Stanchev, L.: Creating a probabilistic graph for WordNet using markov logic network. In: 6th International Conference on Web Intelligence, Mining and Semantics, pp. 1–12 (2016)
Tong, H., Faloutsos, C., Pan, J.Y.: Fast random walk with restart and its applications. In: 6th International Conference on Data Mining (ICDM 2006), pp. 613–622. IEEE (2006)
Díez, J., Luaces, O., del Coz, J.J., et al.: Optimizing different loss functions in multi-label classifications. Prog. Artif. Intell. 3(2), 107–118 (2015). https://doi.org/10.1007/s13748-014-0060-7
Hamers, L., Hemeryck, Y., Herweyers, G., et al.: Similarity measures in scientometric research: the Jaccard index versus Salton’s cosine formula. Inf. Process. Manag. 25(3), 315–318 (1989)
Hubley, A.M.: Using the Rey-Osterrieth and modified Taylor complex figures with older adults: a preliminary examination of accuracy score comparability. Arch. Clin. Neuropsychol. Off. J. Natl. Acad. Neuropsychol. 25(3), 197 (2010)
Acknowledgments
This work was supported in part by National Natural Science Foundation of China (No. 61762078, 61862058, 61967013), Youth Teacher Scientific Capability Promoting Project of NWNU (No. NWNU-LKQN-16-20).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 IFIP International Federation for Information Processing
About this paper
Cite this paper
Li, X., Yang, F., Ma, Y., Ma, H. (2020). Multi-label Classification of Short Text Based on Similarity Graph and Restart Random Walk Model. In: Shi, Z., Vadera, S., Chang, E. (eds) Intelligent Information Processing X. IIP 2020. IFIP Advances in Information and Communication Technology, vol 581. Springer, Cham. https://doi.org/10.1007/978-3-030-46931-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-46931-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-46930-6
Online ISBN: 978-3-030-46931-3
eBook Packages: Computer ScienceComputer Science (R0)