Chinese Weblog Pages Classification Based on Folksonomy and Support Vector Machines

Wang, Xiaoyue; Bai, Rujiang; Liao, Junhua

doi:10.1007/978-3-540-72839-9_27

Xiaoyue Wang¹,
Rujiang Bai¹ &
Junhua Liao¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4476))

Included in the following conference series:

International Workshop on Autonomous Intelligent Systems: Multi-Agents and Data Mining

600 Accesses

Abstract

For centuries, classification has been used to provide context and direction in any aspect of human knowledge. Standard machine learning techniques like support vector machines and related large margin methods have been successfully applied for this task. Unfortunately, automatic classifiers often conduct misclassifications. Folksonomy, a new manual classification scheme based on tagging efforts of users with freely chosen keywords can effective resolve this problem. In folksonomy, a user attaches tags to an item for their own classification, and they reflect many one’s viewpoints. Since tags are chosen from users’ vocabulary and contain many one’s viewpoints, classification results are easy to understand for ordinary users. Even though the scalability of folksonomy is much higher than the other manual classification schemes, the method cannot deal with tremendous number of items such as whole weblog articles on the Internet. For the purpose of solving this problem, we propose a new classification method FSVMC (folisonomy and support vector machine classifier). The FSVMC uses support vector machines as a Tag-agent which is a program to determine whether a particular tag should be attached to a weblog page and Folksonomy dedicates to categorize the weblog articles. In addition, we propose a method to create a candidate tag database which is a list of tags that may be attached to weblog pages. Experimental results indicate our method is more flexible and effective than traditional methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Apte, C., Damerau, F., Weiss, S.M.: Text mining with decision trees and decision rule. In: Proceeding of the automated learning and discovery conference, Carnegie-Mellon University, pp. 99–103 (1998)
Google Scholar
Gunn, S.R.: Support vector machines for classification and regression. ISIS technical report, 31–36. Image speech and intelligent systems group of University of Southampton (1998)
Google Scholar
Tan, S.: Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Systems with Applications, 1–5 (2005)
Google Scholar
Salton, G., McGill, M.J.: Introduction to modern information retrieval, pp. 13–17. McGraw-Hill, New York (1983)
MATH Google Scholar
Schapire, R.E., Singer, Y.: Boostexter: A boosting-based system for text categorization. Machine Learning, 135–168 (2000)
Google Scholar
Avesani, P., et al.: Learning contextualised weblog topics. In: WWW 2005 Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics, pp. 20–33 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Shandong University of Technology Library Zibo 255049, China
Xiaoyue Wang, Rujiang Bai & Junhua Liao

Authors

Xiaoyue Wang
View author publications
You can also search for this author in PubMed Google Scholar
Rujiang Bai
View author publications
You can also search for this author in PubMed Google Scholar
Junhua Liao
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vladimir Gorodetsky Chengqi Zhang Victor A. Skormin Longbing Cao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, X., Bai, R., Liao, J. (2007). Chinese Weblog Pages Classification Based on Folksonomy and Support Vector Machines. In: Gorodetsky, V., Zhang, C., Skormin, V.A., Cao, L. (eds) Autonomous Intelligent Systems: Multi-Agents and Data Mining. AIS-ADM 2007. Lecture Notes in Computer Science(), vol 4476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72839-9_27

Download citation

DOI: https://doi.org/10.1007/978-3-540-72839-9_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72838-2
Online ISBN: 978-3-540-72839-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics