Automatic Document Tagging in Social Semantic Digital Library

Xu, Xiaomei; Niu, Zhendong

doi:10.1007/978-3-642-10684-2_38

Automatic Document Tagging in Social Semantic Digital Library

Xiaomei Xu¹⁹ &
Zhendong Niu¹⁹

Conference paper

1715 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5864))

Abstract

The emergence of Web 2.0 has created a lot of annotation and personalization information about web resources. Extracting and utilizing these information to enhance the quality of services is a key target of modern digital libraries. In this paper, we present a novel Automatic Document Tagging (ADT) approach for digital libraries. In our approach, the ADT problem is formulated as a variant of multi-class classification problem. But differently, the training data for ADT is collected from the user’s historic tags and only partially labeled. The incompleteness of the training data makes the training a more challenging problem. To overcome this problem, an efficient randomized online training algorithm (RPL) is proposed. RPL algorithm has two phases: (i) random exploitation and (ii) classifier update. The experimental results from both synthetic and real-word data demonstrate the effectiveness.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Crammer, K., Dekel, O., Keshet, J., Shalev-Shwartz, S., Singer, Y.: Online passive aggressive algorithms. Journal of Machine Learning Research 7, 551–585 (2006)
MathSciNet Google Scholar
Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. Journal of Machine Learning Research 3(951) (2003)
Google Scholar
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. Wiley, Chichester (1973)
MATH Google Scholar
Fink, M., Shalev-Shwartz, S., Singer, Y., Ullman, S.: On- line multiclass learning by interclass hypothesis sharing. In: Proceedings of the 23rd International Conference on Machine Learning (2006)
Google Scholar
Fox, E.: The digital libraries initiative - update and discussion. In: Bulletin of the America Society of Information Science, October/November 1999, vol. 26 (1999)
Google Scholar
Freund, Y., Schapire, R.: Large margin classification using the perceptron algorithm. Machine Learning 37(3), 277–296 (1999)
Article MATH Google Scholar
Geroimenko, V.: A semantic web primer. Computer Journal 48(1) (2006)
Google Scholar
Kahn, R., Cerf, V.: An open architecture for digital library system and a plan for its development. Digital Libary Project 1 (1998)
Google Scholar
Kivinen, J., Warmuth, M.: Exponentiated gradient versus gradient descent for linear predictors. Information and Computation 132 (January 1997)
Google Scholar
Kruk, S.R., Decker, S., Zieborak, L.: JeromeDL - adding semantic web technologies to digital libraries. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 716–725. Springer, Heidelberg (2005)
Chapter Google Scholar
Kruk, S., Woroniecki, T., Gzella, A., Dabrowski, M., McDaniel, B.: Anatomy of a social semantic library. In: Franconi, E., Kifer, M., May, W. (eds.) ESWC 2007. LNCS, vol. 4519. Springer, Heidelberg (2007)
Google Scholar
Lang, K.: Newsweeder: Learning to filter netnews. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 331–339. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Langford, J., Zhang, T.: The epoch-greedy algorithm for contextual multi-armed bandits. In: NIPS (2007)
Google Scholar
Langville, A., Carl, D.: Google’s PageRank and Beyond: The Science of Search Engine Rankings. Princeton University Press, Princeton (2006)
MATH Google Scholar
Mika, P.: Ontologies are us: A unified model of social networks and semantics. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 522–536. Springer, Heidelberg (2005)
Chapter Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The pagerank citation ranking: Bringing order to the web (1999)
Google Scholar
Rosenblatt, F.: The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 386–407 (1988)
Article Google Scholar
Vapnik, V.: Statistical Learning Theory. Wiley, Chichester (1998)
MATH Google Scholar
Weston, J., Watkins, C.: Support vector machines for multi-class pattern recognition. In: Proceedings of the Seventh European Symposium on Artificial Neural Networks (April 1999)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Beijing Institute of Technology, Beijing, 100081, PRC
Xiaomei Xu & Zhendong Niu

Authors

Xiaomei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Zhendong Niu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronic Engineering, City University of Hong Kong, Hong Kong,
Chi Sing Leung
School of Electrical Engineering and Computer Science, Kyungpook National University, 1370 Sankyuk-Dong, Puk-Gu, 702-701, Taegu, Korea
Minho Lee
School of Information Technology, King Mongkut’s University of Technology Thonburi, 126 Pracha-U-Thit Rd., Bangmod, Thungkru, 10140, Bangkok, Thailand
Jonathan H. Chan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, X., Niu, Z. (2009). Automatic Document Tagging in Social Semantic Digital Library. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5864. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10684-2_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-10684-2_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10682-8
Online ISBN: 978-3-642-10684-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics