Applying a Lightweight Iterative Merging Chinese Segmentation in Web Image Annotation

Huang, Chuen-Min; Chang, Yen-Jia

doi:10.1007/978-3-642-39712-7_14

Chuen-Min Huang²⁰ &
Yen-Jia Chang²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7988))

Included in the following conference series:

International Workshop on Machine Learning and Data Mining in Pattern Recognition

4318 Accesses

Abstract

Traditional CBIR method relies on visual features to identify objects in an image and uses predefined terms to annotate images, thus it fails to depict the implicit meanings. Recent textual content analysis methods applied to image annotation were blamed for their complexity of computation. In this research, we propose a corpus-free, relatively light computation of term segmentation method, namely “Iterative Merging Chinese Segmentation (IMCS) ,” to identify representative terms from a single web page to obtain anecdotes as a semantic enrichment of the target image. It requires minimum computation needs that allows to share characters/words and facilitate their use at fine granularities without prohibitive cost. In the experiment, this method achieves a precision rate of 86.02%, and gains acceptance from expert rating and user rating of 75% and 68%, respectively. In performance testing, it only takes 0.006 second to process each image in a collection of 1,728 testing data set.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gao, S., et al.: Automatic image annotation through multi-topic text categorization. Presented at the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing (2006)
Google Scholar
Lei, Z., Jun, M.: Image annotation by incorporating word correlations into multi-class SVM. Soft Computing 15, 917–927 (2011)
Article Google Scholar
Luong-Dong, N., et al.: A Bayesian approach integrating regional and global features for image semantic learning. In: Proceedings of the IEEE International Conference On Multimedia, pp. 546–549 (2009)
Google Scholar
Chow, T.W.S., Rahman, M.K.M.: A new image classification technique using tree-structured regional features. Advanced Neurocomputing Theory and Methodology 70, 1040–1050 (2007)
Article Google Scholar
Huang, C.M., et al.: Automatic image annotation by incorporating weighting strategy with CSOM classifier. Presented at the The 2011 International Conference on Image Processing, Computer Vision, & Pattern Recognition (IPCV 2011), Monte Carlo Resort, Las Vegas, Nevada, USA (2011)
Google Scholar
Su, J.H., et al.: Effective image semantic annotation by discovering visual-concept associations from image-concept distribution model. In: Proceedings of the IEEE International Conference On Multimedia, pp. 42–47 (2010)
Google Scholar
Barnard, K., et al.: Matching words and pictures. The Journal of Machine Learning Research 3, 1107–1135 (2003)
MATH Google Scholar
Zhu, S., Liu, Y.: Semi-supervised learning model based efficient image annotation. IEEE Signal Processing Letter 16, 989–992 (2009)
Article Google Scholar
Kato, T.: Database architecture for content-based image retrieval. In: Proc. SPIE 1662, Image Storage and Retrieval Systems, pp. 112–123 (1992)
Google Scholar
Jing, L., et al.: Automatic image annotation based-on model space. In: Proceedings of 2005 IEEE International Conference on Natural Language Processing and Knowledge Engineering 2005, pp. 455–460 (2005)
Google Scholar
Gao, Y., et al.: Automatic image annotation by incorporating feature hierarchy and boosting to scale up SVM classifiers. Presented at the Proceedings of the 14th Annual ACM International Conference on Multimedia, Santa Barbara, CA, USA (2006)
Google Scholar
Mori, Y., et al.: Image-to-word transformation based on dividing and vector quantizing images with words. Presented at the First International Workshop on Multimedia Intelligent Storage and Retrieval Manegement (1999)
Google Scholar
Monay, F., Gatica-Perez, D.: PLSA-based image auto-annotation: constraining the latent space. Presented at the Proceedings of the 12th Annual ACM International Conference on Multimedia, New York, NY, USA (2004)
Google Scholar
Carneiro, G., Vasconcelos, N.: Formulating semantic image annotation as a supervised learning problem. Presented at the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2005)
Google Scholar
Wang, Z., et al.: Word segmentation of Chinese text with multiple hybrid methods. Presented at the 2009 International Conference on Computational Intelligence and Software Engineering (2009)
Google Scholar
Horng, J.T., Yeh, C.C.: Applying genetic algorithms to query optimization in document retrieval. Information Processing & Management 36, 737–759 (2000)
Article Google Scholar
Kim, M.S., et al.: Structural optimization of a full-text n-gram index using relational normalization. The VLDB Journal 17, 1485–1507 (2008)
Article Google Scholar
Fuketa, M., et al.: A retrieval method of similar strings using substrings. Presented at the 2010 Second International Conference on Computer Engineering and Applications (2010)
Google Scholar
Teng, C., et al.: A behavioural mode research on user-focus summarization. Mathematical and Computer Modelling 51, 985–994 (2010)
Article Google Scholar
Yan, H., et al.: Compressing term positions in web indexes. Presented at the Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA (2009)
Google Scholar
Troy, A.D., Zhang, G.-Q.: Enhancing relevance scoring with chronological term rank. Presented at the Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Amsterdam, The Netherlands (2007)
Google Scholar
Tatar, D., et al.: Text Segments as Constrained Formal Concepts. In: 12th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 223–228 (2010)
Google Scholar
Shanthi, V., Lalitha, S.: Lexical chaining process for text generations. Presented at the International Conference on Process Automation, Control and Computing (PACC) (2011)
Google Scholar
Tatar, D., et al.: Lexical Chains Segmentation in Summarization. In: 10th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing, pp. 95–101 (2008)
Google Scholar
Chiong, R., Wang, W.: Named entity recognition using hybrid machine learning approach. Presented at the The 5th IEEE International Conference on Cognitive Informatics (2006)
Google Scholar
Ageishi, R., Miura, T.: Named entity recognition based on a Hidden Markov Model in part-of-speech tagging. In: Presented at the First International Conference on the Applications of Digital Information and Web Technologies (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Management, National Yunlin University of Science & Technology, Taiwan, R.O.C.
Chuen-Min Huang & Yen-Jia Chang

Authors

Chuen-Min Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yen-Jia Chang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Vision and Applied Computer Sciences, IBaI, Leipzig, Germany
Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, CM., Chang, YJ. (2013). Applying a Lightweight Iterative Merging Chinese Segmentation in Web Image Annotation. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-39712-7_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics