Developing an Efficient Mechanism for Retrieving Documents Using Automatically Generated Metadata

Bandyopadhyay, Sayanti; Bandyopadhyay, Amit Kumar

doi:10.1007/978-3-642-29219-4_9

Sayanti Bandyopadhyay⁴ &
Amit Kumar Bandyopadhyay⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 269))

Included in the following conference series:

International Conference on Computing and Communication Systems

1977 Accesses

Abstract

Efficient generation of metadata and retrieval is becoming highly challenging with the increase in volume of resources. Huge recall with low precision is the problem of most document retrieval systems. Effective automatic generation of metadata can reduce time required in creating metadata manually. This paper attempts to develop asearch algorithm to retrieve documents using automatically generated metadata. The fifteen elements of Dublin Core Metadata are discussed and addition of another element consisting of popular search keywords for retrieving the document has been suggested. The keywords in the suggested extended metadata element can be grouped according to geographic location for personalizing the search results. Precision increment and recall decrement can be achieved by retrieving books of specific topic-ids for which maximum words match with the searched keywords. Different stemming mechanisms for the search keywords have also been discussed. A flowchart of the proposed algorithm is given at the end.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

DCMI Dublin Core Metadata, http://dublincore.org
Smeaton, A.F.: Natural Language Processing & Information Retrieval, http://www.compapp.dcu.ie/~asmeaton/asmeaton.html
Frakes, W.B.: Term Conflation for Information Retrieval. In: 7th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 383–389 (1984)
Google Scholar
Porter, M.F.: An Algorithm for Suffix Tripping. Program. 14, 130–137 (1980)
Article Google Scholar
Paice, C.D.: Another Stemmer. ACM SIGIR Forum 24, 56–61 (1990)
Article Google Scholar
Dawson, J.: Suffix Removal and Word Conflation. ALLC Bulletin 2, 33–46 (1974)
Google Scholar
Mayfield, J., McNamee, P.: Single N-gram Stemming. In: 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 415–416 (2003)
Google Scholar
Melucci, M., Orio, N.: A Novel Method for Stemmer Generation based on Hidden Markov Models. In: 12th International Conference on Information and Knowledge Management, pp. 131–138 (2003)
Google Scholar
Majumder, P., Mitra, M., Parui, K.S., Kole, G., Mitra, P., Datta, K.: YASS: Yet another Suffix Stripper. ACM Transactions on Information Systems 25 (2007)
Google Scholar
Xu, J., Croft, B.W.: Corpus-based Stemming using Co-occurrence of Word Variants. ACM Transactions on Information Systems 16, 61–81 (1998)
Article Google Scholar
Funchun, P., Ahmed, N., Xin, L., Yumao, L.: Context Sensitive Stemming for Web Search. In: 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 639–646 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

University Institute of Technology, University of Burdwan, Burdwan, 713104, W.B., India
Sayanti Bandyopadhyay
Department of Library and Information Science, University of Burdwan, Burdwan, 713104, W.B., India
Amit Kumar Bandyopadhyay

Authors

Sayanti Bandyopadhyay
View author publications
You can also search for this author in PubMed Google Scholar
Amit Kumar Bandyopadhyay
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing Science and Engineering, VIT University, 632014, Vellore, TN, India
P. Venkata Krishna
School of Computing and Engineering, VIT University, 632014, Vellore, TN, India
M. Rajasekhara Babu
London Metropolitan University, UK
Ezendu Ariwa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bandyopadhyay, S., Bandyopadhyay, A.K. (2012). Developing an Efficient Mechanism for Retrieving Documents Using Automatically Generated Metadata. In: Krishna, P.V., Babu, M.R., Ariwa, E. (eds) Global Trends in Computing and Communication Systems. ObCom 2011. Communications in Computer and Information Science, vol 269. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29219-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-29219-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29218-7
Online ISBN: 978-3-642-29219-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics