Abstract
This paper is about searching literature digital libraries to find “related” publications of a given publication. Existing approaches do not take into account publication topics in the relatedness computation, allowing topic diffusion across query output publications. In this paper, we propose a new way to measure “relatedness” by incorporating “contexts” (representing topics) of publications. We utilize existing ontology terms as contexts for publications, i.e., publications are assigned to their relevant contexts, where a context characterizes one or more publication topics. We define three ways of context-based relatedness, namely, (a) relatedness between two contexts (context-to-context relatedness) by using publications that are assigned to the contexts and the context structures in the context hierarchy, (b) relatedness between a context and a paper (paper-to-context relatedness), which is used to rank the relatedness of contexts with respect to a paper, and (c) relatedness between two papers (paper-to-paper relatedness) by using both paper-to-context and context-to-context relatedness measurements.
Using existing biomedical ontology terms as contexts for genomics-oriented publications, our experiments indicate that the context-based approach is accurate, and solves the topic diffusion problem by effectively classifying and ranking related papers of a given paper based on the selected contexts of the paper.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ratprasartporn, N., Po, J., Cakmak, A., Bani-Ahmad, S., Ozsoyoglu, G.: Context-Based Literature Digital Library Search. Technical Report, CWRU (2006)
Ratprasartporn, N., Bani-Ahmad, S., Cakmak, A., Po, J., Ozsoyoglu, G.: Evaluating Different Ranking Functions for Context-Based Literature Search. In: DBRank workshop. In Conjunction with ICDE (2007)
Gene Ontology, for a visualization of the Gene Ontology, see [14], http://geneontology.org
Medical Subject Headings (MeSH), http://www.nlm.nih.gov/mesh/
Kessler, M.M.: Bibliographic Coupling between Scientific Papers. American Documentation 14, 10–25 (1963)
Small, H.: Co-citation in the Scientific Literature: A New Measure of the Relationship between Two Documents. Journal of the American Society for Information Science 24(4), 28–31 (1973)
White, S., Smyth, P.: Algorithms for Estimating Relative Importance in Networks. In: SIGKDD (2003)
Salton, G.: Automatic Text Processing. Addison-Wesley, Reading (1989)
Resnik, P.: Using Information Content to Evaluate Semantic Similarity in a Taxonomy. In: International Joint Conference on Artificial Intelligence (1995)
Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and Application of a Metric on Semantic Nets. IEEE Trans. Systems, Man, and Cybernetics 9(1), 17–30 (1989)
Li, Y., Bandar, Z., McLean, D.: An Approach for Measuring Semantic Similarity between Words Using Multiple Information Sources. IEEE Transactions on Knowledge and Data Engineering 15(4) (2003)
Lord, P.W., Stevens, R.D., Brass, A., Goble, C.A.: Investigating Semantic Similarity Measures across the Gene Ontology: the Relationship between Sequence and Annotation. Bioinformatics 19(10) (2003)
CaseMed Ontology Viewer, CWRU, http://nashuatest.case.edu/termvisualizer/
Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, pp. 296–304. Morgan Kaufmann Publishers Inc, San Francisco (1998)
Jiang, J., Conrath, D.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonom, the 10th International Conference on Research in Computational Linguistics (1997)
Pedersen, T., Pakhomov, S., Patwardhan, S., Chute, C.: Measures of Semantic Similarity and Relatedness in the Biomedical Domain. Journal of Biomedical Informatics (2006)
Das-Neves, F., Fox, E.A., Yu, X.: Connecting Topics in Document Collections with Stepping Stones and Pathways. In: CIKM (2005)
Maguitman, A.G., Menczer, F., Roinestad, H., Vespignani, A.: Algorithmic Detection of Semantic Similarity, WWW (2005)
Open Directory Project, http://dmoz.org/
Bar-Joseph, Z., Demaine, E.D., Gifford, D.K., Srebro, N., Hamel, A.M., Jaakkola, T.S.: K-ary Clustering with Optimal Leaf Ordering for Gene Expression Data. Bioinformatics 19(9), 1070–1078 (2003)
Po, J.: Context-Based Search in Literature Digital Libraries, MS Thesis, CWRU (2006)
Pan, F.: Comparative Evaluation of Publication Characteristics in Computer Science and Life Sciences, MS Thesis, CWRU (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ratprasartporn, N., Ozsoyoglu, G. (2007). Finding Related Papers in Literature Digital Libraries. In: Kovács, L., Fuhr, N., Meghini, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2007. Lecture Notes in Computer Science, vol 4675. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74851-9_23
Download citation
DOI: https://doi.org/10.1007/978-3-540-74851-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74850-2
Online ISBN: 978-3-540-74851-9
eBook Packages: Computer ScienceComputer Science (R0)