Skip to main content

Identification of Biomedical Articles with Highly Related Core Contents

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10191))

Included in the following conference series:

  • 1847 Accesses

Abstract

Given a biomedical article a, identification of those articles with similar core contents (including research goals, backgrounds, and conclusions) as a is essential for the survey and cross-validation of the highly related biomedical evidence presented in a. We thus present a technique CCSE (Core Content Similarity Estimation) that retrieves these highly related articles by estimating and integrating three kinds of inter-article similarity: goal similarity, background similarity, and conclusion similarity. CCSE works on titles and abstracts of biomedical articles, which are publicly available. Experimental results show that CCSE performs better than PubMed (a popular biomedical search engine) and typical techniques in identifying those scholarly articles that are judged (by biomedical experts) to be the ones whose core contents focus on the same gene-disease associations. The contribution is essential for the retrieval, clustering, mining, and validation of the biomedical evidence in literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Google Scholar is available at https://scholar.google.com.

  2. 2.

    PubMed is available at http://www.ncbi.nlm.nih.gov/pubmed.

  3. 3.

    DisGeNET is available at http://www.disgenet.org/web/DisGeNET/menu/home.

  4. 4.

    GAD is available at http://geneticassociationdb.nih.gov.

  5. 5.

    CTD is available at http://ctdbase.org.

  6. 6.

    PMC is available at http://www.ncbi.nlm.nih.gov/pmc.

References

  1. Aljaber, B., Stokes, N., Bailey, J., Pei, J.: Document clustering of scientific texts using citation contexts. Inf. Retrieval 13(2), 101–131 (2010)

    Article  Google Scholar 

  2. Becker, K.G., Barnes, K.C., Bright, T.J., Wang, S.A.: The genetic association database. Nat. Genet. 36(5), 431–432 (2004)

    Article  Google Scholar 

  3. Boyack, K.W., Newman, D., Duhon, R.J., Klavans, R., Patek, M., Biberstine, J.R., et al.: Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches. PLoS ONE 6(3), e18029 (2011)

    Article  Google Scholar 

  4. Boyack, K.W., Klavans, R.: Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? J. Am. Soc. Inform. Sci. Technol. 61(12), 2389–2404 (2010)

    Article  Google Scholar 

  5. Calado, P., Cristo, M., Moura, E., Ziviani, N., Ribeiro-Neto, B., Goncalves, M.A.: Combining link-based and content-based methods for web document classification. In: Proceedings of the 2003 ACM CIKM International Conference on Information and Knowledge Management, New Orleans, Louisiana, USA (2003)

    Google Scholar 

  6. Couto, T., Cristo, M., Gonçalves, M.A., Calado, P., Nivio Ziviani, N., Moura, E., Ribeiro-Neto, B.: A comparative study of citations and links in document classification. In: Proceedings of the 6th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 75–84 (2006)

    Google Scholar 

  7. Gipp, B., Beel, J.: Citation proximity analysis (CPA) – a new approach for identifying related work based on co-citation analysis. In: Proceedings of the 12th International Conference on Scientometrics and Informetrics, vol. 2, pp. 571–575 (2009)

    Google Scholar 

  8. Janssens, F., Glänzel, W., De Moor, B.: A hybrid mapping of information science. Scientometrics 75(3), 607–631 (2008)

    Article  Google Scholar 

  9. Kessler, M.M.: Bibliographic coupling between scientific papers. Am. Doc. 14(1), 10–25 (1963)

    Article  Google Scholar 

  10. Lin, J., Wilbur, W.J.: PubMed related articles: a probabilistic topic-based model for content similarity. BMC Bioinformatics 8, 423 (2007)

    Article  Google Scholar 

  11. Liu, R.-L.: Citation-based extraction of core contents from biomedical articles. In: Proceedings of the 29th International Conference on Industrial, Engineering & Other Applications of Applied Intelligent Systems (IEA/AIE 2016), pp. 217–228 (2016)

    Google Scholar 

  12. Liu, R.-L.: Passage-based bibliographic coupling: an inter-article similarity measure for biomedical articles. PLoS ONE 10(10), e0139245 (2015)

    Article  Google Scholar 

  13. PubMed: Computation of Related Citations. http://www.ncbi.nlm.nih.gov/books/NBK3827/#pubmedhelp.Computation_of_Similar_Articl. Accessed: Nov 2014

  14. Robertson, S.E., Walker, S., Beaulieu, M.: Okapi at TREC-7: automatic ad hoc, filtering, VLC and interactive. In: proceedings of the 7th Text REtrieval Conference (TREC 7), Gaithersburg, USA, pp. 253–264 (1998)

    Google Scholar 

  15. Small, H.G.: Co-citation in the scientific literature: a new measure of relationship between two documents. J. Am. Soc. Inform. Sci. Technol. 24(4), 265–269 (1973)

    Article  Google Scholar 

  16. Wiegers, T.C., Davis, A.P., Cohen, K.B., Hirschman, L., Mattingly, C.J.: Text mining and manual curation of chemical-gene-disease networks for the Comparative Toxicogenomics Database (CTD). BMC Bioinf. 10, 326 (2009)

    Article  Google Scholar 

Download references

Acknowledgment

This research was supported by the Ministry of Science and Technology (grant ID: MOST 105-2221-E-320-004) and Tzu Chi University (grant IDs: TCRPP103020 and TCRPP104010), Taiwan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rey-Long Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Liu, RL. (2017). Identification of Biomedical Articles with Highly Related Core Contents. In: Nguyen, N., Tojo, S., Nguyen, L., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2017. Lecture Notes in Computer Science(), vol 10191. Springer, Cham. https://doi.org/10.1007/978-3-319-54472-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-54472-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-54471-7

  • Online ISBN: 978-3-319-54472-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics