Skip to main content

BioKeySpotter: An Unsupervised Keyphrase Extraction Technique in the Biomedical Full-Text Collection

  • Chapter
  • First Online:
  • 1648 Accesses

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 25))

Abstract

Extracting keyphrases from full-text is a daunting task in that many different concepts and themes are intertwined and extensive term variations exist in full-text. In this chapter, we proposes a novel unsupervised keyphrase extraction system, BioKeySpotter, which incorporates lexical syntactic features to weigh candidate keyphrases. The main contribution of our study is that BioKeySpotter is an innovative approach for combining Natural Language Processing (NLP), information extraction, and integration techniques into extracting keyphrases from full-text. The results of the experiment demonstrate that BioKeySpotter generates a higher performance, in terms of accuracy, compared to other supervised learning algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Arshadi, N., Jurisica, I.: Feature Selection for Improving Case Based Classifiers on High Dimentional Data Sets. AAAI, Menlo Park (2005)

    Google Scholar 

  • Barker, K., Cornacchia, N.: Using Noun Phrase Heads to Extract Document Keyphrases. In: Hamilton, H.J. (ed.) Canadian AI 2000. LNCS (LNAI), vol. 1822, pp. 40–52. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  • Bracewell, D.B., Ren, F., et al.: Multilingual single document keyword extraction for information retrieval. In: NLP-KE 2005. IEEE, Los Alamitos (2005)

    Google Scholar 

  • Brown, J.: Growing up digital: How the web changes work, education, and the ways people learn. Change, 10–20 (2000)

    Google Scholar 

  • D’Avanzo, E., Magnini, B., et al.: Keyphrase Extraction for Summarization Purposes: The LAKE System at DUC-2004. In: Document Understanding Workshop. HLT/NAACL, Boston, USA (2004)

    Google Scholar 

  • El-Beltagy, S.: KP-Miner: A Simple System for Effective Keyphrase Extraction. In: Innovation in Information Technology. IEEE Xplore (2006)

    Google Scholar 

  • Frantzi, K.T., Ananiadou, S., Tsujii, J.: The C − value/NC − value Method of Automatic Recognition for Multi-word Terms. In: Nikolaou, C., Stephanidis, C. (eds.) ECDL 1998. LNCS, vol. 1513, pp. 585–604. Springer, Heidelberg (1998)

    Chapter  Google Scholar 

  • Liu, X.: Intelligent Data Analysis. Intelligent Information Technologies: Concepts, Methodologies, Tools and Applications, 308 (2007)

    Google Scholar 

  • Settles, B.: Biomedical Named Entity Recognition Using Conditional Random Fields and Rich Feature Sets. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (NLPBA), pp. 104–107 (2004)

    Google Scholar 

  • Mihalcea, R., Tarau, P.: TextRank: Bringing Order into Texts. In: EMNLP-2004, Barcelona, Spain (2004)

    Google Scholar 

  • Song, M., Song, I.-Y., et al.: KPSpotter: a flexible information gain-based keyphrase extraction system. In: International Workshop on Web Information and Data Management ACM (2003)

    Google Scholar 

  • Song, M., Rudniy, A.: Markov Random Field-based Edit Distance for Entity Matching, Biomedical Literature. In: International Conference on Bioinformatics and Biomedicine, pp. 457–460 (2008)

    Google Scholar 

  • Turney, P.D.: Extraction of Keyphrases from Text: Evaluation of Four Algorithms, pp. 1–29. National Research Council Canada, Institute for Information Technology (1997)

    Google Scholar 

  • Turney, P.D.: Learning Algorithms for Keyphrase Extraction. Information Retrieval 2(4), 303–336 (2000)

    Article  Google Scholar 

  • Wan, X., Yang, J., et al.: Towards an iterative reinforcement approach for simultaneous document summarization and keyword extraction. ACL, Prague (2007)

    Google Scholar 

  • Witten, I.H., Paynter, G.W., et al.: KEA: Practical Automatic Keyphrase Extraction. In: The Fourth on Digital Libraries 1999. ACM CNF, New York (1999)

    Google Scholar 

  • Zhang, Y., Zincir-Heywood, N., et al.: Narrative text classification for automatic key phrase extraction in web document corpora. In: 7th Annual ACM International Workshop on Web Information and Data Management. ACM SIGIR, Bremen (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Min Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Song, M., Tanapaisankit, P. (2012). BioKeySpotter: An Unsupervised Keyphrase Extraction Technique in the Biomedical Full-Text Collection. In: Holmes, D., Jain, L. (eds) Data Mining: Foundations and Intelligent Paradigms. Intelligent Systems Reference Library, vol 25. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23151-3_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23151-3_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23150-6

  • Online ISBN: 978-3-642-23151-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics