Skip to main content

Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM

  • Conference paper
  • First Online:
Book cover Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data (NLP-NABD 2016, CCL 2016)

Abstract

As a fundamental step in biomedical information extraction tasks, biomedical named entity recognition remains challenging. In recent years, the neural network has been applied on the entity recognition to avoid the complex hand-designed features, which are derived from various linguistic analyses. However, performance of the conventional neural network systems is always limited to exploiting long range dependencies in sentences. In this paper, we mainly adopt the bidirectional recurrent neural network with LSTM unit to identify biomedical entities, in which the twin word embeddings and sentence vector are added to rich input information. Therefore, the complex feature extraction can be skipped. In the testing phase, Viterbi algorithm is also used to filter the illogical label sequences. The experimental results conducted on the BioCreative II GM corpus show that our system can achieve an F-score of 88.61 %, which outperforms CRF models using the complex hand-designed features and is 6.74 % higher than RNNs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://deeplearning.net/tutorial/rnnslu.html.

References

  1. Li, L., Fan, W., Huang, D., Dang, Y., Sun, J.: Boosting performance of gene mention tagging system by hybrid methods. J. Biomed. Inform. 45(1), 156–164 (2012)

    Article  Google Scholar 

  2. Shen, D., Zhang, J., Zhou, G., Su, J., Tan, C.: Effective adaptation of a hidden Markov model-based named entity recognizer for biomedical domain. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, vol. 13, pp. 49–56 (2003)

    Google Scholar 

  3. Saha, S., Sarkar, S., Mitra, P.: Feature selection techniques for maximum entropy based biomedical named entity recognition. J. Biomed. Inform. 42(5), 905–911 (2009)

    Article  Google Scholar 

  4. Sun, C., Guan, Y., Wang, X., Lin, L.: Rich features based conditional random fields for biological named entities recognition. Comput. Biol. Med. 37(9), 1327–1333 (2007)

    Article  Google Scholar 

  5. Lee, K., Hwang, Y., Kim, S., Rim, H.: Biomedical named entity recognition using two-phase model based on SVMs. J. Biomed. Inform. 37(6), 436–447 (2004)

    Article  Google Scholar 

  6. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(8), 2493–2537 (2011)

    MATH  Google Scholar 

  7. Chen, Y., Zheng, D., Zhao, T.: Exploring deep belief nets to detect and categorize Chinese entities. In: International Conference on Advanced Data Mining and Applications, pp. 468–480 (2013)

    Google Scholar 

  8. Li, L., Jin, L., Huang, D.: Exploring recurrent neural networks to detect named entities from biomedical text. In: Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, pp. 279–290 (2015)

    Google Scholar 

  9. Li, L., Jin, L., Jiang, Z., Song D., Huang, D.: Biomedical named entity recognition based on extended recurrent neural networks. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 649–652 (2015)

    Google Scholar 

  10. Schuster, M., Paliwal, K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)

    Article  Google Scholar 

  11. Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv Preprint arXiv:1212.5701 (2012)

  12. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  13. Chen, Y., Zheng, D., Zhao, T.: Exploring deep belief nets to detect and categorize Chinese entities. In: International Conference on Advanced Data Mining and Applications, pp. 468–480 (2013)

    Google Scholar 

  14. Ando, R.K.: BioCreative II gene mention tagging system at IBM watson. In: Proceedings of the Second BioCreative Challenge Evaluation Workshop, vol. 23, pp. 101–103 (2007)

    Google Scholar 

  15. Li, L., Zhou, R., Huang D., Liao, W.: Integrating divergent models for gene mention tagging. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 1–7 (2009)

    Google Scholar 

  16. Li, L., He, H., Liu, S., Huang, D.: Research of word representations on biomedical named entity recognition. J. Chin. Comput. Syst. 2, 302–307 (2016). (in Chinese)

    Google Scholar 

  17. Li, Y., Lin, H., Yang, Z.: Incorporating rich background knowledge for gene named entity classification and recognition. BMC Bioinform. 10(1), 1–15 (2009)

    Article  Google Scholar 

  18. Yao, L., Liu, H., Liu, Y., Li, X., Anwar, M.W.: Biomedical named entity recognition based on deep neutral network. Corpus 8(8), 279–288 (2015)

    Google Scholar 

  19. Chang, F., Guo, J., Xu, W., Chung, S.: Application of word embeddings in biomedical named entity recognition tasks. J. Digital Inf. Manage. 13(5), 321–327 (2015)

    Google Scholar 

  20. Wang, X., Yang, C., Guan, R.: A comparative study for biomedical named entity recognition. Int. J. Mach. Learn. Cybern. 1–10 (2015). doi:10.1007/s13042-015-0426-6

    Google Scholar 

  21. Zhou, G. Su, J.: Exploring deep knowledge resources in biomedical name recognition. In: International Joint Workshop on Natural Language Processing in Biomedicine and ITS Applications, pp. 96–99 (2004)

    Google Scholar 

Download references

Acknowledgment

The authors gratefully acknowledge the financial support provided by the National Natural Science Foundation of China under Nos. 61173101, 61672126. The Tesla K40 used for this research was donated by the NVIDIA Corporation.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lishuang Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Li, L., Jin, L., Jiang, Y., Huang, D. (2016). Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47674-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47673-5

  • Online ISBN: 978-3-319-47674-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics