Abstract
As a fundamental step in biomedical information extraction tasks, biomedical named entity recognition remains challenging. In recent years, the neural network has been applied on the entity recognition to avoid the complex hand-designed features, which are derived from various linguistic analyses. However, performance of the conventional neural network systems is always limited to exploiting long range dependencies in sentences. In this paper, we mainly adopt the bidirectional recurrent neural network with LSTM unit to identify biomedical entities, in which the twin word embeddings and sentence vector are added to rich input information. Therefore, the complex feature extraction can be skipped. In the testing phase, Viterbi algorithm is also used to filter the illogical label sequences. The experimental results conducted on the BioCreative II GM corpus show that our system can achieve an F-score of 88.61 %, which outperforms CRF models using the complex hand-designed features and is 6.74 % higher than RNNs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Li, L., Fan, W., Huang, D., Dang, Y., Sun, J.: Boosting performance of gene mention tagging system by hybrid methods. J. Biomed. Inform. 45(1), 156–164 (2012)
Shen, D., Zhang, J., Zhou, G., Su, J., Tan, C.: Effective adaptation of a hidden Markov model-based named entity recognizer for biomedical domain. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, vol. 13, pp. 49–56 (2003)
Saha, S., Sarkar, S., Mitra, P.: Feature selection techniques for maximum entropy based biomedical named entity recognition. J. Biomed. Inform. 42(5), 905–911 (2009)
Sun, C., Guan, Y., Wang, X., Lin, L.: Rich features based conditional random fields for biological named entities recognition. Comput. Biol. Med. 37(9), 1327–1333 (2007)
Lee, K., Hwang, Y., Kim, S., Rim, H.: Biomedical named entity recognition using two-phase model based on SVMs. J. Biomed. Inform. 37(6), 436–447 (2004)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(8), 2493–2537 (2011)
Chen, Y., Zheng, D., Zhao, T.: Exploring deep belief nets to detect and categorize Chinese entities. In: International Conference on Advanced Data Mining and Applications, pp. 468–480 (2013)
Li, L., Jin, L., Huang, D.: Exploring recurrent neural networks to detect named entities from biomedical text. In: Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, pp. 279–290 (2015)
Li, L., Jin, L., Jiang, Z., Song D., Huang, D.: Biomedical named entity recognition based on extended recurrent neural networks. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 649–652 (2015)
Schuster, M., Paliwal, K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Zeiler, M.D.: ADADELTA: an adaptive learning rate method. arXiv Preprint arXiv:1212.5701 (2012)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Chen, Y., Zheng, D., Zhao, T.: Exploring deep belief nets to detect and categorize Chinese entities. In: International Conference on Advanced Data Mining and Applications, pp. 468–480 (2013)
Ando, R.K.: BioCreative II gene mention tagging system at IBM watson. In: Proceedings of the Second BioCreative Challenge Evaluation Workshop, vol. 23, pp. 101–103 (2007)
Li, L., Zhou, R., Huang D., Liao, W.: Integrating divergent models for gene mention tagging. In: IEEE International Conference on Bioinformatics and Biomedicine, pp. 1–7 (2009)
Li, L., He, H., Liu, S., Huang, D.: Research of word representations on biomedical named entity recognition. J. Chin. Comput. Syst. 2, 302–307 (2016). (in Chinese)
Li, Y., Lin, H., Yang, Z.: Incorporating rich background knowledge for gene named entity classification and recognition. BMC Bioinform. 10(1), 1–15 (2009)
Yao, L., Liu, H., Liu, Y., Li, X., Anwar, M.W.: Biomedical named entity recognition based on deep neutral network. Corpus 8(8), 279–288 (2015)
Chang, F., Guo, J., Xu, W., Chung, S.: Application of word embeddings in biomedical named entity recognition tasks. J. Digital Inf. Manage. 13(5), 321–327 (2015)
Wang, X., Yang, C., Guan, R.: A comparative study for biomedical named entity recognition. Int. J. Mach. Learn. Cybern. 1–10 (2015). doi:10.1007/s13042-015-0426-6
Zhou, G. Su, J.: Exploring deep knowledge resources in biomedical name recognition. In: International Joint Workshop on Natural Language Processing in Biomedicine and ITS Applications, pp. 96–99 (2004)
Acknowledgment
The authors gratefully acknowledge the financial support provided by the National Natural Science Foundation of China under Nos. 61173101, 61672126. The Tesla K40 used for this research was donated by the NVIDIA Corporation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Li, L., Jin, L., Jiang, Y., Huang, D. (2016). Recognizing Biomedical Named Entities Based on the Sentence Vector/Twin Word Embeddings Conditioned Bidirectional LSTM. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-47674-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)