Abstract
Biomedical named entity recognition (Bio-NER) is one of the most fundamental tasks in the field of biomedical information extraction. The accuracy of biomedical named entity recognition is crucial to the follow-up research work. This paper presents a method for named entity recognition based on the concept of three-way decisions. The method uses a discriminative approach named conditional random fields (CRFs) to construct models. These models follow the decision-making rule of three-way decision in all stages, the model cannot make decision arbitrarily when the information is incomplete until it gets more information. The experimental results show that our method can improve the performance for biomedical named entity recognition compared with other methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Finkel, J., Dingare, S., Manning, C., Nissim, M., Alex, B., Grover, C.: Exploring the boundaries: gene and protein identification in biomedical text. BMC Bioinf. 6, S5 (2005)
Tsuruoka, Y., Tsujii, J.: Boosting precision and recall of dictionary-based protein name recognition. In: Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine, Sapporo, Japan, pp. 41–48 (2003)
Cohen, A.M.: Unsupervised gene/protein entity normalization using automatically extracted dictionaries. In: Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases: Mining Biological Semantics, Detroit, MI, pp. 14–24 (2005)
Fukuda, K., Tsunoda, T., Tamura, A., et al.: Toward information extraction: identifying protein names from biological of the Pacific Symposium on Biocomputing, Hawai, USA, pp. 705–716 (1998)
Olsson, F., Eriksson, G., Franzen, K., et al.: Notions of correctness when evaluating protein name taggers. In: Proceedings of the 19th International Conference on Computational Linguistics, Taipei, Taiwan, pp. 765–771 (2002)
Lee, K.J., Hwang, Y.S., Rim, H.C.: Two-phase biomedical NE recognition based on SVMs. In: Proceedings of the ACL Workshop on Natural Language Processing in Biomedicine, Sapporo, Japan, pp. 33–40 (2003)
Finkel, J., Dingare, S., Nguyen, H., et al.: Exploiting context for biomedical entity recognition: from syntax to web. In: Proceedings of the Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, Geneva, Switzerland, pp. 89–91 (2004)
Settles, B.: Biomedical named entity recognition using conditional random fields and novel feature sets. In: Proceedings of the Joint Workshop on Natural Language Processing in Biomedicine and its Applications, pp. 104–107. Association for Computing Machinery, Geneva (2004)
Keretna, S., Lim, C.P., Creighton, D.: Classification ensemble to improve medical named entity recognition. In: 2014 IEEE International Conference on Systems, Man, and Cybernetics, San Diego, CA, USA, pp. 2630–2636 (2014)
Ekbal, A., Saha, S.: Stacked ensemble coupled with feature selection for biomedical entity extraction. J. Knowl. Based Syst. 46, 22–32 (2013)
Yao, Y.: An outline of a theory of three-way decisions. In: Yao, J., Yang, Y., Słowiński, R., Greco, S., Li, H., Mitra, S., Polkowski, L. (eds.) RSCTC 2012. LNCS (LNAI), vol. 7413, pp. 1–17. Springer, Heidelberg (2012). doi:10.1007/978-3-642-32115-3_1
Jin-Dong, K., Tomoko, O., Yoshimasa T., et al.: Introduction to the bio-entity recognition task at JNLPBA. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications, pp. 70–75. Association for Computational Linguistics, Geneva (2004)
Yang, Z.C.: Research on text mining in biomedical domain. Dalian University of Technology, Dalian (2008). (in Chinese)
Jia, X.Y., Li, W.J., Shang, L., et al.: An adaptive algorithm for decision threshold of three-way decisions. J. Electron. 39, 2520–2525 (2011). (in Chinese)
Tang, Z., Jiang, L.G., Yang, L., Li, K.L., Li, K.Q.: CRFs based parallel biomedical named entity recognition algorithm employing MapReduce framework. Cluster Comput. 18, 493–505 (2015)
Li, L., Zhou, R., Huang, D.: Two-phase biomedical named entity recognition using CRFs. Comput. Biol. Chem. 33(4), 334–338 (2009)
Zhou, G.D., Su, J.: Exploring deep knowledge resources in biomedical name recognition. In: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (JNLPBA), pp. 96–99 (2004)
Okanohara, D., Miyao, Y., Tsuruoka, Y., Tsujii, J.: Improving the scalability of semi-Markov conditional random fields for named entity recognition. In: Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the ACL, pp. 465–472 (2006)
Kim, S., Yoon, J.: Experimental study on a two phase method for biomedical named entity recognition. IEICE Trans. Inf. Syst. 7(E90–D), 1103–1110 (2007)
Yao, Y.Y., Zhao, Y.: Attribute reduction in decision-theoretic rough set models. Inf. Sci. 178(17), 3356–3373 (2008)
Hirschman, L., Yeh, A., Blaschke, C., et al.: Overview of BioCreAtIvE: critical assessment of information extraction for biology. BMC Bioinf. 6, 1 (2005)
Acknowledgments
The work is partially supported by the National Natural Science Foundation of China (No. 61273304, 61573259), and the program of Further Accelerating the Development of Chinese Medicine Three Year Action of Shanghai (No. ZY3-CCCX-3-6002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, H., Wei, Z., Sun, L., Zhang, Z. (2016). Biomedical Named Entity Recognition Based on Multistage Three-Way Decisions. In: Tan, T., Li, X., Chen, X., Zhou, J., Yang, J., Cheng, H. (eds) Pattern Recognition. CCPR 2016. Communications in Computer and Information Science, vol 663. Springer, Singapore. https://doi.org/10.1007/978-981-10-3005-5_42
Download citation
DOI: https://doi.org/10.1007/978-981-10-3005-5_42
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-3004-8
Online ISBN: 978-981-10-3005-5
eBook Packages: Computer ScienceComputer Science (R0)