Types of DOI errors of cited references in Web of Science with a cleaning method
- 7 Downloads
Though the bibliographic databases, such as Web of Science (WoS), largely promote the development of scientometrics and informetrics, these databases are not free of errors. The main purpose of this work is to figure out which types of DOI errors of cited references exist, how often each type of errors occur, and whether it is possible to automatically correct these errors. After careful analysis, several classic DOI errors of cited references, such as prefix-, suffix- and other-type errors, are identified, Then, a cleaning method is put forward on the basis of regular expressions. Experimental results on the bibliographic data in the gene editing field from the WoS database indicate that our cleaning approach can improve largely the quality of DOI names of cited references.
KeywordsDOI errors Cleaning method Web of Science Cited references Regular expression
Our gratitude goes to the anonymous reviewers and the editor for their valuable comments.
- Franceschini, F., Maisano, D., & Mastrogiacomo, L. (2013). A novel approach for estimating the omitted-citation rate of bibliometric databases with an application to the field of bibliometrics. Journal of the Association for Information Science and Technology, 64(10), 2149–2156. https://doi.org/10.1002/asi.22898.Google Scholar
- Paskin, N. (2010). Digital object identifier (DOI) system. In A. Kent (Ed.), Encyclopedia of library and information sciences (3rd ed., pp. 1586–1592). Milton Park: Taylor and Francis.Google Scholar