Measuring patent similarity with SAO semantic analysis
- 143 Downloads
Patents are not only an important aspect of intellectual property rights, but they are also one of the only ways to protect technological inventions. However, in recent years, the number of patents has been increasing dramatically and, as a result, both patent applicants and patent examiners are finding it more difficult to conduct the due diligence step of the patent registration process. Therefore, the lack of a quick and easy way to accurately measure patent similarity has become a significant obstacle to protecting intellectual property. Currently, there are three main ways to measure patent similarity: IPC code analysis, citation analysis, and keyword analysis. None of these approaches are able to fully reflect the semantics in a patent’s content. As an emerging methodology, subject–action–object (SAO) semantic analysis does reflect semantics, but most approaches treat each identified relationship as equally important, which does not necessarily provide an accurate measure of patent similarity. To offer this power to SAO analysis, this article introduces a new indicator called DWSAO as a reflection of the weight of each SAO semantic structure. Further, we present a semantic analysis framework that incorporates the DWSAO index for finding similar patents based on the weight of each SAO structure in the patent. A case study on the similarity of patents in the field of robotics was used to verify the reliability of the method. The results highlight the detailed meanings derived from the method, the accuracy of the outcomes, and the practical significance of using this approach.
KeywordsPatent similarity measurement Text mining Subject–action–object (SAO) Semantic analysis Robot docking stations
This work is partly supported by the General Program of the National Natural Science Foundation of China (Grant Nos. 71774012, 71673024, 71373019) and the strategic research project of the Development Planning Bureau of the Chinese Academy of Sciences (Grant No. GHJ-ZLZX-2019-42). The findings and observations present in this paper are those of the authors and do not necessarily reflect the views of the supporters or the sponsors. The authors would like to thank the anonymous reviewers for their constructive input into this paper.
- Adams, S. R. (2006). Information sources in patents (pp. 234–235). Munich: K. G. Saur.Google Scholar
- Ahlers, C. B., Fiszman, M., Demner-Fushman, D., Lang, F.-M., & Rindflesch, T. C. (2007). Extracting semantic predications from medline citations for pharmacogenomics. Pacific Symposium on Biocomputing, 12, 209–220.Google Scholar
- Angeli, G., Premkumar, M. J. J., & Manning, C. D. (2015). Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol. 1: Long Papers, pp. 344–354).Google Scholar
- Bär, D., Biemann, C., Gurevych, I., & Zesch, T. (2012). Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: Proceedings of the main conference and the shared task, and volume 2: Proceedings of the sixth international workshop on semantic evaluation (pp. 435–440). Association for Computational Linguistics.Google Scholar
- Braam, R. R., Moed, H. F., & Van Raan, A. F. (1988). Mapping of science: Critical elaboration and new approaches, a case study in agricultural biochemistry. Journal of Informetrics, 87(88), 15–28.Google Scholar
- Finlayson, M. A. (2014). Java libraries for accessing the Princeton Wordnet: Comparison and evaluation. In Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia (pp. 78–85).Google Scholar
- Keselman, A., Rosemblat, G., Kilicoglu, H., Fiszman, M., & Rindflesch, T. C. (2010). Adapting semantic natural language processing technology to address information overload in influenza epidemic management. Journal of the American Society for Information Science and Technology, 61(12), 2531–2543.CrossRefGoogle Scholar
- Kim, Y., Tian, Y., Jeong, Y., Ryu, J., & Myaeng, S. (2009). Automatic discovery of technology trends from patent text. In Proceedings of the 2009 ACM symposium on applied computing, Hawaii, USA.Google Scholar
- Lin, D. (1998). An information-theoretic definition of similarity. In International conference on machine learning (pp. 296–304).Google Scholar
- Manning, C. D., & Surdeanu, M., et al. (2014). The Stanford CoreNLP natural language processing toolkit. In 52nd ACL: System demonstrations.Google Scholar
- Saric, F., Glavas, G., Karan, M., Snajder, J., & Basic, B. D. (2012). TakeLab: Systems for measuring semantic text similarity. In SEM 2012 and (SemEval 2012) (pp. 441–448), Montreal, Canada.Google Scholar
- Verbitsky, M. (2004). Semantic TRIZ.triz-journal.com. http://www.triz-journal.com/archives/2004/. Accessed January 18, 2013.
- Yufeng, D. U., Duo, J. I., Lixue, J., & Guiping, Z. (2016). Patent similarity measure based on SAO structure. Journal of Chinese Information Processing, 30(1), 30–35 (in Chinese).Google Scholar
- Zarrella, G., Henderson, J., Merkhofer, E. M., & Strickhart, L. (2015). Mitre: Seven systems for semantic similarity in tweets. In Proceedings of the 9th international workshop on semantic evaluation (semeval 2015) (pp. 12–17). Denver, CO: Association for Computational Linguistics. http://www.aclweb.org/anthology/S15-2002.