Skip to main content
Log in

Measuring patent similarity with SAO semantic analysis

  • Published:
Scientometrics Aims and scope Submit manuscript

Abstract

Patents are not only an important aspect of intellectual property rights, but they are also one of the only ways to protect technological inventions. However, in recent years, the number of patents has been increasing dramatically and, as a result, both patent applicants and patent examiners are finding it more difficult to conduct the due diligence step of the patent registration process. Therefore, the lack of a quick and easy way to accurately measure patent similarity has become a significant obstacle to protecting intellectual property. Currently, there are three main ways to measure patent similarity: IPC code analysis, citation analysis, and keyword analysis. None of these approaches are able to fully reflect the semantics in a patent’s content. As an emerging methodology, subject–action–object (SAO) semantic analysis does reflect semantics, but most approaches treat each identified relationship as equally important, which does not necessarily provide an accurate measure of patent similarity. To offer this power to SAO analysis, this article introduces a new indicator called DWSAO as a reflection of the weight of each SAO semantic structure. Further, we present a semantic analysis framework that incorporates the DWSAO index for finding similar patents based on the weight of each SAO structure in the patent. A case study on the similarity of patents in the field of robotics was used to verify the reliability of the method. The results highlight the detailed meanings derived from the method, the accuracy of the outcomes, and the practical significance of using this approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  • Adams, S. R. (2006). Information sources in patents (pp. 234–235). Munich: K. G. Saur.

    Google Scholar 

  • Ahlers, C. B., Fiszman, M., Demner-Fushman, D., Lang, F.-M., & Rindflesch, T. C. (2007). Extracting semantic predications from medline citations for pharmacogenomics. Pacific Symposium on Biocomputing, 12, 209–220.

    Google Scholar 

  • Angeli, G., Premkumar, M. J. J., & Manning, C. D. (2015). Leveraging linguistic structure for open domain information extraction. In Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol. 1: Long Papers, pp. 344–354).

  • Bär, D., Biemann, C., Gurevych, I., & Zesch, T. (2012). Ukp: Computing semantic textual similarity by combining multiple content similarity measures. In Proceedings of the first joint conference on lexical and computational semantics-volume 1: Proceedings of the main conference and the shared task, and volume 2: Proceedings of the sixth international workshop on semantic evaluation (pp. 435–440). Association for Computational Linguistics.

  • Bergmann, I., Butzke, D., Walter, L., Fuerste, J. P., Moehrle, M. G., & Erdmann, V. A. (2008). Evaluating the risk of patent infringement by means of semantic patent analysis: The case of DNA chips. R&D Management, 38(5), 550–562.

    Article  Google Scholar 

  • Boyack, K. W., Klavans, R., & Börner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351–374.

    Article  Google Scholar 

  • Braam, R. R., Moed, H. F., & Van Raan, A. F. (1988). Mapping of science: Critical elaboration and new approaches, a case study in agricultural biochemistry. Journal of Informetrics, 87(88), 15–28.

    Google Scholar 

  • Finlayson, M. A. (2014). Java libraries for accessing the Princeton Wordnet: Comparison and evaluation. In Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia (pp. 78–85).

  • Keselman, A., Rosemblat, G., Kilicoglu, H., Fiszman, M., & Rindflesch, T. C. (2010). Adapting semantic natural language processing technology to address information overload in influenza epidemic management. Journal of the American Society for Information Science and Technology, 61(12), 2531–2543.

    Article  Google Scholar 

  • Kim, Y., Tian, Y., Jeong, Y., Ryu, J., & Myaeng, S. (2009). Automatic discovery of technology trends from patent text. In Proceedings of the 2009 ACM symposium on applied computing, Hawaii, USA.

  • Lin, D. (1998). An information-theoretic definition of similarity. In International conference on machine learning (pp. 296–304).

  • Magerman, T., Looy, B. V., & Song, X. (2010). Exploring the feasibility and accuracy of latent semantic analysis based text mining techniques to detect similarity between patent documents and scientific publications. Scientometrics, 82(2), 289–306.

    Article  Google Scholar 

  • Manning, C. D., & Surdeanu, M., et al. (2014). The Stanford CoreNLP natural language processing toolkit. In 52nd ACL: System demonstrations.

  • Miller, G. A. (1995). Wordnet: A lexical database for english. Communications of the Association for Computing Machinery, 38(11), 39–41.

    Article  Google Scholar 

  • Moehrle, M. G. (2005). How combinations of TRIZ tools are used in companies—Results of a cluster analysis. R&D Management, 35(3), 285–296.

    Article  Google Scholar 

  • Moehrle, M. G. (2010). Measures for textual patent similarities: A guided way to select appropriate approaches. Scientometrics, 85(1), 95–109.

    Article  Google Scholar 

  • Park, H., Kim, K., Choi, S., & Yoon, J. (2013a). A patent intelligence system for strategic technology planning. Expert Systems with Applications, 40(7), 2373–2390.

    Article  Google Scholar 

  • Park, H., Yoon, J., & Kim, K. (2012). Identifying patent infringement using SAO based semantic technological similarities. Scientometrics, 90(2), 515–529.

    Article  Google Scholar 

  • Park, H., Yoon, J., & Kim, K. (2013b). Identification and evaluation of corporations for merger and acquisition strategies using patent information and text mining. Scientometrics, 97(3), 883–909.

    Article  Google Scholar 

  • Park, I., & Yoon, B. (2014). A semantic analysis approach for identifying patent infringement based on a product–patent map. Technology Analysis & Strategic Management, 26(8), 855–874.

    Article  Google Scholar 

  • Saric, F., Glavas, G., Karan, M., Snajder, J., & Basic, B. D. (2012). TakeLab: Systems for measuring semantic text similarity. In SEM 2012 and (SemEval 2012) (pp. 441–448), Montreal, Canada.

  • Sternitzke, C., & Bergmann, I. (2009). Similarity measures for document mapping: A comparative study on the level of an individual scientist. Scientometrics, 78(1), 113–130.

    Article  Google Scholar 

  • Verbitsky, M. (2004). Semantic TRIZ.triz-journal.com. http://www.triz-journal.com/archives/2004/. Accessed January 18, 2013.

  • Wang, X., Ma, P., Huang, Y., Guo, J., Zhu, D., Porter, A. L., et al. (2017). Combining SAO semantic analysis and morphology analysis to identify technology opportunities. Scientometrics, 111(1), 3–24.

    Article  Google Scholar 

  • Wang, X., Qiu, P., Zhu, D., Mitkova, L., Lei, M., & Porter, A. L. (2015). Identification of technology development trends based on subject–action–object analysis: The case of dye-sensitized solar cells. Technological Forecasting and Social Change, 98, 24–46.

    Article  Google Scholar 

  • Yoon, B. (2008). On the development of a technology intelligence tool for identifying technology opportunity. Expert Systems with Applications, 35(1–2), 124–135.

    Article  Google Scholar 

  • Yoon, B., & Park, Y. (2004). A text-mining-based patent network: Analytical tool for high-technology trend. Journal of High Technology Management Research, 15(1), 37–50.

    Article  Google Scholar 

  • Yoon, J. (2012). Detecting signals of new technological opportunities using semantic patent analysis and outlier detection. Scientometrics, 90(2), 445–461.

    Article  Google Scholar 

  • Yoon, J., Park, H., & Kim, K. (2013). Identifying technological competition trends for R&D planning using dynamic patent maps: SAO-based content analysis. Scientometrics, 94(1), 313–331.

    Article  Google Scholar 

  • Yufeng, D. U., Duo, J. I., Lixue, J., & Guiping, Z. (2016). Patent similarity measure based on SAO structure. Journal of Chinese Information Processing, 30(1), 30–35 (in Chinese).

    Google Scholar 

  • Zarrella, G., Henderson, J., Merkhofer, E. M., & Strickhart, L. (2015). Mitre: Seven systems for semantic similarity in tweets. In Proceedings of the 9th international workshop on semantic evaluation (semeval 2015) (pp. 12–17). Denver, CO: Association for Computational Linguistics. http://www.aclweb.org/anthology/S15-2002.

  • Zhang, Y., Shang, L., Huang, L., Porter, A. L., Zhang, G., Lu, J., et al. (2016). A hybrid similarity measure method for patent portfolio analysis. Journal of Informetrics, 10(4), 1108–1130.

    Article  Google Scholar 

  • Zhang, Y., Zhou, X., Porter, A. L., et al. (2014). How to combine term clumping and technology roadmapping for newly emerging science & technology competitive intelligence: “Problem & solution” pattern based semantic TRIZ tool and case study. Scientometrics, 101(2), 1375–1389.

    Article  Google Scholar 

Download references

Acknowledgements

This work is partly supported by the General Program of the National Natural Science Foundation of China (Grant Nos. 71774012, 71673024, 71373019) and the strategic research project of the Development Planning Bureau of the Chinese Academy of Sciences (Grant No. GHJ-ZLZX-2019-42). The findings and observations present in this paper are those of the authors and do not necessarily reflect the views of the supporters or the sponsors. The authors would like to thank the anonymous reviewers for their constructive input into this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuefeng Wang.

Appendix

Appendix

No.

Patent number

No.

Patent number

No.

Patent number

No.

Patent number

1

FR3046259A1

38

WO2016078517A1

75

GB2513912A

112

KR1179592B1

2

US20170159199A1

39

WO2016080615A1

76

KR2014120437A

113

WO2012086950A2

3

US9672184B1

40

US20160133491A1

77

KR1437778B1

114

KR1151449B1

4

WO2017091066A1

41

KR2016050285A

78

FR3002804A1

115

KR1146907B1

5

US20170105592A1

42

CN105487507A

79

US20140222197A1

116

WO2012064009A1

6

CN206115269U

43

WO2016045593A1

80

GB2509989A

117

US20120060320A1

7

US20170102709A1

44

WO2016038919A1

81

GB2509990A

118

KR2012001510A

8

CN106551659A

45

US20160062362A1

82

GB2509991A

119

WO2011136974A2

9

US20170086325A1

46

US20160057925A1

83

GB2510062A

120

DE102011010205A1

10

US20170075962A1

47

CN105361817A

84

PL402468A1

121

DE102010013297A1

11

US20170072568A1

48

US20160039541A1

85

WO2014105225A1

122

US20110236026A1

12

US20170057760A1

49

KR2016008856A

86

KR1411200B1

123

US20110238214A1

13

US20170050311A1

50

US20160011592A1

87

US20140135991A1

124

US20110214030A1

14

CN106444736A

51

WO2016000622A1

88

US20140122958A1

125

SE201100582A1

15

US20170037648A1

52

US20150356885A1

89

US20140122654A1

126

KR2011056660A

16

US20170020064A1

53

KR1569281B1

90

WO2014058106A1

127

KR2011041721A

17

US9527217B1

54

US20150314437A1

91

US20140100693A1

128

US20110092847A1

18

US20160363933A1

55

US20150314453A1

92

KR2014036653A

129

US20100324736A1

19

WO2016196622A1

56

WO2015166339A1

93

GB2505550A

130

US20100324731A1

20

US20160349756A1

57

WO2015150529A1

94

CN103576678A

131

US20100292884A1

21

CN205734931U

58

SE201500381A1

95

KR2014002841A

132

US20100228421A1

22

WO2016179782A1

59

KR2015105089A

96

EP2662742A1

133

KR2010092807A

23

US20160327959A1

60

WO2015123732A1

97

FR2987689A1

134

KR2010087820A

24

US20160325854A1

61

US9114440B1

98

DE102013101700A1

135

US20100145236A1

25

KR2016129515A

62

US20150228419A1

99

WO2013100938A1

136

US20100106298A1

26

CN205612410U

63

DE102014201203A1

100

US8373391B1

137

KR2010013362A

27

KR1660703B1

64

US20150164599A1

101

US20130035793A1

138

KR2010012351A

28

WO2016148327A1

65

EP2879010A1

102

US20120323365A1

139

KR2010007776A

29

US20160268823A1

66

KR2015053450A

103

US20120303190A1

140

SE200802217A

30

US20160237587A1

67

WO2015067225A1

104

CN102789232A

141

US20090245930A1

31

US20160240405A1

68

EP2870852A1

105

US20120277908A1

142

US20090240370A1

32

US20160236344A1

69

WO2015052588A2

106

KR2012117421A

143

PT104217A

33

US9411337B1

70

CN104416568A

107

US20120265391A1

144

WO2009092166A1

34

JP2016134081A

71

US20150063959A1

108

KR2012113188A

145

KR2009061461A

35

KR2016067351A

72

US20140379129A1

109

CN102692922A

146

KR2009053263A

36

SE201451644A1

73

WO2014201578A2

110

CN102687620A

147

KR2009051319A

37

US20160143500A1

74

KR1467887B1

111

US20120229433A1

148

US20090125174A1

No.

Patent number

No.

Patent number

No.

Patent number

No.

Patent number

149

US20090117011A1

167

KR2007103248A

185

EP1518784A2

203

US6228168B1

150

US20090049640A1

168

JP2007272301A

186

US20050010330A1

204

JP2001033357A

151

US20080275590A1

169

US20070226949A1

187

US20040210346A1

205

US6178361B1

152

WO2008106088A2

170

KR2007095558A

188

US20040204804A1

206

CA2300686A1

153

EP1961358A2

171

KR2007094288A

189

EP1435555A2

207

WO2000033355A2

154

KR2008073626A

172

US20070205215A1

190

US20040055746A1

208

EP997176A2

155

KR2008073628A

173

WO2007089269A2

191

US20040048550A1

209

WO1999065803A1

156

KR2008050278A

174

EP1806086A2

192

US6606784B1

210

US5993132A

157

US20080071417A1

175

EP1806085A2

193

US20020187024A1

211

WO1999059400A1

158

KR814784B1

176

US20070142972A1

194

EP1264935A2

212

WO1999038237A1

159

US20080062558A1

177

KR702147B1

195

US6443543B1

213

WO1999017263A1

160

US20080056933A1

178

US20060277423A1

196

WO2002055271A1

214

DE19738163A1

161

US20080038152A1

179

US20060232236A1

197

US6402846B1

215

WO1998033103A1

162

WO2008001275A2

180

US20060090320A1

198

WO2002044703A2

216

RD374022A

163

KR782863B1

181

US20060013646A1

199

US20020051700A1

217

US5324948A

164

KR2007111628A

182

GB2415252A

200

DE10033680A1

218

CA2054150A1

165

KR2007105477A

183

US20050235076A1

201

WO2002005313A2

219

US4792995A

166

US20070245511A1

184

WO2005074362A2

202

US6325808B1

220

RD246001A

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Ren, H., Chen, Y. et al. Measuring patent similarity with SAO semantic analysis. Scientometrics 121, 1–23 (2019). https://doi.org/10.1007/s11192-019-03191-z

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11192-019-03191-z

Keywords

Navigation