, Volume 101, Issue 1, pp 685–704 | Cite as

Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: a case study in LTE technology

  • Bo Wang
  • Shengbo Liu
  • Kun Ding
  • Zeyuan Liu
  • Jing Xu


An extended latent Dirichlet allocation (LDA) model is presented in this paper for patent competitive intelligence analysis. After part-of-speech tagging and defining the noun phrase extraction rules, technological words have been extracted from patent titles and abstracts. This allows us to go one step further and perform patent analysis at content level. Then LDA model is used for identifying underlying topic structures based on latent relationships of technological words extracted. This helped us to review research hot spots and directions in subclasses of patented technology in a certain field. For the extension of the traditional LDA model, another institution-topic probability level is added to the original LDA model. Direct competing enterprises’ distribution probability and their technological positions are identified in each topic. Then a case study is carried on within one of the core patented technology in next generation telecommunication technology-LTE. This empirical study reveals emerging hot spots of LTE technology, and finds that major companies in this field have been focused on different technological fields with different competitive positions.


Noun phrases extraction Topic model (LDA) Institution-topic model Content analysis Long term evolution (LTE) 



This research is supported by National Natural Science Foundation of China (Grant Number, 61272370), the specialized research fund for doctoral tutor (20110041110034). Thanks to the following experts to help us evaluate our experiment results. Bin Peng from Thomson Reuters, who worked as a patent examiner in State Intellectual Property Office of P.R.China. ( Maoshu Ni, Senior mobile system product manager in Huawei Technology Company. ( Bo Wang, expert of communication technology from Information and Communication Engineering of Dalian University of Technology. (


  1. An, X. Y., & Wu, Q. Q. (2011). Co-word analysis of the trends in stem cells field based on subject heading weighting. Scientometrics, 88(1), 133–144.MathSciNetCrossRefGoogle Scholar
  2. Bhattacharya, I. & Getoor, L. (2005). A latent Dirichlet model for unsupervised entity resolution. SIAM International Conference on Data Mining.Google Scholar
  3. Blei, D. M. & Lafferty, J. D. (2006). Dynamic topic models. Proceedings of the 23rd international conference on Machine learning, ACM.Google Scholar
  4. Blei, D. M., & Lafferty, J. D. (2007). A correlated topic model of science. The Annals of Applied Statistics, 1, 17–35.MathSciNetCrossRefMATHGoogle Scholar
  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.Google Scholar
  6. Breitzman, A. F., & Mogee, M. E. (2002). The many applications of patent analysis. Journal of Information Science, 28(3), 187–205.CrossRefGoogle Scholar
  7. Breitzman, A., & Thomas, P. (2002). Using patent citation analysis to target/value ma candidates. Research-Technology Management, 45(5), 28–36.Google Scholar
  8. Chen, Y.-H., Chen, C.-Y., & Lee, S.-C. (2011). Technology forecasting and patent strategy of hydrogen energy and fuel cell technologies. International Journal of Hydrogen Energy, 36(12), 6957–6969.Google Scholar
  9. Church, K. W. (1988). A stochastic parts program and noun phrase parser for unrestricted text. Proceedings of the second conference on Applied natural language processing, Association for Computational Linguistics.Google Scholar
  10. Courtial, J.-P., Callon, M., & Sigogneau, A. (1993). The use of patent titles for identifying the topics of invention and forecasting trends. Scientometrics, 26(2), 231–242.Google Scholar
  11. Ding, Y. (2011). Topic-based PageRank on author cocitation networks. Journal of the American Society for Information Science and Technology, 62(3), 449–466.Google Scholar
  12. Ding, Y., Chowdhury, G. G., & Foo, S. (2001). Bibliometric cartography of information retrieval research by using co-word analysis. Information Processing and Management, 37(6), 817–842.Google Scholar
  13. Engelsman, E. C., & van Raan, A. F. (1994). A patent-based cartography of technology. Research Policy, 23(1), 1–26.CrossRefGoogle Scholar
  14. Ernst, H. (1997). The use of patent data for technological forecasting: The diffusion of CNC-technology in the machine tool industry. Small Business Economics, 9(4), 361–381.MathSciNetCrossRefGoogle Scholar
  15. Ernst, H. (2003). Patent information for strategic technology management. World Patent Information, 25(3), 233–242.CrossRefGoogle Scholar
  16. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America, 101(Suppl 1), 5228–5235.CrossRefGoogle Scholar
  17. Hofer, K. M., Smejkal, A. E., Bilgin, F. Z., & Wuehrer, G. A. (2010). Conference proceedings as a matter of bibliometric studies: The Academy of International Business 2006–2008. Scientometrics, 84(3), 845–862.Google Scholar
  18. Igami, M. (2008). Exploration of the evolution of nanotechnology via mapping of patent applications. Scientometrics, 77(2), 289–308.CrossRefGoogle Scholar
  19. Kang, J. S. L., Jai, H., & Moon, Y. H. (2010). Systematic approach for monitoring competitor’s technological challenges based upon patent analysis. Information: An International Interdisciplinary Journal, 13(2), 339–352.Google Scholar
  20. Lee, W. H. (2008). How to identify emerging research fields using scientometrics: An example in the field of information security. Scientometrics, 76(3), 503.Google Scholar
  21. Lee, S., Lee, S., Seol, H., & Park, Y. (2008). Using patent information for designing new product and technology: Keyword based technology roadmapping. R&D Management, 38(2), 169–188.Google Scholar
  22. Leydesdorff, L. (1997). Why words and co-words cannot map the development of the sciences. Journal of the American Society for Information Science, 48(5), 418–427.CrossRefGoogle Scholar
  23. Lienou, M., Maitre, H., & Datcu, M. (2010). Semantic annotation of satellite images using latent Dirichlet allocation. IEEE Geoscience and Remote Sensing Letters, 7(1), 28–32.Google Scholar
  24. Liu, S., & Chen, C. (2013). The differences between latent topics in abstracts and citation contexts of citing papers. Journal of the American Society for Information Science and Technology, 64, 627–639.Google Scholar
  25. Liu, S.-J., & Shyu, J. (1997). Strategic planning for technology development with patent analysis. International Journal of Technology Management, 13(5), 661–680.Google Scholar
  26. Milojević, S., Sugimoto, C. R., Yan, E., & Ding, Y. (2011). The cognitive structure of library and information science: Analysis of article title words. Journal of the American Society for Information Science and Technology, 62(10), 1933–1953.Google Scholar
  27. Misra, H., Yvon, F., Cappé, O., & Jose, J. (2011). Text segmentation: A topic modeling perspective. Information Processing and Management, 47(4), 528–544.Google Scholar
  28. Niebles, J. C., Wang, H., & Fei-Fei, L. (2008). Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 79(3), 299–318.Google Scholar
  29. Park, H., Kim, K., & Yoon, J. (2013). A patent intelligence system for strategic technology planning. Expert Systems with Applications, 40(7), 2373–2390.Google Scholar
  30. Pruteanu-Malinici, I., Ren, L., Paisley, J., Wang, E., & Carin, L. (2010). Hierarchical Bayesian modeling of topics in time-stamped documents. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6), 996–1011.Google Scholar
  31. Rahayu, E. S. R. & Hasibuan, Z. A. (2006). Identification of technology trend on Indonesian patent documents and research reports on chemistry and metallurgy fields. Proceeding Asia Pacific Conference, Singapore.Google Scholar
  32. Rosen-Zvi, M., Griffiths, T., Steyvers, M., & Smyth, P. (2004). The author-topic model for authors and documents. Proceedings of the 20th conference on Uncertainty in artificial intelligence, AUAI Press.Google Scholar
  33. Saini, G. (2009). Examining the 4G Mobile standard convergence to the LTE standard, Citeseer.Google Scholar
  34. Savoy, J. (2013). Authorship attribution based on a probabilistic topic model. Information Processing & Management, 49(1), 341–354.CrossRefGoogle Scholar
  35. Shen, J., Gao, J., & Teng, L. (2012). Derwent manual code co-occurrence: A practical method in patent map. Science of Science and Management of S & T, 1, 003.Google Scholar
  36. Shih, M.-J., Liu, D.-R., & Hsu, M.-L. (2008). Mining changes in patent trends for competitive intelligence. Advances in knowledge discovery and data mining (pp. 999–1005). Berlin: Springer.Google Scholar
  37. Shih, M.-J., Liu, D.-R., & Hsu, M.-L. (2010). Discovering competitive intelligence by mining changes in patent trends. Expert Systems with Applications, 37(4), 2882–2890.Google Scholar
  38. Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., & Ding, Y. (2011). The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation. Journal of the American Society for Information Science and Technology, 62(1), 185–204.Google Scholar
  39. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., & Su, Z. (2008). ArnetMiner: Extraction and mining of academic social networks. Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM.Google Scholar
  40. Thompson, P., & Fox-Kean, M. (2005). Patent citations and the geography of knowledge spillovers: A reassessment. American Economic Review, 95, 450–460.CrossRefGoogle Scholar
  41. Toutanova, K., & Manning, C. D. (2000). Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora: Held in conjunction with the 38th Annual Meeting of the Association for Computational Linguistics-Volume 13, Association for Computational Linguistics.Google Scholar
  42. Tseng, Y.-H., Lin, C.-J., & Lin, Y.-I. (2007). Text mining techniques for patent analysis. Information Processing and Management, 43(5), 1216–1247.Google Scholar
  43. von Wartburg, I., Teichert, T., & Rost, K. (2005). Inventive progress measured by multi-stage patent citation analysis. Research Policy, 34(10), 1591–1607.Google Scholar
  44. Wang, H., Ding, Y., Tang, J., Dong, X., He, B., Qiu, J et al. (2011). Finding complex biological relationships in recent PubMed articles using Bio-LDA. PLoS One, 6(3), e17243.Google Scholar
  45. Whittaker, J. (1989). Creativity and conformity in science: Titles, keywords and co-word analysis. Social Studies of Science, 19(3), 473–496.MathSciNetCrossRefGoogle Scholar
  46. Wu, Q., Zhang, C., & An, X. (2013). Topic segmentation model based on ATNLDA and co-occurrence theory and its application in stem cell field. Journal of Information Science, 39(3), 319–332.Google Scholar
  47. Yoon, J., Choi, S., & Kim, K. (2011). Invention property-function network analysis of patents: A case of silicon-based thin film solar cells. Scientometrics, 86(3), 687–703.Google Scholar
  48. Zhang, J., Wolfram, D., Wolfram, D., Wang, P., Hong, Y., & Gillis, R. (2008). Visualization of health-subject analysis based on query term co-occurrences. Journal of the American Society for Information Science and Technology, 59(12), 1933–1947.Google Scholar
  49. Zhang, J., Xie, J., Hou, W., Tu, X., Xu, J., Song, F et al. (2012). Mapping the knowledge structure of research on patient adherence: Knowledge domain visualization based co-word analysis and social network analysis. PLoS One, 7(4), e34497.Google Scholar

Copyright information

© Akadémiai Kiadó, Budapest, Hungary 2014

Authors and Affiliations

  • Bo Wang
    • 1
    • 2
    • 3
  • Shengbo Liu
    • 1
    • 2
    • 3
  • Kun Ding
    • 1
    • 2
    • 3
  • Zeyuan Liu
    • 1
    • 2
    • 3
  • Jing Xu
    • 4
  1. 1.WISELab, Dalian University of TechnologyDalianChina
  2. 2.Joint-Institute for the Study of Knowledge Visualization and Science DiscoveryDalian University of TechnologyDalianChina
  3. 3.Joint-Institute for the Study of Knowledge Visualization and Science DiscoveryDrexel UniversityPhiladelphiaUSA
  4. 4.Sichuan UniversityChengduChina

Personalised recommendations