Abstract
This paper discusses how to apply latent Dirichlet allocation, a topic model, in a trend analysis methodology that exploits patent information. To accomplish this, text mining is used to convert unstructured patent documents into structured data. Next, the term frequency-inverse document frequency (tf-idf) value is used in the feature selection process. After the text preprocessing, the number of topics is decided using the perplexity value. In this study, we employed U.S. patent data on technology that reduces greenhouse gases. We extracted words from 50 relevant topics and showed that these topics are highly meaningful in explaining trends per period.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, S.J., Yoon, B.Y., Park, Y.T.: An approach to discovering new technology opportunities: Keyword-based patent map approach. Technovation 29, 483–484 (2009)
Tseng, Y.H., Lin, C.J., Lin, Y.I.: Text mining techniques for patent analysis. Information Processing & Management 43, 1216–1247 (2007)
Jun, S.H., Park, S.S., Jang, D.S.: Technology forecasting using matrix map and patent clustering. Industrial Management & Data Systems 112(5), 786–807 (2012)
Yoon, B.U., Yoon, C.B., Park, Y.T.: On the development and application of a self-organizing feature map-based patent map. R&D Management 32(4), 291–300 (2002)
Noh, T.G., Park, S.B., Lee, S.J.: A Semantic Representation Based-on Term Co-occurrence Network and Graph Kernel. International Journal of Fuzzy Logic and Intelligent Systems 11(4) (2011)
Blei, D.M., Lafferty, J.D.: Dynamic Topic Models. In: 23rd International Conference on Machine Learning, Pittsburgh, PA (2006)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences of the United States of America 101, 5228–5235 (2004)
Uhm, D., Jun, S., Lee, S.J.
Cho, J.H., Lee, D.J., Park, J.I., Chun, M.G.: Hybrid Feature Selection Using Genetic Algorithm and Information Theory. International Journal of Fuzzy Logic and Intelligent Systems 13(1) (2013)
Blei, D.V., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Steyvers, M., Griffiths, T.: Probabilistic topic models
Grun, B., Hornik, K.: topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software 40(13) (2011)
Simpson, M.M.: Climate Change Technology Initiative (CCTI): Research, Technology, and Related Program. CRS Report for Congress (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Kim, G., Park, S., Jang, D. (2014). Technology Analysis from Patent Data Using Latent Dirichlet Allocation. In: Lee, K., Park, SJ., Lee, JH. (eds) Soft Computing in Big Data Processing. Advances in Intelligent Systems and Computing, vol 271. Springer, Cham. https://doi.org/10.1007/978-3-319-05527-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-05527-5_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05526-8
Online ISBN: 978-3-319-05527-5
eBook Packages: EngineeringEngineering (R0)