Improvement Method for Topic-Based Path Model by Using Word2vec

  • Ryosuke Saga
  • Shoji Nohara
Conference paper


Studying purchasing factor for product developers in the market place is important. Using text data, such as comments from consumers, for factor analysis is a valid method. However, previous research show that generating a stable model for factor analysis using text data is difficult. We assume that if the target text data are handled well, then the analysis can progress smoothly. This study proposes pre-processing text data by word2vec for factor analysis to improve the analysis. Word2vec regards words as vectors in text. Our proposed process is effective, because variables are expressed as the frequency of words in the analysis model. Experiment results also show that our proposed method is helpful in generating an analytical model.


Causal analysis Data ming Text mining Topic model Structural equation modeling Word2vec 



This work was supported by KAKENHI 25240049.


  1. 1.
    S. Kawanaka, A. Miyata, R. Higashinaka, T. Hoshide, K. Fujimura, Computer analysis of consumer situations utilizing topic model, in 25th Annual Conference of the Japanese Society for Article Intelligence (2011)Google Scholar
  2. 2.
    K. Wajima, T. Ogawa, T. Furukawa, S. Shimoda, Specific Negative Factors Using Latent Dirichlet Allocation, DEIM Forum, A9–3 (2014)Google Scholar
  3. 3.
    R. Kunimoto, H. Kobayashi, R. Saga, Factor analysis for game software using structural equation modeling with hierarchical latent Dirichlet allocation in user’s review comments. Int. J. Knowl. Eng. 1(1), 54–58 (2015)CrossRefGoogle Scholar
  4. 4.
    R. Saga, S. Nohara, Factor analysis of investment judgment in crowdfunding using structural equation modeling, in The Fourth Asian Conference on Information Systems (2015)Google Scholar
  5. 5.
    S. Nohara, R. Saga, Preprocessing method topic-based path model by using Word2vec, in Proceedings of The International MultiConference of Engineers and Computer Scientists 2017. Lecture Notes in Engineering and Computer Science, pp. 15–17, Mar 2017, Hong Kong, pp. 317–320Google Scholar
  6. 6.
    R. Saga, T. Fujita, K. Kitami, K. Matsumoto, Improvement of factor model with text information based on factor model construction process, in IIMSS, 2013, pp 222–230Google Scholar
  7. 7.
    R. Saga, R. Kunimoto, LDA-based path model construction process for structure equation modeling. Artif. Life Robot. 21(2), 155–159 (2016)CrossRefGoogle Scholar
  8. 8.
    T. Mikolov, K. Chen, G.S. Corrado, J. Dean, Efficient Estimation of Word Representations in Vector Space, CoRR (2013). arXiv:1301.3781
  9. 9.
    T. Mikolov, I. Sutskever, K. Chen, G.S. Corrado, J. Dean, Districted representations of words and phrases and their compositionality, in 27th Annual Conference on Neural Information Processing Systems. Advances in Neural Information Processing Systems 26. Proceeding of a meeting held December 5–8, Lake Tahoe, Nevada, United States (2013), pp. 3111–3119Google Scholar
  10. 10.
  11. 11.
    MALLET: A Machine Learning for Language Toolkit,
  12. 12.
    The R Project for Statistical Computing,
  13. 13.
    J. Fox, Structural equation modeling with the SEM package in R. Struct. Equ. Model. 13, 465–486 (2006)MathSciNetCrossRefGoogle Scholar
  14. 14.
    Genism: A Topic Modeling Free Python Library,

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  1. 1.Graduate School of Humanities and Sustainable System SciencesOsaka Prefecture UniversitySakaiJapan
  2. 2.Graduate School of EngineeringOsaka Prefecture UnivesritySakaiJapan

Personalised recommendations