Computational Semantics for Asset Correlations

  • Frank Xing
  • Erik Cambria
  • Roy Welsch
Part of the Socio-Affective Computing book series (SAC, volume 9)


This chapter explores the possibility to leverage semantic knowledge for robust estimation of correlations among financial assets. A graphical model for high-dimensional stochastic dependence termed a “vine” structure, which is derived from copula theory, is introduced here. To model the prior semantic knowledge, we use a neural network-based language model to generate distributed semantic representations for financial documents. The semantic representations are used for computing similarities between the assets they respectively refer. The constructed dependence structure is experimented with real-world data. Results suggest that our semantic vine construction-based method is superior to the state-of-the-art covariance matrix estimation method, which is based on an arbitrary vine that at least guarantees robustness of the estimated covariance matrix. The effectiveness of using semantic vines for robust correlation estimation for Markowitz’s asset allocation model on a large scale of assets (up to 50 stocks) is also showed and discussed.


Asset allocation Dependence modeling Robust estimation Doc2vec Semantic vine Correlation matrix Machine learning 


  1. 1.
    K. Aas, D. Berg, Models for construction of multivariate dependence – a comparison study. Eur. J. Financ. 15, 639–659 (2009)CrossRefGoogle Scholar
  2. 7.
    H. Bai, F.Z. Xing, E. Cambria, W.-B. Huang, Business taxonomy construction using concept-level hierarchical clustering, in The First Workshop on Financial Technology and Natural Language Processing (FinNLP-IJCAI), 2019, pp. 1–7Google Scholar
  3. 10.
    T. Bedford, R.M. Cooke, Probability density decomposition for conditionally dependent random variables modeled by vines. Ann. Math. Artif. Intell. 32, 245–268 (2001)CrossRefGoogle Scholar
  4. 11.
    T. Bedford, R.M. Cooke, Vines: a new graphical model for dependent random variables. Ann. Stat. 30(4), 1031–1068 (2002)CrossRefGoogle Scholar
  5. 12.
    Y. Bengio, R. Ducharme, P. Vincent, A neural probabilistic language model. J. Mach. Learn. Res. 3, 1137–1155 (2003)Google Scholar
  6. 31.
    L.K.C. Chan, J. Lakonishok, B. Swaminathan, Industry classification and return comovement. Financ. Anal. J. 63(6), 56–70 (2007)CrossRefGoogle Scholar
  7. 33.
    I. Chaturvedi, Y.-S. Ong, I. Tsang, R.E. Welsch, E. Cambria, Learning word dependencies in text by means of a deep recurrent belief network. Knowl. Based Syst. 108, 144–154 (2016)CrossRefGoogle Scholar
  8. 39.
    R.M. Cooke, D. Kurowicka, K. Wilson, Sampling, conditionalizing, counting, merging, searching regular vines. J. Multivar. Anal. 138, 4–18 (2015)CrossRefGoogle Scholar
  9. 40.
    W. Croft, D.A. Cruse, Cognitive Linguistics (Cambridge University Press, New York, 2004)CrossRefGoogle Scholar
  10. 43.
    A.B. Davidow, J.D. Peterson, A modern approach to asset allocation and portfolio construction. Technical Report MKT81752HL-02, Schwab Center for Financial Research, 2014Google Scholar
  11. 51.
    F. Durante, C. Sempi, Principles of Copula Theory (CRC Press, Boca Raton, 2016)Google Scholar
  12. 52.
    G. Elidan, Copulas in machine learning, in Copulae in Mathematical and Quantitative Finance, vol. 213 (Springer, Berlin/Heidelberg, 2013), pp. 39–60CrossRefGoogle Scholar
  13. 55.
    E.F. Fama, K.R. French, Luck versus skill in the cross-section of mutual fund returns. J. Financ. 65(5), 1915–1947 (2010)CrossRefGoogle Scholar
  14. 78.
    K.K. Hung, C.C. Cheung, L. Xu, New Sharpe-ratio-related methods for portfolio selection, in Proceedings of the Conference on Computational Intelligence for Financial Engineering (CIFEr), 2000, pp. 34–37Google Scholar
  15. 85.
    D.P. Kingma, J. Ba, Adam: a method for stochastic optimization, in Proceedings of International Conference on Learning Representations, 2015Google Scholar
  16. 87.
    D. Kurowicka, H. Joe (eds.), Dependence Modeling: Vine Copula Handbook (World Scientific, London, 2011)Google Scholar
  17. 89.
    Q. Le, T. Mikolov, Distributed representations of sentences and documents, in Proceedings of the 31st International Conference on Machine Learning (ICML), 2014, pp. 1188–1196Google Scholar
  18. 91.
    G. Leech, Semantics: The Study of Meaning, 2 edn. (Harmondsworth, Penguin, 1981)Google Scholar
  19. 97.
    X. Li, H. Xie, Y. Song, S. Zhu, Q. Li, F.L. Wang, Does summarization help stock prediction? A news impact analysis. IEEE Intell. Syst. 30(3), 26–34 (2015)Google Scholar
  20. 104.
    L. Luo, Y. Xiong, Y. Liu, X. Sun, Adaptive gradient methods with dynamic bound of learning rate, in Proceedings of International Conference on Learning Representations, 2019Google Scholar
  21. 112.
    R.C. Merton, On estimating the expected return on the market: an exploratory investigation. J. Financ. Econ. 8(4), 323–361 (1980)CrossRefGoogle Scholar
  22. 113.
    D. Metzler, W.B. Croft, A Markov random field model for term dependencies, in Proceedings of the 28th Annual International Conference on Research and Development in Information Retrieval (SIGIR), 2005, pp. 472–479Google Scholar
  23. 114.
    T. Mikolov, I. Sutskever, K. Chen, G. Corrado, J. Dean, Distributed representations of words and phrases and their compositionality, in Proceedings of the 26th International Conference on Neural Information Processing Systems (NIPS), 2013, pp. 3111–3119Google Scholar
  24. 121.
    N.M. Neykov, P. Filzmoser, P.N. Neytchev, Robust joint modeling of mean and dispersion through trimming. Comput. Stat. Data Anal. 56(1), 34–48 (2012)CrossRefGoogle Scholar
  25. 128.
    A. Panagiotelis, C. Czado, H. Joe, J. Stoeber, Model selection for discrete regular vine copulas. Comput. Stat. Data Anal. 106, 138–152 (2017)CrossRefGoogle Scholar
  26. 138.
    H. Qiu, F. Han, H. Liu, B. Caffo, Robust portfolio optimization, in Neural Information Processing Systems (NIPS), 2015, pp. 46–54Google Scholar
  27. 139.
    S.T. Rachev, S.V. Stoyanov, A. Biglova, F.J. Fabozzi, An empirical examination of daily stock return distributions for U.S. Stocks, in Data Analysis and Decision Support (Springer, Berlin/Heidelberg, 2005), pp. 269–281Google Scholar
  28. 170.
    D. Tran, D.M. Blei, E.M. Airoldi, Copula variational inference, in Advances in Neural Information Processing Systems (NIPS) (Springer, Cham, 2015), pp. 3564–3572Google Scholar
  29. 171.
    R.R. Trippi, J.K. Lee, Artificial Intelligence in Finance & Investing (Irwin Professional Publishing, Chicago, 1996)Google Scholar
  30. 183.
    R.E. Welsch, X. Zhou, Application of robust statistics to asset allocation models. Revstat Stat. J. 5(1), 97–114 (2007)Google Scholar
  31. 191.
    F.Z. Xing, E. Cambria, R.E. Welsch, Growing semantic vines for robust asset allocation. Knowl. Based Syst. 165, 297–305 (2019)CrossRefGoogle Scholar
  32. 200.
    L. Zhang, C. Aggarwal, G.-J. Qi, Stock price prediction via discovering multi-frequency trading patterns, in The 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 2141–2149Google Scholar
  33. 204.
    Z. Zhu, R.E. Welsch, Robust dependence modeling for high-dimensional covariance matrices with financial applications. Ann. Appl. Stat. 12(2), 1228–1249 (2018)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Frank Xing
    • 1
  • Erik Cambria
    • 1
  • Roy Welsch
    • 2
  1. 1.School of Computer Science and EngineeringNanyang Technological UniversitySingaporeSingapore
  2. 2.Sloan School of ManagementMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations