Electronic Markets

, Volume 27, Issue 3, pp 283–296 | Cite as

Market sentiment dispersion and its effects on stock return and volatility

  • Eric. W. K. See-ToEmail author
  • Yang Yang
Research Paper


Behavioral economics has revealed that investor sentiment can profoundly affect individual behavior and decision-making. Recently, the question is no longer whether investor sentiment affects stock market valuation, but how to directly measure investor sentiment and quantify its effects. Before the era of big data, research uses proxies as a mediator to indirectly measure investor sentiment, which has proved elusive due to insufficient data points. In addition, most of extant sentiment analysis studies focus on institutional investors instead of individual investors. This is despite the fact that United States individual investors have been holding around 50% of the stock market in direct stock investments. In order to overcome difficulties in measuring sentiment and endorse the importance of individual investors, we examine the role of individual sentiment dispersion in stock market. In particular, we investigate whether sentiment dispersion contains information about future stock returns and realized volatility. Leveraging on development of big data and recent advances in data and text mining techniques, we capture 1,170,414 data points from Twitter and used a text mining method to extract sentiment and applied both linear regression and Support Vector Regression; found that individual sentiment dispersion contains information about stock realized volatility, and can be used to increase the prediction accuracy. We expect our results contribute to extant theories of electronic market financial behavior by directly measuring the individual sentiment dispersion; raising a new perspective to assess the impact of investor opinion on stock market; and recommending a supplementary investing approach using user-generated content.


Investor sentiment Text mining Return and volatility predictability 

JEL Classification

C55 C53 C52 


  1. Akter, S., & Wamba, S. F. (2016). Big data analytics in E-commerce: A systematic review and agenda for future research. Electronic Markets, 26(2), 173–194.CrossRefGoogle Scholar
  2. Almgren, R. (2009). High frequency volatility. New York University.Google Scholar
  3. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.CrossRefGoogle Scholar
  4. Areal, N. M., & Taylor, S. J. (2002). The realized volatility of FTSE-100 futures prices. Journal of Futures Markets, 22(7), 627–648.CrossRefGoogle Scholar
  5. Baker, M., & Wurgler, J. (2006). Investor sentiment and the cross-section of stock returns. The Journal of Finance, 61(4), 1645–1680.CrossRefGoogle Scholar
  6. Barber, B. M., Odean, T., & Zhu, N. (2009a). Do retail trades move markets? Review of Financial Studies, 22(1), 151–186.CrossRefGoogle Scholar
  7. Barber, B. M., Odean, T., & Zhu, N. (2009b). Systematic noise. Journal of Financial Markets, 12(4), 547–569.CrossRefGoogle Scholar
  8. Bing, L., Chan, K. C., & Ou, C. (2014). Public sentiment analysis in twitter data for prediction of a company's stock price movements. In e-business engineering (ICEBE), 2014 I.E. 11th International Conference on (pp. 232-239). IEEE.Google Scholar
  9. Black, F. (1986). Noise. The Journal of Finance, 41(3), 528–543.CrossRefGoogle Scholar
  10. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of computational science, 2(1), 1–8.CrossRefGoogle Scholar
  11. Brown, G. W., & Cliff, M. T. (2004). Investor sentiment and the near-term stock market. Journal of Empirical Finance, 11(1), 1–27.CrossRefGoogle Scholar
  12. Carlin, B. I., Longstaff, F. A., & Matoba, K. (2014). Disagreement and asset prices. Journal of Financial Economics, 114(2), 226–238.CrossRefGoogle Scholar
  13. Chordia, T., Roll, R., & Subrahmanyam, A. (2002). Order imbalance, liquidity, and market returns. Journal of Financial Economics, 65(1), 111–130.CrossRefGoogle Scholar
  14. Corsi, F. (2005). Measuring and modelling realized volatility: From tick-by-tick to long memory (Doctoral dissertation, University of Lugano).Google Scholar
  15. Da, Z., Engelberg, J., & Gao, P. (2015). The sum of all FEARS investor sentiment and asset prices. Review of Financial Studies, 28(1), 1–32.CrossRefGoogle Scholar
  16. Das, S. R., & Chen, M. Y. (2007). Yahoo! For Amazon: Sentiment extraction from small talk on the web. Management Science, 53(9), 1375–1388.CrossRefGoogle Scholar
  17. De Long, J. B., & Shleifer, A. (1991). The stock market bubble of 1929: evidence from closed-end mutual funds. The Journal of Economic History, 51(03), 675–700.CrossRefGoogle Scholar
  18. De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990). Noise trader risk in financial markets. Journal of Political Economy, 703–738.Google Scholar
  19. Diether, K. B., Malloy, C. J., & Scherbina, A. (2002). Differences of opinion and the cross section of stock returns. Journal of Finance, 2113–2141.CrossRefGoogle Scholar
  20. Drucker, H., Burges, C. J., Kaufman, L., Smola, A., & Vapnik, V. (1997). Support vector regression machines. Advances in neural information processing systems, 9, 155–161.Google Scholar
  21. Ekbia, H., Mattioli, M., Kouper, I., Arave, G., Ghazinejad, A., Bowman, T., & Sugimoto, C. R. (2015). Big data, bigger dilemmas: A critical review. Journal of the Association for Information Science and Technology, 66(8), 1523–1545.CrossRefGoogle Scholar
  22. Fisher, K. L., & Statman, M. (2000). Cognitive biases in market forecasts. The Journal of Portfolio Management, 27(1), 72–81.CrossRefGoogle Scholar
  23. Gao, L., & Kling, G. (2008). Corporate governance and tunneling: Empirical evidence from China. Pacific-Basin Finance Journal, 16(5), 591–605.CrossRefGoogle Scholar
  24. Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National Academy of Sciences, 101(suppl 1), 5228–5235.CrossRefGoogle Scholar
  25. Gruca, T. S., Berg, J. E., & Cipriano, M. (2005). Consensus and differences of opinion in electronic prediction markets. Electronic Markets, 15(1), 13–22.CrossRefGoogle Scholar
  26. Hornik, K., & Grün, B. (2011). Topicmodels: An R package for fitting topic models. Journal of Statistical Software, 40(13), 1–30.Google Scholar
  27. ​Hsiao, C. (2014). Analysis of panel data, 3rd edn. Econometric Society monographs 54. Cambridge University Press.Google Scholar
  28. Keynes, J. M. (1936). The general theory of employment, interest and money. London: Macmillan.Google Scholar
  29. Kim, K. J. (2003). Financial time series forecasting using support vector machines. Neurocomputing, 55(1), 307–319.CrossRefGoogle Scholar
  30. Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Handling imbalanced datasets: A review. GESTS International Transactions on Computer Science and Engineering, 30(1), 25–36.Google Scholar
  31. Lee, C., Shleifer, A., & Thaler, R. H. (1991). Investor sentiment and the closed-end fund puzzle. The Journal of Finance, 46(1), 75–109.CrossRefGoogle Scholar
  32. Mao, H., Counts, S., & Bollen, J. (2011). Predicting Financial Markets: Comparing Survey, News, Twitter and Search Engine Data. ArXiv E-prints, p. 10. Available from:
  33. McAfee, A., & Brynjolfsson, E. (2012). Big Data: The management Revolution: Exploiting vast new flows of information can radically improve your company’s performance. But first you’ll have to change your decision making culture’[2012] Harvard Business Review.Google Scholar
  34. McGraw Hill Financial (n.d.). Dow Jones Averages | About the Averages | Overview. Retrieved August 12, 2015, from
  35. Miller, E. M. (1977). Risk, uncertainty, and divergence of opinion. The Journal of Finance, 32(4), 1151–1168.CrossRefGoogle Scholar
  36. Mosier, C. I. (1947). A critical examination of the concepts of face validity. Educational and Psychological Measurement, 7(2), 191–205.CrossRefGoogle Scholar
  37. Nash, M. S. (2001). Handbook of parametric and nonparametric statistical procedures. Technometrics, 43(3), 374–374.CrossRefGoogle Scholar
  38. Neal, R., & Wheatley, S. M. (1998). Do measures of investor sentiment predict returns? Journal of Financial and Quantitative Analysis, 33(04), 523–547.CrossRefGoogle Scholar
  39. Oliveira, N., Cortez, P., & Areal, N. (2013a). On the predictability of stock market behavior using stocktwits sentiment and posting volume, In Progress in Artificial Intelligence (pp. 355–365). Berlin Heidelberg: Springer.CrossRefGoogle Scholar
  40. Oliveira, N., Cortez, P., & Areal, N. (2013b). Some experiments on modeling stock market behavior using investor sentiment analysis and posting volume from twitter. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics (p. 31). ACM.Google Scholar
  41. Oliveira, N., Cortez, P., & Areal, N. (2014, July). Automatic creation of stock market lexicons for sentiment analysis using StockTwits data. In Proceedings of the 18th International Database Engineering & Applications Symposium (pp. 115-123). ACM.Google Scholar
  42. Pedersen T, Banerjee S (2011) WordNet::Stem, Retrieved August 05, 2015, from
  43. Poteshman, A. M. (2001). Underreaction, overreaction, and increasing misreaction to information in the options market. The Journal of Finance, 56(3), 851–876.CrossRefGoogle Scholar
  44. Qian, X. (2014). Small investor sentiment, differences of opinion and stock overvaluation. Journal of Financial Markets, 19, 219–246.CrossRefGoogle Scholar
  45. Rao, T., & Srivastava, S. (2012, August). Analyzing stock market movements using twitter sentiment analysis. In Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012) (pp. 119-123). IEEE computer society.Google Scholar
  46. Schmeling, M. (2009). Investor sentiment and stock returns: Some international evidence. Journal of Empirical Finance, 16(3), 394–408.CrossRefGoogle Scholar
  47. Schwert, G. W. (1998). Stock market volatility: Ten years after the crash (no. w6381). National Bureau of economic research.Google Scholar
  48. Shiller, R. J. (2000). Measuring bubble expectations and investor confidence. The Journal of Psychology and Financial Markets, 1(1), 49–60.CrossRefGoogle Scholar
  49. Stoffman, N. S. (2008). Individual and institutional investor behavior. ProQuest.Google Scholar
  50. Tay, F. E., & Cao, L. J. (2002). Modified support vector machines in financial time series forecasting. Neurocomputing, 48(1), 847–861.CrossRefGoogle Scholar
  51. Taylor, S. J., & Xu, X. (1997). The incremental volatility information in one million foreign exchange quotations. Journal of Empirical Finance, 4(4), 317–340.CrossRefGoogle Scholar
  52. Tetlock, P. C. (2007). Giving content to investor sentiment: The role of media in the stock market. The Journal of Finance, 62(3), 1139–1168.CrossRefGoogle Scholar
  53. Tetlock, P. C., Saar-tsechansky, M. A. Y. T. A. L., & Macskassy, S. (2008). More than words: Quantifying language to measure firms' fundamentals. The Journal of Finance, 63(3), 1437–1467.CrossRefGoogle Scholar
  54. Theil, H., & Nagar, A. L. (1961). Testing the independence of regression disturbances. Journal of the American Statistical Association, 56(296), 793–806.CrossRefGoogle Scholar
  55. Varian, H. R. (1985). Divergence of opinion in complete markets: A note. The Journal of Finance, 40(1), 309–317.CrossRefGoogle Scholar
  56. Verma, R., & Soydemir, G. (2009). The impact of individual and institutional investor sentiment on the market price of risk. The Quarterly Review of Economics and Finance, 49(3), 1129–1145.CrossRefGoogle Scholar
  57. Verma, R., & Verma, P. (2007). Noise trading and stock market volatility. Journal of Multinational Financial Management, 17(3), 231–243.CrossRefGoogle Scholar
  58. Verma, R., & Verma, P. (2008). Are survey forecasts of individual and institutional investor sentiments rational? International Review of Financial Analysis, 17(5), 1139–1155.CrossRefGoogle Scholar
  59. Wang, Y. H., Keswani, A., & Taylor, S. J. (2006). The relationships between sentiment, returns and volatility. International Journal of Forecasting, 22(1), 109–123.CrossRefGoogle Scholar
  60. Wei, Q., & Dunbrack Jr., R. L. (2013). The role of balanced training and testing data sets for binary classifiers in bioinformatics. PloS one, 8(7), e67863.CrossRefGoogle Scholar
  61. Zhang, M., Jansen, B. J., & Chowdhury, A. (2011). Business engagement on twitter: A path analysis. Electronic Markets, 21(3), 161–175.CrossRefGoogle Scholar

Copyright information

© Institute of Applied Informatics at University of Leipzig 2017

Authors and Affiliations

  1. 1.The Hong Kong Polytechnic UniversityHong KongHong Kong

Personalised recommendations