Computational Economics

, Volume 50, Issue 4, pp 669–685 | Cite as

Can Sentiment Analysis and Options Volume Anticipate Future Returns?



This paper evaluates the question of whether sentiment extracted from social media and options volume anticipates future asset return. The research utilized both textual based data and a particular market data derived call-put ratio, collected between July 2009 and September 2012. It shows that: (1) features derived from market data and a call-put ratio can improve model performance, (2) sentiment derived from StockTwits, a social media platform for the financial community, further enhances model performance, (3) aggregating all features together also facilitates performance, and (4) sentiment from social media and market data can be used as risk factors in an asset pricing framework.


Social media Investor sentiment Behavioral finance Machine learning 



The authors would like to thank StockTwits for providing the messages. The authors also thank Shu-Heng Chen, Blake LeBaron, Jon Kaufman, David Starer, Hamed Ghoddusi, Khaldoun Khashanah, and three anonymous referees for suggestions and informal discussions about this research. The opinions presented are the exclusive responsibility of the authors.


  1. Abu Bakar, A., Siganos, A., & Vagenas-Nanos, E. (2014). Does mood explain the monday effect? Journal of Forecasting, 33(6), 409–418.CrossRefGoogle Scholar
  2. Agarwal, A., Biadsy, F., & Mckeown, K. R. (2009). Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Athens, Greece, pp. 24–32.Google Scholar
  3. Aisopos, F., Tzannetos, D., Violos, J. & Varvarigou, T. (2016). Using n-gram graphs for sentiment analysis: an extended study on Twitter. In Proceedings of the 2016 IEEE second international conference on big data computing service and applications, Oxford, United Kingdom, pp. 44–51.Google Scholar
  4. Anthony, J. H. (1988). The interrelation of stock and options market trading-volume data. The Journal of Finance, 43(4), 949–964.CrossRefGoogle Scholar
  5. Antweiler, W., & Frank, M. Z. (2004). Is all that talk just noise? The information content of internet stock message boards. The Journal of Finance, 59(3), 1259–1294.CrossRefGoogle Scholar
  6. Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Los Alamitos, CA, Vol. 1 (pp. 492-499).Google Scholar
  7. Bermingham, A., & Smeaton, A. F. (2010). Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and Knowledge Management, Toronto, CA (pp. 1833–1836).Google Scholar
  8. Billingsley, R. S., & Chance, D. M. (1988). Put-call ratios and market timing effectiveness. The Journal of Portfolio Management, 15(1), 25–28.CrossRefGoogle Scholar
  9. Bollen, J., Mao, H., & Zeng, X. (2011). Twitter mood predicts the stock market. Journal of Computational Science, 2(1), 1–8.CrossRefGoogle Scholar
  10. Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Google Scholar
  11. Cao, C., Griffin, J. M., & Chen, Z. (2003). Informational content of option volume prior to takeovers, Yale SOM Working Paper No. ES-31.Google Scholar
  12. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion mining and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.CrossRefGoogle Scholar
  13. Cambria, E., & White, B. (2014). Jumping NLP curves: A review of natural language processing research. IEEE Computational Intelligence Magazine, 9(2), 48–57.CrossRefGoogle Scholar
  14. Carhart, M. M. (1997). On persistence in mutual fund performance. The Journal of Finance, 52(1), 57–82.CrossRefGoogle Scholar
  15. Chen, J., Hong, H., & Stein, J. C. (2002). Breadth of ownership and stock returns. Journal of financial Economics, 66(2), 171–205.CrossRefGoogle Scholar
  16. Chen, Z., & Lu, A. (2017). Slow diffusion of information and price momentum in stocks: Evidence from options markets. Journal of Banking and Finance, 75, 98–108.CrossRefGoogle Scholar
  17. Choi, H., & Varian, H. (2012). Predicting the present with Google Trends. Economic Record, 88(s1), 2–9.CrossRefGoogle Scholar
  18. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.Google Scholar
  19. Danbolt, J., Siganos, A., & Vagenas-Nanos, E. (2015). Investor sentiment and bidder announcement abnormal returns. Journal of Corporate Finance, 33, 164–179.CrossRefGoogle Scholar
  20. Da, Z., Engelberg, J., & Gao, P. (2011). In search of attention. The Journal of Finance, 66(5), 1461–1499.CrossRefGoogle Scholar
  21. De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990a). Positive feedback investment strategies and destabilizing rational speculation. The Journal of Finance, 45(2), 379–395.Google Scholar
  22. De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990b). Noise trader risk in financial markets. Journal of Political Economy, 98(4), 703–738.Google Scholar
  23. Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems. MCS 2000. Lecture Notes need space after comma in Computer Science, Springer, Berlin, Heidelberg, Vol. 1857.Google Scholar
  24. Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. The Journal of Political Economy, 81(3), 607–636.Google Scholar
  25. Fama, E. F., & French, K. R. (1993). Common risk factors in the returns on stocks and bonds. Journal of Financial Economics, 33(1), 3–56.CrossRefGoogle Scholar
  26. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.CrossRefGoogle Scholar
  27. Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The annals of statistics, 28(2), 337–407.CrossRefGoogle Scholar
  28. Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484.CrossRefGoogle Scholar
  29. Ghiassi, M., Skinner, J., & Zimbra, D. (2013). Twitter brand sentiment analysis: A hybrid system using n-gram analysis and dynamic artificial neural network. Expert Systems with Applications, 40(16), 6266–6282.CrossRefGoogle Scholar
  30. Gruhl, D., Guha, R., Kumar, R., Novak, J., & Tomkins, A. (2005). The predictive power of online chatter. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL (pp. 78–87).Google Scholar
  31. Hamid, A., & Heiden, M. (2015). Forecasting volatility with empirical similarity and Google Trends. Journal of Economic Behavior and Organization, 117, 62–81.CrossRefGoogle Scholar
  32. Hennig-Thurau, T., Wiertz, C., & Feldhaus, F. (2015). Does Twitter matter? The impact of microblogging word of mouth on consumers’ adoption of new movies. Journal of the Academy of Marketing Science, 43(3), 375–394.CrossRefGoogle Scholar
  33. Houlihan, P. & Creamer, G. G. (2014). Leveraging a call-put ratio as a trading signal. Howe School Research Paper No. 2015–49. Available at SSRN:
  34. Houlihan, P. & Creamer, G. G. (2015). Leveraging social media to predict continuation and reversal in asset prices. Available at SSRN:
  35. Hu, J. (2014). Does option trading convey stock price information? Journal of Financial Economics, 111(3), 625–645.CrossRefGoogle Scholar
  36. Liu, B. (2010). Sentiment analysis and Subjectivity. Handbook of Natural Language Processing, 2, 627–666.Google Scholar
  37. Kanakaraj, M. & Guddeti, R. M. R. (2015). Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In 2015 Ninth IEEE international conference on semantic computing (ICSC), Anaheim, CA (pp. 169–170).Google Scholar
  38. Kaplan, A. M., & Haenlein, M. (2010). Users of the world, unite! The challenges and opportunities of Social Media. Business Horizons, 53(1), 59–68.CrossRefGoogle Scholar
  39. Kim, S. H., & Kim, D. (2014). Investor sentiment from internet message postings and the predictability of stock returns. Journal of Economic Behavior and Organization, 107, 708–729.CrossRefGoogle Scholar
  40. Loughran, T., & McDonald, B. (2011). When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks. The Journal of Finance, 66(1), 35–65.CrossRefGoogle Scholar
  41. Maglogiannis, I. G. (2007). Emerging artificial intelligence applications in computer engineering: Real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies. Amsterdam: Ios Press.Google Scholar
  42. Martínez-Cámara, E., Martín-Valdivia, M. T., Urena-López, L. A., & Montejo-Ráez, A. R. (2014). Sentiment analysis in Twitter. Natural Language Engineering, 20(01), 1–28.CrossRefGoogle Scholar
  43. Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein. Structure, 405(2), 442–451.Google Scholar
  44. Nguyen, T. H., Shirai, K., & Velcin, J. (2015). Sentiment analysis on social media for stock movement prediction. Expert Systems with Applications, 42(24), 9603–9611.CrossRefGoogle Scholar
  45. Pan, J., & Poteshman, A. M. (2006). The information in option volume for future stock prices. Review of Financial Studies, 19(3), 871–908.CrossRefGoogle Scholar
  46. Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd annual meeting on association for computational linguistics, Barcelona, Spain (p. 271).Google Scholar
  47. Poria, S., Cambria, E., Winterstein, G., & Huang, G. B. (2014). Sentic patterns: Dependency-based rules for concept-level sentiment analysis. Knowledge-Based Systems, 69, 45–63.CrossRefGoogle Scholar
  48. Russell, S., Norvig, P., & Intelligence A. (2009). Artificial Intelligence: A modern approach (3rd ed.). Englewood Cliffs: Prentice-Hall.Google Scholar
  49. Saif, H., He, Y., Fernandez, M., & Alani, H. (2016). Contextual semantics for sentiment analysis of Twitter. Information Processing and Management, 52(1), 5–19.CrossRefGoogle Scholar
  50. Shen, D., Zhang, W., Xiong, X., Li, X., & Zhang, Y. (2016). Trading and non-trading period Internet information flow and intraday return volatility. Physica A: Statistical Mechanics and its Applications, 451, 519–524.CrossRefGoogle Scholar
  51. Siganos, A., Vagenas-Nanos, E., & Verwijmeren, P. (2014). Facebook’s daily sentiment and international stock markets. Journal of Economic Behavior and Organization, 107, 730–743.CrossRefGoogle Scholar
  52. Tumarkin, R., & Whitelaw, R. F. (2001). News or noise? Internet postings and stock prices. Financial Analysts Journal, 57(3), 41–51.CrossRefGoogle Scholar
  53. Whissell, C., Fournier, M., Pelland, R., Weir, D., & Makarec, K. (1986). A dictionary of affect in language: IV. Reliability, validity, and applications. Perceptual and Motor Skills, 62(3), 875–888.CrossRefGoogle Scholar
  54. Wu, L. & Brynjolfsson, E. (2014). The future of prediction: How Google searches foreshadow housing prices and sales. In A. Goldfarb, S. M. Greenstein, and C. E. Tucker (Eds). Economic analysis of the digital economy. University of Chicago Press, Chicago, IL, 89–118.Google Scholar
  55. Wysocki, P. D. (1998). Cheap talk on the web: The determinants of postings on stock message boards. University of Michigan Business School Working Paper, (98025).Google Scholar
  56. Xie, B., Passonneau, R. J., Wu, L., & Creamer, G. G. (2013). Semantic frames to predict stock price movement. In Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria (pp. 873–883).Google Scholar
  57. Zhang, W., Shen, D., Zhang, Y., & Xiong, X. (2013). Open source information, investor attention, and asset pricing. Economic Modelling, 33, 613–619.CrossRefGoogle Scholar
  58. Zhang, Y., Feng, L., Jin, X., Shen, D., Xiong, X., & Zhang, W. (2014). Internet information arrival and volatility of SME PRICE INDEX. Physica A: Statistical Mechanics and its Applications, 399, 70–74.CrossRefGoogle Scholar
  59. Zhou, Z. H., Wu, J., & Tang, W. (2002). Ensembling neural networks: Many could be better than all. Artificial Intelligence, 137(1), 239–263.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.Stevens Institute of TechnologyHobokenUSA

Personalised recommendations