Can Sentiment Analysis and Options Volume Anticipate Future Returns?
This paper evaluates the question of whether sentiment extracted from social media and options volume anticipates future asset return. The research utilized both textual based data and a particular market data derived call-put ratio, collected between July 2009 and September 2012. It shows that: (1) features derived from market data and a call-put ratio can improve model performance, (2) sentiment derived from StockTwits, a social media platform for the financial community, further enhances model performance, (3) aggregating all features together also facilitates performance, and (4) sentiment from social media and market data can be used as risk factors in an asset pricing framework.
KeywordsSocial media Investor sentiment Behavioral finance Machine learning
The authors would like to thank StockTwits for providing the messages. The authors also thank Shu-Heng Chen, Blake LeBaron, Jon Kaufman, David Starer, Hamed Ghoddusi, Khaldoun Khashanah, and three anonymous referees for suggestions and informal discussions about this research. The opinions presented are the exclusive responsibility of the authors.
- Agarwal, A., Biadsy, F., & Mckeown, K. R. (2009). Contextual phrase-level polarity analysis using lexical affect scoring and syntactic n-grams. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics, Athens, Greece, pp. 24–32.Google Scholar
- Aisopos, F., Tzannetos, D., Violos, J. & Varvarigou, T. (2016). Using n-gram graphs for sentiment analysis: an extended study on Twitter. In Proceedings of the 2016 IEEE second international conference on big data computing service and applications, Oxford, United Kingdom, pp. 44–51.Google Scholar
- Asur, S., & Huberman, B. A. (2010). Predicting the future with social media. In Proceedings of the 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Los Alamitos, CA, Vol. 1 (pp. 492-499).Google Scholar
- Bermingham, A., & Smeaton, A. F. (2010). Classifying sentiment in microblogs: Is brevity an advantage? In Proceedings of the 19th ACM international conference on Information and Knowledge Management, Toronto, CA (pp. 1833–1836).Google Scholar
- Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.Google Scholar
- Cao, C., Griffin, J. M., & Chen, Z. (2003). Informational content of option volume prior to takeovers, Yale SOM Working Paper No. ES-31.Google Scholar
- Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal Statistical Society. Series B (Methodological), 34(2), 187–220.Google Scholar
- De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990a). Positive feedback investment strategies and destabilizing rational speculation. The Journal of Finance, 45(2), 379–395.Google Scholar
- De Long, J. B., Shleifer, A., Summers, L. H., & Waldmann, R. J. (1990b). Noise trader risk in financial markets. Journal of Political Economy, 98(4), 703–738.Google Scholar
- Dietterich, T. G. (2000). Ensemble methods in machine learning. In Multiple classifier systems. MCS 2000. Lecture Notes need space after comma in Computer Science, Springer, Berlin, Heidelberg, Vol. 1857.Google Scholar
- Fama, E. F., & MacBeth, J. D. (1973). Risk, return, and equilibrium: Empirical tests. The Journal of Political Economy, 81(3), 607–636.Google Scholar
- Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(4), 463–484.CrossRefGoogle Scholar
- Gruhl, D., Guha, R., Kumar, R., Novak, J., & Tomkins, A. (2005). The predictive power of online chatter. In Proceedings of the eleventh ACM SIGKDD international conference on knowledge discovery and data mining, Chicago, IL (pp. 78–87).Google Scholar
- Houlihan, P. & Creamer, G. G. (2014). Leveraging a call-put ratio as a trading signal. Howe School Research Paper No. 2015–49. Available at SSRN: https://ssrn.com/abstract=2363475.
- Houlihan, P. & Creamer, G. G. (2015). Leveraging social media to predict continuation and reversal in asset prices. Available at SSRN: https://ssrn.com/abstract=2527968.
- Liu, B. (2010). Sentiment analysis and Subjectivity. Handbook of Natural Language Processing, 2, 627–666.Google Scholar
- Kanakaraj, M. & Guddeti, R. M. R. (2015). Performance analysis of Ensemble methods on Twitter sentiment analysis using NLP techniques. In 2015 Ninth IEEE international conference on semantic computing (ICSC), Anaheim, CA (pp. 169–170).Google Scholar
- Maglogiannis, I. G. (2007). Emerging artificial intelligence applications in computer engineering: Real word AI systems with applications in eHealth, HCI, information retrieval and pervasive technologies. Amsterdam: Ios Press.Google Scholar
- Matthews, B. W. (1975). Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein. Structure, 405(2), 442–451.Google Scholar
- Pang, B., & Lee, L. (2004). A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the 42nd annual meeting on association for computational linguistics, Barcelona, Spain (p. 271).Google Scholar
- Russell, S., Norvig, P., & Intelligence A. (2009). Artificial Intelligence: A modern approach (3rd ed.). Englewood Cliffs: Prentice-Hall.Google Scholar
- Wu, L. & Brynjolfsson, E. (2014). The future of prediction: How Google searches foreshadow housing prices and sales. In A. Goldfarb, S. M. Greenstein, and C. E. Tucker (Eds). Economic analysis of the digital economy. University of Chicago Press, Chicago, IL, 89–118.Google Scholar
- Wysocki, P. D. (1998). Cheap talk on the web: The determinants of postings on stock message boards. University of Michigan Business School Working Paper, (98025).Google Scholar
- Xie, B., Passonneau, R. J., Wu, L., & Creamer, G. G. (2013). Semantic frames to predict stock price movement. In Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria (pp. 873–883).Google Scholar