Towards Stock Market Data Mining Using Enriched Random Forests from Textual Resources and Technical Indicators

Maragoudakis, Manolis; Serpanos, Dimitrios

doi:10.1007/978-3-642-16239-8_37

Towards Stock Market Data Mining Using Enriched Random Forests from Textual Resources and Technical Indicators

Manolis Maragoudakis^4,5 &
Dimitrios Serpanos^5,6

Conference paper

2573 Accesses
6 Citations
7 Altmetric

Part of the book series: IFIP Advances in Information and Communication Technology ((IFIPAICT,volume 339))

Abstract

The present paper deals with a special Random Forest Data Mining technique, designed to alleviate the significant issue of high dimensionality in volatile and complex domains, such as stock market prediction. Since it has been widely acceptable that media affect the behavior of investors, information from both technical analysis as well as textual data from various on-line financial news resources are considered. Different experiments are carried out to evaluate different aspects of the problem, returning satisfactory results. The results show that the trading strategies guided by the proposed data mining approach generate higher profits than the buy-and-hold strategy, as well as those guided by the level-estimation based forecasts of standard linear regression models and other machine learning classifiers such as Support Vector Machines, ordinary Random Forests and Neural Networks.

Download to read the full chapter text

Chapter PDF

References

Technical-Analysis. The Trader’s Glossary of Technical Terms and Topics (2005), http://www.traders.com
Ng, A., Fu, A.W.: Mining Frequent Episodes for Relating Financial Events and Stock Trends. In: Whang, K.-Y., Jeon, J., Shim, K., Srivastava, J. (eds.) PAKDD 2003. LNCS (LNAI), vol. 2637, pp. 27–39. Springer, Heidelberg (2003)
Chapter Google Scholar
Breiman, L.: Random forests. Machine Learning Journal 45, 532 (2001)
Google Scholar
Chung, F., Fu, T., Luk, R., Ng, V.: Evolutionary Time Series Segmentation for Stock Data Mining. In: Proceedings of IEEE International Conference on Data Mining, pp. 83–91 (2002)
Google Scholar
Klibanoff, P., Laymont, O., Wizman, T.A.: Investor reaction to Salient News in Closed-end Country Funds. Journal of Finance 53(2), 673–699 (1998)
Article Google Scholar
Chan, Y., John-Wei, K.C.: Political Risk and Stock Price Volatility: The Case of Hong-Kong. Pacific-Basin Finance Journal 4(2-3), 259–275 (1996)
Article Google Scholar
Mitchell, M.L., Mulherin, J.H.: The Impact of Public Information on the Stock Market. Journal of Finance 49(3), 923–950
Google Scholar
Mittermayer, M.A.: Forecasting Intraday Stock Price Trends with Text Mining Techniques. In: Proceedings of the 37th Annual Hawaii International Conference on System Sciences (HICS), vol. 3(3), p. 30064.2. IEEE Computer Society, Los Alamitos (2004)
Google Scholar
Shumaker, R.P., Chen, H.: Textual Analysis of Stock Market Prediction Using Financial News Articles. In: On the 12th American Conference on Information Systems, AMCIS (2006)
Google Scholar
Díaz-Uriarte, R., de Andrés, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinformatics 7, 3 (2006)
Article Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence journal, special issue on relevance 97(1-2), 273–324 (1997)
MATH Google Scholar
Cooper, G.F., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. In: Machine Learning, vol. 9, pp. 309–347. Kluwer Academic Publishers, Boston (1992)
Google Scholar
Strobl, C., et al.: Bias in random forest variable importance measures: illustrations, sources and a solution. BMC Bioinformatics 8, 25 (2007)
Article Google Scholar
Lyras, D.P., Sgarbas, K.N., Fakotakis, N.D.: Using the Levenshtein Edit Distance for Automatic Lemmatization: A Case Study for Modern Greek and English. In: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007), vol. 2, pp. 428–435 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information and Communication Systems Engineering, University of Aegean, Samos, 82000, Greece
Manolis Maragoudakis
Science Park building Platani, I.S.I. - Industrial Systems Institute Patras, PATRAS, Greece, 26504
Manolis Maragoudakis & Dimitrios Serpanos
Department of Electrical and Computer Engineering, University of Patras, Rion, 26500, Greece
Dimitrios Serpanos

Authors

Manolis Maragoudakis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitrios Serpanos
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Engineering Department, Frederick University, 1013, Nicosia, Cyprus
Harris Papadopoulos
Department of Electrical Engineering and Information Technology, Cyprus University of Technology, 3603, Limassol, Cyprus
Andreas S. Andreou
School of Computing, University of Portsmouth, PO1 2UP, Portsmouth, UK
Max Bramer

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maragoudakis, M., Serpanos, D. (2010). Towards Stock Market Data Mining Using Enriched Random Forests from Textual Resources and Technical Indicators. In: Papadopoulos, H., Andreou, A.S., Bramer, M. (eds) Artificial Intelligence Applications and Innovations. AIAI 2010. IFIP Advances in Information and Communication Technology, vol 339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16239-8_37

Download citation

DOI: https://doi.org/10.1007/978-3-642-16239-8_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16238-1
Online ISBN: 978-3-642-16239-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics