Abstract
Inclusion of domain knowledge in a process of knowledge discovery in databases is a complex but very important part of successful knowledge discovery solutions. In real-life data mining development, non-structured domain knowledge involvement in the data preparation phase and in the final interpretation/evaluation phase tends to dominate. This paper presents an experiment of direct domain knowledge integration in the algorithm that will search for interesting patterns in the data. In the context of stock market prediction work, a recent rule induction algorithm, PA3, was adapted to include domain theories directly in the internal rule development. Tests performed over several Portuguese stocks show a significant increase in prediction performance over the same process using the standard version of PA3. We believe that a similar methodology can be applied to other symbolic induction algorithms and in other working domains to improve the efficiency of prediction (or classification) in knowledge-intensive data mining tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Almeida, P. and Bento, C.: Sequential Cover Rule Induction with PA3. To appear in Proceedings of the 10th International Conference on Computing and Information (ICCI’2000), Kuwait. Springer-Verlag (2000)
Choey, M. and Weigend, A.: Nonlinear Trading Models Through Sharpe Ratio Maximization. In Decision Technologies for Financial Engineering: Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets (NNCM-96). World Scientific (1997)
Cohen, W.: Compiling Prior Knowledge Into an Explicit Bias. In Proceedings of the Ninth International Conference on Machine Learning. Morgan Kaufmann (1992)
Cook, D., Holder, L. and Djoko S.: Scalable Discovery of Informative Structural Concepts Using Domain Knowledge. IEEE Expert/Intelligent Systems & Their Applications, 11(5) (1996)
Efron, B. and Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall (1993)
Fama, E.: Efficient Capital Markets: A Review of Theory and Empirical Work. Journal of Finance, May (1970)
Herbst, A.: Analyzing and Forecasting Futures Prices. John Wiley & Sons (1992)
Hong S.: Use of Contextual Information for Feature Ranking and Discretization. IEEE Transactions on Knowledge and Data Engineering, 9(5) (1997)
Hutchinson, J.: A Radial Basis Function Approach to Financial Time Series Analysis. Ph.D. Dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (1994)
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95). Morgan Kaufmann (1995)
Lawrence, S., Tsoi, A. and Giles C.: Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference. Technical Report UMIACS-TR-96-27 and CS-TR-3625, Institute for Advanced Computer Studies, University of Maryland, College Park, MD (1996)
LeBaron, B. and Weigend, A.: A Bootstrap Evaluation of the Effect of Data Splitting on Financial Time Series. IEEE Transactions on Neural Networks, 9(1) (1998)
Mitchell, T.: Machine Learning. McGraw-Hill (1997)
O’Sullivan, J.: Integrating Initialization Bias and Search Bias in Neural Network Learning. Unpublished research paper from April 1996, available in: http://www.cs.cmu.edu/~josullvn/research.html
Pazzani, M.: When Prior Knowledge Hinders Learning. In Proceedings of the AAAI Workshop on Constraining Learning with Prior Knowledge. San Jose, CA (1992)
Weigend, A., Abu-Mostafa and Refenes A.-P. (eds): Decision Technologies for Financial Engineering (Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets, NNCM-96). World Scientific (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Almeida, P. (2001). Direct Domain Knowledge Inclusion in the PA3 Rule Induction Algorithm. In: Cheung, D., Williams, G.J., Li, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2001. Lecture Notes in Computer Science(), vol 2035. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45357-1_45
Download citation
DOI: https://doi.org/10.1007/3-540-45357-1_45
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41910-5
Online ISBN: 978-3-540-45357-4
eBook Packages: Springer Book Archive