Skip to main content

Direct Domain Knowledge Inclusion in the PA3 Rule Induction Algorithm

  • Conference paper
  • First Online:
  • 1305 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2035))

Abstract

Inclusion of domain knowledge in a process of knowledge discovery in databases is a complex but very important part of successful knowledge discovery solutions. In real-life data mining development, non-structured domain knowledge involvement in the data preparation phase and in the final interpretation/evaluation phase tends to dominate. This paper presents an experiment of direct domain knowledge integration in the algorithm that will search for interesting patterns in the data. In the context of stock market prediction work, a recent rule induction algorithm, PA3, was adapted to include domain theories directly in the internal rule development. Tests performed over several Portuguese stocks show a significant increase in prediction performance over the same process using the standard version of PA3. We believe that a similar methodology can be applied to other symbolic induction algorithms and in other working domains to improve the efficiency of prediction (or classification) in knowledge-intensive data mining tasks.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Almeida, P. and Bento, C.: Sequential Cover Rule Induction with PA3. To appear in Proceedings of the 10th International Conference on Computing and Information (ICCI’2000), Kuwait. Springer-Verlag (2000)

    Google Scholar 

  2. Choey, M. and Weigend, A.: Nonlinear Trading Models Through Sharpe Ratio Maximization. In Decision Technologies for Financial Engineering: Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets (NNCM-96). World Scientific (1997)

    Google Scholar 

  3. Cohen, W.: Compiling Prior Knowledge Into an Explicit Bias. In Proceedings of the Ninth International Conference on Machine Learning. Morgan Kaufmann (1992)

    Google Scholar 

  4. Cook, D., Holder, L. and Djoko S.: Scalable Discovery of Informative Structural Concepts Using Domain Knowledge. IEEE Expert/Intelligent Systems & Their Applications, 11(5) (1996)

    Google Scholar 

  5. Efron, B. and Tibshirani, R.: An Introduction to the Bootstrap. Chapman & Hall (1993)

    Google Scholar 

  6. Fama, E.: Efficient Capital Markets: A Review of Theory and Empirical Work. Journal of Finance, May (1970)

    Google Scholar 

  7. Herbst, A.: Analyzing and Forecasting Futures Prices. John Wiley & Sons (1992)

    Google Scholar 

  8. Hong S.: Use of Contextual Information for Feature Ranking and Discretization. IEEE Transactions on Knowledge and Data Engineering, 9(5) (1997)

    Google Scholar 

  9. Hutchinson, J.: A Radial Basis Function Approach to Financial Time Series Analysis. Ph.D. Dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology (1994)

    Google Scholar 

  10. Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI-95). Morgan Kaufmann (1995)

    Google Scholar 

  11. Lawrence, S., Tsoi, A. and Giles C.: Noisy Time Series Prediction using Symbolic Representation and Recurrent Neural Network Grammatical Inference. Technical Report UMIACS-TR-96-27 and CS-TR-3625, Institute for Advanced Computer Studies, University of Maryland, College Park, MD (1996)

    Google Scholar 

  12. LeBaron, B. and Weigend, A.: A Bootstrap Evaluation of the Effect of Data Splitting on Financial Time Series. IEEE Transactions on Neural Networks, 9(1) (1998)

    Google Scholar 

  13. Mitchell, T.: Machine Learning. McGraw-Hill (1997)

    Google Scholar 

  14. O’Sullivan, J.: Integrating Initialization Bias and Search Bias in Neural Network Learning. Unpublished research paper from April 1996, available in: http://www.cs.cmu.edu/~josullvn/research.html

  15. Pazzani, M.: When Prior Knowledge Hinders Learning. In Proceedings of the AAAI Workshop on Constraining Learning with Prior Knowledge. San Jose, CA (1992)

    Google Scholar 

  16. Weigend, A., Abu-Mostafa and Refenes A.-P. (eds): Decision Technologies for Financial Engineering (Proceedings of the Fourth International Conference on Neural Networks in the Capital Markets, NNCM-96). World Scientific (1997)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

de Almeida, P. (2001). Direct Domain Knowledge Inclusion in the PA3 Rule Induction Algorithm. In: Cheung, D., Williams, G.J., Li, Q. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2001. Lecture Notes in Computer Science(), vol 2035. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45357-1_45

Download citation

  • DOI: https://doi.org/10.1007/3-540-45357-1_45

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41910-5

  • Online ISBN: 978-3-540-45357-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics