Using Feature Selection with Bagging and Rule Extraction in Drug Discovery

  • Ulf Johansson
  • Cecilia Sönströd
  • Ulf Norinder
  • Henrik Boström
  • Tuve Löfström
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 4)


This paper investigates different ways of combining feature selection with bagging and rule extraction in predictive modeling. Experiments on a large number of data sets from the medicinal chemistry domain, using standard algorithms implemented in the Weka data mining workbench, show that feature selection can lead to significantly improved predictive performance.When combining feature selection with bagging, employing the feature selection on each bootstrap obtains the best result.When using decision trees for rule extraction, the effect of feature selection can actually be detrimental, unless the transductive approach oracle coaching is also used. However, employing oracle coaching will lead to significantly improved performance, and the best results are obtained when performing feature selection before training the opaque model. The overall conclusion is that it can make a substantial difference for the predictive performance exactly how feature selection is used in conjunction with other techniques.


Feature Selection Bagging Rule Extraction 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Andrews, R., Diederich, J., Tickle, A.B.: Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl.-Based Syst. 8(6), 373–389 (1995)CrossRefGoogle Scholar
  2. 2.
    Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)zbMATHMathSciNetGoogle Scholar
  3. 3.
    Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)zbMATHCrossRefGoogle Scholar
  4. 4.
    Bruce, C.L., Melville, J.L., Pickett, S.D., Hirst, J.D.: Contemporary qsar classifiers compared. J. Chem. Inf. Model. 47(1), 219–227 (2007)CrossRefGoogle Scholar
  5. 5.
    Craven, M.W., Shavlik, J.W.: Extracting tree-structured representations of trained networks. In: Advances in Neural Information Processing Systems, pp. 24–30. MIT Press, Cambridge (1996)Google Scholar
  6. 6.
    Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetGoogle Scholar
  7. 7.
    Friedman, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. Journal of American Statistical Association 32, 675–701 (1937)CrossRefGoogle Scholar
  8. 8.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)zbMATHCrossRefGoogle Scholar
  9. 9.
    Joachims, T.: Transductive inference for text classification using support vector machines. In: International Conference on Machine Learning (ICML), pp. 200–209. Bled, Slowenien (1999)Google Scholar
  10. 10.
    Johansson, U., Niklasson, L.: Evolving decision trees using oracle guides. In: CIDM, pp. 238–244. IEEE, Los Alamitos (2009)Google Scholar
  11. 11.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: International Conference on Machine Learning, pp. 121–129 (1994)Google Scholar
  12. 12.
    Krogh, A., Vedelsby, J.: Neural network ensembles, cross validation, and active learning. Advances in Neural Information Processing Systems 2, 231–238 (1995)Google Scholar
  13. 13.
    Nemenyi, P.B.: Distribution-free multiple comparisons. PhD-thesis. Princeton University (1963)Google Scholar
  14. 14.
    Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)Google Scholar
  15. 15.
    Sutherland, J.J., O’Brien, L.A., Weaver, D.F.: A comparison of methods for modeling quantitative structure-activity relationships. J. Med. Chem. 47(22), 5541–5554 (2004)CrossRefGoogle Scholar
  16. 16.
    Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2010

Authors and Affiliations

  • Ulf Johansson
    • 1
  • Cecilia Sönströd
    • 1
  • Ulf Norinder
    • 2
  • Henrik Boström
    • 3
  • Tuve Löfström
    • 1
  1. 1.CSL@BS Research Group, School of Business and InformaticsUniversity of BoråsSweden
  2. 2.AstraZeneca R&D SödertäljeSweden
  3. 3.Department of Computer and Systems SciencesStockholm UniversitySweden

Personalised recommendations