Combining Decision Trees and Neural Networks for Drug Discovery

  • William B. Langdon
  • S. J. Barrett
  • B. F. Buxton
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2278)


Genetic programming (GP) offers a generic method of automatically fusing together classifiers using their receiver operating characteristics (ROC) to yield superior ensembles. We combine decision trees (C4.5) and artificial neural networks (ANN) on a difficult pharmaceutical data mining (KDD) drug discovery application. Specifically predicting inhibition of a P450 enzyme. Training data came from high throughput screening (HTS) runs. The evolved model may be used to predict behaviour of virtual (i.e. yet to be manufactured) chemicals. Measures to reduce over fitting are also described.


Receiver Operating Characteristic Decision Tree Convex Hull Genetic Programming Receiver Operating Characteristic 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [Angeline, 1998]
    P.J. Angeline. Multiple interacting programs: A representation for evolving complex behaviors. Cybernetics and Systems, 29(8):779–806.Google Scholar
  2. [Binmore, 1990]
    K. Binmore. Fun and Games. D. C. Heath, Lexington, MA, USA.Google Scholar
  3. [Breiman, 1996]
    L. Breiman. Bagging predictors. Machine Learning, 24:123–140.Google Scholar
  4. [Davidson et al., 2000]-J.W. Davidson, D.A. Savic, and G.A. Walters. Rainfall run off modeling using a new polynomial regression method. In Proc. 4th Int. Conf. On Hydroinformatics, Iowa City.Google Scholar
  5. [Freitas, 1999]
    A.A. Freitas. Data mining with evolutionary algorithms: Research directions.Technical Report WS-99-06, AAAI, Orlando.Google Scholar
  6. [Freund and Schapire, 1996]
    Y. Freund and R.E. Schapire. Experiments with a new boosting algorithm. In Proc. 13th ICML, pp148–156. Morgan Kaufmann.Google Scholar
  7. [Gathercole and Ross, 1997]
    C. Gathercole and P. Ross. Tackling the boolean even N parity problem with genetic programming and limited-error fitness. In J.R. Koza et al., eds., Proc. GP’97, pp119–127, Stanford University. Morgan Kaufmann.Google Scholar
  8. [Gathercole, 1998]
    C. Gathercole. An Investigation of Supervised Learning in Genetic Programming. PhD thesis, University of Edinburgh, 1998.Google Scholar
  9. [Gunatilaka and Baertlein, 2001]
    A.H. Gunatilaka and B.A. Baertlein. Feature-level and decision level fusion of noncoincidently sampled sensors for land mine detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(6):577–589.Google Scholar
  10. [Handley, 1994]
    S. Handley. On the use of a directed acyclic graph to represent a population of computer programs. In Proc. WCCI’94, pp154–159, Orlando. IEEE.Google Scholar
  11. [Hanley and McNeil, 1983]
    J.A. Hanley and B.J. McNeil. A method of comparing the areas under ROC curves derived from the same cases. Radiology, 148:839–843.Google Scholar
  12. [Jacobs et al., 1991]_R.A. Jacobs, M.I. Jordon, S.J. Nowlan, and G.E. Hinton. Adaptive mixtures of local experts. Neural Computation, 3:79–87.Google Scholar
  13. [Kelly, 1999]
    G. Kelly. Data fusion: from primary metrology to process measurement.In V. Piuri and M. Savino, eds., Proc. 16th Instrumentation and Measurement Technology Conference. IMTC/99., vol 3, pp1325–1329, Venice, Italy. IEEE.Google Scholar
  14. [Kittler and Roli, 2001]
    J. Kittler and F. Roli, eds.. Second International Conference on Multiple Systems, vol 2096 of LNCS, Cambridge. Springer Verlag.Google Scholar
  15. [Kupinski and Anastasio, 1999]
    M. A. Kupinski and M. A. Anastasio. Multiobjective genetic optimization of diagnostic classifiers with implications for generating ROC curves. IEEE Transactions on Medical Imaging, 18(8):675–685.Google Scholar
  16. [Kupinski et al., 2000]_M.A. Kupinski, M.A. Anastasio, and M.L. Giger. Multiobjective genetic optimization of diagnostic classifiers used in the computerized detection of mass lesions in mammography. In K.M. Hanson, ed., SPIE Medical Imaging Conference, vol 3979, San Diego.Google Scholar
  17. [Langdon and Buxton, 2001a]
    W.B. Langdon and B.F. Buxton. Genetic programming for combining classifiers. In L. Spector et al., eds., GECCO-2001, pp66–73, San Francisco. Morgan Kaufmann.Google Scholar
  18. [Langdon and Buxton, 2001b]
    W.B. Langdon and B.F. Buxton. Genetic programming for improved receiver operating characteristics. In J. Kittler and F. Roli, eds., Second International Conference on Multiple Classifier System, pp68–77.Google Scholar
  19. [Langdon and Buxton, 2001c]
    W.B. Langdon and B.F. Buxton. Evolving receiver operating characteristics for data fusion. In J.F. Miller at al., eds., EuroGP’2001, vol 2038 of LNCS, pp87–96, Lake Como, Italy. Springer.Google Scholar
  20. [Langdon et al., 1999] W.B. Langdon, T. Soule, R. Poli, and J.A. Foster. The evolution of size and shape. In L. Spector at al., eds., Advances in Genetic Programming 3, ch 8, pp163–190. MIT Press.Google Scholar
  21. [Langdon et al., 2001] W.B. Langdon, S.J. Barrett, and B.F. Buxton. Genetic programming for combining neural networks for drug discovery. In WSC6, 6th World Conference on Soft Computing in Industrial Applications, Springer-Verlag. Forthcoming.Google Scholar
  22. [Langdon, 1998]
    W.B. Langdon. Genetic Programming and Data Structures. Kluwer.Google Scholar
  23. [Langdon, 2000]
    W.B. Langdon. Size fair and homologous tree genetic programming crossovers. Genetic Programming and Evolvable Machines, 1(1/2):95–119.Google Scholar
  24. [Opitz and Shavlik, 1996]
    D.W. Opitz and J.W. Shavlik. Actively searching for an effective neural-network ensemble. Connection Science, 8(3–4):337–353.Google Scholar
  25. [Provost and Fawcett, 2001]
    F. Provost and T. Fawcett. Robust classification for imprecise environments. Machine Learning, 42(3):203–231.Google Scholar
  26. [Schmiedle et al., 2001]_F. Schmiedle, D. Grosse, R. Drechsler, and B. Becker. Too much knowledge hurts: Acceleration of genetic programs for learning heuristics. In B. Reusch, ed., Computational Intelligence: Theory and Applications, vol 2206 of LNCS, pp479–491, Dortmund, Germany. 7th Fuzzy Days, Springer.CrossRefGoogle Scholar
  27. [Scott et al., 1998]_M.J.J. Scott, M. Niranjan, and R.W. Prager. Realisable classifiers: Improving operating performance on variable cost problems. In P.H. Lewis and M.S. Nixon, eds.., Proc. 9th British Machine Vision Conference, vol 1, pp304–315,University of Southampton, UK.Google Scholar
  28. [Sirlantzis et al., 2001]_K. Sirlantzis, M.C. Fairhurst, and M.S. Hoque. Genetic algorithms for multi-classifier system configuration: A case study in character recognition.In J. Kittler and F. Roli, eds.., Second International Conference on Multiple Classifier System, pp99–108.Google Scholar
  29. [Sollich and Krogh, 1996]
    P. Sollich and A. Krogh. Learning with ensembles: How over fitting can be useful. In D.S. Touretzky et al., eds.., Advances in Neural Information Processing Systems, vol 8, pp190–196. MIT Press.Google Scholar
  30. [Soule, 1999]
    T. Soule. Voting teams: A cooperative approach to non-typical problems using genetic programming. In W. Banzhaf et al., eds.., GECCO-1999, vol 1, pp916–922, Orlando. Morgan Kaufmann.Google Scholar
  31. [Swets et al., 2000]_J.A. Swets, R.M. Dawes, and J. Monahan. Better decisions through science. Scientific American, pp70–75, October.Google Scholar
  32. [Teller and Andre, 1997]
    A. Teller and D. Andre. Automatically choosing the number of fitness cases: The rational allocation of trials. In J.R. Koza et al., eds., GP’97.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • William B. Langdon
    • 1
  • S. J. Barrett
    • 2
  • B. F. Buxton
    • 1
  1. 1.Computer ScienceUniversity CollegeLondonUK
  2. 2.GlaxoSmithKline Research and DevelopmentEssexUK

Personalised recommendations