Skip to main content

Classifying the World Anti-Doping Agency’s 2005 Prohibited List Using the Chemistry Development Kit Fingerprint

  • Conference paper
Book cover Computational Life Sciences II (CompLife 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4216))

Included in the following conference series:

Abstract

We used the freely available Chemistry Development Kit (CDK) fingerprint to classify 5235 representative molecules taken from ten banned classes in the 2005 World Anti-Doping Agency’s (WADA) prohibited list, including molecules taken from the corresponding activity classes in the MDL Drug Data Report (MDDR). We used both Random Forest and k-Nearest Neighbours (kNN) algorithms to generate classifiers. The kNN classifiers withk = 1 gave a very slightly better Matthews Correlation Coefficient than the Random Forest classifiers; the latter, however, predicted fewer false positives. The performance of kNN classifiers tended to decline with increasing k. The performance of the CDK fingerprint is essentially equivalent to that of Unity 2D. Our results suggest that it will be possible to use freely available chemoinformatics tools to aid the fight against drugs in sport, while minimising the risk of wrongfully penalising innocent athletes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. World Anti-Doping Agency (WADA), Stock Exchange Tower, 800 Place Victoria (Suite 1700), P.O. Box 120, Montreal, Quebec, H4Z 1B7, Canada, http://www.wada-ama.org/

  2. Handelsman, D.J.: Designer Androgens in Sport: When too Much is Never Enough. Sci. STKE (244), 41 (2004)

    Google Scholar 

  3. Death, A.K., McGrath, K.C.Y., Kazlauskas, R., Handelsman, D.J.: Tetrahydrogestrinone is a Potent Androgen and Progestin. J. Clin. Endocrinol. Metab. 89, 2498–2500 (2004)

    Article  Google Scholar 

  4. Kontaxakis, S.G., Christodoulou, M.A.: A Neural Network System for Doping Detection in Athletes. In: Proceedings 4th International Conference on Technology and Automation, Thessaloniki, Greece (October 2002)

    Google Scholar 

  5. Cannon, E.O., Bender, A., Palmer, D.S., Mitchell, J.B.O.: Chemoinformatics-based Classification of Prohibited Substances Employed for Doping in Sport. J. Chem. Inf. Model (submitted)

    Google Scholar 

  6. http://cdk.sourceforge.net/api/

  7. Steinbeck, C., Han, Y., Kuhn, S., Horlacher, O., Luttmann, E., Willighagen, E.: The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 43, 493–500 (2003)

    Google Scholar 

  8. Tripos Inc., 1699 South Hanley Road, St. Louis, MO 63144-2319, USA, http://www.tripos.com

  9. Elsevier MDL, 2440 Camino Ramon, San Ramon, CA 94583, USA, http://www.mdli.com

  10. Daylight Chemical Information Systems, Inc. 120 Vantis - Suite 550 - Aliso Viejo, CA 92656, USA, http://www.daylight.com/

  11. Wild, D., Blankley, C.J.: Comparison of 2D Fingerprint Types and Hierarchy Level Selection Methods for Structural Grouping Using Ward’s Clustering. J. Chem. Inf. Comput. Sci. 40, 155–162 (2000)

    Google Scholar 

  12. R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2005), http://www.R-project.org ISBN 3-900051-07-0

  13. Breiman, L.: Random Forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  14. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A.F., Nielsen, H.: Assessing the Accuracy of Prediction Algorithms for Classification: An Overview. Bioinformatics 16, 412–424 (2000)

    Article  Google Scholar 

  15. Lam, L., Suen, C.Y.: Application of Majority Voting to Pattern Recognition: An Analysis of its Behavior and Performance. IEEE Trans. Systems, Man and Cybernetics 27, 553–567 (1997)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Cannon, E.O., Mitchell, J.B.O. (2006). Classifying the World Anti-Doping Agency’s 2005 Prohibited List Using the Chemistry Development Kit Fingerprint. In: R. Berthold, M., Glen, R.C., Fischer, I. (eds) Computational Life Sciences II. CompLife 2006. Lecture Notes in Computer Science(), vol 4216. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11875741_17

Download citation

  • DOI: https://doi.org/10.1007/11875741_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45767-1

  • Online ISBN: 978-3-540-45768-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics