Skip to main content
Log in

pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Determining the net charge and protonation states populated by a small molecule in an environment of interest or the cost of altering those protonation states upon transfer to another environment is a prerequisite for predicting its physicochemical and pharmaceutical properties. The environment of interest can be aqueous, an organic solvent, a protein binding site, or a lipid bilayer. Predicting the protonation state of a small molecule is essential to predicting its interactions with biological macromolecules using computational models. Incorrectly modeling the dominant protonation state, shifts in dominant protonation state, or the population of significant mixtures of protonation states can lead to large modeling errors that degrade the accuracy of physical modeling. Low accuracy hinders the use of physical modeling approaches for molecular design. For small molecules, the acid dissociation constant (pKa) is the primary quantity needed to determine the ionic states populated by a molecule in an aqueous solution at a given pH. As a part of SAMPL6 community challenge, we organized a blind pKa prediction component to assess the accuracy with which contemporary pKa prediction methods can predict this quantity, with the ultimate aim of assessing the expected impact on modeling errors this would induce. While a multitude of approaches for predicting pKa values currently exist, predicting the pKas of drug-like molecules can be difficult due to challenging properties such as multiple titratable sites, heterocycles, and tautomerization. For this challenge, we focused on set of 24 small molecules selected to resemble selective kinase inhibitors—an important class of therapeutics replete with titratable moieties. Using a Sirius T3 instrument that performs automated acid–base titrations, we used UV absorbance-based pKa measurements to construct a high-quality experimental reference dataset of macroscopic pKas for the evaluation of computational pKa prediction methodologies that was utilized in the SAMPL6 pKa challenge. For several compounds in which the microscopic protonation states associated with macroscopic pKas were ambiguous, we performed follow-up NMR experiments to disambiguate the microstates involved in the transition. This dataset provides a useful standard benchmark dataset for the evaluation of pKa prediction methodologies on kinase inhibitor-like compounds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Here, the illustrative diagram style of microstates were adopted from [24], and NMR-determined microscopic pKas for cetirizine were taken from [25]

Fig. 2

Subfigure of cysteine microscopic pKas was reproduced based on [19]

Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Abbreviations

SAMPL:

Statistical Assessment of the Modeling of Proteins and Ligands

pK a :

\(-{\log _{10}}\) acid dissociation equilibrium constant

ps K a :

\(-{\log _{10}}\) apparent acid dissociation equilibrium constant in the presence of cosolvent

DMSO:

Dimethyl sulfoxide

ISA:

Ionic-strength adjusted

SEM:

Standard error of the mean

TFA:

Target factor analysis

LC–MS:

Liquid chromatography–mass spectrometry

NMR:

Nuclear magnetic resonance spectroscopy

HMBC:

Heteronuclear multiple-bond correlation

TFA-d :

Deutero-trifluoroacetic acid

References

  1. Mobley DL, Chodera JD, Isaacs L, Gibb BC (2016) Advancing predictive modeling through focused development of model systems to drive new modeling innovations. UC Irvine: Department of Pharmaceutical Sciences, UCI. https://escholarship.org/uc/item/7cf8c6cr. Accessed 16 May 2018

  2. Drug Design Data Resource, SAMPL. https://drugdesigndata.org/about/sampl. Accessed 16 May 2018

  3. Nicholls A, Mobley DL, Guthrie JP, Chodera JD, Bayly CI, Cooper MD, Pande VS (2008) Predicting small-molecule solvation free energies: an informal blind test for computational chemistry. J Med Chem 51(4):769–779. https://doi.org/10.1021/jm070549+

    Article  CAS  PubMed  Google Scholar 

  4. Guthrie JP (2009) A blind challenge for computational solvation free energies: introduction and overview. J Phys Chem B 113(14):4501–4507

    Article  CAS  PubMed  Google Scholar 

  5. Skillman AG, Geballe MT, Nicholls A (2010) SAMPL2 challenge: prediction of solvation energies and tautomer ratios. J Comput Aided Mol Des 24(4):257–258. https://doi.org/10.1007/s10822-010-9358-0

    Article  CAS  PubMed  Google Scholar 

  6. Geballe MT, Skillman AG, Nicholls A, Guthrie JP, Taylor PJ (2010) The SAMPL2 blind prediction challenge: introduction and overview. J Comput Aided Mol Des. 24(4):259–279. https://doi.org/10.1007/s10822-010-9350-8

    Article  CAS  PubMed  Google Scholar 

  7. Skillman AG (2012) SAMPL3: blinded prediction of host–guest binding affinities, hydration free energies, and trypsin inhibitors. J Comput Aided Mol Des. 26(5):473–474. https://doi.org/10.1007/s10822-012-9580-z

    Article  CAS  PubMed  Google Scholar 

  8. Geballe MT, Guthrie JP (2012) The SAMPL3 blind prediction challenge: transfer energy overview. J Comput Aided Mol Des 26(5):489–496. https://doi.org/10.1007/s10822-012-9568-8

    Article  CAS  PubMed  Google Scholar 

  9. Muddana HS, Varnado CD, Bielawski CW, Urbach AR, Isaacs L, Geballe MT, Gilson MK (2012) Blind prediction of host–guest binding affinities: a new SAMPL3 challenge. J Comput Aided Mol Des 26(5):475–487. https://doi.org/10.1007/s10822-012-9554-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Guthrie JP (2014) SAMPL4, a blind challenge for computational solvation free energies: the compounds considered. J Comput Aided Mol Des 28(3):151–168. https://doi.org/10.1007/s10822-014-9738-y

    Article  CAS  PubMed  Google Scholar 

  11. Mobley DL, Wymer KL, Lim NM, Guthrie JP (2014) Blind prediction of solvation free energies from the SAMPL4 challenge. J Comput Aided Mol Des 28(3):135–150. https://doi.org/10.1007/s10822-014-9718-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Muddana HS, Fenley AT, Mobley DL, Gilson MK (2014) The SAMPL4 host–guest blind prediction challenge: an overview. J Comput Aided Mol Des 28(4):305–317. https://doi.org/10.1007/s10822-014-9735-1

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Mobley DL, Liu S, Lim NM, Wymer KL, Perryman AL, Forli S, Deng N, Su J, Branson K, Olson AJ (2014) Blind prediction of HIV integrase binding from the SAMPL4 challenge. J Comput Aided Mol Des 28(4):327–345. https://doi.org/10.1007/s10822-014-9723-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Yin J, Henriksen NM, Slochower DR, Shirts MR, Chiu MW, Mobley DL, Gilson MK (2017) Overview of the SAMPL5 host–guest challenge: are we doing better? J Comput Aided Mol Des 31(1):1–19. https://doi.org/10.1007/s10822-016-9974-4

    Article  CAS  PubMed  Google Scholar 

  15. Bannan CC, Burley KH, Chiu M, Shirts MR, Gilson MK, Mobley DL (2016) Blind prediction of cyclohexane–water distribution coefficients from the SAMPL5 challenge. J Comput Aided Mol Des 30(11):1–18. https://doi.org/10.1007/s10822-016-9954-8

    Article  CAS  Google Scholar 

  16. Bannan CC, Burley KH, Chiu M, Shirts MR, Gilson MK, Mobley DL (2016) Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge. J Comput-Aided Mol Des 30(11):927–944. https://doi.org/10.1007/s10822-016-9954-8

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Rustenburg AS, Dancer J, Lin B, Feng JA, Ortwine DF, Mobley DL, Chodera JD (2016) Measuring experimental cyclohexane–water distribution coefficients for the SAMPL5 challenge. J Comput-Aided Mol Des 30(11):945–958. https://doi.org/10.1007/s10822-016-9971-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Pickard FC, König G, Tofoleanu F, Lee J, Simmonett AC, Shao Y, Ponder JW, Brooks BR (2016) Blind prediction of distribution in the SAMPL5 challenge with QM based protomer and pK a corrections. J Comput-Aided Mol Des 30(11):1087–1100. https://doi.org/10.1007/s10822-016-9955-7

    Article  CAS  PubMed  Google Scholar 

  19. Bodner GM (1986) Assigning the pKa’s of polyprotic acids. J Chem Educ 63(3):246

    Article  CAS  Google Scholar 

  20. Darvey IG (1995) The assignment of pKa values to functional groups in amino acids. Wiley, New York

    Google Scholar 

  21. Bezençon J, Wittwer MB, Cutting B, Smieško M, Wagner B, Kansy M, Ernst B (2014) pKa determination by 1H NMR spectroscopy–an old methodology revisited. J Pharm Biomed Anal 93:147–155. https://doi.org/10.1016/j.jpba.2013.12.014

    Article  CAS  PubMed  Google Scholar 

  22. Elson EL, Edsall JT (1962) Raman spectra and sulfhydryl ionization constants of thioglycolic acid and cysteine. Biochemistry 1(1):1–7

    Article  CAS  PubMed  Google Scholar 

  23. Elbagerma MA, Edwards HGM, Azimi G, Scowen IJ (2011) Raman spectroscopic determination of the acidity constants of salicylaldoxime in aqueous solution. J Raman Spectrosc 42(3):505–511. https://doi.org/10.1002/jrs.2716

    Article  CAS  Google Scholar 

  24. Rupp M, Korner R, V Tetko I (2011) Predicting the pKa of small molecules. Comb Chem High Throughput Screen 14(5):307–327

    Article  CAS  PubMed  Google Scholar 

  25. Marosi A, Kovács Z, Béni S, Kökösi J, Noszál B (2009) Triprotic acid–base microequilibria and pharmacokinetic sequelae of cetirizine. Eur J Pharm Sci 37(3–4):321–328. https://doi.org/10.1016/j.ejps.2009.03.001

    Article  CAS  PubMed  Google Scholar 

  26. Sober HA, Company CR (1970) Handbook of biochemistry: selected data for molecular biology. Chemical Rubber Company, Cleveland

    Google Scholar 

  27. Benesch RE, Benesch R (1955) The acid strength of the -SH group in cysteine and related compounds. J Am Chem Soc 77(22):5877–5881. https://doi.org/10.1021/ja01627a030

    Article  CAS  Google Scholar 

  28. Tam KY, Takács-Novák K (2001) Multi-wavelength spectrophotometric determination of acid dissociation constants: a validation study. Anal Chim Acta 434(1):157–167

    Article  CAS  Google Scholar 

  29. Allen RI, Box KJ, Comer JEA, Peake C, Tam KY (1998) Multiwavelength spectrophotometric determination of acid dissociation constants of ionizable drugs. J Pharm Biomed Anal 17(4):699–712

    Article  CAS  PubMed  Google Scholar 

  30. Comer JEA, Manallack D (2014) Ionization constants and ionization profiles. In: Reedijk J (ed) Reference module in chemistry, molecular sciences and chemical engineering. Elsevier, New York. https://doi.org/10.1016/B978-0-12-409547-2.11233-8

    Chapter  Google Scholar 

  31. Avdeef A, Box KJ, Comer JEA, Gilges M, Hadley M, Hibbert C, Patterson W, Tam KY (1999) PH-metric logP 11. pK a determination of water-insoluble drugs in organic solvent–water mixtures. J Pharm Biomed Anal 20(4):631–641

    Article  CAS  PubMed  Google Scholar 

  32. Cabot JM, Fuguet E, Rosés M, Smejkal P, Breadmore MC (2015) Novel instrument for automated pKa determination by internal standard capillary electrophoresis. Anal Chem 87(12):6165–6172. https://doi.org/10.1021/acs.analchem.5b00845

    Article  CAS  PubMed  Google Scholar 

  33. Wan H, Holmén A, Någård M, Lindberg W (2002) Rapid screening of pKa values of pharmaceuticals by pressure-assisted capillary electrophoresis combined with short-end injection. J Chromatogr A 979(1–2):369–377

    Article  CAS  PubMed  Google Scholar 

  34. Reijenga J, van Hoof A, van Loon A, Teunissen B (2013) Development of methods for the determination of pKa values. Anal Chem Insights 8:ACI.S12304. https://doi.org/10.4137/ACI.S12304

    Article  CAS  Google Scholar 

  35. Sterling T, Irwin JJ (2015) ZINC 15 - ligand discovery for everyone. J Chem Inf Model 55(11):2324–2337. https://doi.org/10.1021/acs.jcim.5b00559

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Baell JB, Holloway GA (2010) New substructure filters for removal of pan assay interference compounds (PAINS) from screening libraries and for their exclusion in bioassays. J Med Chem 53(7):2719–2740. https://doi.org/10.1021/jm901137j

    Article  CAS  PubMed  Google Scholar 

  37. Saubern S, Guha R, Baell JB (2011) KNIME workflow to assess PAINS filters in SMARTS format. Comparison of RDKit and Indigo Cheminformatics Libraries. Mol Inf 30(10):847–850. https://doi.org/10.1002/minf.201100076

    Article  CAS  Google Scholar 

  38. eMolecules Database Free Version. https://www.emolecules.com/info/products-data-downloads.html. Accessed 01 July 2017

  39. OEChem Toolkit Version 2017.Feb.1;. OpenEye Scientific Software, Santa Fe, NM. http://www.eyesopen.com

  40. Shelley JC, Cholleti A, Frye LL, Greenwood JR, Timlin MR, Uchimaya M (2007) Epik: a software program for pK a prediction and protonation state generation for drug-like molecules. J Comput-Aided Mol Des 21(12):681–691. https://doi.org/10.1007/s10822-007-9133-z

    Article  CAS  PubMed  Google Scholar 

  41. Schrödinger Release 2016-4: Epik Version 3.8;. Schrödinger, LLC, New York, 2016

  42. OEMolProp Toolkit Version 2017.Feb.1;. OpenEye Scientific Software, Santa Fe, NM. http://www.eyesopen.com

  43. Wishart DS (2006) DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Res 34(90001):D668–D672. https://doi.org/10.1093/nar/gkj067

    Article  CAS  PubMed  Google Scholar 

  44. Pence HE, Williams A (2010) ChemSpider: an online chemical information resource. J Chem Educ 87(11):1123–1124. https://doi.org/10.1021/ed100697w

    Article  CAS  Google Scholar 

  45. NCI Open Database, August 2006 Release. https://cactus.nci.nih.gov/download/nci/. Accessed 8 Aug 2017

  46. Enhanced NCI Database Browser 2.2. https://cactus.nci.nih.gov/ncidb2.2/. Accessed 8 Aug 2017

  47. Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH (2016) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202–D1213. https://doi.org/10.1093/nar/gkv951

    Article  CAS  PubMed  Google Scholar 

  48. NCI/CADD Chemical Identifier Resolver. https://cactus.nci.nih.gov/chemical/structure. Accessed 8 Aug 2017

  49. Bemis GW, Murcko MA (1996) The properties of known drugs. 1. Molecular frameworks. J Med Chem 39(15):2887–2893

    Article  CAS  PubMed  Google Scholar 

  50. OEMedChem Toolkit Version 2017.Feb.1;. OpenEye Scientific Software, Santa Fe. http://www.eyesopen.com

  51. Sirius T3 User Manual, v1.1. Sirius Analytical Instruments Ltd, East Sussex (2008)

  52. Yasuda M (1959) Dissociation constants of some carboxylic acids in mixed aqueous solvents. Bull Chem Soc Japan 32(5):429–432

    Article  CAS  Google Scholar 

  53. Shedlovsky T (1962) The behaviour of carboxylic acids in mixed solvents. In: Pesce B (ed) Electrolytes. Pergamon Press, New York, pp 146–151

    Google Scholar 

  54. Avdeef A, Comer JEA, Thomson SJ (1993) pH-Metric log P. 3. Glass electrode calibration in methanol-water, applied to pKa determination of water-insoluble substances. Anal Chem 65(1):42–49. https://doi.org/10.1021/ac00049a010

    Article  CAS  Google Scholar 

  55. Takács-Novák K, Box KJ, Avdeef A (1997) Potentiometric pKa determination of water-insoluble compounds: validation study in methanol/water mixtures. Int J Pharm 151(2):235–248. https://doi.org/10.1016/S0378-5173(97)04907-7

    Article  Google Scholar 

  56. Szakacs Z, Beni S, Varga Z, Orfi L, Keri G, Noszal B (2005) Acid–base profiling of imatinib (gleevec) and its fragments. J Med Chem 48(1):249–255. https://doi.org/10.1021/jm049546c

    Article  CAS  PubMed  Google Scholar 

  57. Szakacs Z, Kraszni M, Noszal B (2004) Determination of microscopic acid–base parameters from NMR–pH titrations. Anal Bioanal Chem 378(6):1428–1448. https://doi.org/10.1007/s00216-003-2390-3

    Article  CAS  PubMed  Google Scholar 

  58. Dozol H, Blum-Held C, Guédat P, Maechling C, Lanners S, Schlewer G, Spiess B (2002) Inframolecular acid–base studies of the tris and tetrakis myo-inositol phosphates including the 1, 2, 3-trisphosphate motif. J Mol Struct 643(1–3):171–181

    Article  CAS  Google Scholar 

  59. OEDepict Toolkit Version 2017.Feb.1;. OpenEye Scientific Software, Santa Fe. http://www.eyesopen.com

  60. Fraczkiewicz R (2013) In silico prediction of ionization. In: Reedijk J (ed) Reference module in chemistry, molecular sciences and chemical engineering. Elsevier, New York. https://doi.org/10.1016/B978-0-12-409547-2.02610-X

    Chapter  Google Scholar 

Download references

Acknowledgements

MI, ASR, and JDC acknowledge support from the Sloan Kettering Institute. JDC acknowledges support from NIH grant P30 CA008748. MI, JDC, ASR, and DLM gratefully acknowledge support from NIH grant R01GM124270 supporting SAMPL blind challenges. MI acknowledges support from a Doris J. Hutchinson Fellowship. DLM appreciates financial support from the National Institutes of Health (1R01GM108889-01), the National Science Foundation (CHE 1352608). IEN acknowledges support from the MRL Postdoctoral Research Program. The authors are extremely grateful for the assistance and support from the MRL Preformulations and NMR Structure Elucidation groups for materials, expertise, and instrument time, without which this SAMPL challenge would not have been possible. MI and DL are grateful to Pion/Sirius Analytical for their technical support in the planning and execution of this study. We are especially thankful to Karl Box (Sirius Analytical) for the guidance on optimization and interpretation of pKa measurements with the Sirius T3, as well as feedback on the manuscript. We thank Brad Sherborne (MRL; ORCID: 0000-0002-0037-3427) for his valuable insights at the conception of the pKa challenge and connecting us with TR and DL who were able to provide resources for experimental measurements. We acknowledge Paul Czodrowski (Merck KGaA; ORCID: 0000-0002-7390-8795) who provided feedback on multiple stages of this work: challenge construction, purchasable compound selection, and manuscript. We acknowledge contributions from Caitlin Bannan who provided feedback on experimental data collection and structure of pKa challenge from a computational chemist’s perspective. We are also grateful to Marilyn Gunner (CCNY) for her feedback on this manuscript. We thank anonymous reviewers for their input and constructive comments that improved this manuscript. MI, ASR, and JDC are grateful to OpenEye Scientific for providing a free academic software license for use in this work. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, MI, JDC, TR, ASR, DLM; Methodology, MI, DL, IEN; Software, MI, ASR; Formal Analysis, MI; Investigation, MI, DL, IEN, HW, XW, MR; Resources, TR, DL; Data Curation, MI; Writing-Original Draft, MI, JDC, IEN; Writing - Review and Editing, MI, DL, ASR, IEN, HW, XW, MR, GEM, DLM, TR, JDC; Visualization, MI, IEN; Supervision, JDC, TR, DLM, GEM, AAM; Project Administration, MI; Funding Acquisition, JDC, DLM, TR, MI.

Corresponding authors

Correspondence to Timothy Rhodes or John D. Chodera.

Ethics declarations

Conflict of interest

JDC was a member of the Scientific Advisory Board for Schrödinger, LLC during part of this study. JDC and DLM are current members of the Scientific Advisory Board of OpenEye Scientific Software. The Chodera laboratory receives or has received funding from multiple sources, including the National Institutes of Health, the National Science Foundation, the Parker Institute for Cancer Immunotherapy, Relay Therapeutics, Entasis Therapeutics, Silicon Therapeutics, EMD Serono (Merck KGaA), AstraZeneca, the Molecular Sciences Software Institute, the Starr Cancer Consortium, Cycle for Survival, a Louis V. Gerstner Young Investigator Award, and the Sloan Kettering Institute. A complete list of funding can be found at http://choderalab.org/funding.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (PDF 3731 KB)

Supplementary material 2 (ZIP 70025 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Işık, M., Levorse, D., Rustenburg, A.S. et al. pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments. J Comput Aided Mol Des 32, 1117–1138 (2018). https://doi.org/10.1007/s10822-018-0168-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-018-0168-0

Keywords

Navigation