Automatic Control and Computer Sciences

, Volume 51, Issue 5, pp 321–330 | Cite as

Prediction of soil adsorption coefficient based on deep recursive neural network

  • Xinyu Shi
  • Shengwei Tian
  • Long Yu
  • Li Li
  • Shuangyin Gao


It is expensive and time consuming to measure soil adsorption coefficient (logKoc) of compounds using traditional methods, and some existing models show lower accuracies. To solve these problems, a deep learning (DL) method based on undirected graph recursive neural network (UG-RNN) is proposed in this paper. Firstly, the structures of molecules are represented by directed acyclic graphs (DAG) using RNN model; after that when a number of such neural networks are bundled together, they form a multi-level and weight sharing deep neural network to extract the features of molecules; Third, logKoc values of compounds have been predicted using back-propagation neural network. The experimental results show that the UG-RNN model achieves a better prediction effect than some shallow models. After five-fold cross validation, the root mean square error (RMSE) value is 0.46, the average absolute error (AAE) value is 0.35, and the square correlation coefficient (R2) value is 0.86.


deep learning recursive neural network logKoc Pearson correlation coefficient molecular descriptors 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gawlik, B.M., Sotiriou, N., Feicht, E.A., et al., Alternatives for the determination of the soil adsorption coefficient, KOC, of non-ionicorganic compounds—a review, Chemosphere, 1997, vol. 34, no. 12, pp. 2525–2551.CrossRefGoogle Scholar
  2. 2.
    González, M.P., Helguera, A.M., and Collado, I.G., A topological substructural molecular design to predict soil sorption coefficients for pesticides, Mol. Diversity, 2006, vol. 10, no. 2, pp. 109–118.CrossRefGoogle Scholar
  3. 3.
    Liu, G. and Yu, J., QSAR analysis of soil sorption coefficients for polar organic chemicals: Substituted anilines and phenols, Water Res., 2005, vol. 39, no. 10, pp. 2048–2055.CrossRefGoogle Scholar
  4. 4.
    Hodson, J. and Williams, N.A., The estimation of the adsorption coefficient (Koc) for soil by High Performance Liquid Chromatography, Chemosphere, 1988, vol. 17, no. 1, pp. 67–77.CrossRefGoogle Scholar
  5. 5.
    OECD, Guideline for the testing of chemicals: Estimation of the adsorption coefficient (Koc) on soil and on sewage sludge using high performance liquid chromatography (HPLC), OECD Guidel. Test. Chem., 2000, vol. 1, no. 1, pp. 1–11.Google Scholar
  6. 6.
    Szabóet, G. and Bulman, G.F.A., Evaluation of silica-humate and alumina-humate HPLC stationary phases for estimation of the adsorption coefficient, Koc, of soil for some aromatics, Chemosphere, 1992, vol. 24, no. 4, pp. 403–412.CrossRefGoogle Scholar
  7. 7.
    Gramatica, P., Giani, E., and Papa, E., Statistical external validation and consensus modeling: A QSPR case study for Koc prediction, J. Mol. Graph. Modell., 2007, vol. 25, no. 6, pp. 755–766.CrossRefGoogle Scholar
  8. 8.
    Phillips, K.L., Toro, D.M., and Sandler, S.I., Prediction of soil sorption coefficients using model molecular structures for organic matter and the quantum mechanical COSMO-SAC model, Environ. Sci. Technol., 2011, vol. 45, no. 3, pp. 1021–1027.CrossRefGoogle Scholar
  9. 9.
    Doucette, W.J., Quantitative structure-activity relationships for predicting soil-sediment sorption coefficients for organic chemicals, Environ. Toxicol. Chem., 2003, vol. 22, no. 8, pp. 1771–1788.CrossRefGoogle Scholar
  10. 10.
    Huuskonen, J., Prediction of soil sorption coefficient of a diverse set of organic chemicals from molecular structure, J. Chem. Inf. Comput. Sci., 2003, vol. 43, no. 5, pp. 1457–1462.CrossRefGoogle Scholar
  11. 11.
    Wang, Y., Chen, J., Yang, X., et al., In silico model for predicting soil ogranic carbon normalized sorption coefficient (Koc) of organic chemicals, Chemosphere, 2015, vol. 119, pp. 438–444.CrossRefGoogle Scholar
  12. 12.
    Sabljic, A., On the prediction of soil sorption coefficients of organic pollutants from molecular structure: Application of molecular topology model, Environ. Sci. Technol., 1987, vol. 21, no. 4, pp. 358–366.CrossRefGoogle Scholar
  13. 13.
    Baker, J.R., Mihelcic, J.R., and Sabljic, A., Reliable QSAR for estimating Koc for persistent organic pollutants: Correlation with molecular connectivity indices, Chemosphere, 2001, vol. 45, no. 2, pp. 213–221.CrossRefGoogle Scholar
  14. 14.
    Bahnick, D.A. and Doucette, W.J., Use of molecular connectivity indices to estimate soil sorption coefficients for organic chemicals, Chemosphere, 1988, vol. 17, no. 9, pp. 1703–1715.CrossRefGoogle Scholar
  15. 15.
    Kier, L.B. and Hall, L.H., Molecular Connectivity in Structure Activity Analysis, Chichester: Research Studies Press, 1986.Google Scholar
  16. 16.
    Poole, S.K. and Poole, C.F., Chromatographic models for the sorption of neutral organic compounds by soil from water and air, J. Chromatogr. A, 1999, vol. 845, nos. 1–2, pp. 381–400.CrossRefGoogle Scholar
  17. 17.
    Tao, S., Lu, X., Cao, J., et al., A comparison of the fragment constant and molecular connectivity indices models for normalized sorption coefficient estimation, Water Environ. Res., 2001, vol. 73, no. 3, pp. 307–313.CrossRefGoogle Scholar
  18. 18.
    Tao, S., Piao, H., Dawson, R., et al., Estimation of organic carbon normalized sorption coefficient (Koc) for soils by fragment constant method, Environ. Sci. Technol., 1999, vol. 33, no. 16, pp. 2719–2725.CrossRefGoogle Scholar
  19. 19.
    Sabljic, A., Güsten, H., Verhaar, H., et al., QSAR modelling of soil sorption. Improvements and systematics of logKoc vs. logKow correlations, Chemosphere, 1995, vol. 31, no. 11, pp. 4489–4514.CrossRefGoogle Scholar
  20. 20.
    Reis, R.R.D., Sampaio, S.C., and Melo, E.B.D., An alternative approach for the use of water solubility of nonionic pesticides in the modeling of the soil sorption coefficients, Water Res., 2014, vol. 53, pp. 191–199.CrossRefGoogle Scholar
  21. 21.
    Goudarzi, N., Goodarzi, M., and Araujo, M.C., et al. QSAR modeling of soil sorption coefficients (Koc) of pesticides using SPA-ANN and SPA-MLR, J. Agric. Food Chem., 2009, vol. 57, no. 15, pp. 7153–7158.CrossRefGoogle Scholar
  22. 22.
    Jiao, L. and Li, H., QSPR study on sediment sorption coefficient of thirty polychlorinated organic compounds, Comput. Appl. Chem., 2012, vol. 29, no. 4, pp. 409–412.Google Scholar
  23. 23.
    Liu, X., Wen Yang, and Zhao Yuan-Hui, Predictive model for soil sorption of organic pollutants and influencing factors, Environ. Chem., 2013, vol. 32, no. 7, pp. 1199–1204.Google Scholar
  24. 24.
    Brandmaier, S., Tetko, I.V., and Oberg, T., An evaluation of experimental design in QSAR modelling utilizing the k-medoid clustering, J. Chemometrics, 2012, vol. 26, no. 10, pp. 509–517.CrossRefGoogle Scholar
  25. 25.
    Hinton, G.E. and Salakhutdinov, R.R., Reducing the dimensionality of data with neural networks, Science, 2006, vol. 313, no. 5786, pp. 504–507.MathSciNetCrossRefMATHGoogle Scholar
  26. 26.
    Lena, P.D., Nagata, K., and Baldi, P., Deep architectures for protein contact map prediction, Bioinformatics, 2012, vol. 28, no. 19, pp. 2449–2457.CrossRefGoogle Scholar
  27. 27.
    Jesse, E. and Cheng Jianlin, DNdisorder: Predicting protein disorder using boosting and deep networks, Bioinformatics, 2013, vol. 14, no. 1, pp. 1–10.Google Scholar
  28. 28.
    Brandmaier, S., Sahlin, U., Tetko, I.V., et al., PLS-Optimal: A stepwise D-optimal design based on latent variables, J. Chem. Inf. Model., 2012, vol. 52, no. 4, pp. 975–983.CrossRefGoogle Scholar
  29. 29.
    Sushko, I., Novotarskyi, S., Korner, R., et al., Online chemical modeling environmental (OCHEM): Web platform for data storage, model development and publishing of chemical information, J. Comput.-Aided Mol. Des., 2011, vol. 25, no. 6, pp. 533–554.CrossRefGoogle Scholar
  30. 30.
    Chen, Q., Research on the Structure Design Method and Application in Modeling of Fermentation Processes, Northeast University of Science and Technology, 2014.Google Scholar
  31. 31.
    Baldi, P. and Pollastri, G., The principled design of large-scale recursive neural network architectures-DAGRNNs and the protein structure prediction problem, J. Mach. Learn. Res., 2003, vol. 4, no. 12, pp. 575–602.MATHGoogle Scholar
  32. 32.
    Wu Lin and Baldi, P., Learning to play Go using recursive neural networks, Neural Networks, 2008, vol. 21, no. 9, pp. 1392–1400.CrossRefGoogle Scholar
  33. 33.
    Xu, Y., Dai, Z., Chen, F., et al., Deep learning for drug-induced liver injury, J. Chem. Inf. Model., 2015, vol. 55, no. 10, pp. 2085–2093.CrossRefGoogle Scholar
  34. 34.
    Lusci, A., Pollastir, G., and Baldi, P., Deep architectures and deep learning in chemoinformatices: The prediction of aqueous solubility for drug-like molecules, J. Chem. Inf. Model., 2013, vol. 53, no. 7, pp. 1563–1575.CrossRefGoogle Scholar
  35. 35.
    Kim, M.T., Sedykh, A., Chakravarti, S.K., et al., Critical evaluation of human oral bioavailability for pharmaceutical drugs by using various cheminformatics approaches, Pharm. Res., 2014, vol. 31, no. 4, pp. 1002–1014.CrossRefGoogle Scholar
  36. 36.
    Wang, B., Chen, J., Li, X., et al., Estimation of soil organic carbon normalized sorption coefficient (Koc) using least squares-support vector machine, QSAR Comb. Sci., 2009, vol. 28, no. 5, pp. 561–567.CrossRefGoogle Scholar
  37. 37.
    Shao, Y., Liu, J., Wang, M., et al., Integrated QSAR models to predict the soil sorption coefficient for a large diverse set of compounds by using different modeling methods., Atmos. Environ., 2014, vol. 88, no. 5, pp. 212–218.CrossRefGoogle Scholar
  38. 38.
    Wen, Y., Li, M., Wei, C., et al., Linear and non-linear relationships between soil sorption and hydrophobicity: Model, validation and influencing factors, Chemosphere, 2012, vol. 86, no. 6, pp. 634–640.CrossRefGoogle Scholar

Copyright information

© Allerton Press, Inc. 2017

Authors and Affiliations

  • Xinyu Shi
    • 1
  • Shengwei Tian
    • 1
  • Long Yu
    • 2
  • Li Li
    • 3
  • Shuangyin Gao
    • 1
  1. 1.School of SoftwareXinjiang UniversityUrumqiChina
  2. 2.Network CenterXinjiang UniversityUrumqiChina
  3. 3.College of Engineering, Xinjiang Medical UniversityXinjiang Uygur Autonomous RegionUrumqiChina

Personalised recommendations