Molecular Basis of Food Classification in Traditional Chinese Medicine

  • Xiaosong Han
  • Haiyan Zhao
  • Hao Xu
  • Yun Yang
  • Yanchun Liang
  • Dong XuEmail author
Part of the Emerging Topics in Statistics and Biostatistics book series (ETSB)


Traditional Chinese Medicine (TCM) started considering the medicinal and health effects of food thousands of years ago. TCM labels are placed on foods based on cold, neutral, and hot properties similar to Chinese herbal medicine. However, it is unclear whether such a classification has any molecular or biochemical basis, and what the relationship is between this TCM classification and the nutrient composition of food. To answer these questions, we collected a large dataset, in which each type of food has both TCM labels and molecular composition records for statistical analyses and machine-learning predictions. We applied machine-learning methods by using food molecular composition to predict the hot, neutral or cold label of food, and achieved more than 80% accuracy, which clearly indicated that TCM labels have a significant molecular basis. We also applied ANOVA to analyze the main factors contributing to the TCM labels. The ANOVA analysis shows that some molecular/biochemical compositions and categories, such as Energy, Fat, Protein, Water and Selenium (Se), have the strongest correlations with the TCM labels of food. To the best of our knowledge, this study represents the first effort to quantitatively explore the relationship between TCM labels and the molecular composition of food.


Traditional Chinese medicine Zheng Food composition Health effect of food Machine learning 



This work has been partially supported by the National Natural Science Foundation of China (61503150, 61972174), the Jilin Scientific and Technological Development Plan (20160520012JH, 20170204057GX), the Guangdong Key-Project for Applied Fundamental Research (Grant 2018KZDXM076), the Guangdong Premier Key-Discipline Enhancement Scheme (Grant 2016GDYSZDXK036) and the Paul K. and Dianne Shumaker Endowed Fund at University of Missouri.

Supplementary material

478473_1_En_10_MOESM1_ESM.xlsx (3.4 mb)
Data 1 (XLSX 3521 kb)


  1. 1.
    Hobert, O. (2008). Gene regulation by transcription factors and microRNAs. Science, 319(5871), 1785–1786.CrossRefGoogle Scholar
  2. 2.
    Ergil, M. C., & Ergil, K. (2009). Pocket atlas of Chinese medicine. New York: Thieme.CrossRefGoogle Scholar
  3. 3.
    Yang, S. (1965). Grand simplicity of inner canon of Huangdi. Beijing: People’s Medical Publishing House.Google Scholar
  4. 4.
    Zhu, J., Deng, W., et al. (2015). Theoretical origination of medicine and food homology. Journal of Traditional Chinese Medicine University of Hunan, 35(12), 27–30.Google Scholar
  5. 5.
    Yu, H., Zhou, H., Xiao, X., Lit, T., Yuan, H., Zhao, Y., & Gao, X. (2001). Advances and prospects of four properties of Chinese traditional medicine. China Journal of Basic Medicine in Traditional Chinese Medicine, 7(8), 61–64.Google Scholar
  6. 6.
    Li, S., Zhang, Z. Q., Wu, L. J., Zhang, X. G., Li, Y. D., & Wang, Y. Y. (2007). Understanding ZHENG in traditional Chinese medicine in the context of neuro-endocrine-immune network. IET Systems Biology, 1(1), 51–60.CrossRefGoogle Scholar
  7. 7.
    Liang, Y. (1998). Study on the therapeutic mechanism of hyperthermia. Chinese Journal of Integrated Traditional and Western Medicine, 18(5), 305–306.Google Scholar
  8. 8.
    He, F., Deng, K., et al. (2008). The status and prospect on studies of the four properties for the traditional Chinese Materia medical. Chinese Journal of Experimental Traditional Medical Formulae, 14(8), 72–75.Google Scholar
  9. 9.
    Wang, H. (2006). Encyclopedia of healthcare based on Chinese food. Guangzhou: Guangdong Travel and Tourism Press.Google Scholar
  10. 10.
    Nutrition Data. (2016). SELF nutrition data know what you eat.
  11. 11.
    Yang, Y. (2009). China food composition. Beijing: Peking University Medical Press.Google Scholar
  12. 12.
    USDA, Agriculture Research Service. (2019). Software developed by the National Agricultural Library v. Retrieved from
  13. 13.
    Webmagic (2016). A scalable web crawler framework for Java. Retrieved from
  14. 14.
    Fisher, R. A. (1921). On the probable error of a coefficient of correlation deduced from a small sample. Metron, 1, 3–32.Google Scholar
  15. 15.
    Hinkelmann, K., & Kempthorne, O. (2008). Design and analysis of experiments. I and II (2nd ed.). New York: Wiley.zbMATHGoogle Scholar
  16. 16.
    Moore, D. S., & McCabe, G. P. (2003). Introduction to the practice of statistics. New York: WH Freeman.zbMATHGoogle Scholar
  17. 17.
    Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In The workshop on computational learning theory (Vol. 5, pp. 144–152). New York: ACM.Google Scholar
  18. 18.
    Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.zbMATHGoogle Scholar
  19. 19.
    Chang, C. C., & Lin, C. J. (2011). LIBSVM: A library for support vector machines (Vol. 2, pp. 1–27). New York: ACM.Google Scholar
  20. 20.
    Ho, T. K. (1995). Random decision forests. In Proceedings of the 3rd international conference on document analysis and recognition, Montreal, QC, 14–16 (pp. 278–282). Piscataway, NJ: IEEE.Google Scholar
  21. 21.
    Hastie, T., Tibshirani, R., & Friedman, J. (2008). The elements of statistical learning 2nd end. Berlin: Springer.zbMATHGoogle Scholar
  22. 22.
    Lin Y, Jeon Y (2002). Random forests and adaptive nearest neighbors (Technical Report No. 1055). University of Wisconsin .Google Scholar
  23. 23.
    Chen T (2016). Machine learning challenge winning solutions. Retrieved from
  24. 24.
  25. 25.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.CrossRefGoogle Scholar
  26. 26.
    Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105). Cambridge, MA: MIT.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Xiaosong Han
    • 1
  • Haiyan Zhao
    • 2
  • Hao Xu
    • 1
  • Yun Yang
    • 3
  • Yanchun Liang
    • 4
    • 5
  • Dong Xu
    • 6
    • 7
    Email author
  1. 1.Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of EducationCollege of Computer Science and Technology, Jilin UniversityChangchunChina
  2. 2.Centre for Artificial IntelligenceFEIT, University of Technology Sydney (UTS)BroadwayAustralia
  3. 3.PhilocafeSan JoseUSA
  4. 4.College of Computer Science and Technology, Jilin UniversityChangchunChina
  5. 5.Zhuhai Laboratory of Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Department of Computer Science and TechnologyZhuhai College of Jilin UniversityZhuhaiChina
  6. 6.Department of Electrical Engineering and Computer ScienceUniversity of MissouriColumbiaUSA
  7. 7.Christopher S. Bond Life Sciences CenterUniversity of MissouriColumbiaUSA

Personalised recommendations