A Chinese Conceptual Semantic Feature Dataset (CCFD)


Memory and language are important high-level cognitive functions of humans, and the study of conceptual representation of the human brain is a key approach to reveal the principles of cognition. However, this research is often constrained by the availability of stimulus materials. The research on concept representation often needs to be based on a standardized and large-scale database of conceptual semantic features. Although Western scholars have established a variety of English conceptual semantic feature datasets, there is still a lack of a comprehensive Chinese version. In the present study, a Chinese Conceptual semantic Feature Dataset (CCFD) was established with 1,410 concepts including their semantic features and the similarity between concepts. The concepts were grouped into 28 subordinate categories and seven superior categories artificially. The results showed that concepts within the same category were closer to each other, while concepts between categories were farther apart. The CCFD proposed in this study can provide stimulation materials and data support for related research fields. All the data and supplementary materials can be found at https://osf.io/ug5dt/.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4


  1. Armeni, K., Willems, R. M., & Frank, S. L. (2017). Probabilistic language models in cognitive neuroscience: Promises and pitfalls. Neuroscience & Biobehavioral Reviews, 83, 579–588. https://doi.org/10.1016/j.neubiorev.2017.09.001.

  2. Ashcraft, M. H. (1978). Property norms for typical and atypical items from 17 categories: A description and discussion. Memory & Cognition, 6(3), 227–232. https://doi.org/10.3758/BF03197450

  3. Balaid, A., Abd Rozan, M. Z., Hikmi, S. N., & Memon, J. (2016). Knowledge maps: A systematic literature review and directions for future research. International Journal of Information Management, 36(3), 451–475. https://doi.org/10.1016/j.ijinfomgt.2016.02.005

    Article  Google Scholar 

  4. Binder, J. R., Desai, R. H., Graves, W. W., & Conant, L. L. (2009). Where is the semantic system? A critical review and meta-analysis of 120 functional neuroimaging studies. Cerebral Cortex, 19(12), 2767–2796. https://doi.org/10.1093/cercor/bhp055

    Article  PubMed  Google Scholar 

  5. Bruffaerts, R., Dupont, P., Peeters, R., De Deyne, S., Storms, G., & Vandenberghe, R. (2013). Similarity of fMRI activity patterns in left perirhinal cortex reflects semantic similarity between words. The Journal of Neuroscience, 33(47), 18597–18607. https://doi.org/10.1523/JNEUROSCI.1548-13.2013

    Article  PubMed  PubMed Central  Google Scholar 

  6. Buchanan, E. M., Holmes, J. L., Teasley, M. L., & Hutchison, K. A. (2013). English semantic word-pair norms and a searchable Web portal for experimental stimulus creation. Behavior Research Methods, 45(3), 746–757. https://doi.org/10.3758/s13428-012-0284-z

    Article  PubMed  Google Scholar 

  7. Buchanan, E. M., Valentine, K. D., & Maxwell, N. P. (2019). English semantic feature production norms: An extended database of 4436 concepts. Behavior Research Methods, 51(4), 1849–1863. https://doi.org/10.3758/s13428-019-01243-z

    Article  PubMed  Google Scholar 

  8. Canessa, E., Chaigneau, S. E., Lagos, R., & Medina, F. A. (2020). How to carry out conceptual properties norming studies as parameter estimation studies: Lessons from ecology. Behavior Research Methods. https://doi.org/10.3758/s13428-020-01439-8

  9. Clarke, A., & Tyler, L. K. (2015). Understanding What We See: How We Derive Meaning From Vision. Trends in Cognitive Sciences, 19(11), 677–687. https://doi.org/10.1016/j.tics.2015.08.008

    Article  PubMed  PubMed Central  Google Scholar 

  10. Collins, A. M., & Loftus, E. F. (1988). A spreading-activation theory of semantic processing. Psychological Review, 82(6), 407–428. https://doi.org/10.1037//0033-295X.82.6.407

  11. Cree, G. S., & McRae, K. (2003). Analyzing the factors underlying the structure and computation of the meaning of chipmunk, cherry, chisel, cheese, and cello (and many other such concrete nouns). Journal of Experimental Psychology. General, 132(2), 163–201. https://doi.org/10.1037/0096-3445.132.2.163

    Article  PubMed  Google Scholar 

  12. De Deyne, S., Navarro, D. J., Perfors, A., Brysbaert, M., & Storms, G. (2019). The “Small World of Words” English word association norms for over 12,000 cue words. Behavior Research Methods, 51(3), 987–1006. https://doi.org/10.3758/s13428-018-1115-7

    Article  PubMed  Google Scholar 

  13. Deng, J., Dong, W., Socher, R., Li, L. J., & Li, F. F. (2009). Imagenet: A large-scale hierarchical image database. Paper presented at the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20–25 June 2009, Miami, Florida, USA.

    Google Scholar 

  14. Devereux, B. J., Tyler, L. K., Geertzen, J., & Randall, B. (2014). The Centre for Speech, Language and the Brain (CSLB) concept property norms. Behavior Research Methods, 46(4), 1119–1127. https://doi.org/10.3758/s13428-013-0420-4

    Article  PubMed  Google Scholar 

  15. Dong, Z., Dong, Q., & Hao, C. (2006). HowNet and the Computation of Meaning. Singapore: World Scientific. https://doi.org/10.1142/5935

  16. Fernandino, L., Binder, J. R., Desai, R. H., Pendl, S. L., Humphries, C. J., Gross, W. L., … Seidenberg, M. S. (2016). Concept Representation Reflects Multimodal Abstraction: A Framework for Embodied Semantics. Cerebral Cortex, 26(5), 2018–2034. https://doi.org/10.1093/cercor/bhv020

    Article  PubMed  Google Scholar 

  17. Gainotti, G. (2000). What the locus of brain lesion tells us about the nature of the cognitive defect underlying category-specific disorders: a review. Cortex, 36(4), 539–559. https://doi.org/10.1016/S0010-9452(08)70537-9.

  18. Gainotti, G. (2005). The influence of gender and lesion location on naming disorders for animals, plants and artefacts. Neuropsychologia, 43(11), 1633–1644. https://doi.org/10.1016/j.neuropsychologia.2005.01.016

  19. Gainotti, G. (2010). The influence of anatomical locus of lesion and of gender-related familiarity factors in category-specific semantic disorders for animals, fruits and vegetables: a review of single-case studies. Cortex, 46(9), 1072–1087. https://doi.org/10.1016/j.cortex.2010.04.002.

  20. Gainotti, G., Spinelli, P., Scaricamazza, E., & Marra, C. (2013). The evaluation of sources of knowledge underlying different semantic categories. Frontiers in Human Neuroscience, 7, 40. https://doi.org/10.3389/fnhum.2013.00040

  21. Gao, J., Lin, F., Jiang, Z., & Lu, H. (2016). The modeling and analysis of semantic features for Chinese verbs. Chinese Journal of Rehabilitation Medicine, 31(4), 381–387.

    Google Scholar 

  22. George, A. M. (1995). WordNet: A Lexical Database for English. Communications of the ACM, 38(11), 39–41. https://doi.org/10.1145/219717.219748.

  23. Griffiths, T. L., Steyvers, M., & Tenenbaum, J. B. (2007). Topics in semantic representation. Psychological Review, 114(2), 211–244. https://doi.org/10.1037/0033-295X.114.2.211.

  24. Guido, G. (2015). Inborn and experience-dependent models of categorical brain organization. A position paper. Frontiers in Human Neuroscience, 9, 2. https://doi.org/10.3389/fnhum.2015.00002

    Article  Google Scholar 

  25. Han, S., Zhang, Y., Ma, Y., Tu, C., Guo, Z., Liu, Z., Sun, M. (2016). THUOCL: Tsinghua Open Chinese Lexicon. Tsinghua University.

  26. Jones, M., & Mewhort, D. (2007). Representing word meaning and order information in a composite holographic lexicon. Psychological Review, 114(1), 1–37. https://doi.org/10.1037/0033-295X.114.1.1.

  27. Jouravlev, O., & Mcrae, K. (2016). Thematic relatedness production norms for 100 object concepts. Behavior Research and Methods, 48, 1349–1357. https://doi.org/10.3758/s13428-015-0679-8.

  28. Kumaran, D., Hassabis, D., & McClelland, J. L. (2016). What Learning Systems do Intelligent Agents Need? Complementary Learning Systems Theory Updated. Trends in Cognitive Sciences, 20(7), 512–534. https://doi.org/10.1016/j.tics.2016.05.004

    Article  PubMed  Google Scholar 

  29. Lambon Ralph, M. A., Jefferies, E., Patterson, K., & Rogers, T. T. (2017). The neural and computational bases of semantic cognition. Nature Reviews. Neuroscience, 18(1), 42–55. https://doi.org/10.1038/nrn.2016.150

    Article  Google Scholar 

  30. Leydesdorff, L., & Vaughan, L. (2006). Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment. Journal of the American Society for Information Science and Technology, 57(12), 1616–1628. https://doi.org/10.1002/asi.20335

  31. Mcnorgan, C., Kotack, R. A., Meehan, D. C., & Mcrae, K. (2007). Feature-feature causal relations and statistical co-occurrences in object concepts. Memory & Cognition, 35(3), 418–431. https://doi.org/10.3758/BF03193282

  32. McRae, K., Cree, G.S., Seidenberg, M. S., & McNorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4), 547–559. https://doi.org/10.3758/BF03192726

  33. Mcwilliams, J., & Schmitter-Edgecombe, M. (2008). Semantic memory organization during the early stage of recovery from traumatic brain injury. Brain injury : BI, 22(3), 243-253. https://doi.org/10.1080/02699050801935252

  34. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. Computer Science, 1-12. arXiv preprint arXiv:1301.3781.

  35. Pulvermuller, F. (2013). How neurons make meaning: brain mechanisms for embodied and abstract-symbolic semantics. Trends in Cognitive Sciences, 17(9), 458–470. https://doi.org/10.1016/j.tics.2013.06.004

    Article  PubMed  Google Scholar 

  36. Samson, D., & Pillon, S. (2003). A case of impaired knowledge for fruits and vegetables. Cognitive Neuropsychology, 20, 373–400.

    Article  Google Scholar 

  37. Schacter, D. L., Dobbins, I. G., & Schnyer, D. M. (2004). Specificity of priming: a cognitive neuroscience perspective. Nature Reviews. Neuroscience, 5(11), 853–862. https://doi.org/10.1038/nrn1534

    Article  PubMed  Google Scholar 

  38. Scott, G. G., Keitel, A., Becirspahic, M., Yao, B., & Sereno, S. C. (2019). The Glasgow Norms: Ratings of 5,500 words on nine scales. Behavior Research and Methods, 51, 1258–1270. https://doi.org/10.3758/s13428-018-1099-3

  39. Skelac, I., & Jandrić, A. (2020). Meaning as Use: From Wittgenstein to Google’s Word2vec. In Guide to Deep Learning Basics (pp. 41–53). Springer. https://doi.org/10.1007/978-3-030-37591-1_5

  40. Taylor, K. I., Devereux, B. J., & Tyler, L. K. (2011). Conceptual structure: Towards an integrated neuro-cognitive account. Language & Cognitive Processes, 26(9), 1368–1401. https://doi.org/10.1080/01690965.2011.568227

    Article  Google Scholar 

  41. Toglia, M.P. (2009). Withstanding the test of time: the 1978 semantic word norms. Behavior Research Methods, 41(2), 531–533. https://doi.org/10.3758/BRM.41.2.531

    Article  PubMed  Google Scholar 

  42. Tyler, L. K., Chiu, S., Zhuang, J., Randall, B., Devereux, B. J., Wright, P., … Taylor, K. I. (2013). Objects and categories: feature statistics and object processing in the ventral stream. Journal of Cognitive Neuroscience, 25(10), 1723–1735. https://doi.org/10.1162/jocn_a_00419

    Article  PubMed  PubMed Central  Google Scholar 

  43. Tyler, L. K., & Moss, H. E. (2001). Towards a distributed account of conceptual knowledge. Trends in Cognitive Sciences, 5(6), 244–252. https://doi.org/10.1016/s1364-6613(00)01651-x

    Article  PubMed  Google Scholar 

  44. Van Rensbergen, B., De Deyne, S., & Storms, G. (2016). Estimating affective word covariates using word association data. Behavior Research Methods, 48(4), 1644–1652. https://doi.org/10.3758/s13428-015-0680-2

  45. Vigliocco, G., Vinson, D. P., Lewis, W., & Garrett, M. F. (2004). Representing the meanings of object and action words: The featural and unitary semantic space hypothesis. Cognitive Psychology, 48(4), 422–488. https://doi.org/10.1016/j.cogpsych.2003.09.001

  46. Vinson, D. P., & Vigliocco, G. (2008). Semantic feature production norms for a large set of objects and events. Behavior Research Methods, 40(1), 183–190. https://doi.org/10.3758/brm.40.1.183

    Article  PubMed  Google Scholar 

  47. Wang, X., Men, W., Gao, J., Caramazza, A., & Bi, Y. (2020). Two Forms of Knowledge Representations in the Human Brain. Neuron, 107, 1-11https://doi.org/10.1016/j.neuron.2020.04.010

  48. Xiang, W., Lin, F., & Jiang, Z. (2015). The modeling and analysis of semantic features for the Chinese nouns. Chinese Journal of Rehabilitation Medicine, 30(11), 1118–1124.

    Google Scholar 

  49. Xie, W., Bainbridge, W. A., Inati, S. K., Baker, C. I., & Zaghloul, K. A. (2020). Memorability of words in arbitrary verbal associations modulates memory retrieval in the anterior temporal lobe. Nature Human Behaviour, 4, 937–948. https://doi.org/10.1038/s41562-020-0901-2

  50. Yilmaz, S., & Toklu, S. (2020). A deep learning analysis on question classification task using Word2vec representations. Neural Computing and Applications,  32(7), 2909–2928. https://doi.org/10.1007/s00521-020-04725-w

Download references


This study was supported by the Fundamental Research Funds for the Central Universities (Grant Nos. CUC200A004, CUC18A001, CUC18A003-3 and CUC2019B079), the High-Quality and Cutting-Edge Disciplines Construction Project for Universities in Beijing (Internet Information, Communication University of China), Ministry of Education of the PRC (20YJAZH085), and the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing (Grant No. 2020A09).

Author information



Corresponding authors

Correspondence to Yaling Deng or Lihong Cao.

Additional information

Data availability statement

The datasets generated and analyzed during the current study are available in the supplementary materials (https://osf.io/ug5dt/). If you find any mistake in the dataset, we will appreciate it if you contact us to correct them.

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Deng, Y., Wang, Y., Qiu, C. et al. A Chinese Conceptual Semantic Feature Dataset (CCFD). Behav Res (2021). https://doi.org/10.3758/s13428-020-01525-x

Download citation


  • concept
  • semantic feature
  • dataset
  • Chinese