Mining the Temporal Structure of Thought from Text

  • Mei MeiEmail author
  • Zhaowei Ren
  • Ali A. Minai
Conference paper
Part of the Springer Proceedings in Complexity book series (SPCOM)


Thinking is a self-organized dynamical process and, as such, interesting to characterize. However, direct, real-time access to thought at the semantic level is still very limited. The best that can be done is to look at spoken or written expression. The question we address in this research is the following: Is there a characteristic pitch of thought? To begin answering this complex question, we look at text documents from several large corpora at the sentence level – i.e., using sentences as the units of meaning – and considering each document to be the result of a random process in semantic space. Given a large corpus of multi-sentence documents, we build a lexical association network representing associations between words in the corpus. This network is used to induce a semantic similarity metric between sentences, and each document is segmented into multi-sentence semantically coherent blocks (SCBs) with occasional connecting text between the blocks. Based on this segmentation, the process of document generation is modeled as a sticky Markov chain at the sentence level. We show that most documents across all the corpora are sequences of blocks with a very consistent mean length of 6.4 sentences across the corpora. This consistency suggests that a value of 6-7 sentences may be the typical mean length for single coherent thoughts in texts. We have also described several ways of visualizing the semantic structure of documents in space and time.


Semantic dynamics Text analysis Text segmentation 



This work was supported in part by National Science Foundation INSPIRE grant BCS-1247971 to Ali Minai.


  1. 1.
    Aggarwal, C.C., Zhao, P.: Towards graphical models for text processing. Knowl. Inform. Syst. 36(1), 1–21 (2013). Scholar
  2. 2.
    Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003). Scholar
  3. 3.
    Canolty, R.T., Soltani, M., Dalal, S.S., Edwards, E., Dronkers, N.F., Nagarajan, S.S., Kirsch, H.E., Barbaro, N.M., Knight, R.T.: Spatiotemporal dynamics of word processing in the human brain. Front. Neurosci. 1(1), 185–196 (2007). Scholar
  4. 4.
    Friedenberg, J., Silverman, G.: Introduction: exploring inner space. In: Brace-Thompson, J., Crouppen, M.B., Robinson, S. (eds.) Cognitive Science An Introduction to the Study of Mind, chapter 1, pp. 2–3. Sage Publications, Inc., Thousand Oaks (2006)Google Scholar
  5. 5.
    Hinton, G.E., Roweis, S.T.: Stochastic neighbor embedding. In: Advances in neural information processing systems, pp. 833–840 (2002)Google Scholar
  6. 6.
    Hogan, J.P.: Mind Matters: Exploring the World of Artificial Intelligence, 1st edn. Ballantine Publication Group, New York (1998)Google Scholar
  7. 7.
    Lamprier, S., Amghar, T., Levrat, B., Saubion, F.: SegGen: A genetic algorithm for linear text segmentation. In: Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), pp. 1647–1652 (2007)Google Scholar
  8. 8.
    Mei, M., Vanarase, A., Minai, A.A.: Chunks of thought: finding salient semantic structures in texts. In: Proceedings of IJCNN 2014 (2014)Google Scholar
  9. 9.
    Misra, H., Yvon, F., Cappé, O., Jose, J.: Text segmentation: a topic modeling perspective. Inform. Process. Manag. 47(4), 528–544 (2011). Scholar
  10. 10.
    Morewedge, C.K., Giblin, C.E., Norton, M.I.: The (perceived) meaning of spontaneous thoughts. J. Exp. Psychol. Gen. 143(4), 1742–1754 (2014). Scholar
  11. 11.
    Riedl, M., Biemann, C.: Text segmentation with topic models. J. Lang. Technol. Comput. Linguist. 27(1), 47–69 (2012)Google Scholar
  12. 12.
    Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput. Appl. Math. 20, 53–65 (1987)CrossRefGoogle Scholar
  13. 13.
    Shen, G., Horikawa, T., Majima, K., Kamitani, Y.: Deep image reconstruction from human brain activity. bioRxiv (2017). 10.1101/240317Google Scholar
  14. 14.
    Turian, J., Ratinov, L., Bengio, Y.: Word representations: a simple and general method for semi-supervised learning. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, July, pp. 384–394 (2010)Google Scholar
  15. 15.
    Wang, J., Cherkassky, V.L., Just, M.A.: Predicting the brain activation pattern associated with the propositional content of a sentence: Modeling neural representations of events and states. Hum. Brain Mapp. 38, 4865–4881 (2017). Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Electrical Engineering and Computer ScienceUniversity of CincinnatiCincinnatiUSA

Personalised recommendations