Keywords

1 Introduction

As we know humankind has a phenomenon which is interconnection. For example, if A is B’s brother, and B is C’s brother, we can know immediately A is C’s brother. However, if apparent knowledge is not sufficient, machine wouldn’t know it. The capability of interconnection makes people to comprehend from inference. This is because of capability of understanding. It is just because of the use of understanding that humankind can solve mathematic problems from simple to complex, can speak from babbling to creation of masterpiece which is rich of literary grace.

However, in the area of NLP (Natural Language Processing), the main way of NLP is based on statistics. Nevertheless, when facing even further language translation and problem solving, the method based on statistics is not a magic bullet. On the contrary, method based on understanding when doing natural language processing can do it at ease.

Different from psychology concepts, the theory of natural language understanding [1, 2] defines the perception as the minimum element of feeling, and uses perception as the basic concept of the physical theory of understanding. As for the qualitative aspect of perception, since it is abstract, it needn’t to be fully analogized at current stage. Instead, just extract the qualitative invariant part of perception and identify this part with symbols. This method can achieve the same effect as that can be gained from fully analogized implementation. We define perception element as: p.

Perception: so called perception, is the unit of consciousness, which is the minimum element which can be felt and which is also the minimum element of meaning, the qualitative invariant part of which is identified as: p.

2 Understanding of Logic and Physical Model of Interconnection

Paper [1, 2] presented that types of understanding include not only understanding of meaning but also understanding of logic. Understanding is the sense of certainty when the stimulus matches the set of perceptual patterns within the cognitive system, and the formula is:

$$ b_{t} = u(x) = w(P_{g}^{x} ,m^{x} ) $$
(1)

Among them, the external stimulation is \( x \), let an understanding of stimulation x to be \( u(x) \), the corresponding perceptual pattern is \( m^{x} \) and the perceptual subset is to be \( P_{g}^{x} \), the assembly perception set is to be \( P_{g} \), \( P_{g}^{x} \) \( \subset \) \( P_{g} \), the matching between x and \( m \) and its confidence function are to be \( w \), the certainty feeling is \( b_{t} \).

The stimulation material consists of various parts, the understanding of it includes all parts of the understanding. Paper [1] gives a formula of the comprehensive understanding. To stimulation \( x \), let the perception produced by it to be \( p \), the comprehensive understanding of x is to be \( u_{c} (x) \); Set a stimulus x containing \( n \) sets of assembly perceptions and n sets of perceptual patterns, the corresponding assembly perception set and perceptual patterns are respectively \( (P_{g}^{x} )_{i} \) and \( (M^{x} )_{i} \;(i = 1,2, \ldots ,n) \), then comprehensive understanding of the \( x \) is:

$$ u_{c} (x) = \prod\limits_{i = 1}^{n} {u((P_{g}^{x} )_{i} )} = \prod\limits_{i = 1}^{n} {w((P_{g}^{x} )_{i} ,(M^{x} )_{i} )} $$
(2)

In order to comprehensively and systematically reveal the physical law of natural language understanding, paper [2] presented the reliability and integrity theorems of the axiom system which is essential to pragmatic implication inference.

The important theorem about pragmatic implication presented and proved in Paper [2] is also based on the foundation of the integrity theorem, and the inference of pragmatic implication is very important to the semantic computing, it provides the formalization operating procedures to semantic computing. The pragmatic implication of a sentence in the corresponding context is:

$$ (p_{h}^{S} )_{j} = (M^{r} - \bigcup\limits_{i = 1,i \ne j}^{n} {(p^{s} )_{i} } - (p_{s}^{s} )_{j} ) \cup (p_{t}^{s} )_{j} $$
(3)

Among them, an understood context contains a collection of sentences \( S = \bigcup\limits_{j = 1}^{n} {s_{j} } \), and the associated rule set \( M^{r} \; \subseteq \;S_{c} \). The pragmatic meaning of sentence \( s \) is \( P_{h}^{S} \), the pragmatic meaning of sentence \( s_{j} \) is \( (P_{h}^{s} )_{j} \), its sentence meaning is \( (P^{s} )_{j} \), its literal meaning is \( (P_{s}^{s} )_{j} \), and the deduced semantic is as \( (P_{t}^{s} )_{j} \). From the expression, the pragmatic implication is a deduction value in the sentence group context G. By taking a certain value of G, it is possible to determine a pragmatic extrapolation value in a context.

2.1 Understanding of Logic is the Prerequisite of Interconnection

There is a close relationship between understanding of logic and interconnection. In paper [3], we reveal how to implement reasoning understanding (understanding of logic) according to perception semantics dictionary within semantics sequence correspondent to language, and furthermore acquire new understandable belief. The results of these steps also further verify the theory of natural language understanding we found.

Table 1 is one of the experimental examples from paper [3], which is a sentence of a famous Chinese ancient poem of Tang Dynasty, by poet Li Bai. This example assume that the machine already has words: in front of the bed, bright moon, light, and has the perception semantic value, and machine can understand the semantics of these words according to the definition of understanding. Where, Mr is pattern set generated after context being understood, new pattern set filter function new(Mr) generates new pattern set M.

Table 1. Computing example of learning based on understanding.

Table 1 shows the formalization procedure of machine understanding learning of “bright moon light in front of the bed”. It means that, is it feasible that “bright moon” and “light” are put together? The implication of “bright moon” shows that it can give out “light”, then the meaning of “bright moon light” is the “light” shined by “bright moon”, thus “bright moon light” is understandable. What is the comprehensive meaning of “in front of the bed” and “bright moon light”? Is it feasible that there is “bright moon light” “in front of the bed”? When “bright moon” shines in the sky, the “light” of it can illuminate to everywhere, thus it can illuminate to “in front of the bed”. Therefore “bright moon light in front of the bed” is understandable. Storing after understanding is learning. Understanding is the basis and one of purposes of learning. So called reasoning of logic understanding includes deduction and induction reasoning based on perception semantics, new knowledge can only be acquired when logic understanding is implemented, thus interconnection effects can be gained.

Compared to statistics based machine learning, one of the advantages of understanding based learning approach is that its learning results have characteristics of robustness, because their learning results are fully logical. Secondly, its learning outcomes is grounded, because the result is in line with reality; thirdly, the outcomes of learning is understandable for human beings; fourthly, it is a small sample learning method. The prerequisite of using the method is that it requires careful description of common sense, and requires a comprehensive understanding. The process of comprehensive understanding is more complicated and more time-consuming than statistical method. So far, the study is still in its infancy, and many further researches are needed to do persistently.

2.2 The Physical Model of Interconnection and its Meaning

Interconnection is to fuse certain perception material and then generate new correct beliefs, thus interconnection is also known as fusion. A model of interconnection is presented in paper [2], shown in Fig. 1. In paper [2, 4], interconnection is built on the basis of understanding, machine cannot implement interconnection if there is no understanding as the basis, and the knowledge in it is only the isolated hard knowledge.

Fig. 1.
figure 1

Extended (pulped) Turing Machine Model processed information based on perception semantics

Since knowledge base system is built on the foundation of concepts, therefore, these knowledge can’t be interconnected well. Differentiating, understanding and fusing concepts is fundamental roadmap to overcome these problems. To understand these concepts is useful to automatic learning and automatic usage.

In Fig. 1, When cognitive system observes Real World, it forms the perceiving Images of Real World, following-on through Instinct Mechanism of machinery understanding, judges whether it forms any function, which means to judge if it has values. If the function rule is true (i.e. \( e(x) = 1 \)), then cognitive system differentiates and segments out the Function Rules including Concepts (C01, C02,…, Cmn)/Knowledge(K01, K02,…, Kmn).

In cognitive system, all Concepts (C01, C02,…, Cmn)/Knowledge(K01, K02,…, Kmn) are fusing based on Perception Semantics \( ( {\text{PS}}_{01}^{\text{C}} \,{\text{PS}}_{02}^{\text{C}} \ldots \ldots {\text{PS}}_{{0{\text{n}}}}^{\text{C}} ,\;{\text{PS}}_{11}^{\text{C}} \,{\text{PS}}_{12}^{\text{C}} \ldots \ldots {\text{PS}}_{{1{\text{n}}}}^{\text{C}} , \ldots \ldots ,{\text{PS}}_{{{\text{m}}1}}^{\text{C}} {\text{PS}}_{\text{m2}}^{\text{C}} \ldots \ldots {\text{PS}}_{\text{mn}}^{\text{C}} ) \) and Perception Semantics of Knowledge \( ( {\text{PS}}_{01}^{\text{K}} \,{\text{PS}}_{02}^{\text{K}} \ldots \ldots {\text{PS}}_{{0{\text{n}}}}^{\text{K}} ,\;{\text{PS}}_{11}^{\text{K}} \,{\text{PS}}_{12}^{\text{K}} \ldots \ldots {\text{PS}}_{{1{\text{n}}}}^{\text{K}} , \ldots \ldots ,\;{\text{PS}}_{{{\text{m}}1}}^{\text{K}} \,{\text{PS}}_{\text{m2}}^{\text{K}} \ldots \ldots {\text{PS}}_{\text{mn}}^{\text{K}} ) \). These concepts and knowledge are in fusion linking state when cognitive system is thinking or using these concepts.

3 Prospection of Understanding Theory

During the course of language translation, supposing that the machine has already got some basic facts and knowledge, if the machine has the ability of interconnection, it will have the ability of analogy. It can interconnects the old material (content) again with perception elements, and can generate new beliefs according to new requirements, and thus can extract new experiences from old material constantly. This means if the machine grasps a typical interconnection, it will comprehend the whole category by analog. This process is just like paper making: Pound paper fiber and mixing fiber with water, and finally fuse them into fine pulp, then manufacture paper products meeting various kinds of requirements. Lack of ready-made and suitable knowledge is often a problem, thus machine-oriented interconnection is extremely significant.

On the other hand, since there is no limitation (restrains) of the magnitude of the concept particles of variable or status in Turing Machine Model, formalization system based on Turing Machine Model is normally local, and thus the fusion of systems is difficult. This limitation is manifested when programming. Generally a program is a formalization system. When programming, firstly the variables of various kinds of data structure are defined, and then the logical flow are defined or structured, and then variable mathematical operation or formal variation are implemented, and thus the final results are got and stored into the result variables. We can see that a program is consists of variables and limited operation instruction sequence. Program is just a formalization system which can stop within limited steps.

The particle size of variable (or status) can be changed. Size and unit can be element, or even can be big enough to the magnitude of a combined concept which is composed of multiple concepts. These concepts can be huge, and concepts of different systems are independent, thus the definition of variables also can be of huge quantity, this means human need to program huge quantity of program, and build huge quantity of formalization system, only in this way we can adapt to the change of the reality world.

After all, the limitation of Turing Machine Model is that it cannot fuse the knowledge it uses. The interconnection built on understanding will enhance the usage rate and usability of knowledge, and thus we outlook that our natural language understanding theory will impact software engineering, especially software production automation significantly.

The theory of natural language understanding can also be applied to network content security, spoken dialogue system, information retrieval, verification code recognition and voice content retrieval, etc. [5,6,7,8,9,10,11,12], and thus can incrementally fulfill the dream of The Imitation Game of Alan Turing.

4 Conclusion

Natural language understanding is a conclusion of one of the important rules about human language phenomenon. It is the same as physics that theory of natural language understanding can be summarized and extracted from the psychological and physical phenomenon. In the field of computer science, Turing Machine is its base, and the initial intention of Turing Machine is to analog thinking computationally. This is the nature of Turing Machine. Just as what is analyzed in paper [2], the interconnection ability of original Turing Machine Model is limited. Natural language understanding theory [1, 2] is an extension to Turing Machine Model. Just like that Turing machine liberate human being from cumbersome computation, we believe that natural language understanding theory will liberate human being from cumbersome analogy of thinking which results in building of infinite formalization systems for human using Turing Machine. This is just the motivation of the natural language understanding theory research.