Keywords

1 Introduction

In recently years, with the breakthrough of Alpha GO and the neural network based on convolution in pattern recognition, Deep Learning has become a research hot topic. In fact, some of basic and very important problems in it have not been resulted yet. First of all, the basic operation of Neural Unit is Classification, so it is called a classifier in general textbooks, the recognition of the pattern is implemented in the neural network by an iterative algorithm, such that the function of pattern recognition of Deep learning by adjusting the connection weight parameters between different levels and different classifiers (neurons), and there are a lot of uncertainty, such as probability and fuzziness and so on. In [1, 2, 3], a new kind of Computer, Attribute Grid Computer (AGC) based on Qualitative Mapping (QM), it is shown that some of artificial methods such as Expert System, Artificial Neural Network and Support Vector Machine can be fused and unified together can be fused in the framework of qualitative criterion transformation of QM and AGC. The basic operation of QM is covering, its mechanism is the conversion from quantity of attribute into quality of attribute. What is the principle of pattern recognition? Why did the Neural Network and AGC can recognize a pattern? What relation between classification and covering is? Whether does there is any linking between the probability and fuzziness in ANN and AGC?

In this paper, the qualitative the envelope of qualitative criteria is subdivided more detail, such that the probability of each classified sample falling into subdivision grid can be counted respectively. In this way, not only any classified samples can be recognized by the Grid-based GAC in detail, but also the indicate linking between the probability and the degree of (fuzzy) conversion can be given.

For the sake of discussion, first of all, let us give the definitions of qualitative mapping and Attribute Grid Computer.

2 Qualitative Mapping of Conjunction Property Judgment

Definition 1.

Let \( a(u) = \mathop \wedge \limits_{i = 1}^{n} a_{i} (u) \) be the conjunction attribute of object u whose n factor attributes are \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \), \( {\text{i}} = 1, \ldots ,{\text{n}} \), \( {\text{x}} = \left( {{\text{x}}_{1} , \ldots ,{\text{x}}_{\text{n}} } \right) \), the quantity vector of \( {\text{a}}\left( {\text{u}} \right) \), \( {\text{x}}_{\text{i}} \in {\text{X}}_{\text{i}} \subseteq {\text{R}} \), the quantity of \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \), \( {\text{p}}_{\text{i}} \left( {\text{u}} \right) \) the quality or property of \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \), let be the collection of properties \( {\text{p}}_{\text{i}} \left( {\text{u}} \right) \), \( \Gamma = \{ [\upalpha_{\text{i}} ,\upbeta_{\text{i}} \text{]|}[\upalpha_{\text{i}} ,\upbeta_{\text{i}} ] \,{\text{is}}\,{\text{the}}\,{\text{qualitative}}\,{\text{criterion}}\,{\text{of}}\,{\text{p}}_{\text{i}} \left( {\text{u}} \right)\} \), n-dimension parallelotope \( [\upalpha,\upbeta] = \{ {\text{x}}|{\text{x}} \in [\upalpha_{1} ,\upbeta_{1} ] \times \ldots \times [\upalpha_{\text{n}} ,\upbeta_{\text{n}} ]\} \) be the qualitative criterion of \( p(u) = \mathop \wedge \limits_{i = 1}^{n} p_{i} (u) \), the mapping is called the Qualitative Mapping (QM) whose criterion is the [α, β], if for any x ϵ X, there is \( [\upalpha,\upbeta] \in\Gamma \) and the conjunction property , whose qualitative criterion is \( [\upalpha,\upbeta] \), such that

$$ \tau (x,[\alpha ,\beta ]) = x\mathop \in \limits_{?} [\alpha ,\beta ] = \left\{ {\begin{array}{*{20}l} 1 \hfill & {x \in [\alpha ,\beta ]} \hfill \\ 0 \hfill & {x \notin [\alpha ,\beta ]} \hfill \\ \end{array} } \right. $$
(1)

For conveniently discussion, we introduce the definition of trivial artificial neuron.

Definition 2.

Let \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \) be the property of object \( {\text{u}},\,{\text{i}} = 1, \ldots ,{\text{n}},{\text{x}}_{\text{i}} \in {\text{X}}_{\text{i}} \) are the qualitative attribute of \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \). \( {\text{p}}_{\text{ij}} \left( {\text{u}} \right) \) is the jth qualitative attribute of \( {\text{a}}_{\text{i}} \left( {\text{u}} \right) \). \( {\text{j}} = 1, \ldots ,{\text{m}},|[\upalpha_{\text{ij}} ,\upbeta_{\text{ij}} ] \subseteq {\text{X}}_{\text{i}} \) is the qualitative criterion of \( {\text{p}}_{\text{ij}} \left( {\text{u}} \right) \), \( \Gamma = \{ [\upalpha_{\text{ij}} ,\upbeta_{\text{ij}} ]\} \) is the cluster of qualitative criterion, which satisfies: \( [\upalpha_{\text{ij}} ,\upbeta_{\text{ij}} ] \cap [\upalpha_{\text{ij}} ,\upbeta_{\text{ij}} ] = \varnothing ,\,{\text{l}} = 1, \ldots ,{\text{m}},{\text{l}} \ne {\text{j}} \), and \( X_{i} = \bigcup\limits_{j = 1}^{m} {[\alpha_{ij} ,\beta_{ij} ]} \). Let \( a(u) = \mathop \wedge \limits_{i = 1}^{n} a_{i} (u) \) be the conjugate property of \( {\text{a}}_{\text{i}} \left( {\text{u}} \right),{\text{x}} = \left( {{\text{x}}_{1} , \ldots ,{\text{x}}_{\text{n}} } \right) \in {\text{X}} = {\text{X}}_{1} \times \ldots \times {\text{X}}_{\text{n}} \subseteq {\text{R}}^{\text{n}} \), is a quantitative attribute of \( {\text{a}}\left( {\text{u}} \right) \), \( {\text{i}}_{\text{k}} \in \{ 1, \ldots {\text{n}}\} ,{\text{ j}}_{\text{l}} \in \{ 1, \ldots {\text{m}}\} \), \( [\alpha_{\nu } ,\beta_{\nu } ]_{m}^{n} = [\alpha_{{i_{1} j_{1} }} ,\beta_{{i_{1} j_{1} }} ] \times \cdots \times [\alpha_{{i_{k} j_{l} }} ,\beta_{{i_{k} j_{l} }} ] \times \cdots \times [\alpha_{{i_{n} j_{m} }} ,\beta_{{i_{n} j_{m} }} ] \) be a hyper rectangular parallelepiped constructed by n qualitative criterion \( [\alpha_{{i_{k} j_{l} }} ,\beta_{{i_{k} j_{l} }} ] \) of different dimensions. Here, \( \left( {{\rm{i}}_{1} {\rm{j}}_{1} , \ldots ,{\rm{i}}_{\rm{k}} {\rm{j}}_{\rm{l}} , \ldots ,{\rm{i}}_{\rm{n}} {\rm{j}}_{\rm{m}} } \right) \) is a combination of ik and jl, and \( {\rm{v}} = {\rm{v}}\left( {{\rm{i}}_{1} {\rm{j}}_{1} , \ldots ,{\rm{i}}_{\rm{k}} {\rm{j}}_{\rm{l}} , \ldots ,{\rm{i}}_{\rm{n}} {\rm{j}}_{\rm{m}} } \right) \) is its order number. Since for every ik, jl has m different choices, we have mn combinations in total. So, \( {\text{v}} \in \{ 1, \ldots ,{\text{m}}^{\text{n}} \} \). Let \( p_{v} (u) = \mathop \wedge \limits_{\begin{subarray}{l} k = 1 \\ l = 1 \end{subarray} }^{{n,{\kern 1pt} {\kern 1pt} {\kern 1pt} m}} p_{{i_{k} j_{l} }} (u) \) be the conjugate property of object u with qualitative criterion \( [\alpha_{v} ,\beta_{v} ]_{m}^{n} \), let \( \Gamma = \{ [\alpha_{v} ,\beta_{v} ]_{m}^{n} \} \) be the collection of all qualitative criterion \( [\alpha_{v} ,\beta_{v} ]_{m}^{n} \), and \( ([\alpha_{v} ,\beta_{v} ]_{m}^{n} ) = \left( {\begin{array}{*{20}c} {[\alpha_{11} ,\beta_{11} ]} & \cdots & {(\alpha_{1m} ,\beta_{1m} ]} \\ \vdots & {(\alpha_{{i_{k} j_{l} }} ,\beta_{{i_{k} j_{l} }} ]} & \vdots \\ {[\alpha_{n1} ,\beta_{n1} ]} & \cdots & {(\alpha_{nm} ,\beta_{nm} ]} \\ \end{array} } \right) \) be the grid constructed by mn different n dimensional hyper rectangular parallelepiped. Thus, qualitative mapping τ with qualitative criterion \( \left( {[\upalpha_{\text{v}} ,\upbeta_{\text{v}} ]} \right) \) can be written as: \( {\text{X}} \times\Gamma \to \left\{ {0,1} \right\} \). For any \( x \in {\text{X}} \), there exists property , \( [\alpha_{v} ,\beta_{v} ]_{m}^{n} \, \in\Gamma ^{\text{n}} \), let

$$ \begin{aligned} & {\rm T}\left( {(x_{1} , \cdots ,x_{n} ),\left( {\begin{array}{*{20}c} {[\alpha_{11} ,\beta_{11} ]} & \cdots & {(\alpha_{1m} ,\beta_{1m} ]} \\ \vdots & {(\alpha_{{i_{k} j_{l} }} ,\beta_{{i_{k} j_{l} }} ]} & \vdots \\ {[\alpha_{n1} ,\beta_{n1} ]} & \cdots & {(\alpha_{nm} ,\beta_{nm} ]} \\ \end{array} } \right)} \right) \\ & = \mathop {\overline{ \vee } }\limits_{{j_{l} = 1}}^{m} \mathop \wedge \limits_{{i_{k} = 1}}^{n} \{ (x_{1} , \cdots ,x_{n} )\mathop \in \limits_{?} [\alpha_{{i_{1} j_{1} }} ,\beta_{{i_{1} j_{1} }} ] \times \cdots \times (\alpha_{{i_{k} j_{l} }} ,\beta_{{i_{k} j_{l} }} ] \times \cdots \times (\alpha_{{i_{n} j_{m} }} ,\beta_{{i_{n} j_{m} }} ]\} \\ & = \mathop {\overline{ \vee } }\limits_{{j_{l} = 1}}^{m} \{ \cdots \{ \mathop \wedge \limits_{{i_{k} = 1}}^{n} \tau_{{v(i_{1} j_{1} , \cdots ,i_{k} j_{l} , \cdots ,i_{n} j_{m} )}} (x)\} \} \\ \end{aligned} $$
(2)

Here,

$$ \tau_{{_{{\nu (i_{1} j_{1} , \cdots ,i_{k} j_{l} , \cdots ,i_{n} j_{m} )}} }} (x) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {iff} \hfill & {x \in [\alpha_{v} ,\beta_{v} ]} \hfill \\ 0 \hfill & {iff} \hfill & {x \notin [\alpha_{v} ,\beta_{v} ]} \hfill \\ \end{array} } \right. $$
(3)

(2) is a qualitative mapping to judge whether the property \( {\text{p}}_{\text{v}} \left( {{\text{x}},{\text{u}}} \right) \) of an object u with vector x is true or not.

Fig. 1.
figure 1

The grid of 3-dimension qualitative criterion

Because the input of the qualitative mapping \( \uptau({\text{x}},[\upalpha,\upbeta]) \) is a n dimension data vector, its criterion \( [\alpha_{v} ,\beta_{v} ]_{m}^{n} \) is a n dimension grid, and the output is a truth value of the property p(u). From the view of point of the mathematics, the computing of \( \tau_{p} (x,[\alpha_{v} ,\beta_{v} ]_{m}^{n} ) \) is a conversion from quantity x into the quality p(u), so we called it the qualitative mapping from quantity into quality.

3 Attribute Grid Computer Based on Qualitative Mapping

It is obvious that, according to the relation between the input x and the output p(x) of qualitative mapping (2), a Qualitative Mapping Logical Unit, or Electro-circuit Unit can be easily designed, the feature extraction and feature conjunction of attribute of object can be implemented by it.

An example of 2-array Qualitative Mapping Unit for the Judging of truth value of 2-array conjunction property whose input are 2 variables, the output are 9 conjunction properties, the qualitative criteria is a 3 × 3 grid, is shown in Fig. 4, there is a number of feedback circuits which aim is for the adjusting of the qualitative mapping criterion.

By a number of conjunction or disjunction of Qualitative Mapping Units, An Attribute Computing Network can be integrated, not only a series of Artificial Intelligent approaches, such as the Expert System, Artificial Neural Network, Support Vector Machine can be simulated by the Attribute Computing Network, but also they are transformed each other by the adjusting of integration mode (conjunction or disjunction), and hierarchy construction, the feedback learning of connection weight and etc. (Fig. 2).

Fig. 2.
figure 2

Logic computing unit and attribute grid computer induced by qualitative mapping

It is shown that since qualitative mapping and the artificial neuron can be defined each other, and a series of artificial intelligent approaches can be fused into the qualitative mapping by varied transformation of qualitative criterion, the Attribute Network Computing based on Qualitative Mapping proposed in here is a mathematical model in which a lot of intelligent methods have be fused.

4 Attribute Grid Computer for Pattern Recognition

The recognition of some of patterns which varies with time t or variable x, such as Electrocardiograph etc., can be considered as the recognition of graph of a function \( {\text{y}} = {\text{f}}\left( {\text{x}} \right) \). So, it is a basic problem whether a method or a model of recognition of graph of a function \( {\text{y}} = {\text{f}}\left( {\text{x}} \right) \) could be found out or not (Fig. 3).

Fig. 3.
figure 3

Competition between pattern of computing values of function y = f(x) and its graph

It is shown in this paper, the computing values \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) \) of function \( {\text{y}} = {\text{f}}\left( {\text{x}} \right) \) at point \( {\text{x}}_{\text{j}} \) equals not to the value f(xj), that is: \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) \ne {\text{f}}\left( {{\text{x}}_{\text{j}} } \right) \), because the number of memory a computer is finite. But the computing value \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) \) has be taken as the function value f(xj), indeed the pattern constructed by the set of \( \left\{ {({\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)} \right\} \), \( {\text{P}}\left( {\left\{ {{\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)} \right\}} \right) \) has be taken to be the image of function f(x). Why does it can do? What the principle we can do that is? If there is the principle and it could be used for general pattern recognition, the problem is very important for us.

Definition 3.

Let X, Y be two set, if for each \( {\text{x}} \in {\text{X}} \), there is a lure f and only a \( {\text{y}} \in {\text{Y}} \), such that \( {\text{y}} = {\text{f}}\left( {\text{t}} \right) \), then the rule f is a function from set X to set Y, noted by \( {\text{f}}:{\text{X}} \to {\text{Y}} \), X is called domain of function \( {\text{f}},\{ {\text{y}}|{\text{y}} = {\text{f}}\left( {\text{x}} \right),{\text{x}} \in {\text{X}}\} \subseteq {\text{Y}} \) is call the co-domain of f.

Specially, let function f is a one to one correspond function between the domain \( \left[ {{\text{a}},{\text{b}}} \right] \subseteq {\text{X}} \) to co-domain \( \left[ {\text{c,d}} \right] \subseteq {\text{Y,}}\,{\text{f}}:{\text{X}} \to {\text{Y}} \), that is if there two \( {\text{x}}_{1} ,{\text{x}}_{2} \in \left[ {\text{a,b}} \right] \subseteq {\text{X}} \), such that \( {\text{y}} = {\text{f}}\left( {{\text{x}}_{1} } \right) = {\text{f}}\left( {{\text{x}}_{2} } \right) \), then x1 = x2.

Let \( {\text{x}}_{\text{j}} \in \left[ {\text{a,b}} \right],{\text{j}} = 1, \ldots ,{\text{m}} \), be m+1 points belong to [a, b] which taken by computer, \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) \) the computing value of function \( {\text{Y}} = {\text{f}}({\text{x}}) \) at point xj, \( {\text{P}}(\{ \left( {{\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)\} } \right) \) the pattern constructed by set \( \left\{ {({\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)} \right\} \) of the order pair of variables {xj} and computing values \( \left\{ {{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)} \right\} \) in the 2 dimension axis system, and there are q points \( \left\{ {{\text{x}}_{\text{k}} } \right\} \), \( {\text{k}} = {\text{j}}_{1} \ldots ,{\text{j}}_{\text{q}} \), whose computing values are not equal to their function values, that is \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{k}} } \right) \ne {\text{f}}\left( {{\text{x}}_{\text{j}} } \right) \), such that the pattern \( {\text{P}}\left( {\left\{ {{\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)} \right\}} \right) \) is not the image of function \( {\text{Y}} = {\text{f}}({\text{x}}) \), noted by \( {\text{P}}\left( {\{ {\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)\} } \right) \ne {\text{P}}\left( {{\text{f}}\left( {\text{x}} \right)} \right) \).

It is shown, from the program design and the computer algorithm, that the step long \( \Delta {\text{x}}_{\text{j}} = {\text{x}}_{{{\text{j}} + 1}} - {\text{x}}_{\text{j}} \) must be designed first, before the computing of values of function \( {\text{Y}} = {\text{f}}({\text{x}}) \) by the computer. If the step \( \Delta {\text{x}}_{\text{j}} \) is too long to fine the result of function, then it must be shorted. Second, a error θ that stops the operation of computer must be selected by designer, such that if \( \left| {{\text{y}}^{{({\text{n}} + 1)}} \left( {{\text{x}}_{\text{j}} } \right) - {\text{y}}^{{({\text{n}})}} \left( {{\text{x}}_{\text{j}} } \right)} \right| <\uptheta \) after the n-th computing, and we get \( \left| {{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) - {\text{f}}\left( {{\text{x}}_{\text{j}} } \right)} \right| <\upvarepsilon \), here \( \upvarepsilon > 0 \) is a arbitrary small position number, then let \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) = {\text{y}}^{{({\text{n}} + 1)}} \left( {{\text{x}}_{\text{j}} } \right) \), and the machine stop.

We can see that, from the above, a basic principle or theorem, that the reason why the computing value equal to the function value: \( {\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) = {\text{f}}\left( {{\text{x}}_{\text{j}} } \right)? \) and the pattern \( {\text{P}}\left( {\{ {\text{x}}_{\text{j}} ,{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right)\} } \right) \) can be considered to be the image of f(x), is the following

Basic Theorem.

For two given arbitrary small position number \( \updelta > 0 \) and \( \upvarepsilon > 0 \), and all j, j = 0, …, m, if there are \( \delta_{m} = \mathop {\hbox{max} }\limits_{j = 0}^{m} \{ |x_{j + 1} - x_{j} |\} {\kern 1pt} \) and \( \varepsilon_{m} = \mathop {\hbox{max} }\limits_{j = 0}^{m} \{ |y^{{\prime }} (x_{j} ) - f(x_{j} )|\} \), such that when \( \left| {{\text{x}}_{{{\text{j}} + 1}} - {\text{x}}_{\text{j}} } \right| <\updelta_{\text{m}} <\updelta \) and \( \left| {{\text{y}}^{{\prime }} \left( {{\text{x}}_{\text{j}} } \right) - {\text{f}}\left( {{\text{x}}_{\text{j}} } \right)} \right| <\upvarepsilon_{m} <\upvarepsilon \), then we get the following limit

$$ \mathop {\lim }\limits_{m \to \infty } y^{{\prime }} (x_{0} , \cdots x_{m} ) = (y^{{\prime }} (x_{0} ), \cdots ,y^{{\prime }} (x_{m} )) = (f(x_{0} ), \cdots ,f(x_{m} )) = y(x) $$
(4)

It is shown that because a new coordinate system whose axes \( y_{j} |_{{x = x_{j} }} {\kern 1pt} \) come from the line \( {\text{x}} = {\text{x}}_{\text{j}} \) in X-Y coordinate system, and a hypercube \( {\text{N}}({\text{f}}\left( {\text{x}} \right),\upvarepsilon) = {\text{N}}({\text{f}}\left( {{\text{x}}_{ 0} } \right),\upvarepsilon_{0} ) \times \ldots \times {\text{N}}({\text{f}}\left( {{\text{x}}_{\text{m}} } \right),\upvarepsilon_{\text{m}} )) \) which component come from the same, have be constructed by the program designer, the qualitative mapping (3) which is a model for pattern cognition have been also given too.

Let ECGu be the Electrocardiograph of u, since it could be considered as a function from interval \( \left[ {{\text{t}}_{ 0} ,{\text{t}}_{\text{m}} } \right] \) to current set Y \( {\text{y}}:\left[ {{\text{t}}_{ 0} ,{\text{t}}_{\text{m}} } \right] \to {\text{Y}} \), for any \( {\text{t}} \in \left[ {{\text{t}}_{0} ,{\text{t}}_{\text{m}} } \right] \), there is a \( {\text{y}}_{\text{u}} \in {\text{Y}} \), such that \( {\text{t}} \to {\text{y}}_{\text{u}} \left( {\text{t}} \right) \), the coordinate of any point of ECGu is \( \left( {{\text{t}},{\text{y}}_{\text{u}} \left( {\text{t}} \right)} \right) \). Let t = tj, j = 0, …, m, be a sampling serial of \( \left[ {{\text{t}}_{0} ,{\text{t}}_{\text{m}} } \right] \), then a m+1 dimension vector \( {\text{y}}_{\text{u}} \left( {{\text{t}}_{0} , \ldots ,{\text{t}}_{\text{m}} } \right) = \left( {{\text{y}}_{\text{u}} \left( {{\text{t}}_{0} } \right), \ldots ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{m}} } \right)} \right) \) could be got by the m+1 values of function \( {\text{y}} = {\text{y}}_{\text{u}} \left( {\text{t}} \right) \).

It is shown as in Fig. 4, Let \( {\text{P}}\left( {\{ \left( {{\text{t}}_{\text{j}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)\} } \right) = (\left( {{\text{t}}_{0} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{0} } \right)} \right), \ldots ,\left( {{\text{t}}_{\text{m}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{m}} } \right)} \right) \) be the pattern of the set \( \left\{ {\left( {{\text{t}}_{\text{j}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)} \right\} \) whose a component are respectively m+1 points of \( {\text{ECG}}_{\text{u}} = {\text{y}}_{\text{u}} \left( {\text{t}} \right) \). Let \( {\text{P}}\left( {\{ \left( {{\text{t}}_{\text{j}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)\} } \right) = \, (\left( {{\text{t}}_{0} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{0} } \right)} \right), \ldots , \, \left( {{\text{t}}_{\text{m}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{m}} } \right)} \right) \) be the pattern is constructed by the set \( \left\{ {\left( {{\text{t}}_{\text{j}} ,\,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)} \right\} \) in 2 dimension coordinate system T-Y. When m go to infinite, the vector \( {\text{y}}_{\text{u}} \left( {{\text{t}}_{0} , \ldots ,{\text{t}}_{\text{m}} } \right) \) will be trend to \( {\text{y}}_{\text{u}} \left( {\text{t}} \right) \), i.e., \( {\text{y}}_{\text{u}} \left( {{\text{t}}_{0} , \ldots ,{\text{t}}_{\text{m}} } \right) = \left( {{\text{y}}_{\text{u}} \left( {{\text{t}}_{0} } \right), \ldots ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{m}} } \right)} \right) \approx {\text{y}}_{\text{u}} \left( {\text{t}} \right) \), and the pattern \( {\text{P}}\left( {\{ \left( {{\text{t}}_{\text{j}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)\} } \right) \) will approximately equal to the electrocardiograph ECGu, then we get \( {\text{ECG}}_{\text{u}} \approx {\text{P}}\left( {\{ \left( {{\text{t}}_{\text{j}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{j}} } \right)} \right)\} } \right) = (\left( {{\text{t}}_{0} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{0} } \right)} \right), \ldots ,\left( {{\text{t}}_{\text{m}} ,{\text{y}}_{\text{u}} \left( {{\text{t}}_{\text{m}} } \right)} \right) \).

Fig. 4.
figure 4

Envelope of college of normal electrocardiogram ECG convert in Hilbert space (Color figure online)

This is say that each point of ECGu can be represented as a pair of t and \( {\text{y}}_{\text{u}} \left( {\text{t}} \right),\left( {{\text{t}},{\text{y}}_{\text{u}} \left( {\text{t}} \right)} \right) \).

Let \( {\text{E}} = \left\{ {{\text{ECG}}_{\text{i}} ,{\text{i}} = 1, \ldots ,{\text{n}}} \right\} \) be a set of normal \( {\text{ECG}}_{\text{i}} \), \( {\text{Top}}\left( {{\text{ECG}}_{\text{i}} } \right) = {\text{Max}}\left\{ {{\text{ECG}}_{\text{i}} } \right\} \) and \( {\text{Down}}\left( {{\text{ECG}}_{\text{i}} } \right) = {\text{Min}}\left\{ {{\text{ECG}}_{\text{i}} } \right\} \) respectively the upper limit of n \( \left\{ {{\text{ECG}}_{\text{i}} } \right\} \) and the lower limit one, and \( {\text{N}}\left( {{\text{ECG}}_{\text{i}} } \right) \) the neighborhood which boundaries are respectively \( {\text{Top}}\left( {{\text{ECG}}_{\text{i}} } \right) \) (red line in Fig. 4), Down (ECGi) (green line), \( {\text{t}} = {\text{t}}_{0} \) and \( {\text{t}} = {\text{t}}_{\text{m}} \). and \( ({\text{t}}_{\text{j}} ,\upalpha_{\text{j}} ) \) and \( ({\text{t}}_{\text{j}} ,\upbeta_{\text{j}} ) \) are Top(ECGi) and Down(ECGi) of current value in the time \( {\text{t}} = {\text{t}}_{\text{j}} \), then \( [\upalpha_{\text{j}} ,\upbeta_{\text{j}} ] \) is the qualitative criterion judging whether the value ECGu (tj) of ECGu of u in the time \( {\text{t}} = {\text{t}}_{\text{j}} \) is normal or not. And we get a qualitative mapping as following

$$ \tau (ECG_{u} (t_{j} ),[\alpha_{j} ,\beta_{j} ]) = ECG_{u} (t_{j} )\mathop \in \limits_{?} [\alpha_{j} ,\beta_{j} ] = \left\{ {\begin{array}{*{20}l} 1 \hfill & {ECG_{u} (t_{j} ) \in [\alpha_{j} ,\beta_{j} ]} \hfill \\ 0 \hfill & {ECG_{u} (t_{j} ) \notin [\alpha_{j} ,\beta_{j} ]} \hfill \\ \end{array} } \right. $$
(5)

In Fig. 4, we show that in the new Coordinate System or the Hilbert space, whose axis respectively sampling y|t=t1, because the qualitative criterion is the hypercube [α, β] = [α1, β1] × … × [αm, βm], such that whether ECGu of u in the time t = tj is normal or not can be represented by following qualitative mapping.

$$ \tau (ECG_{u} (t),[\alpha ,\beta ]) = ECG_{u} (t)\mathop \in \limits_{?} [\alpha ,\beta ] = \left\{ {\begin{array}{*{20}l} 1 \hfill & {ECG_{u} (t) \in [\alpha ,\beta ]} \hfill \\ 0 \hfill & {ECG_{u} (t) \notin [\alpha ,\beta ]} \hfill \\ \end{array} } \right. $$
(6)

Example 1.

A training algorithm and the recognition algorithm based on the Attribute Grid Computer for classifying electrocardiograms are presented. Taking 600 normal cardiograms for example, first of all, 1000 amplitudes Aj(cari) for any electrocardiogram cari, \( {\text{i}} = 1, \ldots ,600 \), \( {\text{j}} = 1, \ldots 1000 \), are samplinged, let \( \upalpha_{\text{j}} = { \hbox{min} }\{ {\text{A}}_{\text{j}} ({\text{car}}_{\text{i}} )\} \), noted the down threshold of 600 normal electrocardiograms, and \( \upbeta_{\text{j}} = { \hbox{max} }\{ {\text{A}}_{\text{j}} ({\text{car}}_{\text{i}} )\} \), noted the top threshold of 600 normal electrocardiograms, then a strip between two red lines which is described respectively by 1000 qualitative criterion \( [\upalpha_{\text{j}} ,\upbeta_{\text{j}} ] \), as shown in Fig. 5.

Fig. 5.
figure 5

Classification of normal electrocardiogram ECG by attribute grid computer

Second, as shown in Fig. 5, a qualitative criterion, the 1000 dimension parallelepiped \( [\upalpha,\upbeta] = [\upalpha_{1} ,\upbeta_{1} ] \times \ldots \times [\upalpha_{1000} ,\upbeta_{1000} ] \) for identifying of the normal electrocardiogram is created by the transformation from criterion \( [\upalpha_{\text{j}} ,\upbeta_{\text{j}} ] \) in the sampling space into the Hilbert Space that expensed by 1000 Sub-Qualitative Mapping in the feature space.

We saw that a normal cardiograph that sandwiching in the strip of qualitative criterion be transformed as a point in the 1000 dimension parallelepiped \( [\upalpha,\upbeta] = [\upalpha_{1} ,\upbeta_{1} ] \times \ldots \times [\upalpha_{1000} ,\upbeta_{1000} ] \). But one point of a deviant cardiograph breaks through the strip of qualitative criterion, as shown in Fig. 5. And it is shown that the deviant cardiograph is identified as abnormality by the qualitative mapping.

Distinguishing between normal and abnormal ECG is a typical classification operation. And from the abnormal or faulty electrocardiogram, identify what disease is the patient suffering from? It is a diagnostic or identification operation. Pattern recognition is the most basic function of the human brain. The success of pattern recognition based on the deep learning algorithm based on neural network is considered to be a breakthrough achievement of artificial intelligence. Since the basic function of artificial neurons is classification, it is also called a classifier in general textbooks. Therefore, deep learning realizes the function of pattern recognition by adjusting the connection weight parameters between different levels and different classifiers (neurons).

5 Relation Between Probability and (Fuzzy) Degree of Conversion Function

Let be a set of N normal electrocardiograms \( y^{i} \left( t \right) \), as shown in Fig. 6(a), \( y\left( {t_{j} } \right) = y_{j} \left( t \right) \) the electrocardiogram , at the time \( t = t_{j} ,j = 1, \ldots ,m \), The green dot \( \alpha_{j} \) and the red dot \( \beta_{j} \) the lowest point and the highest point of the at the time \( t = t_{j} \), and the red and green lines formed by \( \left\{ {\alpha_{j} } \right\} \) and \( \left\{ {\beta_{j} } \right\} \) constructed the envelope of . Let \( e_{j} \left( {t = t_{j} } \right) \) be the sampling line of the electrocardiogram \( y = y\left( t \right) \) at time \( t = t_{j} \} \subseteq {\text{X}} \) is the set of m sampling lines, as shown in the Fig. 5(b). Let be the sample line perpendicular to each other in the set, i.e.: \( \left( {e_{j} \bot e_{k} ,k \ne j} \right) \), into m-dimensional coordinates. The transformation of . On the electrocardiogram sampling line \( e_{j} \left( {t = t_{j} } \right) \) is a qualitative criterion for judging whether the value \( y_{j} \left( {t_{j} } \right) \) is a normal value \( \left[ {\alpha_{j} ,\beta _{j} } \right] \), i.e.: the value \( y\left( {t_{j} } \right) \) is normal, if and only if \( y\left( {t_{j} } \right) \in \left[ {\alpha_{j} ,\beta _{j} } \right] \). Under the coordinate transformation , \( \left[ {\alpha_{j} ,\beta _{j} } \right] \subseteq e_{j} \left( {t = t_{j} } \right) \) can be transformed into \( \left[ {\alpha_{j} ,\beta _{j} } \right] \subseteq e_{j} \left( {t = t_{j} } \right) \), and the Cartesian product of the m qualitative criteria \( \left[ {\alpha_{j} ,\beta _{j} } \right] \), the 2m-dimensional cuboid: \( \left[ {\alpha ,\beta } \right] = \left[ {\alpha_{1} ,\beta _{1} } \right] \times \ldots \times \left[ {\alpha_{m} ,\beta _{m} } \right] \). And under the coordinate transformation F, m values \( \left\{ {y\left( {t_{j} } \right)} \right\} \) constituting the normal electrocardiogram y = y(t) are converted into \( y_{j} = F\left( {y\left( {t_{j} } \right)} \right) \), respectively. and form an m-dimensional vector \( y = y\left( {y_{1} , \ldots ,y_{m} } \right) \in \left[ {\alpha ,\beta } \right] \) embedded in the cuboid \( \left[ {\alpha ,\beta } \right] \).

Fig. 6.
figure 6

classification of normal electrocardiogram ECG by attribute grid computer (Color figure online)

Since the coordinate generation transformation can be regarded as a transformation of the continuous function \( y = f\left( x \right) \) into an m-dimensional vector or point in the Hilbert space, the analysis will be performed by each moment. \( t = t_{j} \) Sampling line \( \left\{ {e_{j} \left( {t = t_{j} } \right)} \right\} \). The m-dimensional coordinate system is a Hilbert space. The transform F is transformed from the pattern space \( X \) to the Hilbert space .

Then, the so-called judgment problem of whether the electrocardiogram \( y = y\left( t \right) \) is normal, under the Hilbert transform , is converted into a set of m values constituting the electrocardiogram \( y = y\left( t \right)\,\left\{ {y\left( {t_{j} } \right)} \right\} \), whether the vector y = (y_1, …, y_m) corresponding to the space belongs to the qualitative criteria \( \left[ {\alpha ,\beta } \right] \), i.e.: \( y = y\left( {y_{1} , \ldots ,y_{m} } \right) \in \left[ {\alpha ,\beta } \right] \) problem. Therefore, there are propositions:

Corollary.

The function \( y = y\left( t \right) \) is a normal electrocardiogram, if and only if, \( y = y\left( {y_{1} , \ldots ,y_{m} } \right) \in \left[ {\alpha ,\beta } \right] \).

In order to obtain a finer recognition algorithm than the classification, one of the simplest ways is to give a refinement map or operator , for , divide into n sub-references \( F_{1} \left( {\left[ {\alpha_{j} ,\beta _{j} } \right]} \right) = \bigcup\nolimits_{k = 1}^{n} {\left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right]} \), such that the electrocardiograph ECGu, Let the electrocardiogram set (envelope diagram) be divided into an n× as shown in Fig. 5(a) Grid of m small lattices, namely:, or

$$ \left[ {\alpha ,\beta } \right] = \left\{ {\left[ {\alpha_{{1_{1} }} , \beta_{{1_{1} }} } \right] \times \ldots \times \left[ {\alpha_{{1_{n} }} , \beta_{{1_{n} }} } \right]} \right\} \times \ldots \times \left\{ {\left[ {\alpha_{{m_{1} }} , \beta_{{m_{1} }} } \right] \times \ldots \times \left[ {\alpha_{{m_{n} }} , \beta_{{m_{n} }} } \right]} \right\} $$
(7)

Let \( y_{j} = F\left( {y\left( {t_{j} } \right)} \right) \in \left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right] \subseteq \left[ {\alpha_{j} , \beta_{j} } \right] \) be \( y\left( {t_{j} } \right) \in \left[ {\alpha_{j} ,\beta _{j} } \right] \) in the refinement mapping. The image under \( F\left( { = F_{1} \circ F} \right) \), because the set . The topology of the electrocardiogram set X. If \( \Omega = \{ \omega |\left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right] \subseteq {\text{Y}}\} \) is the normal electrocardiogram falling into the interval \( \left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right] \). The set of random events, N is the total number of normal ECGs in , \( N_{{j_{k} }} \) is the time \( t = t_{j} \). The normal ECG falls on the grid \( \left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right] \subseteq \left[ {\alpha_{j} ,\beta _{j} } \right] \). The probability that the normal ECG \( y^{s} \left( t \right) \) falls into the sub-grid \( \left[ {\alpha_{jk} , \beta_{jk} } \right] \) is:

$$ p\left( {\left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right]} \right) = p_{k} \left( {j^{s} } \right) = \frac{{N_{{j_{k} }}^{s} }}{N} $$
(8)

Then, the envelope refinement map \( F\left( { = F_{1} \circ F} \right) \) induces a probability map \( G:\Omega \to {\text{H}}\left( {e_{j} } \right) \), and the normal ECG falls into the grid probability \( p_{k} \left( {j^{s} } \right) = \frac{{N_{{j_{k} }}^{s} }}{N} \), (invariantly) substituted into the sub-grid of space , i.e.:

(9)

If , is the electrocardiogram that constitutes Zhang San A set of m values, \( F\left(\left\{ {y\left( {t_{j} } \right)} \right\} \right)= y = y\left( {y_{1} , \ldots ,y_{m} } \right) \in {\text{H}}\left( {e_{j} } \right) \) is the image of the electrocardiogram \(F\left(\left\{ {y\left( {t_{j} } \right)} \right\} \right) \) of Zhang San in , It can be seen from Eq. (4) that the probability \( p_{k} \left( j \right) = \frac{{N_{{j_{k} }} }}{N} \) is mapped to under the compound \( {\text{map}}\,H = G \circ F] \). \( \subseteq \left[ {\alpha_{j} ,\beta _{j} } \right] \subseteq {\text{H}}\left( {e_{j} } \right) \), the coefficient, that is, between X and , induces a composite map :

$$ H = G \circ F\left( {y\left( t \right)} \right) = \sum\nolimits_{j = 1}^{m} {p_{k} \left( j \right)} y_{j} = \sum\nolimits_{j = 1}^{m} {\frac{{N_{{j_{k} }} }}{N}y_{j} } $$
(10)

Let be the set of values that constitute the three-cardiogram, by (5). It can be seen that the composite map will have:

Let \( y^{{\prime }} = y^{{\prime }} \left( {y_{1}^{{\prime }} , \ldots ,y_{m}^{{\prime }} } \right) \) be the image of Li Si’s electrocardiogram in . The image under the composite map is:

$$ H\left( {y^{{\prime }} \left( t \right)} \right) = G \circ F\left( {y^{{\prime }} \left( t \right)} \right) = G \circ F\left( {y^{{\prime }} \left( t \right)\left\{ {y^{{\prime }} \left( {t_{j} } \right)} \right\}} \right) = \sum\nolimits_{j = 1}^{m} {p_{k}^{{\prime }} \left( j \right)y_{j}^{{\prime }} } = \sum\nolimits_{j = 1}^{m} {\frac{{N_{{j_{k} }}^{{\prime }} }}{N}y_{j}^{{\prime }} } $$
(11)

Where \( p^{{\prime }} \left( {j_{k} } \right) = \frac{{N_{{j_{k} }}^{{\prime }} }}{N} \) is the electrocardiogram \( y^{{\prime }} \left( t \right) \) of Li Si at time \( t = t_{j} \), falling on the sub-grid \( \left[ {\alpha_{{j_{k} }} ,\beta _{{j_{k} }} } \right] \). The probability of, and as a coefficient of \( y_{j}^{{\prime }} \) is mapped to by the composite mapping \( H = G \circ F \).

Therefore, as long as the two probabilities \( p^{{\prime }} = p^{{\prime }} \left( {p_{1}^{{\prime }} , \ldots ,p_{m}^{{\prime }} } \right) \ne p\left( {p\left( 1 \right), \ldots ,p\left( m \right)} \right) \), Zhang San and Li Si’s electrocardiogram can be distinguished. In this way, we get a judgment criterion (or algorithm) that identifies two different ECGs:

Proposition (or algorithm): Zhang San and Li Si’s electrocardiogram \( y = y\left( {y_{1} , \ldots ,y_{m} } \right) \) and \( y^{{\prime }} = y^{{\prime }} \left( {y_{1}^{{\prime }} , \ldots ,y_{m}^{{\prime }} } \right) \) are different, if and only if, \( p^{{\prime }} = p^{{\prime }} \left( {p_{1}^{{\prime }} , \ldots ,p_{m}^{{\prime }} } \right) \ne p\left( {p\left( 1 \right), \ldots ,p\left( m \right)} \right) \).

6 From Classification Model to Recognition Model

Let \( y_{j}^{*} = Ey_{j} \) be the mathematical expectation of all electrocardiograms in the electrocardiogram at the moment \( t = t_{j} \), i.e.:

$$ Ey_{j} = \sum\nolimits_{s = 1}^{N} {\frac{{N_{{j_{k} }}^{s} }}{N}y_{j}^{s} } $$
(12)

If all the values \( y\left( {t_{j} } \right) \in \left[ {\alpha_{j} , \beta_{j} } \right] \) in \( \left[ {\alpha_{j} ,\beta _{j} } \right] \) are evaluated, it can be considered that the mathematical expectation \( y_{j}^{*} = Ey_{j} \) is in all normal electrocardiograms, at the time \( t = t_{1} \) score the highest (or most ideal) value, using the fuzzy (set) membership degree, \( y_{j}^{*} = Ey_{j} \) belongs to (the most normal ECG atlas membership degree is equal to 1, i.e.: \( \mu_{norm} \left( {y_{j}^{*} } \right) = 1 \). The term of the mass-mass conversion degree function, the mathematical expectation \( y_{j}^{*} = Ey_{j} \) can be called the qualitative reference value of the most normal electrocardiogram. Thus, \( y^{*} = y^{*} \left( {y_{1}^{*} , \ldots ,y_{m}^{*} } \right) \). It can be regarded as the most normal electrocardiogram. Conversely, as long as there is a value \( y_{j} \), the value of \( y_{j}^{*} \) is deviated from \( y_{j}^{*} = Ey_{j} \), or: distance \( d\left( {y_{j} ,y_{j}^{*} } \right) = \left| {y_{j} - y_{j}^{*} } \right| \ne 0 \). Then, the electrocardiogram \( y = y\left( t \right) \) can be characterized as “non-optimal”.

In other words, only at all times \( t = t_{j} ,j = 1, \ldots ,m \), the value \( y_{j} = y_{j}^{*} = Ey_{j} \)’s electrocardiogram y = y(t), i.e.: \( y^{*} = y^{*} \left( {y_{1}^{*} , \ldots ,y_{m}^{*} } \right) \), (which belongs to the normal ECG set) membership degree is: \( \mu_{norm} (y^{*} ) = 1 \). Otherwise, its membership is not equal to 1. It is not difficult to see that this provides us with an idea or method of how to design a membership function.

If \( y = y\left( t \right) \) is the electrocardiogram of Zhangsan, and let \( y_{j} \) be \( y\left( { \ne y_{j}^{*} } \right) \) at the moment \( t = t_{j} ,j = 1, \ldots ,m \), set \( H\left( {y\left( t \right)} \right) = G \circ F\left( {y\left( t \right)} \right) = \sum\nolimits_{j = 1}^{m} {p\left( {j_{k} } \right)y_{j} } \) and \( H\left( {y^{*} \left( t \right)} \right) = G \circ F\left( {y^{*} \left( t \right)} \right) = \sum\nolimits_{j = 1}^{m} {p^{*} \left( {j_{k} } \right)y_{j}^{*} } \), respectively, the electrocardiogram y = y(t) and the mathematical expectation \( y^{*} = y^{*} \left( t \right) \) respectively in Hilbert The image in space, \( \left[ {H\left( {y\left( t \right)} \right) - H\left( {y^{*} \left( t \right)} \right)} \right] \) is the distance between them, if let \( \sigma \left( y \right) = \sqrt {\left[ {H\left( {y\left( t \right)} \right) - H\left( {y^{*} \left( t \right)} \right)} \right]^{2} } = \sqrt {\sum\nolimits_{j = 1}^{m} {[p\left( {j_{k} } \right) - p^{*} \left( {j_{k} } \right)]^{2} } } \) be the variance, then the (fuzzy) membership of the electrocardiogram \( y = y\left( t \right) \) belongs to the normal electrocardiogram can be defined to be as following

$$ \mu_{norm} \left( y \right) = \mu_{norm} \left[ {\sigma \left( y \right)} \right] = \frac{1}{{\sqrt {2\pi } \sigma \left( y \right)}}e^{{ - \frac{{\left( {y - y^{*} } \right)^{2} }}{{2\sigma^{2} }}}} $$
(13)

Obviously, the Gauss function (13) is a fuzzy membership function.