Introduction

A stochastic process is a probability model employed for describing the evolution in time of a random event. The hidden Markov models (HMMs) make a profitable and flexible class of stochastic processes which have been used satisfactorily in a broad range of applied problems. The hidden Markov model (HMM) is an extension of a Markov chain whose states are hidden. It is a doubly stochastic model \(\lbrace (H_k,O_k) \rbrace \), where \(\lbrace H_k \rbrace \) denotes the hidden state sequence which is a finite-state Markov chain. Given \(\lbrace H_k \rbrace \), the \(O_k\) is conditionally independent and the conditional distribution of it relies on \(\lbrace H_k \rbrace \) only through \(H_n\). The (HMMs) have many applications in different fields such as speech recognition [1], hand gesture recognition [2], source coding [3], seismic hazard assessment [4], traffic prediction [5],wireless network [6,7,8], protein structure prediction [9] and finance [10]. The semi-hidden Markov models (SHMMs) are stochastic models related to HMMs. They are discussed in [11, 12], recently. A principal characteristic of these models is the involvement of statistical inertia which admits the generation, and analysis of observation symbol contains frequent runs. The SHMMs cause a substantial reduction in the model parameter set. Therefore in most cases, these models are computationally more efficient models compared to HMMs in most cases. As long stretches of error-free transmission exist in wireless channels, the corresponding runlength vector is greatly shorter in length than the original binary data. Hence, the simulation runtime decreases considerably.

In this paper, the definition and the modified Baum–Welch algorithm for the parameter estimation of the SHMM are given first and next, the order estimation benchmarks of these models are discussed. Next, we present an application of the SHMM in wireless communication for modeling the error sequence generated in the CDMA system. Finally, conclusions are given.

The semi-hidden Markov model

The SHMMs are the advanced types of stochastic models connected to HMMs. They can model the behavior of symbolic sequences with inertia, memory and long runs. If we know the state of changing points, given these sequences it is probable to realize the sequence of states. Hence, the semi-hidden Markov model (SHMM) is not absolutely hidden. The SHMMs work by altering among distinct states and generating symbols of the alphabet in the same way as the HMMs. The input sequence is in terms of a runlength vector, and therefore, the length of the input sequence is extremely reduced, especially when the sequences involve long stretches of identical symbols.

In this section, we represent a modified version of the forward–backward (BWA) that is employed to estimate the parameters of the SHMMs. Figure 1 indicates the flowgraph of the SHMM.

Fig. 1
figure 1

The semi-hidden Markov flowgraph

The parameters of SHMMs can be estimated by the algorithms similar to the Baum–Welch Algorithm (BWA). Suppose \({\mathbf{{x}}^T_1} \) be the observation sequence which is explained by the sequence of different observation values \(X_1,X_2,\ldots ,X_\mathrm{N} \) and the number of repetitions \(m_1,m_2,\ldots ,m_\mathrm{N} \). Let \(f({\mathbf{{x}}^T_1})\) be a process multidimensional distribution. We can write the likelihood function as

$$\begin{aligned} g({\mathbf{{x}}^T_1},\tau )=h({{\varvec{\upsilon }}^N_1,\tau )}= {\varvec{\pi }}{\prod _{i=1}^{N}} \mathbf{{P}}^{m_i}(X_i,\tau )\mathbf{{1}} \end{aligned}$$
(1)

where \({\upsilon }_i=(X_i, m_i) \), N is the number of subsegments containing identical observations and \(\tau \) is the model parameter vector which can be omitted in the equations in which its actual value is not important. To calculate \(\mathbf{{P}}^{m}(X)\) using the fast exponentiation algorithm, we express the power as a binary number

$$\begin{aligned} m=b_1+2b_2+\cdots +2^{k-1}b_{k} \end{aligned}$$
(2)

\(\mathbf{{P}}^{m}(X)\) is found as a result of the following recursion:

Algorithm:

Initialize:

$$\begin{aligned} \mathbf{{R_1}}=\mathbf{{P}}(X)\,\,\,\mathbf{{Q}}_0=\mathbf{{I}} \end{aligned}$$
(3)

For \(i=1,2,\ldots ,k \)

Begin

$$\begin{aligned} \mathbf{{Q}}_i=\mathbf{{Q}}_{i-1}{\mathbf{{R}}^{b_i}_i}, \,\,\,\mathbf{{R}}_{i+1}={\mathbf{{R}_i}^2} \end{aligned}$$
(4)

End

We obtain the fast forward algorithm

$$\begin{aligned} {\varvec{\alpha }}_i={\varvec{\alpha }}_{i-1}{\mathbf{{R}}^{b_i}_i}, \,\,\,\,\mathbf{{R}}_{i+1}={\mathbf{{R}_i}^2}, i=1,\ldots ,k \end{aligned}$$
(5)

which can be used to calculate the forward probabilities

$$\begin{aligned} {\varvec{\alpha }}({{\varvec{\upsilon }}^{n_t}_1,\tau )}={\varvec{\pi }}{{\varvec{\displaystyle }\prod _{i=1}^{{n_{t-1}}}{} \mathbf{{P}}^{m_i}(X_i,\tau )}} \end{aligned}$$
(6)

where \(n_t \) is the number of different observation series up to time t.

Similarly, we obtain the fast backward algorithm

$$\begin{aligned} {\varvec{\beta }}_i={\mathbf{{R}}^{b_i}_i}{\varvec{\beta }}_{i-1}, \,\,\,\,\mathbf{{R}}_{i+1}={\mathbf{{R}_i}^2}, i=1,\ldots ,k \end{aligned}$$
(7)

for calculating the backward probabilities

$$\begin{aligned} {\varvec{\beta }}({{\varvec{\upsilon }}^{N}_1,\tau )}={{\varvec{\displaystyle }\prod _{i={n_t}}^{{N}}}{\varvec{P}}^{m_i}(X_i,\tau )\mathbf {1}} \end{aligned}$$
(8)

We have

$$\begin{aligned} h({{\varvec{\upsilon }}^N_1,\tau )}= {\varvec{\pi }_{X_1}}{\varvec{\displaystyle }\prod _{i=1}^{N-1}}[\mathbf{{P}}^{m_i-1} _{{X_i}{X_i}}{} \mathbf{{P}}_{{X_i}{X_{i+1}}}]\mathbf{{P}}^{m_N-1} _{{X_N}{X_N}}{} \mathbf{{1}} \end{aligned}$$
(9)

To indicate the forward and backward algorithm for calculating \(h({{\varvec{\upsilon }}^N_1,\tau )}\) in block-vector form, we define

$$\begin{aligned} {\varvec{\alpha }_0}({{\varvec{\upsilon }}^{t}_1,\tau )}= & {} {\varvec{\pi }_{X_1}}{{\varvec{\displaystyle }\prod _{i=1}^{{{t-1}}}[\mathbf{{P}}^{m_i-1}_{{X_i}{X_i}}{} \mathbf{{P}}_{{X_i}{X_{i+1}}}}}] \end{aligned}$$
(10)
$$\begin{aligned} {\varvec{\alpha }}({{\varvec{\upsilon }}^{t}_1,\tau )}= & {} {\varvec{\pi }_{X_1}} {{\varvec{\displaystyle }\prod _{i=1}^{{{t-1}}}[\mathbf{{P}}^{m_i-1}_{{X_i}{X_i}}{} \mathbf{{P}}_{{X_i}{X_{i+1}}}}}]{\mathbf{{P}}^{m_t-1}_{{X_t}{X_t}}} \end{aligned}$$
(11)

These vectors can be analyzed recursively by the following forward algorithm:

$$\begin{aligned} {\varvec{\alpha }_0}({{\varvec{\upsilon }}^{0}_1,\tau )}= & {} {\varvec{\pi }_{X_1}}, {\varvec{\alpha }}({{\varvec{\upsilon }}^{t}_1,\tau )}\nonumber \\= & {} {\varvec{\alpha }_0}({{\varvec{\upsilon }}^{t}_1,\tau )}{\mathbf{{P}}^{m_t-1}_{{X_t}{X_t}}} {\varvec{\alpha }_0}({{\varvec{\upsilon }}^{t+1}_1,\tau )}\nonumber \\= & {} {\varvec{\alpha }}({{\varvec{\upsilon }}^{t}_1,\tau )}{} \mathbf{{P}}_{{X_t}{X_{t+1}}} \end{aligned}$$
(12)

The backward variable is defined as

$$\begin{aligned} {\varvec{\beta }_0}({{\varvec{\upsilon }}^{N}_t,\tau )}= & {} {{\varvec{\displaystyle }\prod _{i=t}^{{{T}}}[\mathbf{{P}}_{{X_{i-1}}{X_{i}}}}}{} \mathbf{{P}}^{m_i-1}_{{X_i}{X_i}}]\mathbf {1} \end{aligned}$$
(13)
$$\begin{aligned} {\varvec{\beta }}({{\varvec{\upsilon }}^{N}_t,\tau )}= & {} {\mathbf{{P}}^{m_t-1}_{{X_t}{X_t}}} {{\varvec{\displaystyle }\prod _{i=t+1}^{{{T}}}[\mathbf{{P}}_{{X_{i-1}}{X_{i}}}}}{} \mathbf{{P}}^{m_i-1}_{{X_i}{X_i}}]\mathbf {1} \end{aligned}$$
(14)

We obtain the block-form backward algorithm as

$$\begin{aligned} {\varvec{\beta }_0}({{\varvec{\upsilon }}^{N}_{N+1},\tau )}= & {} \mathbf{{1}}, {\varvec{\beta }}({{\varvec{\upsilon }}^{N}_t,\tau )}\nonumber \\= & {} {\mathbf{{P}}^{m_t-1}_{{X_t}{X_t}}}{\varvec{\beta }_0}({{\varvec{\upsilon }}^{t+1}_N,\tau )} {\varvec{\beta }_0}({{\varvec{\upsilon }}^{N}_t,\tau )}\nonumber \\= & {} \mathbf{{P}}_{{X_t}{X_{t+1}}}{\varvec{\beta }}({{\varvec{\upsilon }}^{N}_{t+1},\tau )} \end{aligned}$$
(15)

Next, suppose that \(q_t\) is the state at time t. In order to estimate the model \(\Lambda \), define

$$\begin{aligned} \Gamma _t(i,j)=Pr(q_t=i,q_{t+1}=j|{\mathbf{{x}}^T_1}, \Lambda ) \end{aligned}$$
(16)

This can be described in relation to the forward and backward variables as

$$\begin{aligned} \Gamma _t(i,j)=\frac{\alpha _t(i)\Lambda _{{X_N}{X_{N+1}}}{\Lambda _{{X_N}{X_{N+1}}}}^{m(X_{N+1})-1}\beta _t(j)}{P({\mathbf{{x}}^T_1}|\Lambda )} \end{aligned}$$
(17)

\(\Gamma _t(i,j)=[0]\) unless \(i={X_N}\) and \(j={X_{N+1}}\) where [0] represents a matrix consisting of all zeros. The probability transition matrix \(\varvec{P}\) is estimated by calculating the expected number of transition from i to j as \({\sum _{t=1}^{N-1}\Gamma _t(i,j)}\).

One of the most important problems with SHMMs is order estimation. The information criteria are the rectification of the parameters and uncertainty in the model. They are based on the maximum log-likelihood estimate (L) of the model. These criteria try to choose an order of the model with small generalization error. Three common criteria for the mentioned purpose are as follows in which k and n are the numbers of parameters and observations, respectively:

  1. 1.

    The Akaike information criterion (AIC) was proposed by Akaike (1973) [13]. It is defined as

    $$\begin{aligned} \mathrm{AIC}=-\,2\mathrm{log}(L)+2k \end{aligned}$$
    (18)
  2. 2.

    The Bayesian information criterion (BIC) was introduced by Schwarz (1978) [14]. It is computed as

    $$\begin{aligned} \mathrm{BIC}=-\,2\mathrm{log}(L)+k\mathrm{log}n \end{aligned}$$
    (19)
  3. 3.

    The Hannan–Quinn information criterion (HQC) is calculated as

    $$\begin{aligned} \mathrm{HQC}=-\,2\mathrm{log}(L)+2k \mathrm{log}(\mathrm{log}(n)) \end{aligned}$$
    (20)

An application of the SHMM in wireless communication

Modeling wireless communication errors is substantial for simulation-based performance assessment of network protocols or for utilizing information about these error characteristics within a protocol. Discrete channel models (DCMs) were employed in wireless systems such as code-division multiple access (CDMA) [15], orthogonal frequency division modulation (OFDM) [16] and global system for mobile communication (GSM) [17]. HMMs are dominant tools with high accuracy which employed as the discrete channel model (DCM) for modeling stochastic processes. These models are applied for precise simulation of errors in wireless systems [18,19,20,21,22]. In this section, the SHMM is used for modeling the errors of the CDMA as the DCM. The order estimation is profitable for interpreting the model. Moreover, it is vital to ensure stability. The estimation of the order of the HMM was investigated in [23]. They discussed the optimal order estimation of the SHMM for the error sequences generated by the CDMA.

The CDMA specification

The CDMA is a channel access method employed by various radio communication technologies. It has considerable advantages over other similar technologies. CDMA proposes a high put-through rate and can prevail over resilient interference which leads to an increase in the system capacity. The importance of the CDMA makes its modeling necessary to assess and analyze the error control pattern. The superiority of the CDMA makes the modeling necessary for assessing and analyzing the error control pattern. These errors are yielded by comparing the detected with desired symbols and can be shown by the SHMM. Refer to [24, 25] for more details about CDMA. The block diagram for the SHMM-based simulation model of a CDMA link is given in Fig. 2. In Fig. 2, a multiuser CDMA system is shown which is considered to increase the capacity of the system. In this system, K users have considered which one user is the desired one and the other K-1 users are the interfering users. Both have the same blocks. All these users then transmit simultaneously in the same frequency band and are distinguished at the receiver by the user specific spreading code. The interfering users appear as interference to the desired user because of nonzero cross-correlation values between the spreading codes. All the desired user, interfering users and additive noise combine in the wireless channel.

Fig. 2
figure 2

Block diagram of the SHMM implementation for the CDMA

Analysis of the CDMA system with the SHMM

The simulation of a CDMA system is performed in this section. It includes interference and thermal noise. It operates in a multipath/fading environment. The Rayleigh fading on each multipath component is exhibited. Each MAI signal uses the same PN-sequence as the signature/spreading sequence. BPSK modulation is regarded, and pulse shaping is ignored. All MAI signals and the desired signal are chip synchronized at the receiver. The input parameters are supposed to be KfactordB = 0, SF = 63, Number of Interferers (NoI) = 30, and Mpathdelay\(=[2\ 25\ 85]\) with 12,000 symbols are supposed. The run test is performed, and the p value is obtained 0.295. It indicates that at a significance level of \(\alpha =0.05\), the error trace produced from the waveform level simulation is randomly distributed. The slow fading causes this non-stationary. Moreover, existence of the inertia which is the characteristic of SHMM is evident. The original error trace is separated into lossy and error-free traces. The first trace is comprised of \(1's\) and \(0's\) with the first element being a 1, and the second one contains zeros in runlength vector form. Therefore, two random processes with space \(S=\{0,1,2,\ldots \}\) are as follows:

  • \(\{L_n |n\ge {0}\}\): The lossy state length process, where \(L_n\) shows the length of the nth state.

  • \(\{G_n |n\ge {0}\}\): The error-free state length process, where \(G_n\) shows the number of elements in the nth error-free state.

The length of a lossy trace is determined by the change of state constant (C) which is the sum of the mean and standard deviation of the error burst runlengths. The run test is performed on the lossy trace, and the p-value (two-tailed) smaller than 0.0001 admits the acceptance of the stationary hypothesis at the significance level of 0.05. Different SHMMs are trained to model the lossy trace. It is observed that the SHMM with 2 states is the best model. The sample autocorrelation function (ACF) comparison of the lossy data trace with 2 to 6 state SHMM is exhibited in Fig. 3; Furthermore, the MSE ACFs are calculated for judging simpler (Table 1).

Fig. 3
figure 3

The ACF comparison of lossy data with different states of SHMM

Table 1 MSE ACFs of different SHMMs for the lossy trace

The criteria AIC, BIC and HQC of the different SHMMs are given in Table 2. It is obvious that all these measures choose 2-state SHMM as the best model. The best parameters of this model are estimated as follows:

$$\begin{aligned} A= \left( \begin{array}{ccc} 0 &{}\,1\\ 0.1271 &{}\, 0.8729\end{array} \right) \qquad B= \left( \begin{array}{ccc} 0.2692&\,0.7308 \end{array} \right) \end{aligned}$$
(21)
Table 2 AIC, BIC and HQC values of different SHMMs for the lossy trace

The best fitting distributions for the runlength of error-free and lossy traces are the generalized Pareto distribution (GPD) and two-parameter Gamma distribution, respectively. Figures 4 and 5 indicate the cumulative distribution function (CDF) for the \(G_n\) and \(L_n\) along with their empirical cumulative distribution functions. The CDF forms of (GPD) and gamma\((\alpha ,\beta )\) distributions are as follows:

$$\begin{aligned} F(x)= {\left\{ \begin{array}{ll} 1-(1+k\frac{(x-\mu )}{\sigma })^{-\frac{1}{k}}, &{}\,\, {\text {if}}\ k\ne 0 \\ 1-\mathrm{exp}(-\frac{(x-\mu )}{\sigma }), &{}\,\, {\text {if}}\; k=0 \end{array}\right. } \end{aligned}$$
(22)

with domain:

$$\begin{aligned} {\left\{ \begin{array}{ll} \mu \le x\le \infty , &{} \,\,\,{\text {for}}\ k\ge 0 \\ \mu \le x\le \mu -\frac{\sigma }{k} , &{}\,\,\, {\text {for}}\; k<0 \end{array}\right. } \end{aligned}$$
(23)

and

$$\begin{aligned} F(x)=\frac{\Gamma _x(\alpha )}{\Gamma (\alpha )} \end{aligned}$$
(24)

where \(\Gamma \) is the Gamma function and \(\Gamma _x\) is the incomplete Gamma function.

The parameters for these best distributions are \(k=0.05811,\,\sigma =5.3891,\,\mu =1.6524\) for the (GPD) and \(\alpha =2.1861,\,\beta =0.70809 \) for the Gamma distribution.

Fig. 4
figure 4

The error-free state runlength distribution

Fig. 5
figure 5

The lossy state runlength distribution

The binary data are produced by generating lossy trace utilizing the 2-state SHMM and by generating runlengths of the error-free trace from the GPD as follows:

  1. 1.

    Choose the number of lossy and error-free frames (N) to generate in the artificial trace.

  2. 2.

    Identify the length of an error-free state \((g_{len})\) from the \(G_n\) using the inverse CDF method.

  3. 3.

    Generate a sequence of zeros \((g_{len})\) to make an error-free burst.

  4. 4.

    Identify the length of lossy state \((l_{len})\) from the \(L_n\) using the inverse CDF method.

  5. 5.

    Generate a sequence of \((l_{len})\) burst which is either lossy or error-free frames based on 2-state SHMM.

  6. 6.

    Compound the two sequences to the artificial trace.

  7. 7.

    Stop if all N frames have been generated, else return to step 2. The original and artificial error traces are compared according to the ACF in Fig. 6. It is evident that the two plots are matched. Moreover, from Fig. 7, it can be concluded that the distributions of error-free intervals for these two sequences are in the same manner which validates the accuracy of our mathematical model.

Fig. 6
figure 6

The ACF comparison of the original and artificial error traces

Fig. 7
figure 7

The error-free interval distribution comparison of the original and artificial error traces

Conclusion

Analyzing the network protocol and performance depends on the methods of modeling and simulating channel conditions. Wireless channels usually face bursty errors. In this paper, we demonstrated that the SHMM as a discrete channel model can accurately model the errors of a CDMA system. The simulation contained the effects of multipath, additive white Gaussian noise and multiple access interference to generate the error sequence. The original error trace exhibited a non-stationary performance. Therefore, we divided the data into two lossy and error-free traces to obtain stationary behavior. The SHMM was used to model the lossy trace. The AIC, BIC, HQC and sample autocorrelation criteria were employed to find the best model as a 2-state SHMM. The best fitting of a runlength of the lossy and error-free trace was the two-parameter Gamma distribution and the generalized Pareto distribution, respectively. An artificial binary error trace was generated by integrating the 2-state (SHMM) and generalized Pareto distribution according to the algorithm we explained. The original error trace matched closely with the artificial one according to the sample autocorrelation function. All in all, the semi-hidden Markov model is a reliable stochastic model for modeling symbolic sequences with long runs and statistical inertia. It has become a precise mathematical feature to model the error traces generated by wireless channels.