Keywords

1 Introduction

Not only due to the increasing dependency on easily available handheld devices in the form of Smart phones, tablets, iPad, A4 Take Note, etc. which are available at reasonable cost, but also these devices now govern the human society because of their numerous applicability to make life easier. One of the interesting properties of such devices is that people can provide information freely on those devices and written information can be saved in the form of online information bearing the pixels information along trajectory path with pen up/down status. Adopting such online devices, people not only minimize the chances of mistyping that may arise when writing with a keyboard but also saves extra time that would have been required for typing the same information. In Online Handwriting the information are stored as real time coordinate data points. In contrast to that in offline Handwriting recognition, the information are saved as images. So, in the later one the data are prone to quality degradation leading to high noise level in the information whereas in the former one, the information can never be prone to quality issues and hence efficient in contrast to offline. These benefits, in turn, make Online Handwriting Recognition (OHR) an upcoming research domain. Though in the literature, substantial amount of research publications are available for Devanagari script [18], but while talking about Bangla script, this statement is not defensible. Limited number of research works in the literature silently describes the truth. Hence, researchers in the Bangla OHR domain need to pay more attention. Bhattacharya et al. in [9] have shown the advantage of using direction code features to recognize the symbols from Bangla alphabet set. Authors in [10] have highlighted a novel approach to recognize handwritten Bangla characters in a 2D plane without being concerned about writing direction. From implementation point, authors have assumed that structural shape of a character can be represented by different skeletal convexity of its component strokes. In [11], authors have followed a different strategy where firstly component strokes are extracted at character level. Then, stroke level sequential and dynamic information have been computed to extract the feature values for stroke recognition purpose. Target character is then constructed from recognized strokes by matching the stroke sequences from a stored database. In [12], authors have described the benefits of the customized version of Distance Based Feature extraction technique towards recognition of handwritten Bangla characters. At pre-processing stage, characters are segmented into N number of divisions and then distance values have been computed from each segment point to rest of the points. Authors in [13] have proposed a stroke based recognition scheme where each stroke is described by a string of shape features. Then, Dynamic Time Warping (DTW) technique has been adopted in order to recognize an unknown stoke by comparing it with a previously prepared stroke database. Sen et al. in [14] have figured out the individual as well as combined impact of global and local information of an image. Authors in [15] have shown the success rate obtained by combining some online (point based and structural features) and offline (quad tree based longest run and convex hull) features for the recognition of online handwritten Bangla characters. Parui et al., in [16], have successfully managed to group 54 stroke classes from Bangla character alphabet set which is based on graphemes level shape similarity. Stroke level HMM has been constructed to recognize constituent strokes and then 50 look up tables has been prepared to identify the target character from recognized strokes. R. Ghosh, in [17], has divided all the constituent strokes of the sample character into nine rectangular zones and then has focused on building of an effective feature vector by considering standard deviation of x and y coordinates, writing direction, curvature, slope, curliness at stroke level. Produced feature vector are then fed to the Support Vector Machine (SVM) classifier to recognize the same.

In this paper, authors have concentrated on some pre-processing steps along with estimating strong discriminating feature set for the recognition of handwritten Bangla characters.

2 Data Collection and Pre-processing Step

For the current work, 100 different persons in West Bengal, India belonging to different professional and educational background, age, gender etc. have contributed 200 handwritten samples for each Bangla character. Considering 50 character symbols of Bangla script (see Fig. 1), size of the present database becomes 10,000. No strict instruction was given during data collection process; only contributors were asked to write the constituent strokes of the characters as part of basic stroke database [11]. In this experiment, at the beginning of pre-processing step, duplicate points are removed from character sample to minimize redundancy and then pixel points are scaled to fit into a window of size 512 × 512 in order to cope up size variability [12]. Scaled pixel points are then rearranged to obtain a new sequence of points which are unit distance apart, by applying 4-connected Bresenham’s algorithm. Character image with irregular scaled coordinates and unit distant coordinates are shown in Fig. 2.

Fig. 1.
figure 1

Basic symbols in Bangla alphabet set

Fig. 2.
figure 2

(a) Scaled character showing non-uniform pixel intervals due to speed variation (b)same character with unit distant pixels after applying 4-connected Bresenham’s algorithm

3 Feature Extraction

In this paper, quad-tree based image segmentation approach is followed to design three different feature extraction techniques for the recognition of online handwritten Bangla characters. At the first step, the sample character is divided into four (2 × 2) rectangular blocks. Then one area based feature using, composite Simpson’s rule and two other local features namely mass distribution and chord length have been computed from therein. In each epoch, the level of quad-tree is increased by one, which can be achieved by dividing each rectangular block into four sub-blocks to get a structure of 4 × 4 blocks at depth two. The said features are again estimated from each block to witness the outcome with a closer view than before. In the current experiment, the observations are recorded by segmenting the image up to 64 blocks (i.e. quad-tree of depth four). Figure 3 reflects quad-tree based segmentation of the same character sample varying the depth from one to four. In this way 4, 16, 64 and 256 feature values have been produced at quad-tree of depth one, two, three and four respectively. The feature extraction methodologies are described in following subsections:

Fig. 3.
figure 3

Illustration of image segmentation using quad-tree based approach (a) original image. (b-e) segmented images produced at different depths of the tree

3.1 Area Feature

As the structural patterns of the alphabets in Bangla are supposedly unalike from each other, hence, it can be assumed that for different types of characters, pixel patterns constituting the samples must be dissimilar at different blocks. There may be some blocks where no pixels are found for any particular character sample. Therefore, when block-wise area under the curve is calculated then these calculations become distinct for different character patterns. The truthfulness of this statement can be easily observed from Fig. 4 by looking at the positional information of selected blocks for character samples ই, ক. Thereby, it is inferred that these values could be useful for classification of the characters. The working principle for the computation of area feature is presented in Algorithm 1. Equation (1) is known as composite Simpson’s rule to find area under a curve constituting a set of points.

Fig. 4.
figure 4

Calculation of area under curve in the respective blocks for two different character samples using composite Simpson’s rule in a quad-tree based image segmentation at depth two

  • Step I: BEGIN

  • Step II: for i = 1 to N do,

  • Step III: for j = 1 to N do,

  • Step IV: Find area covered by the curve in blocki,j using Eq. (1)

  • Step V: End for

  • Step VI: End for

  • Step VII: End

Algorithm 1: Steps to compute area feature using composite Simpson’s rule when sample character is divided into NxN block size

$$ {\text{Area}} = \mathop \sum \limits_{k = 1}^{{\varvec{n} - 1}} \left( {\left( {\frac{{\mathbf{h}}}{3}} \right)*({\mathbf{y}}_{{\mathbf{k}}} + 4 *{\mathbf{y}}_{{{\mathbf{middle}}}} + {\mathbf{y}}_{{{\mathbf{k}} + 1}} )} \right) $$
(1)

Where, each rectangular block contains varying number of pixels n, starting from y1 to yn. (yk, yk+1) denotes the measurement of y coordinate values between consecutive pixel points in that block. Values of h and ymiddle have been calculated as follows:

$$ \begin{aligned} {\mathbf{h}} = \frac{{{\mathbf{mod}}(\varvec{x}_{{\varvec{k} + 1}} - \varvec{x}_{\varvec{k}} ) }}{2} \hfill \\ \varvec{y}_{{\varvec{middle}}} = \frac{{(\varvec{y}_{\varvec{k}} \varvec{ } + \varvec{ y}_{{\varvec{k} + 1}} )}}{2} \hfill \\ \end{aligned} $$

After execution of Algorithm 1, a total of N*N number of area values will be generated as an outcome when the character image is divided into NxN rectangular blocks. These measurements are taken as feature values in this experiment. This has been assumed that the blocks which do not have any pixel return zero as area value.

3.2 Local Feature

  • Mass Distribution

Depending on structural pattern of the character, certain rectangular blocks are densely populated by the data pixels. Therefore, block-wise mass distribution information may carry an important role to distinguish different character patterns efficiently. Here mass distribution describes the pixel counts inside a block, produced by quad-tree image segmentation approach. Figure 5 shows mass distribution of the character samples ই and ক when the images are segmented into 16 blocks. Here, blue points are the pixels in the respective blocks. From these figures this can be easily understood that for different character patterns, a particular block has varied data pixels, which in turn produces discriminative feature towards online Bangla handwritten character recognition.

Fig. 5.
figure 5

Mass distribution of two character samples ই and ক in a quad-tree based image segmentation at depth two (for each image, pixel counts are shown for a particular block)

  • Chord Length

As compared to mass distribution, in this approach, the length of the contributed chord in each block has been considered. Dividing the character sample into a number of small chords/segments and storing block-wise chord length information as feature values play a vital role in this pattern classification problem. This is because these lengths vary significantly for different character patterns. Algorithm 2 describes the steps to compute block-wise chord length feature.

  • Step-I: BEGIN

  • Step-II: for i = 1 to N do,

  • Step-III: for j = 1 to N do,

  • Step-IV: C \( {\text{hord}}\_{\text{Length}}_{{{\text{i}},{\text{j}}}} = \mathop \sum \limits_{{{\text{n}} = 1}}^{{{\text{k}} - 1}} \sqrt {({\text{x}}_{\text{n}} - {\text{x}}_{{{\text{n}} + 1}} )^{2} + ({\text{y}}_{\text{n}} - {\text{y}}_{{{\text{n}} + 1}} )^{2} } \)

  • Step-V: End for

  • Step-VI: End for

  • Step-VII: End

Algorithm 2: Steps to compute block-wise chord length feature when sample character is divided into NxN blocks.

Let us assume that m number of varying pixels starting from (x1,y1) to (xm,ym) is there in each rectangular block as mentioned in Fig. 5. To find the chord length of blocki,j, summation of Euclidian distances between the consecutive pair of pixels have been measured in the said block as indicated in Step-IV of Algorithm 2. Figure 5 clearly demonstrates the fact that length of the chord remarkably changes for different characters in a particular block which enables the classifier to differentiate handwritten online Bangla characters successfully.

4 Result and Discussion

In this paper, strengths of individual feature set, mentioned earlier, as well as their possible combinations have been observed while recognizing handwritten online Bangla characters. Table 1(a-b) highlights the recognition accuracies, achieved through some well-known classifiers like Multi-Layer Perceptron (MLP), SMO, Random Forest, NaiveBayes and BayesNet when different quad-tree depths are followed for segmenting the images. Here, authors have used 5-fold cross validation scheme on the total dataset. From Table 1, it is noticed that irrespective of applied feature set and classifiers, success rates gradually increase as depth of the quad-tree structure increases for segmenting the image. This statement is true for depth up to three. When depth of the tree is increased beyond this level then performances of the said classifiers start decreasing.

Table 1. (a-b). Success rates (in %) of different classifiers for the recognition of online Bangla characters when individual and various combinations of estimated feature sets are applied for different levels of quad-tree based image segmentation (represented by block size)

Analyzing Fig. 3, it can be stated that with increasing quad-tree depth, closer view of the character sample is obtained which in turn decreases the size of component parts in the respective blocks. Up to depth level three, features estimated on obtained segments of the character sample, relatively smaller in size, become informative enough to classify them. When character is further divided into more number of blocks, by increasing depth of the tree, block-wise components are becoming so small in size. As a result, features estimated from these segments become less informative and thus fail to identify the character samples properly.

Gray cells, in Table 1, represent the depth of quad-tree where classifiers show their best performances for different feature combinations. Bold styled values specify the name of feature extraction procedure that reflects top recognition at a particular depth. From Table 1, this is observed that, in the present experiment, SMO outperforms all other classifiers to yield the recognition accuracy of 98.5 % (marked as bold styled and colored in red) when all three feature sets are combined and depth of the quad-tree is three. Figure 6 graphically describes the outcomes of the different experimentations performed under the current work. Blue, red, green, and violet lines reflect the nature of the classifiers when features are extracted from the images segmented by quad-tree based approach at depth one, two, three and four respectively. It is clearly seen from Fig. 6 that green line is always at top position and the outcome for SMO classifier is best at this depth. Table 2 reveals the number of feature counts for all possible combination that are fed to the classifiers at varied depths of the tree.

Fig. 6.
figure 6

Graphical behavior of the said classifiers for all combinations of feature estimation procedures considering different depths of the quad-tree (Color figure online)

Table 2. Number of feature counts for all possible combination at various quad-tree depths

Though the present system works well for the recognition of online handwritten Bangla characters, still certain misclassifications have been noticed. Table 3 shows the character pairs which were difficult to handle by the proposed system. After a thoughtful analysis, authors have reached to the decision that due to strong structural resemblance between the characters belonging to a pair, almost similar feature values are estimated, thereby, classifiers have made mistakes during the recognition step.

Table 3. Most confusing character pairs

To prove the effectiveness of the present system, Table 4 compares it with some past works.

Table 4. Compative study of some recently published works along with the proposed work

5 Conclusion

In the present work, authors are intended to observe the effects of three feature extraction strategies along with their probable combinations for the recognition of online handwritten Bangla characters, when applied at different local regions of any character image. Authors have also tried to find the optimal depth of the quad-tree while segmenting a character image to estimate feature information. Although some misclassifications between character pairs have been observed, still reasonably large number of samples of Bangla alphabet set is identified properly by applying this approach. Hence, this technique could be included in OHR domain to solve pattern classification problems. In future, this feature extraction procedure will also be tested for stroke based character or word recognition purpose too.