Keywords

1 Introduction

Potato is the World’s and China’s fourth largest staple crop after rice, wheat and maize, it is also one of the most promising high-yield crops in China [1]. With the rapid development of global agriculture, potato and its processed products have great market potential and become a primary part of the global agricultural trade as well. Consumers care more and more information about the products they bought, and well-informed high-quality products are easier to arouse consumers’ interest and stimulate their purchasing desire [2]. For fruits and vegetables, appearance is one of the most important sensory quality attributes, it would not only influences the packaging and retail price, but also affects consumers’ preferences and choice. Products with perfect appearance are always receive more favor of consumers favor and would have a better sales appeal [3, 4]. Diced potato as a kind of semi-manufacture in trade has strict requirements in size and shape. The irregular shape of diced potato not only deteriorates its appearance, but also has effects on the market value. Therefore, it is necessary to make classification for diced potatoes according to the 3D information before packaging or trading. In practice, diced potatoes are classified by human visual inspection, and 3D information inspection manually is a labor intensive work, it was also time-consuming. Manual processing pose added problems of maintaining the consistency and uniformity in grading [5, 6].

Currently, with the improvement of image processing, quality control with computer vision has been an important technology [7, 8]. For the requirement of speedy and real time, computer vision system has being developed as a significant part in quality detection and evaluation [9, 10]. Over the past several years, the on-line inspection system based on computer vision are widely used in realization of automatic detection of many different agricultural products, including the external and internal quality detection, but automatic classification for diced potato according to their 3D information is still not available. There is an increasing demand for on-line detection equipment based on computer vision that can mimic the human grading and realize accurate classification of diced potato with the unified standard to address the above issues with human visual inspection for food manufacturers.

With photo-electronics, image processing and computer technique rapid development, structured light vision technology has been widely used in computer vision system for 3D reconstruction and automatic measurement successfully [11]. There have been several techniques proposed with quite different characteristics for accurate measurement. In order to get the height of seedling, Feng et al. designed a automatic inspection system based on structured light vision, and result shows the height measurement error was less than 5 mm for the normally straight seedling [12]. However, the detection precision could not satisfy the requirement of diced potato inspection. Binocular visual was also extensive used for 3D size measurement. A binocular structure-light scanner was constructed to acquire the surface detail information of the work-pieces by Liu et al., and it was suitable for precision and efficiency demands of the large work-piece measurement [13]. However, 3D measurement based on binocular stereovision systems is not suitable used for online detection of diced potato just because it is time consume and the complexity of stereo matching. Coded structured light systems were commonly used for real-time acquisition of 3D surface data and it was considered as an important and widely used active shape acquisition technique [14, 15]. The encoded pattern plays a dominate role in the system, it could affect all the measurement performances, including the time consumption and accuracy. Xu et al. proposed one-shot pattern method for 3D shape measurement, result shows that the accuracy can achieve 0.18 mm. With the system, moving object could be inspected, and it can be implemented for automotive production lines [16]. A principle of uniquely color-encoded pattern projection was proposed by Chen et al. [17] to design a color matrix for improving the reconstruction efficiency based on coded structured light system. By using such a light pattern, it could realize the 3D vision reconstruction from a single image and accomplish the reliable and accurate measurement for scene objects. It could also be used in dynamic environment for real-time application. However, encoded structured light system is complex and the light pattern has important effect on measurement accuracy.

To realize automatic on-line classification of diced potatoes, a grading system based on computer vision with high accuracy and reliability performance is needed. For this purpose, intensive research works are being conducted to design and build a flexible, reliable and effective computer vision system by using a monocular stationary camera and a near-infrared linear-array structured lighting.

2 Objectives

In order to develop a real time grading system for diced potato classification by using computer vision and near-infrared linear-array structured lighting. To actualize this objective, several steps have to be completed: (1) RGB and NIR images synchronous acquisition through the same optical path of the camera for 3D information inspection. (2) 2D size measurement in RGB images for width and length inspection. (3) 2D shape feature (rectangle degree) extraction from the RGB images for contour shape inspection. (4) Height map (pseudo-color image and gray level images) construction according to the height information extracted from NIR images by using near-infrared light. (5) Evenness evaluation by using Gaussian distribution in height maps (gray level images). (6) Developing an efficient image processing algorithm based on computer vision system and near-infrared structured light to classify the diced potato into either regular or irregular class.

3 Materials and Methods

3.1 Diced Potato Samples

Fresh potatoes from a market in Beijing, were selected and processed into diced potatoes for study. The shape of some diced potatoes were influenced by the surface of potato, as a result, the contour shape of these diced potatoes would be triangle or other irregular shape, and the surface would be slant or rugged. Diced potatoes were classified into either regular or irregular class according to their 3D information and criteria required by the packinghouse. Namely, regular diced potatoes, which were traded at a higher price, can meet all the requirements of the geometric parameters, and irregular that doesn’t satisfy at least one of the criteria. In our study, 17 potatoes were processed into 400 diced potatoes (length, width and height were about 15 mm for regulars, the geometries were uncertain for irregulars), including 270 regular and 130 irregular samples.

3.2 Computer Vision System

All the samples were inspected and classified by using the vision system as shown in Fig. 1(a). The vision system used in our study is same as the system described in [2]. It mainly consists of a 2CCD camera (JAI AD-080GE), a near-infrared linear-array structured lighting (800 nm, 200 mw), a lighting system (LED light), a computer, a conveyor belt driven by a stepper motor. The multi-spectral camera installed right above the conveyor belt was connected to network ports of computer with two RJ45 twisted-pairs as show in Fig. 1(b). It can acquire both NIR (800 nm) and RGB images through the same optical path simultaneously. The structured lighting was mounted on the upper left of conveyor belt and in the same horizontal plane with camera. The pair of two visible LED light source were distributed symmetrically at the both upper sides of conveyor for light supplement. The whole system was placed in a black box to prevent the interference from outside. In the process of 3D information detection, the image acquisition, image processing, final diced potatoes classification and conveyor belt control panel proposed in this paper was developed in MFC combined with OpenCV.

Fig. 1.
figure 1

Schematic illustration of computer vision system (a) Diagram of the computer vision system used in this research (b) Connection diagram of monocular camera and computer

The conveyor belt consists of inferior smooth material belt and stepper motor was used to transmit diced potatoes for the on-line detection. The specular reflection of projection light strip would be significantly reduced due to the inferior smooth surface of belt. And this would make it easy to extract the light strip and reduce the inspection error. The stepper motor was controlled by the driver and controller which were connected to computer through RS232 serial port to control the speed and other motion parameters of conveyor belt.

3.3 2D Shape Inspection Methods

Contour shape detection plays a domination role in 2D shape extraction. The majority of diced potatoes are regular, and the contour shape of qualified diced potato is regular rectangle. For others, they are influenced by the surface of potato, and the contour shape would be irregular. If the contour shape was not rectangle, the diced potato under detection would be judged as irregular directly and other shape information was not needed. Rectangle degree is a significant index to measure the contour shape of rectangular objects. Rectangle degree reflects the filling degree of an object to its external rectangle and rectangle factor can be used to distinguish whether the aim region is rectangle [18]. Therefore rectangle degree was used for the contour shape judgment with minimum circumscribed rectangle method in this paper.

$$ R = S_{0} /S_{MER} $$
(1)

Where, R represents the rectangle factor, \( S_{0} \) represents the area of diced potato’s surface imaged by camera, \( S_{MER} \) represents the area of minimum circumscribed rectangle. For rectangular objects, the rectangle factor is infinite closed to 1, but for other objects, the rectangle factors vary from 0 to 1.

3.4 3D Size Inspection Methods

The inspection principle of height information based on triangulation theory is shown in Fig. 2. The conveyor belt is regarded as the reference plane, and it is configured to be parallel with the baseline of the camera and laser projector. The theory of height measurement can be explained by similar triangles △ABP and △CDP. According to the triangle similarity, height of object can be calculated by:

$$ \frac{h}{L - h} = \frac{d}{s} $$
(2)
Fig. 2.
figure 2

The theory of the height measurement based on triangulation principle

Where, h represents the distance from point Q to C, and it also represents the height of object; d represents the distance between points C and D which can be extracted by image processing; L represents the vertical distance from the baseline of camera and laser projector to the conveyor belt; s represents the baseline distance from the laser projector to the CCD camera; L and s can be measured directly. Equation (2) can be equally transformed into the format as:

$$ h = \frac{L - h}{s}d $$
(3)

Generally, the height of diced potato is lower than 20 mm, but the distance between baseline of camera and laser projector to the conveyor belt has exceeded 500 mm, large difference existed between them in magnitude. So Eq. (3) can be simplified as:

$$ h = \frac{L}{s}d $$
(4)

For height measurement, the distortion distance \( d^{\prime} \) caused by height of diced potato should be extracted for offset distance d detection. And it can be calculated by the distance between the projecting light stripes in inspected and reference images. The x-coordinate of point Q(x, y) is same of point C(x, y) due to the light strip is parallel to the y-axis of camera image plane, and it can be obtained before inspection. For points B, P, and D is collinear, and it is the same pixel for both reference plane point D and inspected part point P in the camera image plane, point P can be easily detected by image processing. A reference image with straight light strip should be acquired before inspection to obtain the reference coordinates Q(x, y), for which the coordinate position is fixed. In height measurement process, when diced potatoes were transmitted through the view of camera, NIR images with the distortional light strip should be acquired to get the information of detection coordinates P(x, y). The difference between x-coordinate of points P and Q could represent the distortion distance \( d^{\prime} \) due to points P and Q were collinear, they had the same y-coordinate.

Both RGB and NIR images were used for 3D size detection. RGB images were processed for the 2D size measurement. With minimum circumscribed rectangle method, width and length of diced potato could be simply and conveniently detected in a small error range. NIR images were processed for height measurement. Figure 3 shows NIR images acquired by the monocular camera. When nothing is under detection, the light strip would be continuous without any distortion for the near-infrared structured light is just throwing lighting onto reference plane as shown in Fig. 3(a). But when diced potato is transmitted and through the vision system, parts of the light strip would project on the surface of it, and the light strip would be disconnected as shown in Fig. 3(b). The offset distance of light strip would change with the height of diced potato, so it is important to get the relationship between diced potato’s actual height and the offset distance of light strip for height measurement.

Fig. 3.
figure 3

NIR images acquired by AD080-GE (a) Reference image (b) Detection image

Before the height measurement, threshold segmentation was used to carry on binary processing for NIR image. And then, light strip in binary images should be refined for pixel coordinate extraction. An applicable image thinning algorithm not only need to provide high inspection accuracy but also should has high efficient that could complete the image processing in the time interval between two images was acquired (the maximum frame rate of AD-080GE is 30 fps). In our research, the centerline extraction method was used for the image thinning. The center point coordinates \( O\left( {x_{0} ,\text{y}_{0} } \right) \) of light strip in each line of binary image can be calculated using Eq. (5):

$$ x_{0} = \frac{1}{A}\sum\nolimits_{{\left( {x,\,y} \right)\, \in \,R}} {x\,y_{0} = \frac{1}{A}} \sum\nolimits_{(x, \,y)\, \in \,R} y $$
(5)

Where, R represents the pixels set of light strip in each line of binary image; A represents the size of pixels set; x and y represents the x-coordinate and y-coordinate of each pixel in light strip.

NIR images were detected line-by-line to get pixel coordinates of the centerline which could represent the position of light strip. After thinning of the binary images, the centerline coordinates could be extracted, and distortion distance \( d^{\prime} \) could be calculated by:

$$ d^{\prime} = d_{0} - d_{1} $$
(6)

Where, \( d_{0} \) represents the centerline coordinates of reference image, \( d_{1} \) represents the centerline coordinates of inspected image.

In our research, model materials (50 mm long, 30 mm wide and 2 mm high) were used layer by layer and formed different heights for the data sampling. Cubic spline interpolation was used to establish the relationship between offset distance d and distortion distance \( d^{\prime} \) by using Matlab 2010. When \( d^{\prime} \) were inspected by visual inspection system, the offset distance of light strip would be got by the fitting function, which could represents the relationship. Finally, the real height of diced potato h would be calculated with triangulation theory proposed above.

3.5 Surface Evenness Detection

For further research, height map, including pseudo-color and gray level images were constructed according to the distortion distance \( d^{\prime} \) in the process of height measurement. The pseudo-color images could make it easier for human eyes to evaluate the performance of the height image, and the gray level images were used for the diced potato’s surface evenness detection. The surface evenness was assessed based on the grey scale distribution of height map. The average and mean square error of \( d^{\prime} \) could be calculated by the grey scale distribution, which could be extracted by the image processing of gray level image. Gaussian model was used for surface evenness detection, and the evenness was determined by the ucl (upper control limit) and lcl (lower control limit) of \( d^{\prime} \) using Eqs. (7) and (8):

$$ ucl = \mu + \sigma $$
(7)
$$ lcl = \mu - \sigma $$
(8)

Where, μ represents the average of \( d^{\prime} \), σ represents the mean square error of \( d^{\prime} \). Due to different distortion distance \( d^{\prime} \) corresponds to different gray scale in gray level image, μ and σ can be determined using Eqs. (9) and (10):

$$ \mu = \mathop \sum \limits_{i = 1}^{255} i*p_{(i)} $$
(9)
$$ \sigma = \sqrt {\frac{1}{255}\mathop \sum \limits_{i = 1}^{255} (i - \mu )^{2} } $$
(10)

Where, i represents the gray scale of gray level image from 1 to 255, gray scale 0 is preclusive due to it is defined as the background (black) of the height image; \( p_{\left( i \right)} \) represents the appearing probability of each gray scale appeared in height map image.

As show in Fig. 4, the shadow represents the gray scales in the scope of lcl and ucl. If the total probabilities of gray scales in the shadow area is higher than a specified threshold, the surface evenness would be judged as eligible.

Fig. 4.
figure 4

Control limits for surface evenness detection

3.6 Whole Classification Algorithm for 3D Size and Shape Inspection

In this paper, 3D geometry characteristics of diced potatoes were extracted by using a combination of structured lighting method, triangulation theory and minimum circumscribed rectangle method. The image processing and shape classification algorithm mainly include the following steps: (1) Calibration: NIR image in which the light strip was uninterrupted was acquired before shape detection to get reference pixel coordinates; (2) 2D size measurement: RGB image was processed to get the length and width in pixels with minimum circumscribed rectangle method. Then the length and width of diced potatoes would be figured out with the parameters of camera and other parameters. (3) 2D shape feature extraction: RGB images were processed to get the rectangle degree of diced potatoes, according which to judge if the contour shape of diced potato was regular rectangle. (4) Height map construction: Pseudo-color images and gray level images which could represent the height information of diced potato were constructed simultaneous depend on the distortion distance \( d^{\prime} \) between the projection light stripes in inspected and reference images. (5) Surface evenness evaluation: The grey-histogram of height map (gray level image) was created to get average of \( d^{\prime} \) and mean square error of \( d^{\prime} \), and then judge if the surface of diced potato was even with Gaussian model. (6) Grade judgment: All the shape parameters inspected through above steps were compared with the parameters of 3D characteristic information set according to the factory requirements for classification. If the geometrical information measured by computer vision system satisfied requirements, Flag would be defined as TRUE, and diced potato under detection would be regarded as regular. Otherwise, the Flag would be defined as FLASE and the diced potato would be regarded as irregular. Finally, diced potatoes were classified into either regular or irregular class. The flowchart of the whole classification algorithm is shown in Fig. 5.

Fig. 5.
figure 5

Flowchart of the classification algorithm for diced potato

4 Results and Discussion

The automatic control software based on computer vision is shown in Fig. 6. The on-line inspection of diced potato would be conducted according to the classification algorithm mentioned above. Inspection results, including 3D size and classification information were shown in real time. Meanwhile, the height map images could be constructed with the moving of diced potatoes, and the pixels would be assigned to different color according to their height information and color bar.

Fig. 6.
figure 6

The inspection system developed in VC++ 2010 for diced potato grading

4.1 Results of Light Strip Thinning and Smoothing

In the process of height measurement, it is important to extract the centerline of light strip in NIR image. Height information of each pixel in the light stripe could be calculated using the measurement principle of the triangulation method. The result of light strip thinning and smoothing was shown in Fig. 7. Figure 7(a) is the NIR image with light strip projecting on the surface of diced potato. As shown in Fig. 7(a), a light stripe with distortion was projected onto the scene and imaged by the camera, and it is obvious that the width of light strip on the surface of diced potato increases due to the light scattering. Figure 7(b) shows the results of light strip thinning. As shown in Fig. 7(b), the light strip is thinned as a very thin line with the width of only one pixel using the centerline extraction method. Figure 7(c) shows the distortion distance \( d^{\prime} \) of each pixel in the centerline. The 3D profile shape of diced potatoes could be constructed by connecting all the distortion distance of the pixels in the centerline. Veining defects were often observed around light strip on the surface of reference plane or diced potato due to the slight scattered light of conveyor belt and instability of the laser projector. As shown in Fig. 7(c), the data after image thinning were not smooth and sometime it would influence the detection results. For this reason, median filtering was used to smooth the original data of distortion distance \( d^{\prime} \) on account of the simple algorithm and better processing performance compared with other processing methods. Figure 7(d) shows the result of data after media filtering. As shown in Fig. 7(d), most of the burrs are eliminated, especially the data in area of reference plane.

Fig. 7.
figure 7

The result of light strip thinning and data smoothing (a) Detection image (b) Result of image thinning (c) Original data of distortion distance \( d^{\prime} \) (d) Results data after median filtering

4.2 Detection Result

Contour shape and surface evenness inspection were used to judge if the diced potato satisfied the dimensional requirements was regular pattern. In order to detect the surface evenness, pseudo-color images and gray level images were constructed as shown in Fig. 8. Figure 8(a) is the RGB image of regular diced potato, Fig. 8(b) and (c) are the pseudo-color image and gray level image constructed by the vision system according to the 3D information, Fig. 8(d) is the gray histogram which could indicate the grey scale distribution of the gray level image. Images in the second row of Fig. 8 is the relevant images of an irregular diced potato. As shown in Fig. 8(f) and (g), the height map were distorted due to the slant or rugged surface, and the surface area of irregular diced potato was larger than the surface area of regular diced potato. Compared Fig. 8(d) with (h), it is obvious that, if diced potato under detection was regular, the grey scale distribution would be relatively concentrated and its’ distribution trends is similar to Gaussian distribution. Otherwise, the grey scale distribution would be dispersive for irregular diced potato with the uneven surface.

Fig. 8.
figure 8

Detection results of regular and irregular diced potato detection (a) RGB image of regular diced potato; (b) Pseudo-color image of regular diced potato; (c) Gray level image of regular diced potato; (d) Gray histogram of regular diced potato; (e) RGB image of irregular diced; (f) Pseudo-color image of irregular diced potato; (g) Gray level image of irregular diced potato; (h) Gray histogram of irregular diced potato

In order to observe the classification effect, and testify detection algorithm proposed in this paper, the inspection results of different classes of diced potatoes were randomly selected. Measuring error was extracted by comparing the 3D size detected by the grading system with the real value measured by digital caliper. The measurement results, including length, width and height shows minor difference (in range of 1 mm) contrasted with the real value.

Figure 9 shows the detection results of geometric parameters. Figure 9(a), (b) and (c) shows the relationship of real values and measured values of length, width and height, respectively. The corresponding \( R^{2} \) (coefficient of determination) is 0.9072, 0.8939 and 0.8968, this could meets requirements for the classification of diced potatoes. The rectangle degree and Gaussian probability were the other two important parameters to judge if diced potato was regular or not. For most irregular diced potatoes, these two parameters were influenced by the surface of potato. There has a relationship between the contour shape and surface evenness, it can be represented by the correlation between rectangle degree and Gaussian probability. As shown in Fig. 9(d), it is obvious that the rectangle degree of regular diced potato is higher than 0.9. By contrast, the rectangle degree of irregular diced potato is disperse because of the various contour shape, and for majority of them, the rectangle degree is lower than 0.9. The Gaussian probability of regular diced potato is higher than 0.68, and it is lower than 0.68 for major irregular diced potatoes due to their uneven surface. Therefore, 0.68 can be used as the decision condition for the surface evenness judgment. For minority irregular diced potatoes, the Gaussian probability is higher than 0.68, but 3D size and rectangle degree could not meet the requirements, it is greatly reduced the inspection error probability. In Fig. 9(d), there have many regular diced potatoes with the same rectangle and Gaussian probability, therefore some overlap was also observed.

Fig. 9.
figure 9

Detection results of geometrical characteristics (a) Comparison of measured length and real length; (b) Comparison of measured width and real width; (c) Comparison of measured height and real height; (d) Distribution of Gaussian probability and rectangular ratio for regular and irregular diced potato

The image processing mainly includes minimum circumscribed rectangle detection and height map construction aim at getting the 3D geometrical characteristics of diced potato. The intermediate process result images are shown in Fig. 10. Diced potatoes with different shape as shown in Row (a), all the geometrical information of sample 1 and 2 are regular and could satisfy the criteria, and other samples are irregular. Images in Row (b) are the detection results with minimum circumscribed rectangle method. Height map including pseudo-color and gray level image constructed in the process of height measurement are shown in Row (c) and Row (d), different height is denoted as different pixel values.

Fig. 10.
figure 10

Results of diced potatoes with deferent shape (a) RGB images; (b) Results of minimum circumscribed rectangle method; (c) Pseudo-color images; (d) Gray level image

The irregular diced potatoes mainly include 3 types: (1) Geometrical feature is regular, but 3D size would not satisfy the requirements as show in sample 3. Though the contour shape is regular rectangle and the surface is evenness, the height could not meet requirements. This is obvious when compared the height map of sample 3 with sample 1 or 2, especially in pseudo-color image. (2) Both of the contour shape and surface evenness could not satisfy the requirements as shown in sample 4, this would be classified as irregular directly with rectangle degree detection. (3) The contour shape of diced potato is regular rectangle, but the surface is slant or rugged, as shown in sample 5 or 6. It is obvious to observe the trend of the surface height change with the height map.

5 Conclusions

Real-time, on-line, low-cost computer vision system is urgent needed to replace the manual inspection for the classification of diced potato. An automatic grading system based on computer vision and near-infrared linear-array structured lighting was proposed in this paper. In order to reduce the interference of external conditions and make the inspection results more precise and stable, visible and near-infrared linear-array structured lighting were used respectively for contrast tests to select the projector with higher detection speed and accuracy. Meanwhile, convey belt with inferior smooth material was used for diced potatoes transmission. The matt surface would reduce the specular reflection of projection light strip significantly and this would make it easy to extract the light strip. In this research, RGB and NIR images acquired simultaneously by a monocular camera were processed for 3D geometrical characteristics extraction, including physical dimension, contour shape and surface evenness, which were used as the main decision conditions for the classification. Minimum circumscribed rectangle method and structured light measurement method were used for the 3D size inspection, meanwhile, height map include pseudo-color image and gray level image were constructed according to the height of diced potato for the surface evenness judgment with Gaussian module. The results demonstrate the detection error of the grading system was in range of about 1 mm, and the classification accuracy has reached 98 %. This preliminary research verified the possibility and feasibility of using computer vision combined with structure light measurement method for the classification of diced potato.