1 Introduction

The beginning of the twenty-one century brought with it new challenges related to organizing and effecting the implementation of production processes. The practical implementation of concepts associated with the digital factory and the fourth industrial revolution forces the manufacturers to take advantage of innovative production techniques that increase the automation and flexibility of production while simultaneously increasing its quality and profitability. A technique that fully meets these expectations is related to machine vision systems relying on digital photography or optical scanners, especially those that work in tandem with state-of-the-art industrial robots. Machine vision is a branch of system engineering which encompasses a large number of integrated techniques, software and hardware solutions, activities, methods and knowledge. Computer vision is a branch of computer science that deals with automatically processing and analysing images in order to extract the desired information from them. On the basis of data obtained in this way, a computer is able to make decisions, draw conclusions, compile statistics, etc. The degree of complexity associated with the task of recognizing images may be significant, as the extraction and identification of objects are often attempted on the basis of crowded and chaotic digital images, and objects may be depicted in various positions. Some specialist and highly diverse approaches are devoted to solving problems from this category [6]. In contrast to the above-mentioned ones, tasks performed in the area of machine vision are much simpler, as we try to organize the scene and remove the background by choosing the appropriate optics, lighting and by subjecting the image to preliminary processing.

There are a number of different recognition methods focusing on analysing the external shape of an object. Generally, these methods are divided into global and local ones. Global methods, such as global shape indices or Fourier descriptors, are resistant to the noise present in an image, but they may overlook details that are significant for the process of recognition. Local methods focus on features detected in limited areas of an image or, alternatively, on the mutual relationships between these features. The overview, classification and properties of techniques related to describing and representing shapes of objects, including their computational complexity, are to be found in the works [19, 20]. Methods of extracting the properties of objects which rely on their shape are described in the work [11]. A general overview of methods for analysing the shape of objects may be found in an earlier work [8].

The present article proposes an approach based on mutual matching of two ordered, finite sets of points determining the contours of a reference object and the classified object. Such approaches are already known from the scientific and technical literature. A range of varieties of this method may be found in the work [18]: bottleneck matching, minimum weight matching, uniform matching, minimum deviation matching, Hausdorff distance, transformation space subdivision. The presented method is of global nature because it takes into account an ordered and evenly, angularly distributed set of 360 points of the contour set relative to the contour’s centroid. Therefore, it is significantly different from local methods aimed at finding the key points of a contour and the properties associated with them, as is the case in the work [1].

2 Technical conditions for the use of machine vision in the engineering industry

The technique of machine vision has been known and applied in engineering practice for several dozen years; however, it has only recently reached the level of technical and economic parameters that allowed for its application to become widespread. The contributing factors include the wide range of products offered by manufacturers of machine vision equipment, affordable costs of investments in this area, well-rounded and reliable software solutions for intelligent cameras, solving the problem of two-way communication between the vision system and the industrial robot, the ergonomic user interface for configuring and calibrating components of vision systems as well as the knowledge and skills of the users themselves regarding the possible ways of using vision systems and solving problems associated with them. All this makes it so that the global market for vision systems is one of the most rapidly developing segments in the area of means of production, recently at the level of 10% per annum on average. The automotive segment is the biggest customer of machine vision. It is responsible for 20% of the total turnover, and here, the technology is used in such applications as automatically monitoring and controlling the process or intelligent control over robots. The justification for using machine vision includes: the possibility of performing non-contact measurements, automatic acquisition of huge amounts of data, efficiency and promptness of acting that exceed a man’s abilities, additional possibilities of optimizing processes, active prevention of production errors or uninterrupted workflow. Compared to the abilities of a man, machine vision systems are faster, more precise, repetitive and objective. Hence, in the automotive and machine industries, the use of machine vision is currently becoming mandatory, especially in relation to tasks associated with monitoring and supervising production.

The architecture of machine vision systems constitutes a combination of the hardware structure and the visual information processing chain. A typical machine vision system consists of several components: a digital camera along with its optics, a built-in processor or an external computer, input/output devices, a specialist light source, software for processing the image and detecting the properties of objects, synchronizing sensors and executive elements for manipulating the objects. Several types of vision systems are available on the market, from simple vision sensors, through intelligent cameras, camera–computer systems, to specialist integrated platforms (PXI). Each of the solutions is characterized by a different level of flexibility and efficiency, as well as a different price. In terms of the number of dimensions of the analysed image, vision systems are divided into one-dimensional, two-dimensional, two-and-a-half-dimensional and three-dimensional ones. In the industry, two-dimensional systems are used most commonly, but it is three-dimensional systems that are characterized by the most rapid speed of development.

Raster images of appropriate quality or models of objects extracted from a point cloud make it possible to automatically interpret objects and extract the correct information about them, which results in a reliable, repeatable and efficient system for recognizing them. The lighting conditions prevalent in a camera’s field of view, as well as the applied optics, including the filters and parameters of the camera, are of crucial importance for the correct operation of a vision system. Factors which contribute to the creation of lighting conditions include the type of light sources, their number, the light’s angle of incidence and its colour. Improperly lightning the camera’s field of view results in the loss of information at the stage of image processing, whose recovery is impossible. In the case of moving parts, stroboscopic light synchronized with the camera’s shutter should be used. The various possibilities available in this regard make it so that selecting the appropriate lighting is an art which requires considerable experience and preliminary trials run at the stage of configuring a vision system. Appropriate lighting is of key importance for the successful implementation of machine vision, and it should be the first factor considered when manufacturing it. The selected object lighting should emphasize the contrast between the extracted properties of an object while minimizing the contrast of the other elements. High contrast increases the reliability of extracting properties. The task of the proper colour of light is to eliminate the background from the image and those features of the photographed object that are irrelevant to the task of recognizing it. The correct optics ensure the elimination of an array of possible optical defects such as distortions, aberrations and vignetting. Band filters allow users to control what their camera sees in greater contrast, lowering the cost of image processing. A very important parameter of a camera, one that affects the quality of the recognition task, is the resolution of its image sensor. It should be kept in mind, however, that attempting to obtain the perfect image expands the hardware structure, makes its calibration difficult, extends the process chain at the stage of processing the image and requires greater competencies from the system’s operator. A higher number of pixels and a greater colour depth of the analysed image or an excessive number of scanned points extend the processing time. Thus, when designing a vision system, a rational compromise should be reached in this regard. An intelligent vision system is one which, based on a simplified image of an object, allows for the correct conclusions to be drawn in a short time, without extending the production cycle time.

The vast majority of the machine vision systems currently in use revolve around the recognition of simple elements with symmetrical shapes and of small sizes, such as rings, sleeves, discs, fasteners and simple prismatic elements. In such cases, there are no problems with the quality of recognizing the object and the system’s short response time. Cases where the objects subjected to recognition are characterized by complex shapes, asymmetrical contours of the perimeter, large sizes, and where they may appear in various areas of the image, are a different problem altogether. In such a case, it is necessary to use much more sophisticated methods, dedicated to a given machine part, oriented towards properties which are specific of a given part. Such parts include the rear wheel pin, which often appears in various forms and is one of the more irregular elements found in the construction of a car. In order to determine the position, location and orientation of the pin on a belt conveyor unambiguously (Fig. 1), we should specify: the A0 inclination angle of the pin’s axis relative to the XY plane associated with the conveyor belt surface, the Axy orientation of the pin as the angle between the positive direction of the X-axis and the vertical projection of the pin’s axis on the XY plane, as well as the point of intersection of the pin’s axis and the XY plane (Fig. 2), which is where the beginning of the local coordinate system should be situated.

Fig. 1
figure 1

Test stand: 1-rear wheel pin, 2-industrial robot, 3-camera, 4-lighting, 5-control cabinet

Fig. 2
figure 2

Parameters determining the position, location and orientation relative to the XY plane

Different positions of the pin on the conveyor belt stem from the complex shape of its perimeter, which results in the six possible positions of the pin as shown in Fig. 3. In such a case, the standard functions of vision systems have significant trouble with recognizing the position accurately, especially in situations when the pin may additionally be subject to relocations relative to the axis of the optical system.

Fig. 3
figure 3

Possible positions of the pin on the conveyor belt

3 The method of recognizing the position, location and orientation of complex objects

A typical machine vision system performs three basic tasks:

  • Detection—detecting the presence of a transported object,

  • Classification—determining which object is in the field of view of the camera,

  • Determination of coordinates—determining the position and orientation of the detected object relative to the adopted coordinate system.

Thus, the basic functions of the software of such a system include image acquisition, preliminary image processing (filtering, binarization, segmentation, dilation, erosion, skeletonization) to a form which facilitates further processing, extraction of the characteristic properties of an object, classification, final assessment and/or decision. In this work, the author proposed a classification method based on the principle of minimum calculation—maximum information. Successive steps in this method are presented below. To show its advantages, a monochromatic image of a pin which has not been subjected to digital processing, featuring shadows and a number of light reflections on the surface of the recognized object was chosen as the point of departure (Fig. 4a).

Fig. 4
figure 4

Preliminary processing of an image: a unprocessed image, b image after binarization, c fragment of the perimeter before filtration, d fragment of the perimeter after filtration

3.1 Image binarization

The initial transformation of an image consists of its binarization (Fig. 4a–b) and filtration (Fig. 4c–d). Binarization is one of the basic preliminary transformations of an image. It is a process of converting points of colourful or monochromatic images into binary images where there are only white or black pixels. A binary image is represented by a matrix containing only zeros and ones. Binarization is most often executed by thresholding as a function of an intelligent camera. A current presentation of effective thresholding methods is included in the work [3]. In terms of the task in question, the loss of information that follows binarization is not significant for the quality of the recognition process; however, it substantially shortens the time it takes to perform the subsequent stages of image processing. Determining the T threshold is most often done manually while calibrating the vision system. It may, however, be done automatically, for example, by measuring entropy [2]. The priority in determining the threshold is to achieve the full external image in black, even at the expense of including small, shaded background areas whose automatic removal is difficult in the image of the object. An improperly chosen binarization threshold may lead to determining the external shape of the object incorrectly. Therefore, in the chapter devoted to verifying the proposed method, tests of its sensitivity to the improperly fixed outline of the recognized object have been presented.

Transformations with the use of filters consist of modifying individual images depending on their conditions as well as the conditions of their surroundings. After binarization, the presented method necessitates the use of simple filters which enable the removal of singular, white pixels or strings of white points forming a line. Such points, when located in the immediate vicinity of the object’s perimeter, may affect its determined contour. The condition for a change in the colour of a single white point is the lack of other white points in its von Neumann neighbourhood (2). The condition for a change in the colour from white to black is the sum of more than five black points in its Moore neighbourhood (3). The integrated threshold binarization, along with the filter eliminating single white point and the linear strings of white points from the image, is shown by dependence (1).

$$ b\left( {i, j} \right) = \left\{ {\begin{array}{*{20}l} 1 \hfill & {{\text{if}}\;a\left( {i, j} \right) > T} \hfill \\ 1 \hfill & {{\text{if}}\;\forall p\left( {x,y} \right) \in N_{{\left( {i,j} \right)}}^{v} a\left( {x, y} \right) > T} \hfill \\ 1 \hfill & {{\text{if}}\;\left( {\mathop \sum \limits_{k = 1}^{8} b\left( {x,y} \right)} \right) > 5, p\left( {x,y} \right) \in N_{{\left( {i,j} \right)}}^{M} } \hfill \\ 0 \hfill & {\text{otherwise}} \hfill \\ \end{array} } \right. $$
(1)

where a(i, j) is the value of the pixel from the original image, b(i, j) the value of the pixel from the binary image, 1 for image points, 0 for background points, p(x, y) the point with the x, y coordinates, \( N_{{\left( {i,j} \right)}}^{v} \) the von Neumann neighbourhood with a radius of 1 for the point with i, j coordinates, \( N_{{\left( {i,j} \right)}}^{M} \) the Moore neighbourhood with a radius of 1 for the point with i, j coordinates, and T the threshold value.

A von Neumann neighbourhood with the range of r is defined as:

$$ N_{{\left( {x_{0} ,y_{0} } \right)}}^{v} = \left\{ {\left( {x,y} \right):\left| {x - x_{0} } \right| + \left| {y - y_{0} } \right| \le r} \right\}. $$
(2)

A Moore neighbourhood with the range of r is defined as:

$$ N_{{\left( {x_{0} ,y_{0} } \right)}}^{M} = \left\{ {\left( {x,y} \right):\left| {x - x_{0} } \right| \le r, \left| {y - y_{0} } \right| \le r} \right\}. $$
(3)

Currently, for the purposes of binarization and simple image filtering, specialist processors that perform parallel processing of all the points of an image while relying only on the hardware are used increasingly more commonly.

As may be seen in Fig. 4, binarization removed the vast majority of distortions in the form of shadows. Small areas of the background, in the form of overlapping shadows assigned to the object, are only found in the concave fragments of the perimeter, which has a very limited impact on the quality of the recognition process in the proposed method. Meanwhile, part of the object’s interior has been treated as the background, which would, of course, affect the calculation of the object’s area, and especially the centre of gravity, determined relative to all the points of the image which are not the background. The centre of gravity is also known as the centroid (Fig. 5b). However, the suggested method does not take into account the area of the object, which, in the opinion of its author, is one of its advantages. The centre of gravity is determined based on a set of the perimeter’s points; thus, it is the centre of gravity of the external contour.

Fig. 5
figure 5

a Analysed object, b contour of the object and its centre of gravity

3.2 Determining the contour of an object

Detecting the external contour of an object is one of the most important techniques used in image segmentation and the identification of objects present in an image. Edge detection as an image processing operation is sometimes difficult because images may contain areas characterized by different degrees of noise, uniformity of edges, blurriness and clarity. There is a plethora of techniques used in order to detect the external contour of recognized objects, and they differ significantly from one another in terms of their computational complexity and effectiveness. The vast majority of algorithms serving the detection of edges of the photographed objects are also suitable for this task [21]. They are comprehensively described [15] and are topics that are often encountered in current scientific publications [4, 7, 12, 17]. The advantages and disadvantages of the most popular of these methods were assessed, among others, in the work [9]. The majority of suggested approaches are based on the use of effective filters [10]. It is worth paying attention to the fact that the applied filters should detect edges and simultaneously smooth out their contours to a sufficient degree [13]. Here, it should be emphasized that most of these methods are too complex to be used in the relatively simple problem of detecting the external contour in a binary image, with which we are dealing at this stage of processing.

A frequently recommended method of determining the contour of a binary image has been proposed in the work [14: alg.5.6]. In order for the above algorithm to be executed without interruptions, the edges of a binary image should be smoothed out sufficiently beforehand; otherwise, the algorithm’s functioning will be interrupted. Such a case is illustrated in Fig. 6.

Fig. 6
figure 6

a Fragment of a binary image, b unfinished contour

In this paper, an original method of detecting the external contour, along with determining the contour’s centre of gravity and simultaneously smoothing it out, has been proposed. The method is defined on the foundation of the theory of cellular automata [5]. In the first step of this method, any point belonging to the external contour of an object should be detected. The subsequent point of the contour, adjacent to the one that was detected previously, is determined on the basis of four simple principles:

  • The point belongs to the Moore neighbourhood (3) of the point of contour that was last determined;

  • The point does not belong to the inside of the contour, i.e. its von Neumann neighbourhood (2) has at least one white pixel;

  • The point is black;

  • The Moore neighbourhood of the analysed point, apart from containing points that have already been verified as belonging to the contour, also contains at least two black points belonging to the recognized object.

The c cell state function is defined as:

$$ s\left( {x,y} \right) = \left\{ {\begin{array}{*{20}l} {1,} \hfill & {{\text{if}}\;{\text{black}}} \hfill \\ {0,} \hfill & {{\text{if}}\;{\text{white}}} \hfill \\ \end{array} } \right. $$
(4)

A cell c(x, y) belongs to the outer contour of an object if the conditions 5, 6, 7, 8 are met:

$$ c\left( {x,y} \right) \in N_{{\left( {x_{0} ,y_{0} } \right)}}^{M} , $$
(5)
$$ \mathop \sum \limits_{k = 1}^{4} s\left( {x_{k} ,y_{k} } \right) < 4,\;c\left( {x_{k} ,y_{k} } \right) \in N_{{\left( {x,y} \right)}}^{v} , $$
(6)
$$ s\left( {x,y} \right) = 1, $$
(7)
$$ \mathop \sum \limits_{k = 1}^{8} s\left( {x_{k} ,y_{k} } \right) \ge 2, c\left( {x_{k} ,y_{k} } \right) \in N_{{\left( {x,y} \right)}}^{M} ,c\left( {x_{k} ,y_{k} } \right) \notin C, $$
(8)

where C is a set of cells that have already been included in the contour.

The above procedure should be repeated until the contour is closed. The effect of smoothing out the perimeter is observed in Fig. 7. Black pixels “weakly” associated with the object were removed from the perimeter because they did not meet condition (8).

Fig. 7
figure 7

Effect of smoothing out the perimeter: a part of the perimeter, b corresponding fragment of the contour

Parallelly to detecting the contour, their K sum and the sums of their coordinates are calculated; these are used in order to calculate the Xc and Yc coordinates of the contour’s centre of gravity. Its coordinates are determined in accordance with correlations (9) and (10):

$$ X_{\text{c}} = \frac{1}{K}\mathop \sum \limits_{k = 1}^{K} x_{k} , $$
(9)
$$ Y_{\text{c}} = \frac{1}{K}\mathop \sum \limits_{k = 1}^{K} y_{k} , $$
(10)

where K is the sum of all the determined points of the contour.

3.3 Determining the position pattern

In order to determine the position pattern, we make a sequence of 360 revolutions of the obtained contour relative to its centre of gravity (Fig. 8a). The Pp pattern, specific to one of the six possible p positions of the rear wheel pin, is an ordered set of 360 points (Fig. 8b):

$$ P^{p} = \left\{ {\left( {A_{0} ,X_{{\max_{0} }} } \right),\left( {A_{1} ,X_{{\max_{1} }} } \right), \ldots ,\left( {A_{359} ,X_{{\max_{359} }} } \right)} \right\}, $$
(11)

where A is the angle of rotating the pin’s outline in the XY plane, and Xmax the radial coordinate of the rightmost point of the contour.

Fig. 8
figure 8

a Principle of determining Xmax, b radial chart of the position pattern

Before saving the pattern in the database, two simple transformations should be performed: offsetting and normalizing the points. Offsetting the points makes it possible to determine the Axy angular position of the pin during the recognition process. In a simple case, it consists of moving the points in the list while preserving their previous order in such a way that the point with the minimal Xmax value is located at the beginning of the list. This type of solution is only allowed if an unambiguous minimum is present in the set of Xmax values. In cases, where the minimal value may be assigned to several different points of the contour, the more complex method of global matching should be applied, which consists of shifting the set of points by the appropriate offset that gives the highest value of the similarity index.

Normalization consists of dividing all the Xmax coordinates by the minimal value of (12), corresponding to the first position on the offset list.

$$ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{X}_{{\max_{i} }} = \frac{{X_{{\max_{i} }} }}{{\hbox{min} \left\{ {X_{{\max_{0} }} ,X_{{\max_{1} }} , \ldots ,X_{{\max_{359} }} } \right\}}}, $$
(12)

where \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{X}_{{\max_{i} }} \) is the normalized coordinate of the maximal declination for this I angular value.

Normalization allows for objects to be compared while taking into account the different scales of their images resulting, for instance, from a change in the height at which the camera’s lens is located, or applying a higher raster resolution when recording the image.

3.4 Determining similarity to the pattern

In order to compare the similarity of the image of an analysed object to an image of one of the patterns stored in the database, the Pr set should be determined in the same way as Pp sets were determined before for patterns. Figure 9 illustrates an example of the mutual distribution of normalized maximal declination coordinates for the pattern and a recognized object, as well as their absolute differences for subsequent angular positions. Similarity is intuitively understood as the minimization of differences.

Fig. 9
figure 9

Example distribution of normalized maximal declination coordinates

If we were to treat the maximal declination coordinates generated for the pattern and the recognized object as two points in a 360-dimensional, homogeneous Euclidean space, then the length of the vector connecting these points (13), that is, the Euclidean metric, may serve as a similarity measure:

$$ \vec{V}_{pr} = \sqrt {\mathop \sum \limits_{i = 0}^{359} \left( {\left( {X_{{\max_{pi} }} - X_{{\max_{ri} }} } \right)^{2} } \right)} , $$
(13)

where \( \vec{V}_{pr} \) is the length of the vector between the positions of the p pattern and the r recognized object in the multidimensional Euclidean space. A similarity measure expressed as the length of a vector in a multidimensional space of states still does not tell the user much. Similarity should be treated in terms of the function of belonging to a fuzzy set. Hence, it was decided that the Is similarity index should be determined in accordance with the correlation as illustrated in Fig. 10:

$$ I_{\text{s}} = \frac{1}{{\exp^{{\left( {0.5\vec{V}_{pr} } \right)}} }} $$
(14)
Fig. 10
figure 10

Behaviour of the function of belonging to the set of similar objects depending on the length of the vector \( \vec{V}_{pr} \)

Such a form of the similarity index means Is= 1 for identical images and Is close to 0 for images that do not demonstrate traits of similarity. For example, from Fig. 8, the calculated similarity index equalled 0.25. Such a value of ls should be interpreted as the fact that some similarity is present, but it is not sufficient for the object to be included in the category of the pattern.

The approach proposed above distinguishes objects based on external, convex parts of the perimeter. In a case, where the external fragments of the perimeter are not sufficiently diverse, the same method may be applied to an inverse image of the perimeter’s outline (Fig. 11), for which subsequent points of the contour are determined from correlation (15):

$$ R_{k}^{i} = R_{\hbox{max} } - R_{k} , $$
(15)

where \( R_{k}^{i} \) is the radius of the k point of the contour in the inverse image, \( R_{\hbox{max} } \) the maximal declination of the contour’s point from its centre of gravity, and \( R_{k} \) the k radius of the contour.

Fig. 11
figure 11

Image of the object (a) and its inverse image (b)

Such an approach allows for geometrical properties of the object, associated with the internal, concave fragments of the contour, to be taken into account. If such a need arises, these approaches may be aggregated by applying a soft logical product of ls indices for each of the approaches.

4 Verifying the method

4.1 Recognizing the position of a rear wheel pin

In order to verify the developed method, an element of a car in the form of a rear wheel pin was considered. The method was not tested on simple geometric elements such as triangles and rectangles because, often, the results of tests for simple elements do not translate into the application of the method for complex elements photographed in real production plant conditions. Six reference categories were established (Fig. 12), corresponding to the possible positions of the pin on the conveyor belt as shown in Fig. 3.

Fig. 12
figure 12

Reference outlines of the contour for the six pin positions

Figure 13 shows charts depicting the behaviour of maximal declination coordinates for all six positions expressed in pixels (Fig. 13a) and in normalized, dimensionless maximum deflection coordinates (Fig. 13b). It may be concluded that in both cases, a very good pattern separation was obtained, which predicts the good quality of recognition. Such separation could not be obtained by using simple methods, such as calculating the area of an object or calculating the length of the external contour. It may also be noted that mutual pattern relationships change after the process of normalization.

Fig. 13
figure 13

Diagrams illustrating the patterns of positions of the rear wheel pin a expressed in pixels, b expressed in normalized maximal declinations

Mutual similarity indices, calculated for the six reference position patterns, are depicted in Fig. 14. All the calculated indices have values of less than 0.25, which means that the developed method clearly differentiates between the tested images.

Fig. 14
figure 14

Mutual similarity indices for the six patterns (colours in accordance with Fig. 13)

4.2 Impact of the image’s resolution and the angular position of the object

The subsequent test verified the sensitivity of the method to rotating the elements and changes in the scale. The results of the tests for the selected position 5 are depicted in Fig. 15. The results clearly indicate that the method is not sensitive to either rotations or changes in the scale. The differences obtained result from the raster nature of the image.

Fig. 15
figure 15

Tests of the method’s sensitivity: a original image, b image rotated by 30°, c image reduced by 25%, d image rotated by 30° and reduced by 25%

4.3 Comparison of the results with tests based on shape coefficients

The results obtained were compared with the classic Blair-Bliss (16), Danielsson (17) and Haralick (18) shape coefficients [16]:

$$ R_{\text{B}} = \frac{S}{{\sqrt {2\pi \mathop \sum \nolimits_{i} r_{i}^{2} } }}, $$
(16)
$$ R_{\text{d}} = \frac{{S^{3} }}{{\left( {\mathop \sum \nolimits_{i} l_{i} } \right)^{2} }} , $$
(17)
$$ R_{\text{H}} = \sqrt {\frac{{\left( {\mathop \sum \nolimits_{i} d_{i} } \right)^{2} }}{{n\mathop \sum \nolimits_{i} d_{i}^{2} - 1}}} , $$
(18)

where S is the surface area of the object, ri the distance of the object’s pixel from the object’s centre of gravity, i the number of the object’s pixel, li the minimal distance of the object’s pixel from the object’s contour, di the distance between the pixels in the object’s contour and its centre of gravity, and n the number of the contour’s pixels.

Table 1 shows the calculated shape coefficients for the six possible positions of the pin and additionally for the fifth position after rotating and re-scaling and simultaneously rotating and re-scaling.

Table 1 Shape coefficient values

The results clearly indicate that none of the three considered shape coefficients classifies perfectly. The Danielsson coefficient is the best at differentiating the assessments, but it struggles with classifying positions 4 and 5. The Blair-Bliss coefficient is very sensitive to the size of the area occupied by the image of an object, while the Haralick coefficient differentiates between assessments for a very slight degree, and similarly to the Danielsson coefficient, it struggles in cases of a simultaneous rotation and a change in the scale. It should also be emphasized that the calculation time for the developed approach, similarly to the Haralick coefficient, falls within the time needed to determine the contour of the object, whereas the calculation times of Danielsson and Blair-Bliss coefficients are from 5 to 7 times longer.

4.4 Sensitivity of the method to errors in determining the contour of an object

The developed method is dedicated to recognizing complex machine parts under real conditions of a production plant. These types of machine parts are usually created in the form of castings or forged parts of complex surfaces with variable stereometry and, additionally, convex or concave inscriptions. Thus, images obtained in such conditions are characterized by a certain variability, resulting from differences in the position between the source of light and the lens of a camera as well as the photographed object. Light reflections, whose complete removal is difficult and which result from the metal surfaces of machine parts, make it so that the possibility of the contour of the photographed object being determined incorrectly needs to be taken into account. In particular, a case of the contour “spilling over” onto the inside of the object may appear, which is difficult to diagnose. Such a situation greatly increases the requirements posed for methods of preliminary image processing and classification. For this reason, tests of the proposed method’s sensitivity to the above case have been carried out.

A change in the location of the contour’s centre of gravity, which appears in the case of an erroneously generated contour of the photographed object, is a significant interference. As a consequence, this often changes the point with the minimal value \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{X}_{{\max_{i} }} \) and, as a further consequence, the mutual shift in sets of points generated in the polar system. The IS similarity index calculated in such a case would have reached a low value. For the case from Fig. 16a, the IS equals 0.2. The solution to this problem is to offset the set of points by a value which gives the highest similarity index (Fig. 16b). The IS index calculated after the offset reached a value of 0.55.

Fig. 16
figure 16

Effect of shifting the pair of comparison graphs by the maximal declination caused by a change in the location of the minimal value

An example of how this approach works for different lighting conditions and camera sensitivity parameters is shown below for three cases. Figures 17a, 18a and 19a show three original images from which only the blue background has been removed (the colour of the conveyor belt). Figures 17b, 18b and 19b show the images after binarization and filtering. Figures 17c, 18c and 19c show the determined contours. As it may be seen, the contour in Fig. 17c shows the complete perimeter of the photographed object. The contour in Fig. 18c contains slight differences, whereas the contour in Fig. 19c contains significant deviations from the ideal contour. Figures 20, 21 and 22 show the respective graphs of how the IS similarity index behaves in relation to the reference pattern as shown in Fig. 12a depending on the size of the offset.

Fig. 17
figure 17

Correctly determined contour: a original image, b image after binarization and filtration, c determined contour and its centroid

Fig. 18
figure 18

Determined contour with minor errors: a original image, b image after binarization and filtration, c determined contour and its centroid

Fig. 19
figure 19

Determined contour with large errors: a original image, b image after binarization and filtration, c determined contour and its centroid

Fig. 20
figure 20

Formation of the IS index in the case of a correctly determined contour

Fig. 21
figure 21

Formation of the IS index in the case of a contour containing minor errors

Fig. 22
figure 22

Formation of the IS index in the case of a contour containing significant errors

For the complete contour, the maximum IS index found is high and equals 0.77 for an offset of 270. The reference pattern from Fig. 12a was developed on the basis of another photograph with a four times lower resolution, it is rotated, and hence, IS equals less than 1. For the contour with minor errors, the maximum IS index equals 0.66 (Fig. 21), and thus, it is relatively only slightly smaller than in the previous case, and it is still much larger than the maximum similarity indices for the other reference position patterns. For the contour with large errors, the maximum similarity index reached the value of 0.197. It is therefore at an unacceptably low level. It should be noted, however, that it is still the highest of all the determined IS values for this case; furthermore, it demonstrates the correct rotation of an object by 270°.

5 Conclusion

Each non-trivial task of image recognition requires an individual approach. The present work introduces a solution to the problem of detecting the position and orientation of machine parts, for which the standard methods have failed. A method of shape recognition was sought, which would efficiently differentiate between objects with complex shapes, would be sensitive to a change in the shape and would not be sensitive to a change in representing the object. The diverse behaviours of \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\smile}$}}{X}_{\hbox{max} } \) depending on the A angle prove that the aim of developing various patterns for the position of a rear wheel pin has been achieved. The developed approach is characterized by its low sensitivity to a change in the scale and rotating the image, unlike the classic shape coefficients. The method yields correct recognition results also in cases, when the determined external outline of an object contains incorrect fragments. The short calculation time is practically identical to the time it takes to generate contour points. The presented method of determining the contour requires slightly more time for calculations compared to the best solutions in this respect. This slight increase in the calculation time results mainly from the integration of the contour determination function with its smoothing and determining the contour’s centre of gravity.

The developed method is not a universal one. It may be applied in cases when the external properties of a contour contain sufficient information the purpose of differentiating between objects in the recognition task. Therefore, its application is limited to the classification of complex, asymmetrical objects whose properties related to the external outline are determinantal. The method may be classified as intelligent because, on the basis of easily accessible, incomplete information about an object, it enables the appropriate classification of the photographed object, as well as determining its position and orientation. The applied algorithm for calculating the similarity index is suited for parallel calculations, which should be a property of all methods classified as intelligent and prospective. Further directions for researching the presented method include reaching a rational compromise between the required resolution of the analysed image and the quality of the obtained results. Another interesting issue would be to examine the effectiveness of the developed method in comparison with methods of processing point models obtained by way of optical triangulation.