Granular computing in mosaicing of images from capsule endoscopy
 917 Downloads
Abstract
This article introduces methods for modeling compound granules used in algorithms which could successfully construct a mosaic from the images coming from an endoscope capsule. In order to apply the algorithm, combined images must have a common area where the correspondence of points is determined. That allows to determine the transformation parameters to compensate movement of the capsule that occurs between moments when the mosaic images were acquired. The developed algorithm for images from the capsule endoscopy has proved to be faster and comparably accurate as commercial GDBICP algorithm.
Keywords
Granular computing Capsule endoscopy Image registration Image mosaic Keypoints matching1 Introduction
1.1 Endoscope capsule
The process of endoscope examination by means of the capsule goes as follows. First, an antenna device is placed on a patient’s belly making it possible to receive a visual signal transmitted by the capsule. A patient swallows an endoscope capsule which goes through the whole human gastrointestinal taking 2 images per second which gives according to the producer over 50,000 images while being examined. The capsule is disposable. The images sent to an external device are placed on disc of a computer with Rapid application which serves as a browser for the images.
1.2 Image mosaicing and image registration
 1.
Feature extraction The important and distinctive structures are extracted (areas of closed boundary, edges, outlines, intersection of lines, corners, etc.). For further processing these features can be represented by points (centers of gravity, line endings, distinctive points ) which, in literature, are called control points (see Zitova and Flusser 2003).
 2.
Feature matching A correspondence is found between the features in the processed image and the found features in the reference image. For this purpose, a variety of descriptors of features and similarity measures with the spatial relationships among features are used.
 3.
Estimation of the transformation model The type and model of the mapping function is determined between the overlaid image and the reference image. Mapping function parameters are calculated by using the fixed correspondence features.
 4.
Resampling and image transformation The relevant image is transformed by using the mapping function. The values in the image at the points of noninteger coordinates are converted by suitable interpolation technique.
Key points selection and 2 nearest neighbors matching technique comes from SIFT algorithm which is standard, most popular technique to matching images. However, a slight modified method of 2 nearest neighbors (finding of N best matches by sorting them in terms of distance to 2 nearest neighbors), comes from GDBICP algorithm, was used.
2 The developed algorithm for image mosaicing
This section presents the results of research which lead to the creation of an original algorithm for mosaicing images from the endoscopy capsule, as well as a complete description of the algorithm. This algorithm is the result of synthesis techniques and algorithms described in literature and designed for its needs. It has been called “Quadruple keypoints matching and perspective transformation testing (QKMPTT)” algorithm (Maciura 2012). In this paper we present this algorithm as a kind of a granular computing.
2.1 Preprocessing
In the developed algorithm, the preprocessing of the input images (two initial granules) plays an important role. It allows at a further stage (i.e., during the isolation of keypoints) to select those items that will be more useful in the search for the correspondence pairs. The first step in image preprocessing is the detection of noise which should not be taken into account when identifying and matching of the keypoints, since these are relative to the content which is moving along the gastrointestinal tract. The detection of the noise involves threshold saturation of colour (S) channel in the HSV color space (Palus and Bereska 1995). What is used here, is the fact that the ingesta has got much lower saturation level in relation to the wall of the gastrointestinal tract.
Finally, using logical operations on binary images a binary image is created, which pixels does not belong to the noise, and at the same time belongs to the main edges and corners. This image will be used for subsequent rejection of the keypoints that do not lie in the right areas.
2.2 The keypoints of input images
The next step of this algorithm is to extract the keypoints of the input images (a collection of keypoints for an image we call as a KP granule). As a result of comparison of the available algorithms in terms of their characteristics as well as preliminary studies comparing keypoints of registration techniques there was selected SIFT technique as the matching algorithm of single keypoints in the developed algorithm. Thanks to the SIFT technique we obtain for any keypoint a vector of qualities (features), called a vector of SIFT descriptors. Such vectors are computed on the basis of keypoints surrounding and are used for further calculations. Besides, thanks to it the KP granules are replaced by two VKP granules which are the collection of vectors calculated for all keypoints from equivalent KP granules. The next step was the removal of the keypoints (from KP granules and from VKP granules) that lie on the edge of the screen and the ones that lie in the regions identified as noise caused by gastric contents (which detection was described in 2.1).
At the same time, the points are discarded which do not lie in the areas around the major edges and corners, based on the binary image described previously.
2.3 The technique for finding the four best correspondences
The technique for finding the four best correspondences has been developed for this algorithm. It is one of its main components. It is based on the fact that all possible combinations of quadruple matches of \(N\) best rated matching (granules of pairs of VKP granules) is compared (in terms of the ratio of distance of the matched descriptor to the two nearest neighbors—in the sense of similarity) and the best quadruples are found. Algorithm (and number) of finding \(N\) best rated matching cames from initial step of generalized dualbootstrap iterative closest points (GDBICP) algorithm (Yang et al. 2007; Yang 2007) and from own experiments.
After sorting all quadruple combinations in terms of their evaluation function, among the top quadruples one should find the correct solution (if any). This technique works best with the assumption that the overall perspective transformation between the images is similar to the affine transformation.
2.4 Setting the optional perspective transformations between images
The next step is to determine different versions of perspective transformations between a pair of images. This is done repeatedly in the main loop of the program where different versions of the transformation based on the \(M\) best SIFT correspondence quadruples are calculated. Finally, the best version of the transformation is considered valid and used to create the mosaic. Computation of the transformation is done using an algorithm determining the perspective transformation based on the correspondence of four points. Analyzing the perspective transformation matrix, it is possible to reject at this stage of transformation errors resulting from incorrect quadruple matches. One can check whether certain transformation parameters have the correct values. It significantly speeds up the functioning of the algorithm as incorrect transformations need no longer be analyzed and can proceed to the next iteration of the main loop.
2.5 Estimation of perspective transformation matrix using four matched keypoints
This algorithm is implemented in OpenCV library in function cvGetPerspectiveTransform. The result of this algoritm is perspective transformation matrix \(3 \times 3\) (Eq. 2). Four matched keypoints is minimal number to calculate perspective transformation matrix.
2.6 Transformation of image for evaluation of transformation matrix
In order to select the best transformation matrix, it is necessary to perform each transformation and its evaluation first. When the transformation in the form of transformation matrix is known, its performance is a simple problem. To perform this transformation, a transformation technique was used based on the perspective transformation matrix (Eq. 2), and a bilinear interpolation technique (Goshtasby 2005).
2.7 Finding the best transformation
The next step in the main loop of the program is the analysis of the performed transformations to select the matrix of the best evaluated transformation. This matrix will eventually be used to create a mosaic. The algorithm of the search for the best transformation is to match the edges between the transformed target image and the reference image and counting the points belonging to the edges that overlap in the common part of both images and have a similar orientation (angle of the inclination of the edge). The number of these points is the evaluation of the analyzed transformation. The edges of the images are extracted using the Canny edge detector (Canny 1986). In addition to the very edges of the target image and the reference image, one needs to calculate the orientation of the image points, which will be used later to compare the matched edges in terms of the difference in the angles of their orientation. The calculation of the orientation takes place after a previous calculation of partial derivatives of functions of image brightness. The original function serves to evaluate a transformation which calculates the number of overlapping edge points in both binary images which have similar orientation. The number of overlapping points is an evaluation of a transformation; the more numbers, the better transformation.
2.8 Normalization of windows
During the mosaicing process a very important element is to create suitable windows for the transformed image and the image of the outcome as a mosaic. It often happens, as a result of a transformation, that a transformed image and the mosaic are bigger than the input images and moved towards them. If the same windows were used to create the transformed image and the mosaic as in the input images, the outcome could not fit in the window. Apart from increasing the size of the window, one should also move the input images so that the outcome of the transformation does not exceed the size of the window from the left size or from the top. The very process of windows normalization starts after finding the best transformation. At the beginning of the process of window normalization, the algorithm using the formula for points belonging to the circle verifies which coordinates \(P (x, y)\) lie on a circle border of the field of view image from the endoscope capsule (that is, the first parameter of the program). After determining the boundary points, the algorithm calculates their new coordinates in which they will find after the operation of transforming the image. These coordinates are calculated on the basis of the transformation matrix. The next step is to find the four most extreme points from the obtained set of new coordinates of boundary points. These are: a point located on the extreme left, right, up and down. With it, you can calculate the required size of the window, and the required vector of the shift of the images (so the resulting image does not exceed beyond the left and beyond the upper edge of the window).
2.9 The final transformation and the creation of mosaic
The best transformation matrix determined on the basis of suitability points in the images after normalization (2.8) is used for the final transformation of the standardized target image. To complete the transformation with the resampling, the function cvWarpPerspective from the OpenCV library was used. It uses the technique of transformation on the basis of the perspective transformation matrix and bilinear interpolation technique. The last step of creating the mosaic images is the socalled image fusion (resulting in final granule). On the whole, the concept of image fusion means appropriate connection of information with each other for two or more images. A broader concept of fusion covers the registration of images, and then connecting with each other corresponding pixels. The research has established the following image fusion algorithms (left to choose by the user): fusion by the average arithmetical value of the RGB channels, the image fusion technique for maximizing the value of the RGB channels, fusion by the technique of color mixing, fusion combined with a reduction of noise. The fusion by the average arithmetical value of the RGB channels deals with inserting in the output image pixels generated by calculating the arithmetic mean of corresponding RGB channels of corresponding pixels in the pairs of images. The fusion technique for maximizing the value of the RGB channels deals with inserting in the output image pixels generated by calculating the maximal corresponding to the RGB channels of corresponding pixels in a pair of images. The fusion by the technique of color mixing deals with inserting in the output image corresponding to the weighted average of the RGB channels corresponding to each pixel in a pair of images. The weighting factors of the weighted average are calculated based on the ratio of the distance of a pixel from the boundaries of the shared part of images together with each image. The result is a smooth transition from the shared part of images to particular images. The fusion combined with the removal of noise is to select for the output image the pixel with the corresponding pixels in the input images which has a higher level of color saturation (S channel in the HSV color space). This is due to the fact that the noise and the black background around the proper image has a much lower level of color saturation.
2.10 A comprehensive presentation of the developed algorithm of image mosaicing
 1.
Reading of the target image and the reference image (initial granules),
 2.
Appointment of the uncertain pieces of images,
 3.
Determination of the reference image edge using the Canny technique and removal of these that are caused by noise,
 4.
The calculation of the gradient orientation for the points in the reference image,
 5.
Designation of areas surrounding the dominant edges and corners,
 6.
Determination of SIFT keypoints (KP and VKP granules) in the areas identified in the previous step and simultaneously parts not belonging to the unsure parts of the image (determined in step two),
 7.
Finding matches (pairs of VKP granules) of the SIFT keypoints between images,
 8.
Sorting of matches in a non decreasing order in terms of the quotient of the distance to their two nearest neighbors (which causes the lineup of matches from the best to the worst)
 9.
If there are fewer than \(L\) matches, it is followed by the cease of the algorithm and a notice is displayed about an insufficient number of the found matches,
 10.
If there are more than \(N\) matches, then there will be considered a subset of \(N\) the best assessed matches from all sets of matches, otherwise a collection of all of the matches will be taken into account,
 11.
Finding the best of all possible combinations of an N set of quadruples of SIFT adjustments (determined in the previous step) and their evaluation,
 12.
Sorting of the adjustments of quadruples in terms of evaluation,
 13.For the first \(M\) quadruples (best in terms of assessment) :
 (a)
Determination of the transformation matrix on the basis of quadruple matches,
 (b)
If the approximate transformation matrix is incorrect, there is a move to the next iteration of the loop,
 (c)
Perform the transformation in the target image,
 (d)
Determination of the edge with the Canny technique for a transformed target image,
 (e)
The calculation of the gradient orientation for points belonging to the target image after the transformation,
 (f)
The calculation of the number of overlapping edge points with similar orientation,
 (g)
If the assessment of the analyzed transformation is better than the current maximum rating, followed by the saving of the transformation matrix and keypoints by which it was determined (granule of mosaic). The maximum rating becomes the assessment of the analyzed transformation.
 (a)
 14.Final phase (when the maximum score is greater than 0):Otherwise, the system displays an inability to create a mosaic.
 (a)
Normalization of windows (windows transformation of the input image with their shift, and the creation of windows of appropriate size for a transformed image and the resulting image mosaic),
 (b)
In case of the improper parameters of windows the program is stopped and the message comes out about the impossibility of the completion of the mosaics,
 (c)
The creation of a new transformation matrix from the saved keypoints, if there has been the shift in images while normalizing the window,
 (d)
Perform the transformation based on the best transformation matrix,
 (e)
Completing the mosaic (final granule)
 (a)
3 Experiments and results
The created “Quadruple keypoints matching and perspective transformation testing” algorithm is compared with the algorithm GDBICP proposed by Yang et al. (2007) in terms of accuracy and operating time. Both algorithms are completely different, although they are similar in initial stage, a technique for determining the N best correspondence SIFT was borrowed from GDBICP algorithm. Algorithm GDBICP consists in calculating the approximate transformation using single SIFT correspondence and then increase the accuracy of transformation estimation by matching points using another method and increasing the area around matching SIFT key points. In contrast created algorithm consists in calculating and testing perspective transforms using fours matched SIFT keypoints selected from set of N matched keypoints.

+++—perfect or nearperfect mosaic,

++—visible mosaic comprising a transformation errors but quite correct,

+—mosaic containing large errors in the transformation, but approximately correct,

0—mosaic totally incorrect or missing output file.
The selected pairs of images come from different parts of the gastrointestinal tract. It is worth noticing that in the case of images from capsule endoscopy many images are such that even the human would not be able to match successive images (not to mention for algorithms working in the automatic way). On the other hand, in some places of the gastrointestinal tract capsule retracts or moves very slowly (which causes successive images with no difference). Therefore at this point we used the opinion of medical experts who helped point out some interesting seven pairs of images, which on one hand are significantly different from each other and on the other hand are representative examples of pairs of images whose can be matched automatically.
Results of experiments with QKMPTT algorithm and the algorithm GDBICP
Pair  QKMPTT  Algorithm GDBICP  

Accuracy  Time (s)  Accuracy  Time (s)  
1  +++  6  +++  177 
2  0  –  0  166 
3  ++  4  +++  232 
4  +++  5  ++  289 
5  ++  3  +  290 
6  ++  4  0  208 
7  +++  7  +++  37 
In conclusion, the discussed series of experiments of the designed “Quadruple keypoints matching and perspective transformation testing” algorithm in terms of accuracy works equally well as the algorithm GDBICP. Undoubtedly, the advantage of the designed algorithm is its speed. In all such cases, it operated faster than GDBICP algorithm. In the best case, 96.7 times faster and at worst 5.3 times faster than the algorithm GDBICP.
Execution time of both algorithms is given in seconds not in order to determine their complexity but in order to compare the speed of these algorithms. The algorithm GDBICP was available on the Internet in the form of exe file. On the basis of the publications about this algorithm it is also not possible to determine precisely its complexity. The described QKMPTT algorithm, which was compared with the algorithm GDBICP, uses a variety of image processing algorithms [for example the Canny edge detection (1986)], that the complexity of these algorithms is not clearly described in publications known from literature. Therefore, a formal computation of complexity of the QKMPTT algorithm seems to be very difficult or even impossible in this place.
4 Conclusions
We have discussed methods for modeling of compound granules used in algorithms which could successfully construct a mosaic from the images coming from an endoscope capsule. The research was conducted on existing algorithms of applying and mosaicing of images, a selection of algorithms was improved and developed such algorithm which would be able to effectively construct a mosaic of images from the endoscope capsule. The presented algorithm is called “Quadruple keypoints matching and perspective transformation testing (QKMPTT)” algorithm. It has also been developed an algorithm to eliminate noise in the images of endoscopy capsule during the image fusion. After the final experimental studies, it turned out that the developed algorithm is many times faster than a commercial algorithm GBDICP for the images from endoscopy capsule and, at the same time, comparatively accurate. It should also be noted that the algorithm GDBICP was the only algorithm found by the author that handled the mosaicing of these images (other than the algorithm presented in this paper). The time in which the algorithm for images of the endoscopy capsule was developed gives hope for its implementation in realtime through the use of existing hardware capabilities (e.g., through parallelization of the algorithm using graphic processors with the CUDA technology). The parallelization of the algorithm would also open the possibility of adapting the algorithm to the application of registration endoscopy capsule images and CT images. The aim of the work presented in the article was not to create a diagnostic tool but only the tests that could make it possible to create such a tool. It should be noted that the research in this area and, in particular, the results obtained during the research are original. An expert judgment in this phase of research was limited to determining whether the obtained theoretical results promise the possibility of practical applications.
Notes
Acknowledgments
This work was partially supported by the Polish National Science Centre Grant DEC2013/09/B/ST6/01568 and by the Centre for Innovation and Transfer of Natural Sciences and Engineering Knowledge of University of Rzeszów, Poland.
References
 Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded up robust features (SURF). Elsevier, AmsterdamGoogle Scholar
 Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell PAMI8:679–698Google Scholar
 Chen J, Tian J (2006) Rapid Multimodality preRegistration based on SIFT descriptor. In: Proceedings of the 28th IEEE EMBS annual international conference, New York City, USAGoogle Scholar
 Cunha JPS, Coimbra M, Campos P, Soares JM (2008) Automated topographic segmentation and transit time estimation in endoscopic capsule exams. IEEE Trans Med Imaging 27:19–27Google Scholar
 Doherty P, Łukaszewicz W, Skowron A, Szałas A (2006) Knowledge engineering: a rough set approach. Springer, HeidelbergGoogle Scholar
 Goshtasby AA (2005) 2D and 3D image registration for medical, remote sensing and industrial application. Wiley, New YorkGoogle Scholar
 Hammer med. Companypolish distributor of Capsule Endoscopy, http://pillcam.hammer.pl/
 Harris C, Stephens M (1988) A combined corner and edge detector. The Plessey Company plc., UKGoogle Scholar
 Kanazawaa Y, Kanatanib K (2004) Image mosaicing by stratified matching. Image Vis Comput 22:93–103Google Scholar
 Lowe DG (2004) Distinctive image features from scaleinvariant keypoints. Int J Comput Vis 60:91–110Google Scholar
 Maciura Ł (2012) Mozaikowanie obrazów z kapsuły endoskopowej (in Polish). Silesian University of Technology, Studia InformaticaGoogle Scholar
 Palus H, Bereska D (1995) The comparison between transformations from RGB colour space to IHS colour space, used for object recognition. Image Process ApplGoogle Scholar
 Pedrycz W, Skowron A, Kreinovich V (eds) (2008) Handbook of granular computing. Wiley, ChichesterGoogle Scholar
 Yang G (2007) Towards generalpurpose image registration. Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, New York, USAGoogle Scholar
 Yang G, Stewart CV, Sofka M, Tsai CL (2007) Registration of challenging image pairs: initialization, estimation, and decision. IEEE Trans Pattern Anal Mach Intell 29:1973–1989Google Scholar
 Yue W, YunDonga W, Huia W (2008) Free image registration and mosaicing based on tin and improved Szeliski algorithm. In: The International archives of the photogrammetry, remote sensing and spatial information sciences. ISPRS Congress, BeijingGoogle Scholar
 Zitova B, Flusser J (2003) Image registration methods: a survey. Department of Image Processing, Institute of Information Theory and Automation, Academy of Sciences of the Czech RepublicGoogle Scholar
Copyright information
Open AccessThis article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.