Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Endoscopic diagnosis and treatment on digestive tracts has become increasingly accepted methods. For example, in the diagnosis of early-stage gastric tumors, the size of the tumor is one of the most important factors for the choice of treatment. However, this is currently evaluated either by manipulation with forceps or by visual assessment alone, both of which are time consuming and possible human errors occurring. For this reason, an easy to deploy, accurate tumor size estimation technique is necessary for endoscopic diagnosis systems.

Recently, several papers have proposed 3D endoscope systems to measure the shapes and sizes of living tissue [19]. Among them, we have adopted active stereo systems for developing 3D endoscopic systems, because of stability, accuracy and cost effectiveness [5, 7]. These systems use micro-sized pattern projectors with the endoscope cameras, and have successfully reconstructed several ex vivo human tumor samples. To implement the micro-sized pattern projector, a micro chip of the pattern and a lens are used to project a focused pattern image on the target surface. One significant limitation of such system is that a lens-based projection method can project a clear pattern only within a narrow depth range. This is because the pattern projector is based on a single lens, and thus, off-focus blurring as well as aberrations in the periphery of the field of view inevitably occur. Another critical problem in an active stereo system for endoscope is the strong subsurface scattering effect, which is common for internal tissue. This not only blurs the projected pattern, making it more difficult to detect, but it also diminishes its brightness. One more important problem is the sparse reconstruction of the system. Since the human body moves dynamically, we are required to scan the target object within a short period of time; such a system is also known as a oneshot scan  [1012]. Usually, the resolution of a oneshot scanning system is low because a certain area of the pattern is required to embed the projector’s positional information. Consequently, resolution of oneshot active stereo system for endoscope becomes low.

In this paper, we propose a three approach solution for the aforementioned problems. The first is using a special optical device called Diffractive Optical Element (DOE). Since the DOE is a device that can project sharp patterns regardless of its depth, it solves the narrow depth of field problem. Furthermore, the DOE light efficiency is usually more than 90 %, which helps to maximize the pattern detection accuracy by preserving its visibility. The second is a novel line-based grid pattern, processing gap coding. Since high frequency information is easily lost in the presence of strong scattering effects, low frequency patterns are generally more robust. However, a low frequency pattern is not suitable for encoding rich information. We address this issue by intentionally adding gaps between adjacent lines in the grid pattern, which creates implicit unambiguous higher-level label structures that can be easily detected under strong scattering effects. The third contribution is using a multiple-shape alignment algorithm producing a grid-like shape, which then is reconstructed using the line-based grid pattern. With this method, multiple sparse shapes of the grid patterns are effectively aligned using grid information and merged into a finer shape.

By using our DOE micro pattern projector with a line-based grid pattern, we can achieve an efficient and accurate reconstruction of tissue in metric 3D with a wide depth of field using an ordinary endoscopic system. In the experiments, we show the effectiveness of our technique with several tests using a projector-camera system, and demonstrate the reconstruction of an ex vivo tumor sample imaged at several distances from the camera using the endoscopic system.

2 Related Work

For a 3D reconstruction method using endoscopes, techniques using Shape from Shading (SfS) [1316] have been proposed. However, the SfS techniques often have stringent assumptions concerning the types of images that can be processed, such as the known or uniform diffuse reflection rates of the target surface, thus, the precise size measurement is generally difficult. 3D endoscopes based on binocular stereo [17, 18] are actively being researched at the present. For the binocular stereo algorithm, which is a typical passive stereo technique, correspondence problems are often very difficult, especially on textureless surfaces. Visual SLAM has also been also applied on endoscope images [6], but the 3D reconstruction is only up-to-scale. Thus, it cannot be directly applied for measuring real sizes of 3D tissues. As an example of active stereo applications in endoscopy, in work of Grasa et al. [1], a single-line laser scanner attached to the scope head was used to measure tissue shape, however the scope head had to be actuated in a direction parallel to the target, which limited the practical applicability of the technique. Some other vision techniques use special cameras being applied to endoscopes, such as the Shape from Polarization (SfP) [2] which uses an endoscope with a rotating polarizing filter on light source, and ToF sensors [4]. Considering laparoscope systems, computer guided surgery operations have been actively researched, such as Penne et al. [8] or Kunert et al. [9]. However, the 3D system used in a laparoscope is not necessarily small and therefore cannot be used to endoscope. Recently, Furukawa et al. proposed a structured light system for endoscope [5, 7], which allows users to update a common endoscope system without any reconfiguration. However, there are several problems with the system and our technique can provide practical solutions to them.

The Structured light based 3D scanning systems have been studied for a several decades and these techniques are well summarized by Salvi et al. [19]. Based on the analysis, the techniques can be largely categorized into two method, the temporal method and the spatial encoding method. Multiple patterns are required for a temporal encoding method [20], whereas just a single static pattern is required for a spatial encoding method. Because of such differences, one of the most important advantages of a spatial encoding method is that it can capture a moving object or, in other words, the sensor can be moved during the scan. Another important benefit is a potential for compact implementation. Note that those two advantages are applicable for endoscopic systems. Based on such advantages, spatial encoding techniques have been intensively studied [11, 2123]. To increase their stability and accuracy, most techniques use color information, however, it is difficult to use multiple colors in endoscopic systems because of space limitations. Koninckx et al. proposed a single color method using parallel lines [24], but it has several limitations in practical usage. Sagawa et al. proposed a single color method using a wave shaped pattern to encode additional information into the phase of the wave [25]. Kinect is another successful implementation using a random dot pattern [12]. However, those patterns are not considered to be useful under strong subsurface scattering environment, and thus, not suitable for endoscopic systems.

Our third contribution is based on a rigid registration algorithm. Rigid registration algorithms estimate translation and rotation of an object from two point sets, with the ICP algorithm [26] and its extension to multiple point sets [27] being the two best-known approaches. Since then, improved techniques have been intensively researched on realtime registration [28], large scale simultaneous registration [29, 30] and color compensated registration [31]. However, since they all assume a large overlap of dense shapes, they generally cannot be used whenever the shape is sparse such as a grid based reconstruction [10, 11, 23]. Recently, an ICP for the sparse point set was proposed [32], however, since the technique is still based on the correspondences of closest points, lines in the same direction are inevitably pulled together, and thus, all the grid based shapes are bundled into a single grid liked shape. Banno et al. proposed a method to align the multiple 3D curves which are reconstructed by the light sectioning method into single consistent shape [33], however, they assumed a base shape with holes captured in advance with 3D curves that are aligned with it to fill the holes. Therefore, the technique cannot be applied to the data which consists of independent curves only. Another approach to achieving robust registration of multiple shapes is based on 3D features extracted from input shapes [3438]. However, stable 3D features are usually extracted only from dense 3D points and cannot be applied to grid based shapes, whose points are sparse and unevenly distributed.

3 DOE Projector for Endoscopy

3.1 System Configuration

A projector-camera system is constructed by installing a micro pattern projector on a standard endoscope as shown in Fig. 1(a). For our system, we used a FujiFilm VP-4450HD system coupled with a EG-590WR scope. The DOE-based pattern projector is inserted in the endoscope through the instrument channel, and the projector protrudes slightly from the endoscope head and emits structured light. The light source of the projector is a green laser module with a wavelength of 517 nm. The laser light is transmitted through a single-mode optical fiber to the head of the DOE projector. In the head, the light is collimated by a grin lens, and go through the DOE. The DOE generates the pattern through diffraction of the laser light.

Fig. 1.
figure 1

System configuration: (a) system components, (b) DOE micro projector, (c) the projected pattern (top), and embedded codewords of S colored in red, L in blue, and R in green (bottom). S means edges of the left and the right sides have the same height, L means the left side is higher, and R means the right is higher. (Color figure online)

3.2 DOE Projector for Endoscopy

In the previous work [7], a lens with a mask pattern is used for the pattern projection. From our experience, such an optical system has a generally narrow depth of field, such as approximately an 8 mm depth for a working distance of 40 mm. Another problem with such optical systems is brightness efficiency, which is important since light exposure for the cameras on the endoscopes are low. To solve both problems, we have created a micro-pattern projector consisting of a DOE, a grin lens with single-mode optical fiber and a laser light source as shown in Fig. 1(b). The DOE can project a fine, complex pattern at a greater depth range without requiring lenses, and the energy loss is less than 5 %. The actual specifications of the micro pattern projector are as follows. To lead the micro DOE projector into the head of the endoscope through the instrument channel, its dimensions were needed to be \(2.8\,\mathrm{mm}\) in diameter and \(12\,\mathrm{mm}\) in length. The working distance, valid depth range and area for the pattern projector are of \(30\,\mathrm{mm}\), \((-10\,\mathrm{mm\,to} +40\,\mathrm{mm})\) and \(30\,\mathrm{mm}^2\), respectively.

3.3 Design of Projected Pattern

Avoidance of Subsurface Scattering. Since the reflectance conditions inside the body are very different from the ordinary environments for which active scanners are built, we then tailored an original pattern design specifically for the intra-operative environment. One significant casual effect under endoscopic environment is a strong subsurface scattering on the surface of internal organs. In the previous work [7], a grid pattern that consists of waved lines was used. We have found that, under the conditions of strong subsurface scattering, some of the important information of the waved grid patterns, such as the wave curvature, is difficult to be extracted or sometimes is lost completely. To avoid losing important detailed information, we considered a pattern with a larger low-frequency structure. Existing patterns of this kind include sparse dots or a straight line-based pattern with a wide interval. However, sparse dots are difficult to decode with wide baselines and large windows, because the pattern is heavily distorted under such conditions. On the other hand, a simple line-based pattern cannot encode distinctive information efficiently [10]. Instead, we propose a line-based pattern with large intervals with a new encoding technique which is robust against the scattering effect.

Our proposed pattern consists of line segments only as shown in Fig. 1(c). The vertical lines of the pattern are all connected and straight, whereas the horizontal segments are designed in such a way to leave a small variable vertical gap between adjacent horizontal segments and their intersections with the same vertical line. With this configuration, a higher-level ternary code emerges from the design with the following three codewords: S (the end-points of both sides have the same height), L (the end-point of the left side is higher), and R (the end-point of the left side is higher). In our actual implementation, we assign all S code for every other horizontal line, because it increases the robustness of the line detection process. Therefore, the final codes of the pattern of Fig. 1(c) (top) are shown by color in Fig. 1(c) (bottom) and by graph representation in Fig. 3 (left).

Eliminating the Singular Rotation Angle of Pattern. As shown in Fig. 1(a), since a DOE pattern projector cannot be fixed to the head of the endoscope, a rotation angle of the pattern may have some freedom, such as \(\pm 30^{\circ }\). If the rotation angle is near 0 degrees, the epiplor lines, which are drawn on the pattern image of Fig. 1(c), nearly coincide with the horizontal lines and the number of candidate points on the pattern for the intersection on the captured image increases; such a condition increases the ambiguity and results in low reconstruction accuracy. To mitigate the instability at such a singular rotation angle, each set of horizontal line segments in the same column is slightly inclined with a specific degree; e.g., according to a piecewise long wavelength sinusoid as shown in Fig. 1(c).

4 3D Reconstruction

4.1 Detection of Line Patterns

The source image is first geometrically corrected on fisheye lens distortion. Noises in the image are suppressed using Gaussian filters or median filters simultaneously. Figure 2(a) shows an example of an input image. Then, the vertical lines on the captured image are detected because vertical lines projected onto the objects are still connected if the surface of the objects is smooth; remember that the vertical lines are straight whereas the horizontal lines are small segments frequently disconnected. Figure 2(b) shows detected vertical lines.

Fig. 2.
figure 2

Example of grid graph detection: (a) source image (a partial region), (b) identified vertical line (violet dots) and candidate end-points (blue dots), (c) initial candidates of the horizontal edges, (d) identified horizontal edges (blue line segments), and (e) detected grid graph with gap codes (colors represent same codes with Fig. 1(c)). (Color figure online)

Next, the horizontal segments connecting the vertical lines are extracted. In this stage, intensities of both sides of the detected vertical lines are traced and the peak values on each side are measured. The position of these peaks are the candidates of end-points of the horizontal segments as shown in Fig. 2(b) as blue dots. From all line segments that connect the candidate end-points, those within a predefined range of lengths are then selected as initial candidates for the horizontal edges as shown in Fig. 2(c) as red lines. Then, to correct for small positional errors of the initial edge candidates, every pixel on the edge is moved to the local peak position along the vertical line from the original pixel. Finally, all the distances between the corrected edge candidates are calculated and if there is a pair of edge candidates that are close to each other, the edge candidate with the smaller average intensity is removed. The final determined horizontal edges are shown in Fig. 2(d).

Fig. 3.
figure 3

Grid graph construction: (left) structure of the projected pattern, and (right) grouping end-points of the identified lines.

4.2 Constructing Grid Graphs

From the identified line patterns, a grid-graph structure is constructed. The grid graph structure of the proposed pattern is shown in Fig. 3 (left). Note that the vertical lines are first identified (e.g., \(v_1\) and \(v_2\) of the figure), and the horizontal edges are then extracted between them (e.g., \(e_1\) and \(e_2\)). Thus, pairs of horizontal edges from both sides of a vertical line may be classified as ‘continuous’ or ‘non-continuous’. Here, ‘continuous’ horizontal edges do not mean only that they are geometrically continuous as \(e_3\) and \(e_4\). Edges \(e_5\) and \(e_6\) are also considered continuous because their end-points are regarded as a single node \(n_1\). Thus, classifying the continuity of horizontal segments is an important process.

As shown in Fig. 3 (left), nodes have ternary S / L / R gap codes. For example, on the vertical line \(v_2\), nodes have S codes at \(n_3\) and \(n_5\), L codes at \(n_2\) and R codes at \(n_4\). Taking this property into account, we first label the nodes from continuous horizontal edges with the label S by selecting those satisfying the small distance threshold between end points of consecutive horizontal segments of the identified lines (end-point groups shown by rectangles in Fig. 3 (right)) Then, the remaining edge pairs with a larger vertical distance between end-points are selected, in which, if the horizontal lines above or below have continuous nodes, then the pair is labeled as belonging to the same horizontal piecewise sinusoid and joined together as a single node (end-point groups shown by triangles in Fig. 3 (right)).

Each node is connected by vertical or horizontal edges with its up, down, left, or right adjacent nodes, such as \(n_6\). Some horizontal edges might have a missing edge because of mis-identification. In this case, the node will only have either a left or a right edge, which may be matched later by looking at other connectivity in the grid graph. Figure 2(e). shows an example of the identified vertical and horizontal patterns with estimated gap codes.

Fig. 4.
figure 4

Matching the detected grid graph and the projected pattern using LSGPs.

4.3 Finding Correspondences Using Sub-graph Patterns

Let the grid-graph detected be G, and let the pattern of the grid-graph in Fig. 1(c) be P. Note that graph G may lack some edges, or have undesired false edges, missing labels, or false labels of S/L/R as shown in Fig. 4. To match G to P allowing for topological errors, we exploit the notion of local sub-graph patterns (LSGPs). We define the LSGP to be a sub-graph of a grid-graph, which can be used as a template for matching the common local topologies of G to P as shown in Fig. 4. Given a dictionary of LSGPs, G may be matched to P robustly for missing or false edges. By providing multiple LSGPs and trying to match G to P using each of them, matching flexibility can be realized. In our implementation, an LSGP is represented by a path that traces all of its edges. To merge all the matching results of LSGPs, voting is used.

The matching algorithm is as follows. From a node n of G, the path of an LSGP is traced, checking for the existence of missing edges with the sub-graph denoted as \(G_0\). Then, the a corresponding sub-graph \(P_0\) of P is searched, under the condition that the topology of \(P_0\) matches that of \(G_0\), all the nodes of \(P_0\) fulfill epipolar constraints with the corresponding nodes of \(G_0\), and the S/L/R labels of nodes of \(P_0\) and the corresponding nodes \(G_0\) should match to at least some pre-defined agreement ratio. Since our proposed pattern structure has a low number of candidate nodes fulfilling the epipolar constraints (at most 10 depending on the point), the dictionary search can be performed efficiently and with a low degree of ambiguity.

If a \(P_0\) with the above condition is found, all the nodes of \(P_0\) are voted on as candidate matches for the nodes of \(G_0\). The above process is repeated for all the nodes of G with all the pre-defined LSGPs. After each iteration finishes, each node of G is checked if it fulfills the predefined thresholds of minimum number and minimum percentage of votes. If the thresholds are fulfilled, it is matched with the corresponding node of P with the maximum votes.

Once the correspondence of the captured image to the pattern is obtained, the points on the vertical and horizontal lines are reconstructed in 3D using a light-sectioning method.

4.4 Taking Consensus of Vertical and Horizontal Line Positions

In a real system, the calibration of the camera and the projector includes errors. With the existence of calibration errors, vertical and horizontal lines that are reconstructed by a light sectioning method generally do not intersect in 3D space (i.e., they result in skewed positions). The inconsistencies between vertical and horizontal lines are not desirable for obtaining a consistent shape of the target surface.

The direct cause of a inconsistency of the vertical and horizontal lines at an intersection is the displacement of an intersection point from the corresponding epipolar line. Thus, to solve this problem, we propose a deformation of the local segments of both of the detected lines around the intersection in 2D image space so that the intersection is moved strictly onto the epipolar line. This process can be done on each of the intersections after the identified grid-graph is mapped to the corresponding projected grid pattern. Figure 5 shows the approach.

This deformation of the lines is done locally, so that the correction of a grid-graph node does not interfere with the adjacent grid-graph nodes. One more policy about the deformation is to move the points of the lines only in the direction that is vertical to the epipolar line. This deformation can be realized by, first calculating the displacement vectors at each intersection that is vertical to the epipolar lines and move the intersections onto the epipolar lines, and then shift each of the points on the lines with the weighted means of the displacement vectors at the adjacent two nodes in both of the line directions at the point. The weights can be decided using the distances from the two nodes.

Fig. 5.
figure 5

Correction of position of grid nodes.

Fig. 6.
figure 6

Process of grid ICP

4.5 Registration of Reconstructed Grid Patterns

Once the correspondence from the captured image to the pattern is obtained, the points on the vertical and horizontal lines are reconstructed as 3D curves using a light-sectioning method. Since the line intervals between parallel lines are wide enough to avoid mis-detection by the subsurface scattering effect, the shapes can be only coarsely reconstructed. To increase the density of the sparse grid shaped 3D points, one solution is to capture the object multiple times by moving the sensor, and then, align and integrate them.

The ICP algorithm is the most used solution to conduct shape alignment between 3D shapes of a static object. The algorithm consists of two steps such as (1) searching for the closest point \(q_i\) of the scene object from point \(p_i,\) which belongs to the target object, and (2) estimate a rigid transformation Rt by minimizing \(\sum _{i} \Vert p_i -( R~q_i + t)\Vert ^2\). Final parameters of Rt are obtained by iterating the two steps until convergence.

However, such a naive ICP algorithm does not work properly on sparse grid shapes, because the closest points from vertical/horizontal lines of the scene object are usually found on the line in same direction as the target shape, note that such incorrect corresponding points are pulled together to minimize the differences to configure an incorrect wrong shape. Noteworthy, if multiple shapes are captured with small translational motions, grid lines tends to be bundled together.

In this paper, we propose a new ICP algorithm to solve this problem. Figure 6 shows the process of our algorithm. We first divide the grid shape into two sets of lines depending on the line directions, i.e., the vertical set and the horizontal set. Then, the closest point \(q^v_i\) in the vertical line set from the point \(p^h_i\) in the horizontal line set is searched. Similarly, the closest point \(q^h_i\) in the horizontal line set from the point \(p^v_i\) which belongs to the vertical line set is found. Finally, rigid transformation parameters Rt are estimated by minimizing \(\sum _{i} \Vert p^h_i -( R~q^v_i + t)\Vert ^2 + \sum _{j} \Vert p^v_j -( R~q^h_j + t)\Vert ^2\). Final results are obtained by iterating the aforementioned steps until convergence. Within this scenario, grid lines of the final shapes are evenly distributed realizing dense reconstruction of the object surface.

Fig. 7.
figure 7

Captured images: (a) projected patterns (top is proposed and bottom is wave pattern), (b) measurement scene, (c) a scene projected the wave pattern, and (d) a scene projected the proposed pattern.

5 Experiments

5.1 3D Reconstruction Based on Gap-Coded Grid Pattern

To confirm the effectiveness of our gap-coded grid pattern, which is robust to objects with strong subsurface scattering and complicated textures, we prepare the scene with various materials. In addition, to compare our technique with the existing state-of-the-art technique, we reconstruct the same scene using Kinect1 and wave a pattern [7, 39], which are also single color and oneshot scanning techniques. Projected patterns are shown in Fig. 7(a). For the ground truth, we also capture the same scene with a time coded technique, i.e., gray code [20]. We use a video projector and a CCD camera for the experiment and the actual captured scene is shown in Fig. 7(b)–(d). Figure 8 shows the reconstructed shapes and the results are summarized in Fig. 9. For the evaluation, we divided the scene with each object as shown in different colors in Fig. 8 and then calculate the value for each segment. From the results, we can confirm that our technique successfully reconstructed the shape with higher density and accuracy than the previous techniques, especially on the strong subsurface scattering object (sponge) and the complicated texture object (camel figurine).

Fig. 8.
figure 8

Reconstruction result (Color figure online)

Fig. 9.
figure 9

RMSE (mm) of wave pattern and proposed method.

5.2 Evaluation of Grid-Based ICP

Next, we evaluated the grid-based ICP technique by using 15 frames as an input, which are captured with a slight movement of the device. We also used a common ICP for comparison. Figure 10(a) is an example of a reconstructed shape from the first frame. Figure 10(b) is the 3D shape reconstructed with a time-encoded technique. Registration result with the grid-based ICP and a common ICP are shown in Fig. 10(c)(d) and all the results are summarized in Fig. 11. In the figure, the Number of Points are calculated by projecting all the points onto the image plane of the camera and counting their pixels. From the figure and the graph, we can confirm that integrated 3D points with our grid-based ICP are more evenly distributed than that with the general ICP. The RMSE is also improved by 22 % compared to a common ICP.

Fig. 10.
figure 10

Registration result

Fig. 11.
figure 11

Registration comparison result

Fig. 12.
figure 12

Registration of shapes measured by the 3D endoscope. (Top row) Input images, (Middle row) reconstructed 3D points, and (Bottom row) registration results.

5.3 3D Reconstruction Using Endoscope Images

We first evaluated the proposed 3D endoscopic system by measuring a 3D object with a known shape. The target is an output of a 3D printer using the 3D shape model of Stanford bunny object. We used this object because its ground-truth data is available. It is first scanned with the 3D endoscope by moving the object as shown in Fig. 12 (top row), and each of the frames are reconstructed as shown in Fig. 12 (middle row). Finally, those reconstructed frames are registered with both a general ICP algorithm and our grid ICP algorithm, and compared with the ground-truth shape data. The registration results that are fit to the true shape data is shown in Fig. 12 (bottom row). The results show that, in the result shape of the general ICP, the grid lines of multiple frames were pulled together and aggregated, whereas in the result of our grid ICP, the grid lines were uniformly distributed. The RMSEs between the registered 3D shapes and the ground-truth were 1.54 mm for the general ICP, and 1.20 mm for our grid ICP.

Fig. 13.
figure 13

3D reconstruction of real tissues from human stomach measured with distance of about 25 mm (top row) and about 15 mm (bottom row). (From left column to right) The appearance of the sample, the captured image, the identified grid graph with gap codes, and the reconstructed shapes (the front and the side).

Fig. 14.
figure 14

3D reconstruction of (top row) the inside of the human mouth (palate), and (bottom row) a piece of intestines of a cattle.

To evaluate the system in realistic conditions, a biological specimen extracted from a human stomach in an endoscopy operation is measured, as shown in Fig. 13. We captured the tissue image from different distances about 25 mm and 15 mm. Note that, since the projector is based on DOE, the projected patterns from both images at different distances are sharp enough. Using our proposed algorithm, both images could be nearly reconstructed except for the regions that were affected by bright specular highlights. Our future work will be to avoid the effects of these specular highlights. Other than these points, the correspondence between the detected grid points and the projected pattern were estimated accurately, and the shape of the specimen was successfully reconstructed.

We also measured the inside of the human mouth (palate), and a piece of intestines of a cattle with the endoscopic system and the result were registered with the proposed grid ICP, as shown in Fig. 14. From the figures, we can confirm that shapes are densely reconstructed with our gap-based DOE pattern projector, and grid based ICP technique.

6 Conclusion

We proposed a 3D endoscopic system based on an active stereo, where the pattern projector consists of a DOE that generates a special line based grid pattern. By using a DOE which is free from the blur effect, sharp patterns are projected at wide depth ranges while keeping a strong intensity (usually more than 90 % efficiency). In addition, by using a line based grid pattern, the severely blurred pattern caused by the subsurface scattering effect is also robustly detected and decoded. We also propose a new reconstruction algorithm for the gap coding implemented on the pattern. Since shapes reconstructed from grid pattern are usually sparse, an ICP algorithm specialized for grid patterns is proposed. The potential of the technique was verified by intensive experiment using projector-camera systems and demonstrated by reconstructing the shape of some bio tissues such as the surface of a human palate, or a biological specimen extracted from the stomach of a human subject at various distances from the endoscope. Our future work is to test the system with real diagnosis.