1 Introduction

In the Cultural Heritage (CH) 3D reconstructed models provide an important support for procedures like historical reconstruction, analysis of artifacts and architecture, documentation of archaeological sites, preservation, marketing of museum etc. [1, 10]. The 3D models of cultural heritage objects are generated by using dedicated software graphical applications, image based methods of specialized sensors [2].

Many 3D digital models of cultural heritage objects like buildings, statues and historical places are developed using image based methods, active sensors like laser scanners, total stations or combination of them [2, 11, 13]. Although laser scanners are more accurate than image-based systems, the latter are preferred on the strength of their cost, ease-of-use and time saving. As mentioned in [11], some of the disadvantage of laser scanning are that many blind spots are generated and a great deal of time is required to obtain the data.

The possibility to reconstruct reliable 3D models by simply using consumer low cost devices is a great opportunity. Smartphones are widely used and offer the possibility to access a variety of applications for accomplishing a large number of tasks. Due to the development of camera features and the usage of multiple cameras, smartphones became a common device used for image acquiring.

In order to perform 3D reconstruction, the user can use a smartphone equipped with a depth sensor [4] or attach a depth sensor to the smartphone or simply make photos of the target and upload the pictures on a cloud server and then run a web-based application to create the 3D reconstruction.

To obtain a high accuracy for the 3D reconstructed object, a set of the most representative selection frames is uploaded to the cloud where they are processed and the 3D model is obtain [15]. For volumetric 3D models of realistic scenes, like whole buildings, many agents using augmented reality smartphones collaborate through an online pose optimization [9]. In [7] multiple users with smartphones collaborate to provide the best frames for a 3D reconstruction pipeline that are uploaded on a cloud-based server and processed through a Structure form Motion (SfM) and Dense Image Matching (DIM) procedures.

Another method of reconstructing outdoor scenes of large proportions is by computing depth maps at interactive frame rates on the GPU of a Google Project Tango Tablet 4 through motion stereo [14]. In [16] is presented a comparison between the usage of Tango tablet and ZED camera in 3D reconstruction. The Tango tablet offers better accuracy of generated point clouds for outdoor environments and a lower value of the average error. The ZED camera per-forms better indoors because it has to be connected to a computer while scanning.

Because the lack of appropriate light source can be a disadvantage for 3D reconstruction using smartphones, in [6] was presented 3D structured light scanning that represents the combination between 3D reconstruction and the registration on Lenovo Yoga Tab 3 Pro to achieve a 3D point cloud model.

Due to the unfavorable environment conditions or object materials (e.g. poor lighting, shiny surface, homogenous textures), some authors proposed improved approaches for point cloud generation [14], but it is still room for new improvements of algorithms and used technology to overcome some limitations, like, for instance, transparent surfaces, that cannot be detected by the devices.

It should be noted that the point cloud’s resolution depends on the user needs and the application in CH: the reconstructed 3D models can be used for documentation and analysis, for creating digital archives, for providing digital replicas for exhibitions and so on. Nevertheless, the quality of the resulted model should be as good as possible in all cases.

The aim of this research study is to investigate the accuracy of smartphones that incorporate 3D depth sensors for 3D reconstruction of cultural heritage artifacts compared with photogrammetry based approach. Using these techniques, two 3D measurements have been carried out and the goal was to analyze whether the first method, based on Google’s Tango technology, is suitable to reproduce geometrical volumes of objects in CH preservation sector, using photogrammetry data as reference.

2 Android Applications (Apps) for 3D Reconstruction

Following is a brief overview of the 3D reconstruction Android apps available on the Google Play store with examples of their output. The most significant applications are presented, but it is not an exhaustive overview, there are many other applications that can be used for 3D scanning of objects.

2.1 Tango® Constructor

Tango Constructor [8, 19] allows users to scan their surroundings and visualize the reconstructed 3D textured mesh (Fig. 1) models directly on the mobile device. The processing of the 3D mesh is performed on the smartphone. The application enables export of the output using the OBJ format.

Fig. 1.
figure 1

Provence House 3D reconstructed using Tango Constructor [20]

2.2 Open Constructor for Tango

Open Constructor [21, 22] allows real time 3D reconstruction with textured models (Fig. 2) using the Tango enabled mobile phone. 3D scanned models are exported using OBJ format and can be visualized by using external applications. Also provides support to uploading models to Sketchfab.

Fig. 2.
figure 2

3D model reconstructed using Open Constructor for Tango [23]

2.3 RTAB-Map

RTAB-Map (Real-Time Appearance-Based Mapping) [17] allows online 3D scanning/mapping of the environment (Fig. 3) based on multi-session incremental appearance-based loop closure detector [12]. RTAB-Map allows to export the 3D reconstructed model in PLY or OBJ (with textures up to 720p) format.

Fig. 3.
figure 3

3D model reconstructed using RTAB-Map [18]

2.4 Matterport Scenes

Matterport Scenes [24] is another free 3D scanning application for Tango-enabled devices that has some useful features, like: the ability to trim a 3D scan and to crop object of interest, the ability to make real time measurements, to take photos while scanning and to share 3D scans (Fig. 4).

Fig. 4.
figure 4

3D model reconstructed using Matterport Scenes [25]

2.5 Scandy Pro

Scandy Pro [26] capture 3D meshes with Tango device, having high resolution and accurate scans of objects by combining Tango scanning technology with its own 3D scanning algorithms. It is able to provide a maximum resolution of 1 mm and render a 3D mesh on-device within seconds [27] (Fig. 5).

Fig. 5.
figure 5

3D model reconstructed using Scandy Pro [28]

3 Materials and Data Collection

With the advent of the 3D data capture image-based methodologies, new scenarios are opening up for the construction of true 3D models of artifacts, statues, buildings or other object belonging to cultural heritage. The input images are acquired in different experimental configuration, with different cameras and the proposed procedure has the purpose to compare 3D models starting from the two input data (Tango and Photogrammetry).

Project Tango is a platform developed by the Advanced Technology and Projects (ATAP) that uses computer vision to give devices like smartphones and tablets the ability to understand their position relative to the environment. The software works by integrating three types of functionality: motion-tracking, area learning, and depth perception [29]. In Fig. 6 is presented the diagram of Tango 3D reconstruction offline process, with used techniques for each stage.

Fig. 6.
figure 6

Tango 3D reconstruction process [29]

For depth perception, Tango devices use three technologies: structured light, time-of-flight and stereo. The first two require the use of an infrared sensor to estimate the distance to the surrounding objects from the time difference between the emission of the infrared wave and its return to the sensor as a result of the reflection. Stereo technique uses two cameras for taking pictures and calculates the distance between them, like to the human eyes do. Project Tango’s developers provides APIs that return point clouds from the depth sensor in the form of xyz coordinates.

A statue of Horea from the statuary group representing Horea, Cloşca and Crişan, located on Griviţei Boulevard, in front of the Building V of Transylvania University of Brașov was selected as subject for analysis (Fig. 7). The statuary group representing three personalities of the Romanian national history was subjected this year to a process of cleaning and restoration of mosaic tiles. These personalities were the leaders of the 1784 Romanian Peasants’ Revolt of Transylvania. They were executed by the Hungarian authorities by breaking on the wheel in 1785.

Fig. 7.
figure 7

The statuary group “Horia, Cloșca și Crișan” (a) and the bust of Horia (b)

For photogrammetry method, the images were collected using a Samsung Galaxy S7 smartphone, that employs a ISOCELL S5K2L1 sensor (1.4 µm pixel size), with a maximum resolution of 12 effective Megapixels (4032 × 3024). In order to reduce the errors of 3D model, the smartphone camera was placed at close distance from the object (about 1.5 m). The images were captured at different angles with respect to the statue surface. The smartphone camera was rotated horizontally and vertically with 10 degrees, maintaining a 70–80% overlap between consecutive photos. Having this overlapping percentage, the acquisition of 173 images was enough for obtaining a good 3D model of the statue.

The images sets were imported into Agisoft PhotoScan [30], a commercial tool used for processing digital images and to generate 3D spatial data, to realize the 3D modeling process: from the sparse point cloud to the final 3D textured model (Fig. 8a).

Fig. 8.
figure 8

Point cloud obtained by photogrammetry (a) and using Tango Constructor (b)

The Tango model (Fig. 8b) was obtained using Lenovo Phab2, a large Android smartphone with numerous cameras and infrared sensors that could make use of the Google’s Tango software. It has a 16 MP rear camera, depth sensor, motion tracking camera and a 6.4” QHD display. The phablet not only tracks motion, but also has depth perception thanks to an IR emitter and other sensors, remembering the space around it. Another features are accelerometer, gyroscope, ambient light sensor, compass, a powerful Qualcomm Snapdragon, 652 processor optimized for Tango, and an 8 MP front camera. With this device, you are able even to measure distances in the environment.

The image sets were acquired in different sessions, keeping the same procedural steps in order to obtain comparable results for the two methods. The models are imported first in MeshLab [31], an advanced 3D mesh processing software system, free and open source, to improve the quality of the meshes.

4 Results

The 3D point clouds generated from the two different recording techniques were analyzed for variations in their quality, to determine which method is most appropriate for generating 3D models in an easy and fast way.

In a first stage, the .obj files obtained from Tango Constructor and from PhotoScan were imported in CloudCompare and they are converted in .pts files. When comparing two models obtained with different methods, the first problem that arises is the difference of resolution of the two corresponding point clouds. The RGB model acquired with the Samsung camera has about 400 times more points in the dense cloud than the one obtained with Tango camera, 5530620 and 13855 points, respectively. Before doing the comparison, the points on the mesh were sampled at 3000000 points.

Before comparing the 3D models, the meshes must be carefully aligned (Fig. 9a). For this purpose, we used the open source CloudCompare software [32]. CloudCompare is an advanced 3D data processing software for quickly detecting changes and comparing 3D point clouds data. Point pair based alignment tool was firstly used for rough alignment, then we used the registration automatic method to finely register the two datasets by the Iterative Closest Point (ICP) algorithm. The quality of the obtained alignment depends on choosing good pairs of corresponding points in the two datasets. The alignment parameters were previously optimized by minimizing the ICP alignment error for two identical point clouds.

Fig. 9.
figure 9

Aligned point cloud (a) Photogrammetry/Tango models absolute distances (b)

After registering the two 3D clouds, a quantitatively and qualitatively evaluation of their difference was performed by calculating the Hausdorff distance and the Cloud to Cloud (C2C) distance between the two models. C2C distance computation calculates the nearest neighbor distance, i.e. for each point of the compared cloud, the software looks for the nearest point in the reference cloud and then computes their distance.

In Fig. 10 are presented the histogram of absolute distances between vertices of the two point clouds. We compared Tango/Photogrammetry clouds, keeping the Photogrammetry cloud as reference. The average differences between the models are around 0.07 cm (Fig. 9b), with standard deviation of 0.07. It can be observed that the scalar field shows the distances between points ranging from 0 to 4.5 mm. The number of points for each model was 2999976, of which more than 85% show a difference smaller than 1.5 mm.

Fig. 10.
figure 10

Histogram of the absolute distance

The models are very similar, showing the larger deviations on the edges of the model where no perfect match was achieved. The model produced by photogrammetry is of higher quality than the Tango-based one and this can be easily seen even when looking at them (e.g. visual comparison of the two models in terms of geometrical details and lack of noise). But this thing is due to the surface properties of the statue: there are no big changes in terms of color – the surface is almost white, there are reflections occurring in some areas or shaded surfaces. Moreover, as there are many powerful features, the accuracy would be better.

On the other hand, comparing the two models, it is clear that the one obtained with depth sensor device does not show the outline of statue details as well as the photogrammetry-based one. Details are more prominent in the case of the second technique, while the Tango mesh have a smoother texture that makes some parts of the model more difficult to distinguish (like, for instance, the ornament carved on the statue’s base – Fig. 11 a, b). So, the Tango-based model does not highlight depths as detailed as photogrammetry-based one. The details can be distinguished, but they are not so well shaped. But this is due to considerably lower number of points captured with Tango device. As the manufacturers mentioned, the device could construct a “rough” 3D model in real time, but, in addition, post-processing tools can be applied in order to result a realistic 3D model of an object or a space.

Fig. 11.
figure 11

Photogrammetry reconstructed 3D model (a) Tango reconstructed 3D (b) Photogrammetry/Tango models Hausdorff distance computed for the Tango based model (c)

The one-sided Hausdorff distance was also computed in order to analyze the differences and similarities between the reconstructed 3D models [3]. The geometric difference between the 3D models was achieved by calculating the distance between each point of the models and then measuring the actual dimension in millimeters. The models surfaces at a point are similar if the distance between two points is smaller. To compute the Hausdorff distance was used the feature integrated into MeshLab in the filter Sampling -> Hausdorff Distance.

In order to better visualize the error, the results of the calculated Hausdorff distance were displayed by using red-green-blue colormap. For this representation, the red color is maximum error and the blue color is minimum error. Results of the geometric differences applying Hausdorff distance are presented in Fig. 11c. The maximum error between the 3D models is 0.96% and the average error is 0.16%. Thus, the models generated using Tango technology and photogrammetry are geometrically approximately similar.

However, a critical part of Tango scanning is the influence of lighting conditions and object material. We have chosen an object whose surface is very little reflective and the captures where made in proper conditions, with enough light and with clouds. As is shown in [16, 33], there is an obvious difference in the performance of a Tango device in low light and high light conditions. When there are insufficient lighting conditions, the obtained models have certainly a reduced geometric quality. Moreover, Tango-based device is not able to detect surfaces that are directly illuminated by sunlight or transparent surfaces. Even with these limitations, using this method the restorers have a quick way to record vast amount of data, combined with sufficient accuracy and ease-of-use.

The result shows that it is possible to obtain 3D models that are morphometrically comparable using both image-based techniques. The photogrammetry technique has created clearer details in relief of the artifact and their depth seems much higher than the model captured with depth sensor. However, the conducted analysis shows that the differences between the two models are very small and can be considered as acceptable.

As mentioned above, the transparent, shiny, reflective, but also dark surfaces are problematic for image-based 3D reconstruction. They influence the quality of the resulting model, but there are expectations both on the part of the hardware and the improved algorithms that next devices will be able to overcome the different unfavorable conditions and can provide solutions for the realization of 3D models of artifacts in CH field. In this way, the scientists or the conservators can easily compare different information given by different techniques.

Moreover, even if the native models produce by Tango are not rich in information enough, there are various approaches in the literature that can enrich them with semantic and topological information [5]. The method is very suitable when the time is a critical aspect.

5 Conclusions

The 3D reconstruction is very important to analyze, document and to preserve Cultural Heritage. The aim of this paper was to investigate the application of an image-based reconstruction methodology based on Google Tango for 3D reconstruction of historical artifacts. We have attempted to evaluate the quality of the data generated by Google’s Tango Constructor Android application compared to photogrammetry-based 3D reconstruction using commercial software for a relatively large artifact with low features of the surface.

The 3D data evaluation comparison indicates that Tango-based method is an efficient way for 3D reconstruction of historical artifacts and is able to provide morphometric data comparable with photogrammetry-based data. The analysis was performed by comparing the point clouds obtained using the two mentioned methods and the variation of them in terms of distances between point is relatively small and thus quite accurate.

The performance of Tango Constructor free standalone application was evaluated for an outdoor artifact that was considered suitable for 3D reconstruction due to the bright color and low reflectivity of the surface. Tango Constructor allows users to capture and view 3D models of objects or of their surroundings, and the models can be shared as mesh files. This is a quick way of creating 3D models, which can be used in other applications or software environments.

The presented method has a series of advantages, like saving time required for data collection, fast generation of the 3D model, cost saving, less maintenance, and ease of use. However, there are certain limitations of this method: the accuracy of the 3D reconstructed model is influenced by object materials and lighting conditions. Another limitation is related to the size of the object. If the object to be scanned is small or very small, it is reconstructed with very few details, distorted or even it cannot be identifiable in the scanned scene.

It is certain fact that mobile devices equipped with sensors and appropriate software for depth sensing can find their application in CH field. Since laser scanners have already proven their usefulness and potential on acquiring high quality 3D data, and photogrammetry or image-based techniques have their advantages, but a combination of them lead to better results [34].

Concluding, due to the versatility in terms of cost, time and ease of use, Google Tango represent a viable solution for conservators and restorers, enlarging their choice in the range of 3D reconstruction instruments. However, as a preliminary test, this study need to be extended in order to asses if Tango devices are adequate for specific applications and to determine whether this low cost solution has the potential to make the difference in 3D reconstruction.

As future work, we plan to continue the investigation by using different scenarios (e.g. indoor, outdoor), more objects with different shapes and different lightning conditions. We also intend to conduct a study to evaluate the performance of various reconstruction applications that use Google Tango technology and which were presented (some of them) in Sect. 2 of this article.