1 Introduction

Over the past two decades, various attempts have been made to actively incorporate AR and VR technologies to mobile applications for the purpose of digitally enhancing on-site experiences at cultural heritage sites. However, although these attempts considered how the technological elements for a mobile application should be placed and implemented for a better user experience, the type of content that should be given to the visitors and how such content should be organized atop the technology were not addressed. For example, the applications mostly focused on experimenting a visual overlay technology, such as marker detection, and feature tracking. Furthermore, a platform that can process retrieved data into a system, and effectively bind and link such data to suit the current context of the visitor is not available. To improve these issues, we designed and implemented KCTM application in which the retrieved and structured data would be shown considering users’ various geo-context through augmented and virtual visualization system.

The ‘K-Culture Time Machine’ project is a research and development project for culture and technology hosted by the Ministry of Culture, Sports and Tourism in Korea from 2014 to 2016. It aims to develop technology for collecting cultural contents with spatial and temporal information, creating semantic correlation of them, and visualizing them on the AR and VR service platforms. In order to carry out the project, first, we developed data processing technologies that create, store, and retrieve cultural content based on the semantic relationship between them. Through these studies, various cultural contents of different organizations can be structured into new metadata-based cultural contents with temporal and spatial information, which can be provided to tourism or IT industries and to the public. This allows users to semantically search and utilize diverse cultural heritage contents.

Next, based on robust and context-aware visualization techniques, users can experience more immersive content through real or virtual worlds. In augmented reality visualization, the system enables real-time 6-DOF camera tracking and pose estimation by utilizing a 3-dimensional trackable map. This allows the user to track and identify the PoI (point of interest) and the OoI (object of interest) in real time in the field, and retrieve and view related information in AR. Virtual reality visualization is based on 360\(^{\circ }\) panoramic image data obtained from the field. The system can render various contents, taking the camera posture of the user’s wearable device into consideration such that the sense of presence can be improved even in the virtual environment. Developed technology was launched with the ‘K-Culture Time Machine’ application on the iOS App Store on May 23, 2017. The application runs on a trial basis in Changdeokgung Palace and is undergoing improvements in technology and services through user validation and feedback.

In the evaluation of the system, AR and VR visualization module processing speed was measured by FPS change. For each FPS change measurement, the FPS showed an average of more than 50. The augmented reality visualization module showed an average tracking speed of over 40 FPS. For the user evaluation, we conducted in-depth interviews during two days of open demonstration sessions from November 2 to 3, 2017 in the laboratory. A total of 43 participants experienced the application that was set in VR mode, and were asked to freely state their impressions on the experience they just had. Most of the participants expressed interest in viewing real-life locations and culturally meaningful monuments in a virtually and realistically recreated space. However, they stated that it was difficult to approach and acquire various hierarchical and multiple layers of information on PoIs and OoIs, and it would be more effective if such layers of information were organically linked and transmitted in a more interesting way.

The remaining sections are organized as follows. In Sect. 2, we will examine how augmented reality and virtual reality have been applied to cultural heritage sites and the problems encountered. Section 3 describes the entire system of the application and details the data processing and visualization modules, while Sect. 4 presents the system evaluation and user evaluation contents. Section 5 gives the conclusion and planning for future project.

2 Related Work

In the case of AR, the heritage industry has for some time considered the technology to be a key component in attracting and engaging visitors in unprecedented and innovative ways, thereby creating novel values [1]. Střelák et al. [2] showed through extensive user evaluations that AR technology is well received overall as a platform for digital guides at cultural heritage sites. However, the design and implementation of AR mobile applications for cultural heritage sites have mostly lacked a holistic perspective and a systematic approach in terms of how the effect of features specific to AR can be maximized while retrieving and providing existing content to visitors at timely moments during their individual, personalized experiences.

Casella et al. [3] conducted a state-of-the-art review of AR mobile applications for cultural heritage sites through observations held both online and offline. Before this work, available applications were mostly focused on introducing the technology itself and experimenting with basic components of AR for cultural heritage sites, such as visual overlay, marker detection, feature tracking, GPS, PoIs, location-based narrative, contextual data, and personalization. Starting with the Touring Machine made by Feiner, MacIntyre, and Höllerer in 1997, they noted the development of AR as a platform that could potentially reinvent methods of cultural mediation. The more recent but nonetheless outdated examples they reviewed, such as Archeoguide and iTacitus, laid the groundwork for the now recurring concepts of mobile AR applications that could only be realized with technology as yet unavailable. As mobile AR technology became more utilizable, museums began launching applications such as Streetmuseum, Londinium, and Phillyhistory with a common focus on visualizing related 2D and 3D contents of historic value over the physical realms of places they are related to.

More recent applications in the field have moved on from this exploratory stage to examine various design principles that can take full advantage of AR technologies at hand and thus provide a better sense of engagement to the visitors at heritage sites. de los Rios et al. [4] investigated how AR coupled with gamification, storytelling, and social media can engage visitors more actively by way of participation using a mobile device. In order to define appropriate functions that the mobile application should provide, it referred to actual use cases of TAG GLOUD. Rattanarungrot et al. [5] proposed a system of presenting cultural objects on an AR platform by accessing an aggregated RCH data repository. Although they considered how the technological elements for a mobile application should be placed and implemented for a better user experience, they did not address what type of content should be given to the visitors and how such content should be organized using the technology. Furthermore, when the systematic handling and delivering of data is a priority for mobile AR services in cultural heritage sites, the relation among the contents retrieved has not been sufficiently considered.

In the case of VR, various studies have been conducted to provide immersive experience without going directly to cultural heritage sites. For example, inside the virtual cave, users can experience various multimedia such as 2D and 3D animations, as well as 3D cinematography related to ‘the Mogao Grottoes at Dunhuang’ in an immersive and interactive way [6]. Students could explore a 360\(^{\circ }\) video displaying the cultural heritage site, and solve problems designed by a gamification method [7]. Visitors could also experience real-scale graphics on the immersive stereoscopic display with hand gesture interaction [8]. However, these projects mostly focused on presenting an immersive display of cultural heritage sites. They did not utilize various multimedia from scattered databases related to the heritage site, and did not provide personalized experience in terms of considering various users’ context.

3 KCTM Mobile Application

3.1 System Overview

The application provides mobile-based cultural heritage AR guides using vision-based tracking of cultural heritage sites. Through the embedded camera in a smartphone, the application can identify a cultural heritage site and provide related information and contents of the heritage site. At the same time, it can provide remote experience over time and space for cultural heritage or relics through a wearable 360\(^{\circ }\) video-based VR. Users can remotely experience cultural heritage sites using a 360\(^{\circ }\) video provided by installing an app in a smartphone HMD device, and can search information on historical figures, places, and events related to the cultural heritage site. Furthermore, 3D reconstruction of lost cultural heritage can be experienced.

Fig. 1.
figure 1

KCTM application system diagram

We designed the system diagram of K-Culture Time Machine application as shown in Fig. 1. Originally, it comes from the modified AR reference model [9]. The modified AR reference is a supplementary model based on the AR reference model, which is an ISO standard, and provides a standardized workflow for interoperability with other augmented reality applications. In Fig. 1, the application obtains sensor information of the real or virtual world and the user through embedded sensors. Then, an AR tracker performs user localization and PoI recognition through it. A user can select PoI by touching the UI button, and related PoI contents are retrieved through the data retrieval process. Finally, the content renderer generates the AR or VR scene to provide the content of the selected PoI to the user.

3.2 Geo-Context Tracker

AR Tracker. In order to visualize heterogeneous AR contents through the application, real-time camera tracking and pose estimation techniques are necessary. In our previous study [10], we proposed an all-in-one mobile outdoor AR framework and a real-time camera tracking prototype for cultural heritage sites. Through this AR framework, we addressed how to acquire a 3-dimensional trackable map and how to enable robust 6-DOF camera tracking and pose estimation. An ORB [11] keypoint-based standard SfM (structure-from-motion) pipeline was applied to reconstruct keypoints and camera pose of keyframes. In order to stabilize real-time camera tracking on a mobile device, the multi-threading technique was applied. The foreground thread tracks the inter-frame movement of ORB keypoints, and estimates the camera pose by solving 3d-2d correspondences. The background thread collects new candidate key-points, which are evaluated by a keypoint matching procedure. In our implementation, both threads work concurrently. Finally, this AR tracking module was stabilized and applied to the application. Figure 2 shows the overall procedure of our AR tracking module.

Fig. 2.
figure 2

System workflow diagram of real-time camera tracking module. (left) Image acquisition and SfM process, (middle) generated trackable map, and (right) real-time camera pose estimation and tracking demonstration on mobile device.

VR Tracker. The virtual reality visualization module was developed in order to solve the spatial constraint that the user has to be in the field and to provide an immersive experience in the wearable platform. Virtual reality visualization is based on 360–panoramic image data obtained from the field, allowing users to experience cultural heritage in a virtual environment. This module is implemented based on iOS, and the camera position estimation (6-DOF) technology is applied through 3D spatial feature map and image matching of the wearable device. This module is based on user’s wide range of position information by tracking the direction and field of view (FoV) of the wearable camera. It applies real-time content rendering technology considering the camera posture of users’ wearable devices for improving the sense of presence of the user.

3.3 Data Processing and Delivery

The process of contents retrieval for the selected PoI is divided into two processes. One process retrieves information about the cultural heritage itself and multimedia contents from the integrated ontology, which aggregates the existing cultural heritage databases. The other process retrieves multimedia contents created in modern times related to the cultural heritage concerned such as movie and drama. In the cultural heritage content part, the application uses Korean cultural heritage web-databases to deliver various cultural information and content related to the heritage site. First, we selected five cultural heritage databases that provide tangible and intangible cultural heritage information: ‘Cultural Heritage Administration of Korea’, ‘Museum Portal of Korea’, ‘National Palace Museum of Korea’, ‘Encyclopedia of Korean Culture’, and ‘Culture Content’. Then, we employed the Korean Cultural Heritage Data Model (KCHDM) [12] to aggregate heterogeneous data from these five web-databases.

KCHDM comprises 5 super classes (Actor, Object, Place, Time, and Event) and 78 properties such that it represents the context of cultural heritage according to “who, when, where, what, and how”s. To construct the semantic information model of the targeted heritage site, we used the information modeling framework proposed by Kim et al. [13]. In the first phase of information modeling, we established a knowledge base comprising cultural heritage entities and their relationships, which show the cultural heritage features associated with PoIs by class and the inference rules of KCHDM. Then, we linked four types of web resources (i.e., text, audio, image, and video contents) to each cultural heritage entity. In the last phase, the information model was redesigned for mobile application users such that they can access it without unnecessary steps or duplicated contents.

The media asset database was built to contain multimedia content such as historical movie and drama related to the cultural heritage sites. In designing the metadata element and schema, we referred to three existing metadata sets: ‘W3C core set of multimedia metadata [14]’, ‘Metadata Element and Format for Broadcast Content Distribution [15]’, and ‘Metadata Schema for Visualization and Sharing of the Augmented Reality Contents [9]’. Based on these references, we proposed the newly modified and extended metadata elements and schema [16]. It aims to systematically manage the spatial and temporal information of video in the AR system. The media asset database was written in MySQL and the video resource from media asset was incorporated into AR content when requested by the app. KCTM app contains information on a PoI, as well as information related to various cultural heritage sites including tangible and intangible heritage information linked to the PoI.

Fig. 3.
figure 3

Revised MAR contents metadata schema [17]

To retrieve cultural heritage information from the ontology and the relational database, we implemented the SPARQL query for KCTM application. In the case of offline retrieval, the application can retrieve related contents of PoI at the initialization process through the SPARQL query, and then it provides the contents for the selected PoI to the user without additional retrieval process. Otherwise, in the case of online retrieval, the application can provide only the contents related to the selected PoI when it is required by the user without the initialization process. Users can choose appropriate retrieval methods according to the network bandwidth and contents volume. The retrieved content is stored according to a standardized metadata structure (Fig. 3) [17]. According to previous studies on augmented reality metadata for cultural heritage [18], various types of AR contents can be integrated, and the reusability of other AR contents, which follow this standardized metadata structure, can be guaranteed.

3.4 User Interface and Content

The user interface has a wearable mode and a mobile mode, and mode switching is possible in each real and virtual space as shown Fig. 4. Although there is no difference between modes, it is implemented to allow users to select the optimal experience mode according to the situation. In order to move to the next place in each virtual space, users should touch the arrow in the direction of movement in a similar manner to a general 3D map service such as Kakao map load view and Naver map, and then the visualizer provides a 360\(^{\circ }\) image display. When a user touches a PoI that exists in this environment, related information and multimedia contents can be checked. The wearable mode provides users with virtual reality experience by using smartphone HMD such as Samsung Gear VR. The user can select the target through eye gaze for a certain period of time in the user interface.

Fig. 4.
figure 4

Wearable mode (left) and mobile mode (right)

Fig. 5.
figure 5

Multimedia information of PoI in Changdeokgung Palace

The user can experience various multimedia information such as text, sound, photo, video, and 3D content in a specific object or place through interaction with the 2D UI of visualization module as shown Fig. 5. Especially, 3D content design and development was carried out to optimize 3D models of the existing cultural heritage database to the HMD mobile device visualization module. It was referenced from national cultural heritage portals such as ‘E-museum’ and ‘Cultural contents’. Lightweight and resolution-conscious textures were produced such that 3D content could be operated in mobile devices. The optimized 3D model was stored and managed in the multimedia database. Users can experience lost cultural heritage by augmented reality and virtual reality through this content. For example, the ‘Seungjungwon’ building is a government office where all documents such as the letter of the king and the writings of the servants to the king should go through. It played a great role as the king’s secretarial agency, sometimes exercising power over other institutions. However, it is lost and visitors cannot see the building in reality. Thus, based on the Dongwoldo, the picture of the palace of the late Joseon dynasty, it was restored through 3D contents and provided to users as shown Fig. 6.

Fig. 6.
figure 6

Virtual restoration of ‘Seungjungwon’ building

4 Evaluation

4.1 System Evaluation

The AR tracker processing speed was measured by FPS change according to real-time tracking. The VR tracker processing speed was measured by FPS change according to the speed of reading related data when moving in virtual space and the speed of reading information when manipulating the UI. For each FPS change measurement, the FPS showed an average of more than 50. The augmented reality visualization module showed an average tracking speed of over 40 FPS (Figs. 7 and 8).

Fig. 7.
figure 7

FPS change in AR tracker

Fig. 8.
figure 8

FPS change in VR tracker

4.2 User Evaluation

During two days of open demonstration sessions conducted in an indoor lab environment, we observed the behavior and response of various users as they experienced our mobile application. The demonstration sessions were held from November 2 to 3, 2017 at the KAIST UVR Lab. A total of 43 participants voluntarily used the app that was set in VR mode, wearing VR headsets that were adjusted according to their preferences. The age group of participants varied from elementary school students to teenagers and adults in their thirties and forties. The average duration of the trials was approximately 5 min. Overall, the participants showed much movement, turning their head and bodies around to fully grasp the 360\(^{\circ }\) space of the reconstructed parts of the Sejongro and Changdeokgung Palace. They mostly tried to interact with all the points of interaction that were given as pointers or menu buttons within their field of view, which can be accessed by aiming and focusing their gaze on the red dot located at the center of their view. Through their means of interaction, they tried to proceed to all layers of information enabled by the ontology-based structure of the data system (Fig. 9).

After completing their turns at trying the VR part of the application, we carried out in-depth interviews with all the participants in which they were asked to freely state their impressions and opinions on the experience they just had. Most of them expressed interest in viewing real-life locations and culturally meaningful monuments in a virtually and realistically recreated space. They took note of the fact that an entirely fictional place was included within the real space. However, the overlying common factor in their experiences is the pronounced difficulty in navigating and following the content provided with each response they were required to make with their gazes. Owing to the fact that most of the participants had not used VR headsets prior to the demonstration session, controlling the hardware itself was the initial hurdle they encountered while trying to ease into the experience provided by the application. They also felt that accessing multiple layers of information on the PoIs within their field of view was not intuitive enough and required some strained effort on their part. Most importantly, they felt that text information and images that were augmented virtually in a consecutive order had no direct relation or meaning to the one that came before or after.

Fig. 9.
figure 9

Introduction of KCTM App for the user evaluation

5 Conclusion and Future Work

The ‘K-Culture Time Machine’ application uses real-time tracking of user’s location and pose through AR and VR visualization technology, and provides various existing multimedia related to cultural heritage based on structured metadata schema and ontology. This is significant in that it experimented with how to utilize and visualize existing data on AR and VR platform, compared to existing apps that simply focused on creating and showing new virtual content related to cultural heritage sites. However, as mentioned in the user evaluation part, one of the critical weaknesses of our current system is that the ontology-based data organized through our own data structure cannot as yet be provided through an effective means to utilize the content in coherent, cohesive, and meaningful ways that also consider the context of the user and the paths they are given to navigate multiple PoIs. As a result, users felt that the information provided to them was often fragmented, and lacking a clear point of view to sufficiently guide them through an unfamiliar site, or reorient them to a site of historic value that they had already known.

In order to address this issue, we devised a framework that utilizes the merits of our ontology and metadata-based multimedia data retrieval system to remediate each content in a new context through storytelling in our follow-up study [19]. The framework focused on bringing contextual information and related multimedia content of differing PoIs within a single heritage site together by referring to principles of experience design and gamification in order to produce guide routes that connect multiple PoIs in a given order. After designing a themed narrative that embeds video data related to the PoIs, retrieved and filtered through the metadata system we created, we conducted an extensive between group user evaluation at the Changdeokgung Palace to compare the detection and search-based experience of the current KCTM application to an AR mobile application built around the narrative we proposed. The results showed that storytelling as a content remediating and route-creating method delivers a significant impact in enhancing the level of immersion and interest of users that consult additional digital material in an AR and/or VR environment to achieve the goal of enhancing their overall experience of a cultural heritage site.

In a follow-up project, we plan to include multiple narratives to upgrade our current version of applications in order to provide various route options that can at the same time structure and arrange the content of our data in meaningful ways and effectively link various PoIs within the site. The subsequent project began in June 2017. Its subject is ‘Technology development for building cultural content creation and production infrastructure utilizing spatial data’. The objective of this research project is to reduce the cost of producing content using real world’s spatial information. For this, we have started developing technology that converts mass spatial information produced by the country into a specialized spatial asset for the cultural content of the cultural industry. The motivation for these studies is as follows. There is a growing interest in the use of high-precision real world spatial data in games, movies, art, tourism, advertising, and entertainment, and the costs associated with data deployment are increasing. Thus, it is necessary to provide data acquisition infrastructure to reduce costs in such areas. The field of study consists mostly of four sections such as creating spatial contents, end-user service, standardization, and content demonstration services.

First, the creation of spatial culture content is related to the technology linking multi-layer space and diverse multimedia. It includes technologies to create user-generated spatial content in addition to conventional multimedia content. The second part allows users to semantically search multimedia based on temporal and spatial information and to visualize content on an end-user system. It also includes technological development that applies these content to commercial advertisement and personalized service. The spatial content standardization involves standardizing cultural content related to multi-layered space and guaranteeing copyright of them. Through these studies, affordable cultural content, which has very accurate spatial assets, can be served to the public. Finally, in the content demonstration services, we will implement a more user-centered service by adopting the multiple narratives mentioned above in order to provide more meaningful user experience in the various PoIs.