Keywords

1 Introduction

Augmented Reality and its applications have progressively achieved attention of both academia and industry, especially during the past two decades. It works by placing virtual information or objects over physical environment being captured with a video camera – leading to a mixed reality having both virtual and physical environments in a meaningful context [1]. Thus the environment surrounding the user gets interactive, which can be manipulated digitally using AR technology [2].

AR-based tracking can be broadly categorized into two groups, marker-based and marker-less tracking [3]. Marker-less tracking use feature based or model based approach for calculating camera’s pose and orientation. While marker-based techniques employ fiducial markers positioned in a real environment and tracked with a camera. These are usually passive markers with no electro coating over them, and with a variety of different patterns printed on a plain paper. ARToolkit [4], ARToolkit Plus [5], and ARTag [6] are a few popular marker-based AR systems.

Indoor navigation has always been challenging for visually impaired people to carry out their routine tasks. According to World Health Organization’s fact file, about 285 million people are facing problems in vision [7]. These include around 13% complete blind people and 87% being visually impaired. White cane stick and guide dogs have been affective in many scenarios for helping blind people in mobility. White cane stick works well with obstacles within its range, i.e. a meter away. Guide dogs can assist in already known places, which however are unacceptable in some societies [8].

Indoor positioning systems currently use various technologies for user’s localization. Wireless methods are comprised of GPS-based [9,10,11], Infrared-based [8], NFC-based [12], Bluetooth-based [13, 14], and RFID-based [15, 16] techniques. Major drawback of these systems is that installation of physical infrastructure is required in the target environment, e.g. Wi-Fi routers, RFID sensors, and Bluetooth beacons [17]. Yet such solutions have a tendency of localization errors and inaccurate results [18]. In contrast, previous studies have shown that Computer Vision techniques may be effective in navigation systems and indoor positioning [19].

Despite of various approaches proposed by researchers; no existing applications help visually impaired people to navigate easily inside large indoor buildings. The primary objectives of this research are:

  • An automated system for generating and augmenting path in an indoor environment using marker-based computer vision methods with the help of a smartphone camera.

  • Development of an Android application to facilitate a visually impaired person to navigate easily in a large indoor environment using merely a smartphone device.

2 Related Work

For outdoor navigation, GPS has been an ideal and de facto solution for positioning and user tracking. However, for indoor environments no such unique technology has been developed so far to solve the problem. To address the challenge, various approaches have been proposed in the literature. Commercial solutions have also been introduced in the market, which utilize various sensors/hardware of the smartphone for user’s current position localization. Such solutions include (a) dead reckoning systems which employ accelerometer, gyroscope, and magnetometer of the smartphone [20]; (b) received signal strength indication systems like Wi-Fi, Bluetooth, and RFID; (c) computer vision-based systems use the high computational capabilities and high performance cameras of smartphones in either marker-based or marker-less approaches for calculating user locality and orientation in indoor environments.

Authors in [21] have proposed a marker-based navigation system using ARToolkit markers, which uses images sequence to locate user position, and overlays user’s view with location information. Video stream obtained from a camera mounted to the user’s head and connected with the tablet, is transmitted wirelessly to a remote computer; which performs detection of ARToolkit markers, location recognition, and image sequence matching. This location information is then transmitted back the user’s tablet. The system does not store any pre-defined map of the indoor environment, so the shortest path to the destination cannot be calculated. It heavily relies on Wi-Fi network infrastructure to be deployed in the building for connecting to the remote server. The image recognition process is also very slow as it tries to match an input image to a buffer of 64 images each time to calculate user location.

In [22], authors have deployed ARToolkit markers in various positions in an indoor environment, which are detected using a camera attached to a laptop device. The laptop displays a pre-defined 2D map of the indoor environment. A route planner algorithm is developed that calculates the current location of the user. The algorithm uses a pre-defined matrix, which represents the links between any two locations on the map. It assists the user with both an audio clip associated with the current location, as well as displaying navigational information over the video stream using AR technique. The route planner algorithm lacks the capability of calculating the shortest path. Moreover, user needs to carry a laptop with a camera being connected.

Subakti and Jiang in [23] have used a combination of different hardware and software for guidance and navigation system for fresh students to experience the indoor building of a university. They have used an HMD for augmented display, android application for guidance and navigation, microcontrollers deployed in the building for sensing light, temperature, and sound. BLE beacons are deployed at various locations in the building to propagate location packets for sensing in the android application. The system works in two mode – marker-based using location-aware QR codes, and with invisible markers using BLE packets for navigational purposes. Map of the building is created as graph of BLE beacons and QR codes in which shortest path can be calculated with Dijkstra’s shortest path algorithm [24]. The system works well but its deployment is accompanied with complex BLE and microcontroller sensors infrastructure.

Yin et al. [25] proposed a peer-to-peer (P2P) based indoor navigation system that works with no aid from predefined maps or connectivity to a location service. Previous travelers record the path on which they navigate in a building and share it with other new travelers. The existing path is merged with Wi-Fi measurements and other key points like turns, stairs, etc. to create a consolidated path information. Smartphone application, ppNav, is developed to assist a new user in employing the path traces of the reference path, generated by previous users. [26] has used ultrasonic sensors and visual markers to assist a blind user navigation in indoor environments. Obstacles are sensed with ultrasonic modules connected to a pair of glasses. RGB camera is also attached with the glasses to detect markers in the environment. Map of the building is stored manually in the software, which makes it difficult to modify or edit paths.

Zeb et al. [8] have developed a desktop application using ARToolkit library to detect markers with the help of a webcam attached to a laptop. Markers are deployed inside a building and their connectivity is manually carried out using hardcoded entries in the application’s database along with auditory information about each marker. A blind user can then navigate through the building by detecting the markers with a webcam, and getting response audio information using headphones. The solution well addresses the situation but needs the user to carry a laptop device. Moreover, hardcoding the path manually into the application makes it harder to extend/update the current path setup.

In [27], authors have developed an indoor navigation system with the help of a laptop attached to the back of the user, an HMD for displaying augmented information, a camera and an inertial tracker are attached to the user’s head. A wrist-mounted touchscreen device is used to display a UI for application monitoring and tracking. ARToolkit markers are deployed in the building, which are tracked by the head mounted camera, and fed into the laptop for comparing to the pre-stored map of the building. The results are displayed on the HMD along with navigational aids using AR techniques. The system works well but is bulky and under low light conditions, it does not accurately identify the markers. Similarly, map generation and storage also requires manual coordinate editing.

Al-Khalifa and Al-Razgan in [28] have developed a system named Ebsar, which uses Google Glass connected to a smartphone to assist a visually impaired person in indoor navigation and positioning. The building is prepared with the help of a sighted person, called a map builder, who moves around the indoors of the building and explores different paths. The map builder marks every room, office, etc. with QR codes generated by Ebsar installed on a smartphone. Distance and direction between the QR codes is determined with the help of smartphone’s accelerometer and compass sensors. All the information gathered is used to create a floor plan graph with each node representing a checkpoint in the building like a room, office, or stairs; and edges for number of steps and direction between the checkpoints. The map is then uploaded to some central web server, which is available to any user with Ebsar installed on smartphone. At the first entrance to the building, the Google glass worn by a visually impaired user detects the QR code, and the application automatically downloads the corresponding map file of the building to the user’s phone. The user can then use voice commands for both input and output of information about the current location. The system is evaluated for performance and accuracy with several sighted and blind users yielding acceptable results. Although, it heavily relies on the smartphone’s accelerometer that can cause certain margin of error in calculating the steps; and the user should constantly have to wear a Google glass connected via Wi-Fi to the phone.

Another research in [29] has developed an indoor navigation system using smartphone by utilizing custom 2D colored markers, and accelerometer for step detection. Colored markers printed on plain papers are displayed on the entrance and other key intersection points inside the building. However, the exact position of each marker in the building has to be recorded offline, i.e. it lacks the automatic buildup of indoor paths. Distance between the markers is measured with accelerometer of the phone. The system proves to be scalable and simple yet has several drawbacks like poor detection of colored markers in low light conditions, incapable of working with multiple-floors building, and inaccuracy found in measuring steps using accelerometer.

Authors in [30] propose an indoor navigation system using smartphone, a newer version of Bluetooth, known as Bluetooth Low Energy (BLE), and visual 2D markers. The building is split into multiple logical regions where each region is installed with a BLE beacon device. The visual markers, ArUco [31], are pasted on the floors of the building, which are then detected by a user with a phone camera pointed towards the floor. Location information decoded from the marker is used by the smartphone application along with the beacon’s data to localize the user in the environment. Although the markers are not inter-related, they provide information about the current position only. The system provides an efficient and accurate positioning but requires beacons infrastructure to be deployed in the overall building, while no path calculation algorithm is proposed.

3 Proposed System

3.1 System Design

Indoor navigation system should be designed in manner to ease the path generation and path augmentation processes; as well as provide a robust and accurate user localization in the environment. Such a system should be flexible to assist in path editing and map extension.

With these goals in mind, we have designed a system thatautomatically detects fiducial markers and creates a floor plan, augments the markers with localization information; and at the same time providing an intuitive way to assist a visually impaired in indoor navigation.

The system’s major function is to assist a visually impaired person in indoor navigation inside hospitals, universities, shopping malls, museums, and other large buildings. It would facilitate the navigation with auditory information to be augmented with the real-world video stream. We have used ARToolkit markers, which are printed on plain papers. These makers are capable of detection using an average quality camera under normal lighting conditions of an indoor environment [32]. The person has to carry a smartphone device, and headphones connected to the phone. The phone should have both the rear camera with a flashlight and the front facing camera. The phone is installed with our indoor navigation application developed using ARToolkit SDK (Fig. 1).

Fig. 1.
figure 1

(a) Path generation process. (b) Path augmentation process

3.2 Path Generation

The given steps are carried out for preparing markers to identify the paths inside a building.

  • Path generation process starts with the step of registering ARToolkit markers in a library.

  • The required number of markers is prepared for all of the possible paths in a single-floor or multiple-floor building.

  • The markers are printed on plain paper, and pasted on ceilings of the identified paths inside the building, i.e., in front of each point of interest location like a room, office, lab, etc.

  • When completed, the user with the help of the Android application on a smartphone starts to scan the building and detects each marker with phone camera.

  • As a marker is detected, a node for it is created in a graph data structure storing information like marker unique identifier, andits direction.

  • Upon detection of next marker, the application connects it to the previously detected marker using an adjacency matrix, using the given algorithm:

    • Suppose we have detected the first marker m 1and created its node in the graph.

    • Upon the detection of next marker m 2, the application checks the angle θ between the y-axis of the camera and the y-axis of the marker.

    • If θ = 0°, it means m 2 is in straight direction to the m 1.

    • If θ = 90°, m 2 is to the right direction of m 1.

    • If θ = 180°, m 2 lies behind m 1.

    • If θ = 270°, m 2 is to the left direction of m 1.

    • We take a ±45° range at each direction calculation, because the camera’s y-axis does not have to be in precise angle along the marker’s y-axis. For instance, connecting m 2 in straight direction to m 1, we consider angle range 45° to −45° (i.e. 315°).

    • Similarly, for connecting in right side direction we take range of 45° to 135°.

  • This way all of the hallways and corridors of the building are covered-up and a graph of the entire vicinity is build up in the application’s database.

3.3 Path Augmentation

After the path generation process is completed, another process, path augmentation, is carried out using the given steps:

  • A sighted operator with a mobile phone having the application installed, traverses again through the building holding the phone camera in a position to capture a video stream of the ceiling and intersection points.

  • When a marker is detected, the application asks the operator for auditory information to be augmented with the marker. The operator records an audio information for the marker and its corresponding location inside the building like Room No. 4, Office, Lab, etc.

  • Here the application also gives an option to add some textual information for the detected marker, which will be used to translate it into other languages by the application, when desired by the user.

  • This way all of the hallways and corridors of the building are traversed and the application database is populated with the auditory and textual information about the entire markers.

4 Technical Assessment and Discussion

For testing the system and the proposed algorithms for path generation and path augmentation, we have designed several experiments. The experiments have been carried were on first floor of the Academic Block, University of Malakand. The actual floor plan of the selected building is Fig. 2.

Fig. 2.
figure 2

Floor plan of the selected building and markers deployment

4.1 Path Selection

We selected four different paths for testing the path generation and the path augmentation algorithms. Here path 1 passes linearly across the corridor of CS department from Research Lab to the HOD office, while the other paths have been selected through the marker distribution in the department hallway as shown in Fig. 3.

Fig. 3.
figure 3

Distribution of makers in the building and path directions

4.2 Experiment 1 – Path Generation

The primary objective of this experiment is to find out the average time taken to detect markers using smartphone camera, identify them, and connect them with each other to define a pathway in indoor of the building. We will also check the level of accuracy in interconnections between the detected markers to match with the actual deployment of the markers inside the building.

The time taken to scan each path and subsequently generate the graph for it in the application’s database is shown in the given table (Table 1).

Table 1. Time taken in path generation

On comparison of the floor graph created by the application for each path with the actual path inside the building; there were found no errors in marker interconnections.

4.3 Experiment 2 – Path Augmentation

In this experiment, the time taken to augment the selected paths with auditory and textual information has been calculated. The average time taken by the application for this task, which is about half a minute, seems efficient. The results are shown in the given table (Table 2).

Table 2. Time taken in path augmentation

4.4 Experiment 3 – Path Extension

In this experiment, we have extended the already stored path graph with some additional markers. This situation is needed when we add new markers to the indoor building where we already have generated paths. Considering the Path-1 in the selected paths, we wish to extend the path and attach the markers having id: 29, 28, and 27; we start with selecting marker id: 2, and moving toward the new path, thus scanning with phone camera till the last marker (id: 27), the final path becomes (Fig. 4):

Fig. 4.
figure 4

Path-1 after extension to include new markers

5 Conclusion

After reviewing several proposals and implementations presented in various researches for assisting the visually impaired people in indoor navigation and localization, we have proposed a novel approach towards the same situation. The solution has been implemented as an Android application, and tested in an indoor environment for efficiency and effectiveness. It has a vital advantage over other solutions in that it requires only a smartphone with the application installed, and its camera features for detection and identification of plain markers – thus localizing the user inside an indoor environment. It presents an automating path generation algorithm that simplifies the creation of pathways inside the building by merely detection and connection of pre-deployed markers. Similarly, the path augmentation algorithm adds auditory and textual information to the path graph. Both algorithms have been tested in a real scenario and the experiments have shown comparatively acceptable benchmarks. We also have tested the path extension algorithm, with which we can efficiently extend the existing path graph to include newly deployed markers.