Keywords

1 Introduction

Controlling air traffic from anywhere than from a local tower is the core of Remote Tower Operations (RTO). Thanks to optical visual representation of the out-of-the-window view in a digital video panorama, one or more aerodromes can be controlled remotely from a Remote Tower Center (RTC). Originally conceived at the German Aerospace Center (DLR) Brunswick [1] to provide air traffic control (ATC) at a better cost-efficiency ratio, the idea of remotely controlled air traffic went viral and was firstly operationalized in 2015 [2]. Especially for regional airports, struggling with financial issues, RTO represents an efficient solution. Next to economic benefits, RTO could even outperform conventional tower control, thanks to assistance systems that could support air traffic controller officers (ATCOs) in the future. In this context, the German research project “INVIDEON” investigates how to assist ATCOs in RTO. More specifically, it concentrates on augmented reality based on the fusion of VS and IR images on the digital video panorama (output). A second research question investigates how to support this new work environment with adapted input devices (e.g. control of fusion level, extended use of pan-tilt-zoom function). In this paper, we will at first give a theoretic background on augmented reality in ATC, on VS and IR advantages as well as on input devices used in RTO environments. Later on, methods and contents for the three INVIDEON-workshops are described. Finally, we present current results and give a general prospective for further research.

2 Augmented Reality in Air Traffic Control

Without any system providing augmented reality, ATCOs would only perceive what they could perceive relying on their biological senses. By contrast, augmented reality allows their users to perceive more stimuli than they would actually do through supplementary information (e.g. visual cues) about his or her environment. For ATCOs, who rely especially on their visual faculties to perform their daily tasks at work [3], augmented reality has the potential to provide valuable assistance. Past research has already developed new concepts for augmented reality in conventional tower control. Through head-mounted displays [4,5,6,7,8,9] or holographic screens [10] ATCOs can be provided by supplementary information they would not see through the out-of-the-window view. Concerning RTO, implementing augmented reality seems to be even easier than in a conventional tower environment. Given that RTO are already based on the visual presentation of an aerodrome in a digital video panorama, features like aircraft detection/identification- or aerodrome information, like weather, wind or stop bars could directly be integrated in the video panorama [11]. Thus, latency between the occurrence stimulation and the display response, that is likely to appear with optical see through displays, can be reduced [12]. With “Head-up Only”, Papenfuss and Friedrich [13] designed a concept aiming for the increase of visual attention through additional information in the video panorama (e.g. approach radar, pan-tilt-zoom camera (PTZ), electronic flight strips, coupled radio frequency, weather data). Due to decreased head-down-times in such a working environment, ATCOs are estimated to work more efficiently since the changing accommodation of the eyes ceases. The anticipated benefits become even more pertinent, when visual information is deteriorated or even inaccessible, due to bad weather conditions or at nocturnal times. At this point, INVIDEON kicked-off.

3 INVIDEON

In the context of further development of RTO, INVIDEON aims for improvement on the design of the video panorama through augmented reality, using optical sensors only. Currently, the standard set-up for RTO is a VS video panorama that represents the out-of-the-window view from a conventional tower. Furthermore, ATCOs use a PTZ as a replacement for conventional binoculars to magnify distant objects of interest. As extension to the standard set-up, some RTCs present IR information on extra screens to get supplementary information when visual conditions are altered. To seize the advantages of having both visual information materials at disposition when needed, the next paragraphs aim to point out the characteristics of VS images and IR images.

3.1 Characteristics of VS Images

The visual output from a RTO video panorama based on VS images is oriented by the visual faculties of the human eye. Color vision is a faculty that helps humans to distinguish objects from each other as it increases the contrast between them if their colors differ. The objects of interest can therefore be detected, recognized and identified easier [14]. Humans perceive colors because surrounding objects reflect electromagnetic waves that are captured by the dedicated photoreceptors on the retina (cones) if their wavelength is within the spectrum from 380 nm to 780 nm [15]. However, color vision works only at daytime or under artificial light and best under good visible conditions since cones are only activated in the presence of visible light with sufficient intensity. This also applies to visual acuity, which is another faculty of the human visual system. Thanks to visual acuity, the texture of an object of interest can be perceived in detail and therefore recognized and identified. More accurately, the perceived declining size of texture elements gives us important information about the depth of scenery [16]. If it wasn’t for depth perception, ATCOs could not correctly assess speed, distance or size of an object in space. In resume, the video panorama with its transmitted VS images, furnishes ATCOs almost everything he or she would see from a conventional tower. By consequent, the visual environment is one, with almost all of its advantages, they already are used to. As stated above, the perception of information through visual images works best under good light conditions because regular VS sensors only detect reflected sun- or artificial light sources. Therefore, detection, recognition and identification processes thanks to color- and depth perception are strongly altered under bad visibility conditions and even disappear in the dark.

3.2 Characteristics of IR Images

For almost eighty years [17], military institutions have been using IR sensors to detect targets even in the dark [17, 18]. As a matter of fact, IR technologies are able to detect electromagnetic waves beyond the visible spectrum. IR wavelengths reach from 780 nm to 1 mm and are therefore not visible to the human eye. Next to thermal detectors, photon detectors are amongst the most performance IR technologies [19]. More precisely, they capture the radiation of an object of interest and by interacting with electrons on the optical sensor; an electrical output signal is generated [19]. These signals are transformed and displayed as an IR picture which humans perceive as poorly textured, black-and-white picture. As described by Planck’s law, all surfaces of objects emit electromagnetic radiation with wavelengths corresponding to their temperature. For usual surrounding temperatures, the maximum of the emitted radiation has wavelengths in the IR spectral band. In contrast to VS camera sensors, which detects light reflected by the objects, IR sensors detect this self-emitting thermal radiation of surrounding objects. Therefore, warmer surfaces (e.g. engines of an aircraft, humans or birds), usually displayed brighter, can be distinguished in high contrast from cooler ones (e.g. ground, sky). This contrast based on temperature difference compared to the color based contrast in the VS image makes detection and tracking of objects easier in the IR image. As IR imaging does not need sun- or artificial light to display objects, night vision is possible and the different wavelength improves vision under bad weather conditions (e.g. snow, fog, and rain).

3.3 Workplace to Enable Fusion of VS and IR Images

As the previous paragraphs about VS and IR images already have emphasized, there are noticeable advantages of using both optical modes. The permanent availability of VS and IR camera information could help ATCOs in specific situations. However, if the information is presented separately, it also could make them deal with higher head-down times and therefore lower situation awareness or increase workload. Therefore, the first goal of INVIDEON consists in developing a demonstrator able to display VS and IR camera images simultaneously, merged into one video panorama. As a second goal, this fusion needs to be controlled by adapted input devices that are tested with end users. In addition, the integration of the PTZ function in the merged RTO environment is to be tested, as well as its associated control modalities. This paper focuses on the aims to develop a rapid prototype of such a system and gives prospective for further research within INVIDEON.

4 Methods

A user experience focused approach was the methodological framework for three explorative workshops carried out within INVIDEON. For adequate human-machine interaction (HMI) design, rapid prototyping methods were applied with the aim to provide user-centered systems. Therefore, the user’s perception of a VS/IR camera merged video panorama with adapted input devices and the PTZ-control was taken into account before, during and after the prototyping processes. In this chapter, general applied methods will be described, followed by detailed methods concerning each workshop.

4.1 Participants

A total number of seven ATCOs (all male) took part in three workshops. In the first and second workshop, four ATCOs joined per workshop; in the third workshop, three ATCOs took part. Three ATCOs were present at two workshops; one ATCO participated in all three workshops. Their professional responsibilities included runway and ground control on regional airports. They participated voluntarily and were recruited by DFS Aviation Services, a INVIDEON partner.

4.2 Material

Input Device Material.

For workshop 1, three input devices to control the PTZ camera were provided: a 3D-mouse, eye-tracking glasses and a touch input device via tablet. The 3D-mouse is a device that allows ATCOs to control the PTZ camera in a tridimensional manner. More specifically, ATCOs can klick on an object of interest, increase its size by a zoom function gradually or stepwise on different levels and track it manually. Thanks to eye-tracking-glasses, the PTZ-camera can be controlled by the captured eye fixations and nodding. Reflecting targets at the glasses’ edges reflect infrared radiation back to captors attached to the RTO test platform. When the ATCO fixates an object of interest and nods, the requested object is magnified on a screen. By the means of a touch input device via tablet, ATCOs are able to control the PTZ via a presentation of an airport map and with the aid of a miniature panorama of the exterior view on a tablet. Some areas of interest are tagged on the map. They can be selected by tapping on the tablet; by consequent the PTZ automatically focuses on these hotspots. Furthermore, the size of objects of interest can be increased. Independent from the input device, the PTZ-video was displayed on a separate monitor and not yet included in the video panorama.

For workshop 2, a 3D-mouse to control the VS/IR fusion was provided. Thanks to this input device, ATCOs can control gradually or stepwise to which extent the video-panorama is displayed in the IR, fully merged or VS range.

Image-and Video-Material.

For workshop 1, singular IR and VS video streams as well as singular IR and VS images and two merged VS/IR images (an image closer to VS and one in “pseudo-colors”) were at disposition. The VS video-panorama and IR video represented scenarios from Braunschweig Wolfsburg Airport (BWE). The singular VS/IR and merged image-material was selected by project partners. Both video and image material were provided to show ATCOs the characteristics of IR and VS and to highlight their corresponding advantages. Two versions of merged VS/IR images were prepared to give an impression of how a VS/IR fusion could be displayed.

In preparation for workshop 2, several hours of traffic have been recorded simultaneously with VS and IR cameras. The videos were taken on March 7th 2017 at BWE under visual meteorological conditions by a mobile camera-carrier belonging to Rheinmetall Defence Electronics. The video material contained regular traffic (IFR & VFR) and commissioned flights (VFR) to provide a variety of elements that an ATCO normally would have to handle at a regional airport. In addition to a maximum of occurrences in a period of 20 min, other events like bird flocks for instance, were present in the selected scenario. For workshop 2, a fully merged IR and VS panorama was provided by Fraunhofer IOSB (cf. Fig. 1.).

Fig. 1.
figure 1

VS and IR mode in a fully merged version.

Simulation Material.

In preparation for workshop 1, and 3, different scenarios were created on the simulation platform at DLR. The content of each scenario was created step by step based on the project goals and the ATCOs feedback in the previous workshop.

For workshop 3, a rapid prototype (cf. Fig. 2) was created relying on the feedback and findings in workshop 1 and 2. A head-up display of the PTZ and the VS/IR merged video-panorama represent the output core prototype. The platform design relied essentially on feedback and findings from workshop 1 and 2. Therefore, a chart with integrated hotspots and a 3D-mouse inspired digital PTZ-control- and a digital slide bar to control the overlay were the basis of the ATCO’s control monitor (cf. Fig. 3).

Fig. 2.
figure 2

Prototype of ATCO workplace in workshop 3

Fig. 3.
figure 3

Control monitor with interactive chart for PTZ function (1), digital 3D-mouse inspired PTZ-control input device (2) and slide control input device for VS/IR overlay (3)

Data Collection Material.

The data collection was based on qualitative methods such as active brainstorming, open discussions and semi-directed interviews. Furthermore, data collection by quantitative methods was applied through the system usability scale [20]. A mixed approach of qualitative and quantitative methods represented the use of an adapted Cooper-Harper scale.

4.3 Workshop 1

Goals of Workshop 1.

The first goal of workshop 1 consisted in presenting singular VS and IR video-streams as well as singular VS and IR images and two different merged VS/IR images to get ATCOs’ feedback about the perceived advantages and disadvantages of VS and IR modes as well as their first impression on merged VS/IR material. Secondly, workshop 1 aimed for testing three different input devices to control the PTZ function, integrated in a video panorama.

Procedure of Workshop 1.

In part one, ATCOs evaluated singular VS/IR video streams as well as singular VS/IR images and two differently merged VS/IR images. In cooperation with human factors specialists, ATCOs were invited to compare both display modes and to point out advantages and disadvantages of each video mode, in relation to their daily ATC practice. Furthermore, they were asked to give feedback on the two differently merged VS/IR images.

In part two, ATCOs used the simulation platform at DLR to test three different PTZ control input devices by means of a prepared traffic scenario at visual meteorological conditions. The input devices were a 3D-mouse, eye-tracking glasses and a touch input device via tablet. Only one input device was tested per scenario. After each run, ATCOs completed a SUS-questionnaire [20] evaluating firstly the utility and usability of the tested input modality on a 5-point Likert scale (1 = totally disagree to 5 = totally agree). At the end of the questionnaire, they were asked about advantages, disadvantages, possible improvement and supplementary comments they associated with the tested input device. At the end of all three runs, ATCOs were debriefed and interviewed about their experiences with the different input devices during the experiment.

4.4 Workshop 2

Goals of Workshop 2.

Workshop 2 focused at presenting a fully merged VS and IR video stream to ATCOs to receive their feedback from an operational point of view on advantages, disadvantages and possible improvement measures of the VS/IR control device.

Procedure of Workshop 2.

The DLR simulation platform was used to show the fully merged video-stream from a real-time traffic scenario described in Fig. 1. ATCOs were recalled the advantages of both VS and IR modes and asked to manually control the fusion degree of VS/IR with a 3D-mouse, depending on the visual cues they would like to detect and to recognize. Thanks to the 3D-mouse, ATCOs were able to switch smoothly in gradual steps from IR to VS by turning the input device or to make bigger progressive steps by tapping on it. While the participants were watching the video and tested the VS/IR control features, the experimenter encouraged to change the display mode between VS and IR at specific events in the video (e.g. grey plane in front of grey sky). Thus, all ATCOs saw the same situation in both modes of presentation as well as in different fusion degrees. At the end of the scenario, the experimenter asked ATCOs in a semi-guided interview questions about their opinion on object detection, weather and light, input modalities and usability.

4.5 Workshop 3

Goals of Workshop 3.

Workshop 3 aimed at testing the elements elaborated in the previous workshops combined in one prototype. This set-up includes a head-up PTZ camera display controlled by a 3D-mouse inspired digital input device and a slide bar for VS/IR fusion control. Feedback on the tested prototype from ATCOs should be provided to project partners so that they could, as a result, adapt it better to the operator’s needs.

Procedure of Workshop 3.

Two ATCOs participated in the study at the prototype test platform at the same time. One had the role to execute ATCO relevant tasks while the other one was an expert observer. Each ATCO performed both roles. The complete exercise run took two hours in total. ATCOs began by a 30 min training session which was followed by traffic scenarios under CAVOK conditions, foggy conditions and night vision; each scenario took 30 min. During the exercise run, the expert observer completed adapted Cooper-Harper scales to estimate the traffic situation management depending on visibility conditions, the use of VS/IR fusion tools and PTZ control. After each run, the active ATCO completed a SASHA [22] questionnaire where they could rate their perceived situation awareness on a 7-point Likert scale from (1 = totally disagree to 7 = totally agree) as well as on utility and usability (SUS) of the previously tested system. In a debriefing phase, ATCOs could add comments, opinions and further suggestions on the exercise and the setting.

5 Results

Due to the low number of participants, the recorded data was analyzed descriptively. In the following chapters, the results will be described separately for each workshop.

5.1 Results of Workshop 1

Feedback on Singular VS/IR Images and Merged VS/IR Images.

The feedback on singular VS/IR video streams as well as on the singular VS/IR images showed that ATCOs perceived the difference of information they got from each display mode. The idea of having access to additional visual cues through IR overlay in lower visibility conditions was perceived positively. From an operational point of view, the ATCOs pointed out requirements to prevent loss of reality, false interpretations (e.g. jetwash that looks like fire in IR) and liability questions.

Concerning the two differently merged VS/IR images, they preferred the version that was closer to VS mode than the one which relied on “pseudo colors”.

Test of PTZ Control Input Devices.

Concerning the 3D-mouse, a total score of 61 out of maximal 100 was attained. According to Bangor et al. [21] this score indicates that the utility and usability of this device was rated as “ok”. ATCOs stated that they appreciated especially the intuitive handling of the 3D-mouse but criticized the latency in system reaction by the manual object tracking, which could result in increased head-down times.

The eye-tracking glasses achieved a total score of 51 which suggests a rather poor utility and usability performance. Even though ATCOs were very fond of the idea of not having to control the PTZ manually, disadvantages from a practical perspective emerged. Thus, ATCOs criticized that nodding was rather cumbersome and that the eye-tracking-function was not as accurate as expected. Above all, ATCOs claimed that the glasses were not comfortable to wear. Concerning improvement feedback, they endorsed the idea of an exact eye-tracking instrument without glasses that magnifies an object of interest by other means.

Concerning the touch input via tablet, a score of 43 was attained, which also stands for a poor performance in terms of perceived utility and usability. Despite the positive aspect of having a good overview on hotspots, ATCOs criticized the amount of hand movements necessary to execute the PTZ. Furthermore, the fact of not being able to swipe over the touchpad but having to tap constantly was perceived as an obstacle for active objective tracking.

In resume, the touch input via tablet scored lowest (N = 4; SD = .48) after the eye-tracking-glasses (N = 4; SD = .43). The perceived utility and usability was rated highest for the 3D-mouse (N = 4; SD = .6) (cf. Fig. 4).

Fig. 4.
figure 4

SUS-Score of tested PTZ control input devices

5.2 Results of Workshop 2

The results of workshop 2 show a variety of first impressions on the fully merged video stream.

Generally, ATCOs hesitated to take clear positions on aircraft identification, correct assessment of speed, acceleration and heading of an aircraft. Moreover, they expressed that they would like to see other visibility conditions such as night- and fog scenarios.

Concerning the input VS/IR control device, one positive aspect was the possibility to “jump” from IR over fixed overlay degrees to VS. Nevertheless, others preferred the gradual movement they could apply to smoothly overlay an IR range with VS range. As an additional result, the suggestion to replace the 3D-mouse by a digital slide control device emerged.

5.3 Results of Workshop 3

The results of workshop 3 will be described in terms of situation awareness, perceived utility and usability as well as in terms of estimated traffic situation management.

Concerning the perceived situation awareness, ATCOs achieved the highest score in the foggy conditions (N = 3; SD = .1), followed by the night condition (N = 3; SD = .48) and the CAVOK condition (N = 3; SD = .25) (cf. Fig. 5). The average means of perceived situation awareness are good (CAVOK) up to very good (night and fog condition).

Fig. 5.
figure 5

Perceived situation awareness per experimental condition

The perceived utility and usability attained a mean score of 85. According to Bangor et al. [21], this score indicates that the utility and usability of the system was rated as “excellent”.

Results show that no major impairment was perceived by the observer while the active ATCO performed ATC relevant tasks and operated the VS/IR fusion and PTZ control function.

6 Discussion

Throughout the three workshops, a RTO prototype equipped with augmented reality features such as VS/IR fusion and head-up PTZ display with adapted input devices was developed with a user-centered approach. From a general and abstract concept, human factors specialists, project partners and end users worked out a concept that gradually improved.

Beginning by discovering the advantages of VS and IR modes in ATC, concrete ideas were developed in the first two workshops to redefine requirements. In fact, presenting more information to the ATCOs than they perceive currently under restricted visibility conditions would influence their work methods. Certainly, they could work on a rather constant workload if air traffic does not decrease due to bad weather conditions. Nevertheless, it has to be clarified what happens in terms of communication and liability when ATCOs see more than pilots. Concerning the VS/IR fusion modalities, ATCOs had a clear preference for a merged image that is closer to what they see in VS. However, preferring a more “realistic image” is not surprising considering the ATCOs work methods.

Regarding the perceived situation awareness in workshop 3, ATCOs rated their perceived situation awareness highest in the foggy condition (N = 3; SD = .1), followed by the night condition (N = 3; SD = .48). The CAVOK condition (N = 3; SD = .25) scored lowest but still as “good”. These results can be explained in two different ways. It is therefore possible that ATCOs detected objects of interest better due to the predominant use of IR which results in higher contrast perception due to sharp-edged contours. Another explanation is a training effect. Thus, ATCOs were already better trained in the fog and night condition compared to the CAVOK condition which was the first condition after training.

Compared to the utility and usability perception of the tested PTZ control input devices in workshop 1, the perceived utility and usability of the final prototype increased to “excellent”. The relatively high score can be explained by the results and the progress throughout the three workshops, but it has also to be considered that the user-centered approach might have an impact on the results. Integrating the final users’ suggestions into the construction cycle and adapting the object in creation to their specific needs is fundamental for successful HMI-design. Such approach is predictive for higher user acceptance and satisfaction. As past research suggests, letting final users participate in change processes reduces their resistance to change [23]. Therefore, it should be continued to include ATCOs into future studies and workshops. In this case, the created prototype could inspire ATCOs for further implementation strategies. In a next step of INVIDEON, the concept will be tested by means of live video material. Especially when integrating new features into a RTO environment that imply VS/IR video fusion, it is necessary to test them with real videos rather than in a simulation environment only. Another planned activity is to develop automatic IR-tracking as an ATCO assistance extension to PTZ object following.