Keywords

1 Introduction

LFV, the Swedish Air Navigation Service Provider, focuses on the digitalization of its services and in particular the deployment of digital air traffic control services for small and medium size airports. The Swedish airports Örnsköldsvik and Sundsvall were put into operations remotely from the remote tower center (RTC) in Sundsvall in the year 2015. The airports Linköping, Malmö, Umeå, Östersund and Kiruna are following in the scope of an ongoing implementation program.

It is then important to investigate the possible influences of a digitalized tower working position on tower controller’s behavior of sensing and recognizing safety-relevant visual information for decision making after the transition from conventional towerFootnote 1. The purpose of this paper is to conduct a proof-of-concept evaluation that shall test and assess a new method of classifying visual scan activities through a case study of the visual scan patterns of three air traffic controllers at one airport. Our approach is the analysis of a baseline of the conventional tower that identifies visual scan patterns that are characteristic for the tower controller’s work. The identified characteristics shall be used for a subsequent comparison with the digital remote tower environment. We use episode analysis, involving an area-of-interest (AoI) analysis, and so called dwell-time-share diagrams that are specifically developed by us to identify characteristics in the form of reoccurring visual scan patterns and include the out-the-window (OTW) view as the primary information source.

The remote provision of control, information, weather observation and alerting services relies primarily on the video presentation that substitutes the conventional OTW view. The video presentation is enhanced by automatic assistance systems in an integrated platform solution from the industry (SAAB) such as wind sensors, cloud ceiling and weather radar that is used to support the weather observation. Nevertheless, the tower controller’s capability to monitor and assess the conditions in the control zone and on the runway is safety-relevant for decision-making. Necessary actions for separating aircraft, such as instrument flight rule (IFR) and visual flight rule (VFR) based movements or recognizing runway conditions during landing and departing situations rely on the tower controller’s capability to visually search, find and assess cues using the visual information provided.

A considerable number of research studies addressed related behavior phenomenon by investigating the visual activities in tower control, of which some are briefly presented here. A list of 28 “visual features” was identified by Ellis and Liston [1] from discussions with 24 controllers. At the example of aircraft landing deceleration on the runway, visual velocities and features of anticipating the aircraft were analyzed on the ability of the tower controller to perceive the speed by the visual change. According to the results, tower controller’s ability to judge the change of speed is a learned viewing strategy. Complementary to this study, the visual cues perceived by the tower controller are investigated by means of a questionnaire and seven tower controllers [2]. A list of OTW-relevant visual cues was ranked according to the range of perceptibility and importance with the cue “vehicle on maneuvering area” as most important to detect.

An empiric study on the use of the out the window view revealed that the tower controllers identify aircrafts visually for verifying the information provided by the flight strips [3]. Additionally, the airport is monitored occasionally in order to potentially permit an immediate reaction to unexpected events. The sequence of scanning working instruments and areas of interest of the OTW was investigated by using an episode analysis [4]. The comparison of the tower controller’s work patterns regarding system interaction in a multi and single airport working environment revealed an individual variance between tower controllers and the interdependency between work patterns, the design of the environment and the use of implicit communication. The findings reveal the existence of visual working patterns that are characteristic to the individual tower controller and that depend on characteristics of the operational context such as weather and traffic.

Possibly, safety-relevant effects on the working behavior of tower controllers may arise from the fact that the video and visualization technique affects the “in situ” perceived picture compared to a conventional tower. These implications might arise from the fact that the video presentation bases on camera and visualization technology that is still state-of-the-art but nevertheless a reproduction. The design of the video presentation equipment tends to be dominated by questions about display resolution minima where 85% of the population is able to discriminate visually 1 arcsec−1 [2]. Comparing camera and human visual capabilities fairly, the range of aspects is more diverse from which two are presented briefly. Exemplarily, the human eye is able to see a huge range of intensities, from daylight levels of around 108 cd/m2 to night luminances of approximately 10−6 cd/m2 [5]. It is capable of working in visual environments with a large luminance range due to a process called “adaptation”. At the retina level, eye adaptation is highly localized allowing us to see both dark and bright regions in the same high dynamic range environment. The capability to detect motion by the human eye is performed by amacrine cell that reports salient features of the visual world to the brain [6]. This is an important feature of the peripheral vision that allows the tower controller to keep track of movements on and around the runway including the instantaneous detection of non-authorized movements. Taking into account the findings on controller’s work pattern, these are indispensable capabilities of collecting visual evidence for building up situational awareness and decision making in a safety-critical work environment.

With a view on the forthcoming transition process to digitalized remote towers, the direct visual contact of the human eye to the operational environment is substituted by a video presentation. This hence changes the physical origin of visual stimulation. An operational relevance might result from the circumstance that the substitution affects the mentioned capabilities of the tower controller. This possibly affects activities of searching and establishing visual contact to operational-relevant objects in time. A safety-relevant question arises as the early identification of threatening situations such as the runway incursion relies on the timely provision of all visual cues under consideration of the physiologic-visual capabilities of the human.

Our first step of investigating possible implications of the video presentation is to create a baseline that consists of characteristics of scanning behavior for a later comparison with the digital remote tower which we plan for this year 2019. Accounting for the diversity of characteristic scanning behavior [4], we distinguish two key areas that may be subject of implications when changing to a video presentation:

  • Group-specific characteristics of visual scan patterns that tower controllers share

  • Individual characteristic of a certain tower controller (visual signature).

Both points are interrelated since visual signatures and group-specific scanning behavior exclude each other implicitly.

For the baseline in the conventional tower, we identify characteristics consisting of a systematic and repeated sequence and related timing of the scanning pattern during a specific activity of the controller. The identification shall then succeed by comparing gaze data samples of approach situation samples from tower controllers in live operations. The approach of an aircraft, including final approach and landing, is a high-risk situation in which 49% of the accidents in commercial aviation occur [7]. All operational processes on and around the runway rely on the complete understanding and situational awareness of the controller. Possible erroneous judgments might be caused in incomplete or corrupted scanning patterns by the controller.

Equal conditions of comparison are an essential prerequisite for identification that requires distinguishing and classifying the traffic situation, weather and the activity. The latter refers to the intended task as defined for the tower controller that is assigned to follow the rules as defined by ICAO doc. 4444 PANS-ATM [8]. The intention to undertake a certain control task is a key feature of explaining the variance of the actual scanning activity (based on empirical observations). This is important due to the trained ability of the controller to handle multiple tasks at a time that are serialized by switching the task according to the current demand [9]. By such a task-related classification of the activity, we expect to distinguish and identify even small features of characteristics in the scanning patterns since they feature the same intention of the tower controller.

In the scope of this approach, we present here the results of a proof-of-concept study that has the following objectives:

  • Evaluating the method of classifying activities of equal intention

  • Identification of the baseline characteristics in the conventional tower.

In the following, we present the setup of the observation study for collecting eye tracking data in live operations. Further, we introduce our verbal coding method that is used to distinguish intention and the related activities for understanding the visual scan patterns. For analysis we use the dwell-time share diagrams for comparing the scan patterns observed. The discussion highlights aspects of the results such as the conditions of recordings and the found characteristics of the scan patterns. Finally, we conclude the major statements possible on the basis of the results gained so far.

2 Method

2.1 Observation Study

The field study was conducted at the “SAAB” Linköping Tower during two days and involving three tower controllers. Figure 1 provides an overview of the tower working position, including areas of interest (AoI) (see Sect. 2.3 Episode Analysis for further details of the AoIs). Controller A is 30 years old, with an operational experience of 2.5 years, B is 29 years, with an operational experience of 4 years and C is 42 years with 22 years operational experience. The tower environment was selected due to the fact that Linköping Tower is planned to be remotely controlled in spring 2019. Eye-point-of-gaze (an indicator of visual attention) in the conventional tower was measured by means of a Tobii Pro Glasses eye-tracking equipment, with three tower controllers. The Tobii glasses provided data of eye gaze movements with a sampling rate of 50 Hz extended by audio and video captures of the scene. The use of eye tracking-related terminology refers to the definitions made in [7]. The analysis was performed using Tobii Pro Lab 1.76 and specifically programmed tools on the basis of Java.

Fig. 1.
figure 1

Areas of interest of Linköping Tower

2.2 Verbal Coding

A usual practice in eye gaze analysis is to use radio voice communications for disclosing intent and thus to classify the observed activities (e.g. empirical observations of the controller’s work). In contrast, the use of the window view is in the majority of the situations featured by radio silence and thus does not provide sufficient cues for concluding on the actual intention and classification of the related activities.

A key requirement in our approach was to relate the intent of the tower controllers to the observed visual activities. The aim was to classify episodes of activities and thus to identify similar situations of using the window view. Our solution to the issue of identifying intent was to conduct an “in situ” verbal coding that extends the recording by an active support of the tower controller. Beside the task to provide tower control services, the controller was advised to utter clearly a code to indicate the current visual activity. A list of verbal codes was evaluated by the tower controllers and reduced to a basic and simple set that all refer to the use of the out-the-window view:

  • “Check”: The controller checks the runway for obstacle clearance.

  • “Birds”: The controller checks for birds on or around the runway.

  • “Search”: The controller search for expected approaching aircraft in the airspace.

  • “Contact”: The controller establishes visual contact to the expected approaching aircraft.

The verbal coding provides an important subjective reference time of the true event of establishing visual contact. The chosen approach thus permits for narrowing the selected time period of recordings down to the desired search activities.

2.3 Episode Analysis

As mentioned in the introduction, our analysis approaches the identification of characteristics in the sequence and timing of scanning visually the working environment. Therefore, the chosen analysis method applied is the “episode analysis” which allows bigger audio and video data sets to be divided into shorter episodes, or sub-episodes, for in-depth transcriptions [4, 10, 11].

To narrow down the times of interest, we divide the audio, eye-gaze video into episodes of interest for in-depth description. The episodes of interests are arrivals of aircraft to the airport, from the initial call of entering the control zone till the touchdown. The purpose of this analysis is to identify the times of activities that are related to the visual search for the expected aircraft and other OTW-related activities. The visual search might be embedded within the task of handling an approaching aircraft beside activities such as note taking on the flight strip or communications with the aircraft. Thus, the episodes help us to determine the periods of visual search and to predict the intention of the tower controller independent of the verbal coding.

To determine visual scan patterns, areas of interest were defined as shown in Fig. 1. It shows the view over the runway from the working position and surrounding equipment such as the radar, clock and an additional video view. The 15 AoIs are complemented by the flight strip-AoI and the vicinity of the runways divided into a lower and an upper part each (18 AoIs in total). On the basis of the area-of-interests and episodes, the related dwell times are calculated indicating the share of attention over the area-of-interests. The dwell-time-share diagrams developed for this purpose highlight the visual scan pattern of the three tower controllers during approach situations in an easy understandable way.

3 Results

The analysis bases on eight hours of eye tracking recordings from the three tower controllers. The recordings were conducted in a period of August and September 2017 as well as February 2018 in a time between 8 am and 2 pm when higher volumes of traffic were expected including VFR and IFR. The weather conditions had a visibility above 12 km with scattered till broken clouds during all recordings. The runway in use was 29, meaning aircraft approached from the east side on the controller’s right. From all recorded situations, four samples per tower controller were chosen for the episode analysis containing one approach situation each.

3.1 Dwell Time Share

The dwell time diagrams (Figs. 2, 3 and 4) show the temporal distribution of fixations on the 18 specifically defined AoIs indicated by the color of surface. We defined a fixation with a minimum duration of 60 ms not exceeding a speed of 30 deg./sec.

Fig. 2.
figure 2

Controller A

Fig. 3.
figure 3

Controller B

Fig. 4.
figure 4

Controller C

The diagrams cover the chosen episode from 4 min before the touch down (or touch and go) till 30 s after the touch down (or the touch and go). 4 min was chosen due to the time of the aircraft having entered the control zone. This provides a complete picture of the working context including the entire approach situation. The graph distinguished fixations on the OTW as surfaces on the upside of the coordinate axis whereas head down fixations lie on the downside. The AoI-states are originally binary distributed resulting in the majority of the cases in a highly fragmented picture of the context. Therefore, we applied a smooth filter by using a sliding window with a size of 2000 ms (symmetric range 1000 ms). This allows us to filter out the long term work pattern by reducing the noise of fragmented AoI-fixation that is caused by high frequency changes. Complementary, we use a fixation time metric that indicates the relative mean length of a fixation within the sliding window by a black graph. The graph indicates situations in which AoIs were scanned more intensively than others. An example with low fixation times is the runway check where the controllers slips visually across the runway using saccadic eye movements. This is in contrast to the use of the radar screen that exhibits often times long fixations times.

The diagrams have an additional label on the upper axes indicating the times where the tower controller uses the verbal codes. These are complemented by event labels such as the moment of giving a clearance to the aircraft, landing as well as touch and go events. Some graphs are left truncated due to operational limitations of initiating and calibrating the eye tracking device while providing control services.

The diagram shows the labels “search”, “contact” as well as “check” in all the approach samples. The code “birds” was used 8 times by 2 of 3 persons. All “search” and “contact” codes were used during periods of fixating the OTW. Corresponding, the code “check” was used while fixating the runway. The code “bird” gave mixed results as it was not clearly associated with a certain area rather it can only be stated that the point of fixation was on the runway or the vicinity of the runway.

3.2 Statistics for the Areas of Interests and Verbal Coding

Times of Verbal Codes

Looking at the analysis of the times of verbal coding, the establishment of first visual contact had a mean of 81.9 s (SD 19.21 s) before the moment of touchdown. Controller A distinguished from the other controllers with an early establishment of visual contact in sample 1 (124.1 s) and sample 4 (113.2 s). Controller A showed also the highest dwell times at the east airspace with between 18.6 and 52.1% within the episode (Table 1). Controller B showed the latest establishment of first visual contact with times between 49.8 s and 73.1 s. The mean time between calling out “search” and “contact” was 10.2 s (SD 9.6 s).

Table 1. Dwell times of selected AoIs in percent

Area of Interest statistics

Comparing the overall mean of the dwell times on the basis of Table 1, the use of the radar exhibits the highest overall share of fixations with 22.8% followed by the east airspace (22.3%) and the west runway (10.5%). This so called AoI-“triangle” consists of the three most used AoIs that are balanced individually by the controllers in terms of the total amount of attention as well as quality the timing of switching in between these AoIs. According to Table 2, Controller A shows in the episodes a focus on the OTW while Controller B has a rather radar dominated pattern. Controller C has a tradeoff with an increased tendency to the runway compared to controller A.

Table 2. Tradeoff runway, airspace east and radar per controller

4 Discussion

The results show the visual work pattern of 12 approach samples and three tower controllers by means of the dwell-time-share diagrams. The eye gaze data contains the observation of how the control tasks are in facto executed in a conventional tower. The 12 samples cover, therefore, an enormous amount of information as gaze data that is in raw form. The raw form of such data is neither readable nor understandable for investigating the work pattern of tower controllers using the OTW. In this regard, both the AoI analysis of the episodes and the application of a sliding window filter helped to structure the gaze data and thus to increase readability and understandability. The analysis was additionally labelled with contextual information of the operational situation and the intention of the tower controllers while using the OTW. The resulting dwell-time-share-diagrams provides on overview of the work patterns that provides the best prerequisites for identifying characteristics in the sequence of AoIs and the related dwell time.

Based on the work patterns of the dwell time-share diagrams, the verbal coding and the statistics, we found several indications of group-specific systematic working shared by the three tower controllers.

  • As explained initially, the intention to undertake a certain control task is a key feature of explaining the variance of the actual scanning activity (based on empirical observations). Within the episodes, we were able to distinguish the periods of control tasks and related intentions that we define as following:

    • Entry of aircraft into control zone: While the aircraft is approaching the airport, after entered the control zone, but still far enough from the tower to be visually noticeable in the sky, the intentions by the controller are to plan and prioritize the runway usage using the flight strip system. However, there were also several short periods of using the OTW that we explain by a demand for scanning the environment for indications that helps to anticipate upcoming runway usage in terms of expected departure and arrival movement. On the ground side, the monitoring activities include aircraft on the apron preparing the departure. These activities might aim on visual cues such as the boarding or refueling using the apron camera and the apron sight. On the air side, VFR traffic is observed that is located in the controlled airspace using the radar or the OTW. In general, the predominant visual sources are AoIs located on the instrument panel.

    • Visual search for approaching aircraft: The initiation of visual search activities is indicatable by increased dwell times on the approach airspace. Most likely, the moment of initiation is triggered by the traffic situation presented by the radar. The moment when the controller switch attention from the radar to the approach airspace is indicated in dwell-time-share diagrams (Figs. 2, 3 and 4) by the flag labeled as “1”. The moment is in 10 of 12 cases accompanied by a direct change from radar to approach airspace. The triangle of radar, approach airspace and runway checks is established in the following. The runway checks are in most cases applied after the search at the approach airspace.

    • Established first visual contact: By statistical analysis of the verbal codes times, the time of visual establishment was determined at a mean of 81.9 s before the touch down event shows a rather low 19.2 s standard deviation. The landing aircraft types varied from a small P28A till an Embraer 190 with quiet different dimensions of the visual cross-section and thus different prerequisites of detecting the body. An explanation for the nevertheless homogenous time of establishing first visual contact might be the correlation between size of aircraft and its speed during approach. Smaller aircraft are detected closer to the runway which is counterbalanced by the lower speed. After successful establishment, the pattern of the triangle remains. The focus might shift in some cases to an intensified monitoring of the runway vicinity, including the taxiways to the runway indicated by “apron” and “lower west runway vicinity”.

    • Full stop landing: The landing event is indicated by the fixation of the clock and the flight strip, the controller notes the time of landing. The landed aircraft is visually followed on the runway while monitoring the taxiways to the runway.

    • Touch & go: The touch and go is accompanied by following the aircraft visually at the airspace west. The most likely explanation for this is to see the aircrafts turning into a right aerodrome circuit as usual cleared at this airport.

  • The runway check is indicated by short fixation duration and a rather high number of saccades during the scan.

  • The landing clearance was announced in 11 out of 12 cases by the fixation of the wind info.

Within the work pattern, the dwell time-share diagrams (Figs. 2, 3 and 4) indicate several observations that point on the individual signatures on how the task is applied. Exemplarily, controller A and C focused mainly on the airspace for an early establishment of visual contact and embedded short episodes of checking the runway for obstacles. In contrast, the controller B tends to use the radar instead of following the aircraft visually before establishing visual contact and planned already for the next traffic movements at the same time. Distinctions in the efforts on establishing visual contact to aircraft on final are clearly indicatable by the time spend on the approach sector. This corresponds to an individual trade-off between planning and prioritization of monitoring movements. More specifically, the controllers used the flight strips, radar and clock, for planning activities on the first hand and the separation activities on the other hand, involving the runway, position fetching on radar and the window view, as well as occasional monitoring of unexpected obstacles on or nearby the runway. The following features might summarize these distinctive features of the three tower controllers that is considered as characteristic and systematic for the individual:

  • The timing of the task switching

  • The tradeoff of directing the visual attention between the runway, approach airspace and radar

  • The runway checks involving the visual check of the taxiways and runway holding points individually

  • The bird check.

The discussion relies on 12 samples of a field study that was conducted under rather constant operational conditions in terms of weather and air traffic movements. Nevertheless, the variability of the conditions does not allow for a generalization of the results since the sample size does not account for the related complexity of the operations. This concerns especially the confounding effect of other air traffic on ground or in the control zone as well as planned activities of the airport operator that might influence the activities. The results are disturbed by the chance that the controllers did actually not verbalized the current intention at all opportunities available during the recordings. Rather, the visual scan pattern that can be related to a certain intention provides a template that allows for identifying similar periods in the episode.

5 Conclusion

The paper presents a proof-of-concept study that shall test and assess our method of classifying visual scan activities. The verbal coding helped us to understand and relate the observed visual scan pattern to the intention that allows us to identify the times of switching between the tasks during the approach. By this, we were able to classify the periods of executing certain control tasks and to compare them for identifying characteristics of the tower controllers. The dwell-time-share diagrams showed clear distinctions between tower controller’s scan pattern of gathering activities within the chosen periods. The differences were shown in terms of time and efforts spent on specific control activities and the related sequences of focusing on specific visual cues. This concerns in particular the tactics of the controllers to search (visually) for the aircraft in the controlled airspace and on final.

The results show the success of our visual scan pattern analysis method to classify activities while executing a control tasks and to identify differences between controllers. The method, which will be used to proof safety-relevant implications during the transition to digital tower control operations, is however still under development. The focus of future research activities is set, therefore, on the evaluation of robust metrics that shall indicate the statistical significance of the signatures found so far.