Abstract
Three-dimensional motion capture systems such as Vicon have been used to validate commercial electronic performance and tracking systems. However, three-dimensional motion capture cannot be used for large capture areas such as a full football pitch due to the need for many fragile cameras to be placed around the capture volume and a lack of suitable depth of field of those cameras. There is a need, therefore, for a hybrid testing solution for commercial electronic performance and tracking systems using highly precise three-dimensional motion capture in a small test area and a computer vision system in other areas to test for full-pitch coverage by the commercial systems. This study aimed to establish the validity of VisionKit computer vision system against three-dimensional motion capture in a stadium environment. Ten participants undertook a series of football-specific movement tasks, including a circuit, small-sided games and a 20 m sprint. There was strong agreement between VisionKit and three-dimensional motion capture across each activity undertaken. The root mean square difference for speed was 0.04 m·s−1 and for position was 0.18 m. VisionKit had strong agreement with the criterion three-dimensional motion capture system three-dimensional motion capture for football-related movements tested in stadium environments. VisionKit can thus be used to establish the concurrent validity of other electronic performance and tracking systems in circumstances where three-dimensional motion capture cannot be used.
Avoid common mistakes on your manuscript.
1 Introduction
Electronic and performance tracking systems measure the location and speed of movement of athletes during competition and training. Speed and location information can in turn be used to describe the physical and tactical behaviour of players [1]. The electronic performance and tracking system market is highly competitive and estimated to grow to be worth up to USD7 billion USD by 2023 [2]. In this market where football department staff are often end users, the assumption is that electronic performance and tracking systems would be independently evaluated prior to purchase. However, in many cases electronic performance and tracking systems are not independently evaluated prior to entering the market, or indeed prior to manufacturers gaining lucrative contracts in professional sport. The Fédération Internationale de Football Association (FIFA) introduced a quality standard for electronic performance and tracking systems in 2019, against which commercial tracking systems are tested. This new quality standard should both accelerate research and development in the electronic performance and tracking system industry, and add important accountability in the accuracy of systems.
There are three main types of electronic performance and tracking systems currently used in sport. These are global navigation satellite systems (GNSS) [3, 4], local positioning systems (LPS) [5, 6] or optical systems [7]. Each system is able to provide the location of a player on Earth, either relative to satellites or nodes in a stadium [3, 5, 6] or calibrated areas on a pitch [7, 8]. From location, speed can be derived, or with GPS the Doppler shift in signal from satellites can be used as an indirect marker of speed [3].
The differences in electronic performance and tracking system make for potential variations as to how validation studies are conducted. For example, with GNSS clear access to the sky to enable satellite reception is required, precluding testing indoors. With LPS an instrumented stadium or laboratory is necessary, and for optical systems sufficient height, line of site and field of view to place the cameras that derive images is required, typically meaning a stadium-like environment. As a result of the differences in systems, and the improvement in the ability to use motion capture systems outdoors in the past 5 years, electronic performance and tracking systems have been concurrently validated against various other systems. Thus, the concurrent validation of electronic performance and tracking systems has been determined outdoors against timing gates [4, 8], or radar [9] in the absence of a true criterion measure.
Three-dimensional motion capture systems, in the cases below Vicon, have been used by us to validate many of the major commercial electronic performance and tracking systems available in today’s market including those offered by Hawk-Eye Innovations Limited, Track160 Ltd, Fitogether Inc, Catapult Sports, Realtrack Systems SL, and Chyron Hego AB (for details see [10]). Three-dimensional motion capture has also been used to validate two optical (STATS SportVU, [7] and Chyron Hego [11]), one local and one GPS system (Inmotio, GPSPortsSPI Pro X [7]). Despite the relatively widespread use of three-dimensional motion capture systems, it is not currently possible to use them on a full football pitch due to the need for many fragile cameras to be placed around the capture volume and a lack of suitable depth of field of those cameras.
There is a need for a method to test electronic performance and tracking systems in stadia to ensure ecological validity of the test environment. With no true criterion measure that allows full-pitch movements and therefore comparisons with electronic performance and tracking systems, a hybrid solution is required. One component of the hybrid solution is a three-dimensional motion capture system operating in a small test area in one section of the pitch. These three-dimensional motion capture systems are considered the gold standard and have been reported to produce sub millimetre errors [7]. The second component is a computer vision system accurate enough for concurrent validity of electronic performance and tracking systems to be tested, for activities on areas outside the three-dimensional motion capture space. It is important that electronic performance and tracking systems are assessed on a full pitch to reflect how these are required to operate in a football match. A reference system used in an area outside the three-dimensional motion capture space would enable a check of whether systems are trained on the capture space only, and thus potentially misrepresenting the true full-pitch accuracy of these systems. Further, the reference system for comparison should not be commercially available and thus be in commercial conflict with the electronic performance and tracking systems tested. Before a computer vision system could be used, it too should be validated against a highly accurate motion capture system. The aim of this study was therefore to test the accuracy of a bespoke computer vision system [8] against three-dimensional motion capture in a stadium environment.
2 Method
2.1 Test environment and participants
The test was conducted in two separate stadia. The first test venue was a stadium used at the time for national level football competition. The stadium had a regulation football pitch, and grandstands with seating for 15,000 people. The second test venue was a stadium currently used for national and international football matches. This stadium had a regulation football pitch, and grandstands with seating for 100,000 people. Participants (n = 60) were members of an elite youth football academy, attached to a professional football team, or active community-level footballers each of whom gave written informed consent to participate.
The test area consisted of a 30 × 30-m area in which participant movements were captured simultaneously with a 3-D motion capture system and a computer vision system (each detailed below). The test area was set up in one of four possible quadrants on consecutive days, originating from the centre circle of the pitch, thus the computer vision system was tested against 3-D motion capture across an area of 3600 m2, representative of over half the area of a standard football pitch, and four times greater than the largest area any other EPTS has been tested in against 3-D motion capture [7, 11].
The authors received human research ethics approval from the Institutional Ethics Committee to conduct this work (HRE 16–278).
2.2 Movement activities
Within the test area, participants conducted a series of activities to simulate common movements in football. The activities included a circuit with pre-determined movements, 2v2 and 5v5 small-sided games and a 20-m sprint commencing outside the test area and finishing within it.
The circuit, designed as demonstrated in Fig. 1A, within the test area with dimensions 30 × 30 m, included the following activities: self-paced walking; self-paced jogging; maximal accelerations; changes of direction. Each participant completed 4 min of circuit activities. Each small-sided game was 4 min in duration.
Data were collected on four consecutive days at the smaller stadium, with a total of 7459 individual frames of video data sampled and subsequent speed and position of players of circuit, 4207 sampled of 2v2, and 30,967 sampled of 5v5 data collected. Data were collected on two consecutive days at the larger stadium with a total of 39,486 samples of circuit, 3707 samples of 2v2, 43,518 samples of 5v5 and 781 samples the sprint data collected.
2.3 Three-dimensional motion capture system
Participant position and movement were determined by a large-scale three-dimensional motion capture system. To track participants, five 38-mm retro-reflective spherical markers were placed on specific landmarks: one on each shoulder, and three on the pelvis (Fig. 1B). The mid-position of the three pelvis markers was determined for each data frame to approximate the centre of mass of the player [7]. Shoulder markers aided in identification of individual participants.
The playing volume was reconstructed into three-dimensional space from 36 Vicon Vantage cameras (Oxford Metrics Group Plc [OMG], Oxford, UK) with a sampling frequency of 100 Hz. The cameras were positioned around the 30 × 30-m test area used for both the circuit and small-sided games (Fig. 2A).
Data for each marker were manually labelled in Vicon Nexus motion capture processing software and then transferred to Visual3D biomechanics analysis software (C-Motion Inc., Germantown, MD, USA). Data were interpolated where necessary using the interpolation function in Visual3D with a maximum window ranging from 10 to 100 frames depending on the section of data missing. Data were then smoothed using a dual-pass Butterworth digital filter. The cutoff frequency of 2.5 Hz was based on the results of wavelet analysis, residual analysis, and visual inspection of the effects of different cutoffs on the data (particularly around the maxima and minima). The lower end of these analyses was chosen (between 2.5 and 5.0 Hz was indicated) as it served to reduce the effects of the intra-step velocity fluctuation, thereby providing a better estimate of overall velocity. This approach of overall velocity estimation is directly relevant to the method used by practitioners to quantify running velocities in which they use bands (e.g. distance run within a certain velocity band). For this reason, and based on our experience in broader validation work with the manufacturers, it was found that these systems apply smoothing on their data to eliminate arbitrary fluctuations.
Data (X,Y coordinates) were reduced from 100 to 25 Hz and cropped to the start and finish line (circuit) or the kick off (small-sided games) to allow for aligning coordinates temporally with those from the computer vision system.
2.4 Computer vision system
Activities were recorded using four stationary high-definition video cameras (Panasonic AW-UE70KEJ) genlocked via a remote-control panel (Panasonic AW-RP50E) that provided a view of the entire pitch for each discrete task (see Fig. 2B for details). The resultant video footage was imported into the tracking software (VisionKit, Australian Institute of Sport, Canberra, Australia) and each camera’s video image was calibrated to the capture area via association of known points from a rigid calibration rig in the field of view of each camera, so that a pixel represented a known unit of measurement. A set of player detection observations was then generated in VisionKit where each observation consisted of an X,Y ground location and a timestamp [8, 12]. VisionKit samples raw detections at 25 frames per second. Individual detections were then aggregated into temporal sequences using the low- and medium-level hierarchical association methods [13]. A piecewise cubic polynomial was fitted to the continuous player tracking using the midpoint for each 1-s epoch. Coordinates (x,y) for players were then estimated by solving the cubic polynomial at each time point.
2.5 Statistical analysis
Three-dimensional motion capture raw position data were differentiated to obtain horizontal plane speed using a three-point finite central difference formula [14]. Three-dimensional motion capture position and speed data were then down sampled to 10 Hz. VisionKit raw position data were differentiated to obtain horizontal plane speed using a three-point finite central difference formula [14]. VisionKit data were then up-sampled to 50 Hz using linear interpolation and then down-sampled to 10 Hz. The up-sampling was required as this study formed part of a larger project, and we needed to be able to make comparisons between the three-dimensional motion capture data and EPTS at both 10- and 25-Hz sample rates. A fourth-order 1-Hz low-pass Butterworth filter was then applied to position and speed data for both VisionKit and three-dimensional motion capture. This filter was selected after wavelet and residual analyses, as per Vicon data analysis described above and has been used in previous research examining player movements over a similarly sized capture space [11].
Three-dimensional motion capture and VisionKit data were time synchronized using cross-correlation of position data [15]. Once synchronized, data were trimmed for time on the field, combined and extracted into individual data files. VisionKit position data were then rotated through 360 degrees to find the lowest mean absolute error for position. Once the closest degree was found, the data was further rotated 2° either side by 0.01° increments to align VisionKit with three-dimensional motion capture that resulted in the lowest mean absolute error. Velocity and position data were then compared by root mean square deviation (RMSD): the sample standard deviation of the differences between three-dimensional motion capture and VisionKit. The stabilization of the error was also calculated to determine if sufficient data were obtained for effective comparison in each activity [16] and is presented in Table 1.
3 Results
The root mean square deviation for speed was 0.04 m·s−1 and the mean absolute error for position was 0.15 m. The distribution of error for speed and position is shown in Fig. 3. The error for speed and position by activity and relative to position in the test area is shown in Fig. 4.
Adequate data were collected to determine the error within all velocity bands, with stabilization of the error for position occurring after about 7 s, with the exception of the moderate speed band where stabilization occurred after 24 s (Table 1).
4 Discussion
This study established the accuracy of a bespoke computer vision system for tracking footballers in large stadia. The computer vision system had strong agreement for both speed and position of participants across various activities that included high-speed movements, rapid changes of direction and speed, and potential for occlusion with multiple participants moving in the relatively small (30 × 30 m) test area.
It is difficult to position the accuracy of the computer system tested here in the scientific literature, as only two published studies have tested electronic performance and tracking systems in stadium environments using three-dimensional motion capture as the criterion measure [7, 11]. Further, there is no established criteria on what determines acceptable accuracy. Compared to EPTS tested in a similar way, VisionKit had superior speed accuracy than STATS SportVU (Optical), Inmotio (LPS) and GPSPortsSPI Pro X (GPS) that had reported speed accuracy of 0.41 ± 0.08; 0.25 ± 0.06 and 0.28 ± 0.07 m·s−1 mean ± SD, respectively [7]. Further, the computer vision system here was more accurate for speed than either the Gen4, or Gen 5 Chyron Hego TRACAB system (0.09 and 0.08 m·s−1 respectively [11]). It should also be noted that VisionKit was tested over 3600 m2 of the pitch, an area four times greater than the area covered by the commercial systems [7, 11].
The stated trueness compared to three-dimensional motion capture of the systems above for speed of movement would enable each to be used for typical movement “performance” applications. Speed data are typically used to describe the mean or peak movement of players in a given epoch (see for example [17, 18]). Speed data can be further differentiated to include measures of acceleration [19, 20] or combined with skill measures to aid in understanding and prescription of training [21]. In each of the measures, speed accuracy less than 0.5 m·s−1 is likely satisfactory if that error is understood and incorporated into analyses and subsequent applications such as describing player movement in competition [17] or comparisons between levels of competition [18] or in training [21].
The computer vision system here was also superior for positional trueness compared to three-dimensional motion capture than STATS SportVU, Inmotio and GPSPortsSPI Pro X systems (0.56 ± 0.16; 0.23 ± 0.07 and 0.96 ± 0.49 m, respectively) [7]. The Chyron Hego TRACAB Gen 4 and Gen 5 systems were approximately 11 cm superior for positional trueness compared to three-dimensional motion capture than the computer vision system here.
The differences in trueness of position listed above begs the question as to how positionally accurate an electronic performance and tracking system needs to be in order to be effective for quantifying player movement in matches. Given that optical tracking systems typically track the trunk of a player, and limb length, and by association capacity to contact and control the ball, is far larger than trunk width, then positional accuracy within approximately 20 cm is likely sufficient. Certainly for common metrics associated with position such as x- and y-axis centroid [22], length, width, and surface area [22], player dyads [23] or occupancy maps [24], this level of accuracy would suffice.
There are many other factors that contribute to a difficulty in placing the results here in the broader electronic performance and tracking system context. First, and importantly, most studies attempting to establish the validity of electronic performance and tracking system have not used a criterion measure. Most studies have used timing gates to time the movement of participants between two points, rather than directly comparing the position or speed accuracy [4, 25,26,27]. Whilst adding incremental advances in the knowledge of electronic performance and tracking systems, the results of these studies do not truly reflect the accuracy of these systems without comparison to a criterion measure. Second, few studies test systems in the environment that they will be used. In the case of outdoor team sports, systems should be tested in a stadium used for official competition. Third, most studies do not include game-specific tasks. Fourth, many studies use aggregated measures such as total distance, or distance at a certain velocity rather than instantaneous position or speed [4, 28]. The use of aggregated measures takes the validation away from first principles of what an electronic performance and tracking system measures and also results in lower degrees of freedom for comparison. Finally, there is lack of agreement in the statistical method for comparison of systems. Many, including some of our previous work, favour typical error expressed as a coefficient of variation [4, 5], whilst we have used root mean square deviation here. Presentation of results in the units used in the field is possible with root mean square deviation and mean absolute error and thus can be easily interpreted by end users.
Ideally, the accuracy of electronic performance and tracking systems would be established in a stadium environment during actual competition on full-sized pitches. Unfortunately, three-dimensional motion capture systems are not yet able to be used on a whole pitch, and certainly not in full competition due to the need for fragile infrared cameras to be positioned on the sideline, and lack of depth of field of cameras. A second option for use in validation studies in selected areas of stadia is the use of a non-commercial research quality system that has strong agreement with 3-D motion capture. The VisionKit system tested here has strong levels of agreement with 3-D motion capture for speed and position, is not available commercially, and thus is not in direct conflict with any other commercial electronic performance and tracking system. VisionKit either alone or in conjunction with a smaller test area captured by 3-D motion capture thus offers a viable validation standard.
5 Conclusions
VisionKit has strong agreement with the criterion of three-dimensional motion capture system 3 for football-related movements tested in stadium environments. VisionKit can thus be used to establish the concurrent validity of other electronic performance and tracking system in circumstances where three-dimensional motion capture cannot be used.
Data availability
Nil.
Code availability
Nil.
References
Carling C (2013) Interpreting physical performance in professional soccer match-play: should we be more pragmatic in our approach? Sports Med 43(8):655–663
MarketsandMarkets. Player Tracking Market by Component, Solution (Wearables, Opticals, Application-Based), End User (Individual Sports, Team Sports), Application (Fitness Tracking, Performance Tracking, Fraud Detection, Player Safety), and Region - Global Forecast to 2023 https://www.marketsandmarkets.com/Market-Reports/player-tracking-market-213362941.html2019
Aughey RJ (2011) Applications of GPS technologies to field sports. Int J Sports Physiol Perform 6(3):295–310
Jennings D, Cormack S, Coutts AJ, Boyd L, Aughey RJ (2010) The validity and reliability of GPS units for measuring distance in team sport specific running patterns. Int J Sports Physiol Perform 5(3):328–341
Serpiello FR, Hopkins WG, Barnes S, Tavrou J, Duthie GM, Aughey RJ et al (2018) Validity of an ultra-wideband local positioning system to measure locomotion in indoor sports. J Sports Sci 36(15):1727–1733
Luteberget LS, Spencer M, Gilgien M (2018) Validity of the catapult clearsky t6 local positioning system for team sports specific drills, in indoor conditions. Front Physiol 9:115
Linke D, Link D, Lames M. Validation of electronic performance and tracking systems EPTS under field conditions. PLoS One. 2018;13(7):e0199519.
Mara JK, Morgan S, Pumpa K, Thompson K (2017) The accuracy and reliability of a new optical player tracking system for measuring displacement of soccer players. Int J Comput Sci Sport 16(3):175–184
Varley MC, Fairweather IH, Aughey RJ (2012) Validity and reliability of GPS for measuring instantaneous velocity during acceleration, deceleration, and constant motion. J Sports Sci 30(2):121–127
Association FIdF. Resource Hub: Certified Product Database Zurich, Switzerland: Fédération Internationale de Football Association; 2019 https://football-technology.fifa.com/en/resource-hub/certified-product-database/football-technologies/epts/certified-systems/.
Linke D, Link D, Lames M. Football-specific validity of TRACAB's optical video tracking systems. PLoS One. 2020;15(3):e0230179.
Carr P, Sheikh Y, Matthews I. Monocular object detection using 3D geometric primitives. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)2012. p. 864–78.
Liu J, Carr P, Collins RT, Liu Y, editors. Tracking sports players with context-conditioned motion models. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; 2013.
Gilat A, Subramaniam V. Numerical methods for engineers and scientists: an introduction with applications using Matlab: Wiley; 2011.
Buck JR, Daniel MM, Singer A. Computer Explorations in Signals and Systems Using MATLAB: Prentice Hall; 2002.
Bates BT, Osternig LR, Sawhill JA, James SL (1983) An assessment of subject variability, subject-shoe interaction, and the evaluation of running shoes using ground reaction force data. J Biomech 16(3):181–191
Aughey RJ (2010) Australian football player work rate: Evidence of fatigue and pacing? Int J Sports Physiol Perform 5(3):394–405
Aughey RJ (2013) Widening margin in activity profile between elite and sub-elite Australian football: a case study. J Sci Med Sport 16(4):382–386
Aughey RJ (2011) Increased high-intensity activity in elite australian football finals matches. Int J Sports Physiol Perform 6(3):367–379
Delaney JA, Cummins CJ, Thornton HR, Duthie GM (2018) Importance, reliability, and usefulness of acceleration measures in team sports. J Strength Cond Res 32(12):3485–3493
Corbett DM, Bartlett JD, O’connor F, Back N, Torres-Ronda L, Robertson S (2018) Development of physical and skill training drill prescription systems for elite Australian Rules football. Science and Medicine in Football 2(1):51–57
Alexander JP, Spencer B, Mara JK, Robertson S (2019) Collective team behaviour of Australian rules football during phases of match play. J Sports Sci 37(3):237–243
Gonçalves B, Coutinho D, Travassos B, Folgado H, Caixinha P, Sampaio J. Speed synchronization, physical workload and match-to-match performance variation of elite football players. PLoS One. 2018;13(7):e0200019.
Alexander JP, Spencer B, Sweeting AJ, Mara JK, Robertson S. The influence of match phase and field position on collective team behaviour in Australian Rules football. Journal of Sports Sciences. 2019.
Johnston RJ, Watsford ML, Kelly SJ, Pine MJ, Spurrs RW (2014) Validity and interunit reliability of 10 Hz and 15 Hz GPS units for assessing athlete movement demands. J Strength Cond Res 28(6):1649–1655
Waldron M, Worsfold P, Twist C, Lamb K (2011) Concurrent validity and test–retest reliability of a global positioning system (gps) and timing gates to assess sprint performance variables. J Sports Sci 29(15):1613–1619
Coutts AJ, Duffield R (2010) Validity and reliability of GPS devices for measuring movement demands of team sports. J Sci Med Sport 13(1):133–135
Beato M, Coratella G, Stiff A, Iacono AD. The validity and between-unit variability of GNSS units (STATSports apex 10 and 18 Hz) for measuring distance and peak speed in team sports. Frontiers in Physiology. 2018;9(SEP).
Acknowledgements
The authors acknowledge the Vicon engineers Jack, Nick, Bob and Shannon from Victoria University for Engineering and operating a solution for large-scale outdoor 3-D motion capture data collection.
Funding
This research was funded by a grant from the Football Technology Innovation Subdivision, Fédération Internationale de Football Association (FIFA) (Grant number: 0001).
Author information
Authors and Affiliations
Contributions
RA, KB, SR, GD, FS, NE and JB conceived and designed the study; KB, GD, SR, BS, SE, JH, and EC analysed the data; RA, KB, SR, GD, NE interpreted the results; RA drafted the manuscript and prepared the tables; BS and JH prepared the figures. All authors contributed to the article and approved the submitted version.
Corresponding author
Ethics declarations
Conflict of interest
Sam Robertson and Johsan Billingham serve as Guest Editors for the Football Research topical collection in Sports Engineering and Nicolas Evans serves on the Editorial Board of Sports Engineering, and they were not involved in the blind peer review process of this paper.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article is a part of Topical Collection in Sports Engineering on Football Research, edited by Dr. Marcus Dunn, Mr. Johsan Billingham, Prof. Paul Fleming, Prof. John Eric Goff and Prof. Sam Robertson.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Aughey, R.J., Ball, K., Robertson, S.J. et al. Comparison of a computer vision system against three-dimensional motion capture for tracking football movements in a stadium environment. Sports Eng 25, 2 (2022). https://doi.org/10.1007/s12283-021-00365-y
Accepted:
Published:
DOI: https://doi.org/10.1007/s12283-021-00365-y