Background

Autism spectrum disorder (ASD) is a neurodevelopmental condition characterised by impairments in social interaction, disrupted communication and repetitive behaviours [1]. Although these features remain the primary diagnostic markers of ASD, the presence of sensory symptoms has recently been given a more central diagnostic role. This change in symptom emphasis reflects the observation that over 90% of ASD individuals experience hyper- and/or hypo-sensitive sensory perception [2, 3]. It has been suggested that differences in low-level sensory processing contribute to the atypical developmental trajectories of higher-level cognitive functions in autism [4]. An understanding of the neural circuits involved will therefore prove fruitful for ASD research, and could even facilitate the identification of earlier, brain-based diagnostic markers [5, 6].

Dysregulated neural oscillations are a promising neural correlate of atypical sensory processing in ASD. In particular, atypicalities in high frequency gamma-band oscillations (30-80 Hz) have been reported in ASD across visual, auditory and somatosensory domains [5, 7,8,9,10,11]. Gamma oscillations are generated through excitatory-inhibitory (E-I) neuronal coupling [12], which facilitates periods of pre- and post-synaptic excitability alignment, thereby promoting efficient neural communication [13]. Findings of atypical gamma oscillations in ASD may therefore reflect disrupted E-I interactions within cortical micro-circuits [14], and concomitant effects on local and global brain connectivity [15].

Within the context of auditory processing, dysregulated gamma-band oscillations in ASD have been previously reported [5]. One prevalent approach to study auditory gamma-band activity non-invasively is through amplitude modulated tones called “clicktrains”. Such stimuli produce two distinct gamma-band responses. First, a transient gamma-band response (tGBR) is generated within 1 s of stimulus onset [16]. This tGBR spans frequencies from 30-60 Hz and is generated in primary and secondary auditory cortices. Second, clicktrain stimuli produce an auditory steady-state response (ASSR), in which neural populations in primary auditory regions are entrained to the modulation frequency for the duration of the clicktrain [17]. In adults, the entrainment in primary auditory cortex is greatest for clicktrains modulated at 40 Hz [18]. Measures of inter-trial coherence (ITC) can also be used to measure the ASSR, by quantifying the degree of phase consistency across trials [19]. One advantage of ASSRs is their high test-retest reliability which approaches an intra-class correlation of 0.96, even with a relatively small number of trials [20, 21]. Furthermore, ASSRs are modulated by neural development, increasing in power by approximately 0.01 ITC value per year, until early adulthood [22, 23]. This increase has been linked with the maturation of superficial cortical layers [24, 25]. This makes the ASSR an ideal tool for studying auditory function in developmental conditions such as ASD.

Two studies have measured ASSRs in an ASD context, that is, in ASD participants and in the first-degree relatives of people diagnosed with ASD. Wilson and colleagues reported a reduction in left-hemisphere auditory ASSR power in a group of 10 autistic adolescents, using an early 37-channel MEG system [26]. The second study reported reduced ITC in first-degree relatives of people diagnosed with ASD, with maximal reductions at 40 Hz across both hemispheres [27]. Reductions in the ASSR could therefore be an ASD-relevant endophenotype. Additionally, the finding of reduced ITC suggests that dysregulated phase dynamics in bilateral primary auditory cortex could underlie reductions in the ASSR in ASD. However, measures of ITC have not been applied to study the ASSR directly in a group of autistic participants. Additionally, it remains unclear whether reductions in ASSRs are bilateral [27] or unilateral [26] in nature.

As discussed above, auditory stimuli also elicit a more broadband, transient gamma-band response (tGBR) within 0.1 s post-stimulus onset [16]. Previously, using auditory clicktrains, Rojas and colleagues [27] reported equivalent tGBRs between the first-degree relatives of people diagnosed with ASD, and controls. However, using sinusoidal auditory tones, several studies have found reduced tGBRs in ASD [6, 28, 29]. It therefore remains unclear whether both early evoked and later sustained gamma-band activity are dysregulated in ASD (also see Kessler, Seymour and Rippon [5], for a review). We therefore opted to analyse tGBRs alongside ASSRs using clicktrain stimuli in a group of autistic participants. However, the primary focus for this study was on the sustained 40 Hz response, given the ASD-related differences, previously reported using clicktrain stimuli [26, 27].

This study attempted to replicate and extend previous findings showing differences in ASSRs and tGBRs in autism [26, 27], within an adolescent population (aged 14-20). We focused on this age range because adolescence is a crucial period for brain maturation [30, 31] and ASSR power increases with age [22, 23]. Therefore, we reasoned that ASD-related differences in the ASSR would be more pronounced for an adolescent versus adult population. We also opted to recruit adolescents rather than children for this study, given fixed (adult) size of most MEG helmets and higher levels of compliance in adolescent populations. Data were collected from a group of 18 ASD participants and 18 typically developing controls using a 306-channel MEG system (Elekta Neuromag). An auditory clicktrain stimulus was presented binaurally to participants, to elicit bilateral ASSRs at 40 Hz. To investigate prolonged neural entrainment, clicktrain stimuli were presented for a total of 1.5, rather than 0.5 s as in previous studies [26, 27]. ASSRs were analysed over time, in order to investigate transient changes in 40 Hz power and inter-trial coherence. It was hypothesised that, compared with the control group, the ASD group would show reduced ASSR power and ITC at 40 Hz for the duration of clicktrain presentation [26, 27]. In contrast, it was hypothesised that tGBRs would be equivalent between groups, as previously reported by Rojas and colleagues [27].

Methods

Participants

Data were collected from 18 participants diagnosed with ASD and 18 age-matched typically developing controls, see Table 1. ASD participants had a confirmed clinical diagnosis of ASD or Asperger’s syndrome from a paediatric psychiatrist. Participants were excluded from participating if they were taking psychiatric medication or reported epileptic symptoms. Control participants were excluded from participating if a sibling or parent was diagnosed with ASD. MEG data from a further 9 participants were collected but excluded, due to intolerance to MEG resulting in experimental attrition (2 ASD); movement over 0.5 cm (2 ASD, 2 control, see MEG acquisition); metal artefacts (1 ASD, 1 control); AQ score over 30 (1 control). The movement and AQ thresholds were defined before data collection began.

Table 1 Participant demographic and behavioural data

Behavioural assessments

The severity of autistic traits was assessed using the autism quotient (AQ) [32] and sensory traits using the glasgow sensory questionnaire (GSQ) [33]. Using an independent samples t test, it was shown that both AQ scores, t (34) = 9.869, p < 0.001 and GSQ scores, t (34) = 3.533, p = 0.001, were significantly higher in the ASD group, compared with the control group (see Table 1).

Breaking down the GSQ scores further, we observed that our ASD participants showed a heterogeneous pattern of sensory symptoms, with mixtures of hypo- and hyper-sensitivities, across sensory domains (see Supporting Figure 4). Interestingly, auditory scores were the highest amongst any sensory modality, with a mean of 13.9/24. This means that the ASD participants, on average, answered between “Sometimes” and “Often”, when reporting atypical auditory processing on the GSQ.

General non-verbal intelligence was assessed using the Raven’s matrices task [34]. Using an independent samples t test, it was shown that there were no significant group differences in the Raven matrices score, t (34) = −1.372, p = 0.179.

Participants also completed the mind in the eyes test [35]; however, there were no group differences for this test, t (34) = −1.615, p = 0.116. The mind in the eyes test has been recently criticised for measuring emotion recognition rather than an autism-specific deficit in mental state attribution [36], and therefore these scores were not used to investigate correlations between brain patterns and questionnaire measures.

Paradigm

Whilst undergoing MEG, participants performed an engaging sensory task. Each trial started with a randomised fixation period (1.5, 2.5 or 3.5 s), followed by the presentation of a visual grating or auditory binaural clicktrain stimulus (see Fig. 1). The visual and auditory stimuli were presented randomly, rather than in separate experimental blocks. Only the auditory clicktrain data will be described in this article (please see Seymour et al., 2019 [11] for analysis of the visual grating data). The auditory clicktrain was created from auditory square wave clicks, each of 2 ms duration delivered every 25 ms for a total of 1.5 s. Clicktrains were presented at 80 dB (verified using a decibel meter after pneumatic transduction and transmission) binaurally through Etymotic MEG-compatible ear tubes. To keep participants engaged with the task, cartoon pictures of aliens or astronauts were presented after the auditory clicktrain, for a maximum of 0.5 s. Participants were instructed to press a response pad as soon as they were presented with a picture of an alien, but not if they were presented with a picture of an astronaut (maximum response duration allowed was 1.0 s). Correct versus incorrect responses were conveyed through 0.5 s-long audio-visual feedback (correct: green box, high auditory tone; incorrect responses: red box, low auditory tone). Prior to MEG acquisition, the nature of the task was fully explained to participants and several practice trials were performed. MEG recordings lasted 12-13 min and included 64 trials with auditory clicktrain stimuli.

Fig. 1
figure 1

Experimental procedure. Participants performed an audiovisual task, consisting of 1.5–3.5 s baseline period followed by presentation of an auditory clicktrain stimulus for a duration of 1.5 s. After this, participants were presented with a cartoon alien or astronaut picture and instructed to only respond when an alien was presented (response time up to 1.5 s), followed by a green or a red framed box for a correct or an incorrect response, respectively. The alien/astronaut stimuli were to maintain attention and do not form part of the analysed data

MEG acquisition

MEG data were acquired using a 306-channel Neuromag MEG scanner (Vectorview, Elekta, Finland) made up of 102 triplets of two orthogonal first-order planar gradiometers and one magnetometer. Acquisition was conducted in a MaxShield™ magnetically shielded room using MaxShield’s™ patented “four layers in one shell” construction. All data were recorded at a sampling rate of 1000 Hz, using default on-line filters between 0.1-330 Hz. Internal active shielding (Max Shield) was turned off for the recording due to numerous artefacts generated by this technology. Five head position indicator (HPI) coils were applied for continuous head position tracking, and visualised post-acquisition using an in-house Matlab script. Any participant who moved more than a conservative threshold of 5 mm in any one direction (x, y or z) were excluded from subsequent analysis. For MEG-MRI coregistration purposes, the locations of three anatomical landmarks (nasion, left and right pre-auricular points), the locations of the HPI coils and 300-500 points from the head surface were acquired using a Polhemus Fastrak digitizer.

Structural MRI

A structural T1 brain scan was acquired for source reconstruction using a Siemens MAGNETOM Trio 3 T scanner with a 32-channel head coil (TE = 2.18 ms, TR = 2300 ms, TI = 1100 ms, flip angle = 9°, 192 or 208 slices depending on head size, voxel size = 0.8 × 0.8 × 0.8 cm).

MEG-MRI coregistration and cortical mesh construction

MEG data were co-registered with participants’ structural MRIs by matching the digitised head-shape data with surface data from the structural scan [37]. Two control participants did not complete a T1 structural MRI and therefore the digitised head-shape data was matched with a database of 95 structural MRIs from the human connectome database [38], using an iterative closest points (ICP) algorithm. The head shape-MRI pair with the lowest ICP error was then used as a ‘pseudo-MRI’ for subsequent steps. This procedure has been shown to improve source localisation performance, in situations where a subject-specific anatomic MRI is not available [39, 40]. The aligned MRI-MEG images were used to create a forward model based on a single-shell description of the inner surface of the skull [41] (3000 vertices), using the segmentation function in SPM8 [42]. The cortical mantle was then extracted to create a cortical mesh, using Freesurfer v5.3 [43], and registered to a standard fs_LR mesh, based on the Conte69 brain [44], using an interpolation algorithm from the Human Connectome Project [45] (also see: https://goo.gl/3HYA3L). Finally, the mesh was downsampled to 4002 vertices per hemisphere.

MEG pre-processing

Following data inspection, four MEG channels containing large amounts of non-physiological noise (low-frequency drift) were removed from the data, and not included in any of the subsequent pre-processing steps. No channel interpolation was performed. MEG data were pre-processed using Maxfilter (temporal signal space separation, default settings with correlation limit raised to 0.9), which attenuates external sources of noise from outside the head [46]. Further pre-processing steps were performed in Matlab 2014b using the Fieldtrip toolbox v20161024 [47]. Firstly, for each participant the entire recording was band-pass filtered between 0.5-250 Hz (Butterworth filter, low-pass order 4, high-pass order 3) and band-stop filtered (49.5-50.5 Hz; 99.5-100.5 Hz) to remove residual 50 Hz power-line contamination and its harmonic. Data were epoched into segments of 4 s (1.5 s pre, 1.5 s post-stimulus onset, with 0.5 s of padding either side), and each trial was demeaned and detrended. Trials were inspected if the trial-by-channel (magnetometer) variance exceeded 8 × 10−23 (threshold determined from pilot testing), and those containing artefacts (SQUID jumps, eye-blinks, head movement, muscle) were removed. This resulted in the rejection, on average, of 3.4 trials per participant (mean number of trials across participants = 60.6, minimum = 54, maximum = 64). The remaining mean number of trials across participants used for analysis was therefore 60.6 (minimum = 54, maximum = 64). Finally, data were resampled to 200 Hz to aid computation time.

Source-level spectral power

Source analysis was conducted using a linearly constrained minimum variance beamformer [48], which applies a spatial filter to the MEG data at each vertex of the cortical mesh. Only data from orthogonal first-order planar gradiometers were used for source analysis due to current controversy over combining both magnetometers and gradiometers (which have different levels of noise). A covariance matrix for each participant was constructed from the non-averaged, filtered (see below) data, rather than trial-averaged data (as sensor-level data will be made more ‘correlated’ by averaging over trials). Beamformer weights were calculated by combining the gradiometer covariance matrix with lead-field information, with data pooled across baseline and clicktrain periods (−1.5 s to 1.5 s). Based on recommendations for optimisation of MEG beamforming [49], a regularisation parameter of lambda 5% was applied, due to the rank deficiency of the data following the Maxfilter procedure.

Whilst the tGBR and ASSR originate from primary auditory cortex, both responses have different frequency ranges and underlying neural generators [23]. Therefore, we opted to use separate spatial filters, rather than single spatial filter based on the M100 as used in previous studies [26, 27, 50]. This decision was based on recent work suggesting that beamformer weights should be optimised for specific data of interest [51]

To localise the ASSR, data were band-pass filtered (Butterworth filter, fifth order) between 35 and 45 Hz. A period of 0.0-1.5 s following stimulus onset was compared with a 1.5 s baseline period (1.5 to 0.0 s before clicktrain onset, also see Fig. 2). To localise the tGBR, data were band-pass filtered between 30 and 60 Hz, and a period of 0.0-0.1 s following clicktrain onset was compared with a 0.1 s baseline period (see Fig. 2).

Fig. 2
figure 2

Procedure for source analysis. For ASSR beamforming, a common spatial filter was computed using data pooled across ASSR and baseline data. This common filter was then used to localise ASSR/baseline data separately. This process was repeated for tGBR data, but the common spatial filter was computed using a different time and frequency band of interest. ASSR, auditory steady state response. tGBR, transient gamma-band response

ROI definition

Regions of interest (ROI) were selected in bilateral primary auditory (A1) cortices to investigate ASSRs and tGBRs in greater detail. ROIs were defined using a multi-modal parcellation from the Human Connectome Project (Supporting Figure 1, [52]). To obtain a single spatial filter for each ROI (right A1 and left A1 separately), we performed principal components analysis on the concatenated filters of each ROI, multiplied by the sensor-level covariance matrix, and extracted the first component, see [53]. Broadband (0.5-250 Hz) sensor-level data were multiplied by this spatial filter to obtain ‘virtual electrodes’.

A1 spectral power

A1 gamma power (ASSR, tGBR) was analysed using the multi-taper method, as implemented in the Fieldtrip toolbox [47], using discrete prolate spheroidal sequences (Slepian functions). This has been shown to offer an optimal trade-off between time and frequency resolution, and is preferred to Morlet wavelets for high-frequency gamma-band activity [54, 55]. Oscillatory power was calculated from 35-45 Hz using a 0.5 s sliding window (step size 0.02 s) with ± 5 Hz frequency smoothing. Due to the narrow frequency range under investigation, power values were averaged between 35 and 45 Hz. Finally, the percentage change in ASSR power was calculated, using the baseline time period, i.e. 1.5 s before clicktrain presentation. For tGBR power, power values were averaged between 30 and 60 Hz and across time (0-0.1 s versus a baseline window 0.1 s before clicktrain presentation). From this, the percentage change in tGBR power was calculated.

A1 inter-trial coherence

To assess band-limited phase consistency across trials, we calculated inter-trial coherence (ITC). An ITC value of 0, indicates complete absence of phase consistency, whereas a value of 1 indicates perfect phase consistency across trials [19]. ITC values were converted to Z values, as recommended by Maris and colleagues [56], to ensure a normal distribution for the statistical analysis (see below).

Statistical analysis

For MEG data, statistical analysis was performed using cluster-based permutation tests as implemented in the Fieldtrip toolbox, which have been shown to adequately control the type-I error rate for electrophysiological data [57]. Cluster permutation tests consist of two parts: first an uncorrected independent t test is performed (two-tailed), and all values exceeding a 5% significance threshold are grouped into clusters. The maximum t value within each cluster is carried forward. Second, a null distribution is obtained by randomising the data labels (e.g. ASD/control and time) 10,000 times and calculating the largest cluster-level t value for each permutation. The maximum t value within each original cluster is then compared against this null distribution, with values exceeding a 5% significance threshold (corrected across both tails, i.e. p < 0.025 for each tail) deemed significant. Given that only two other MEG studies have used auditory clicktrains to study ASD-related differences [26, 27], we adopted a conservative statistical approach and used two-tailed tests in all instances.

For both left and right A1, the following within-group planned statistical contrasts were performed: ASSR power (0.0 s to 1.5 s) versus baseline (−1.5 s to 0.0 s); ASSR ITC (0.0 s to 1.5 s) versus baseline (−1.5 s to 0.0 s). In addition, the following between-group planned statistical contrasts were performed: control versus ASD ASSR power (0-1.5 s post-clicktrain onset); control versus ASD ITC (0-1.5 s post-clicktrain onset); control versus ASD tGBR power (0.0 s to 0.1 s).

Results

ASSR—power

Whilst ASSRs are known to originate from bilateral primary auditory cortex [17, 58], in order to confirm successful source localisation with our pipeline, ASSR power (35-45 Hz) was localised on a cortical mesh, using an LCMV beamformer, see the “Methods” section. We then calculated the percentage change in 35-45 Hz power between 0.0 and 1.5 s post-clicktrain onset versus a 1.5 s baseline period (−1.5 to 0.0). As expected, the control group showed maximal increases in power for regions overlapping bilateral primary auditory cortex (Fig. 3a) [18, 21]. For the ASD group, there were increases in ASSR power for right, but not left, auditory regions, albeit with lower average values than controls (Fig. 3b). For an alternative visualisation of results featuring unthresholded whole-brain statistical maps, see Supporting Information, Fig. 3.

Fig. 3
figure 3

ASSR power analysis. Top panels (a, b): ASSR beamformer localization. The percentage change in ASSR power (35-45 Hz) is presented on a 3D cortical mesh, thresholded at values greater than 10% (white dotted line on colour scale) for illustrative purposes (for unthresholded images, see Supporting Figure 3), separately for control (a) and ASD (b) groups. Bottom panels (a, b): ASSR in regions of interest (ROIs). ROIs were defined in left and right A1 (see Supporting Figure 1) and ASSR oscillatory power was calculated between 35-45 Hz. The time-period 0-1.5 s post-clicktrain onset was statistically compared with a 1.5 s baseline period. Data are plotted separately for (a) the control group and (b) the ASD group. Dotted lines under the graph indicate times passing a p < 0.05 threshold (two-tailed) compared to baseline, with different colours corresponding to right A1 (red) and left A1 (green). ASSR, auditory steady state response

The use of beamforming for bilateral auditory responses has been questioned, due to the potential for mis-localisations resulting from correlated neural sources [48]. However, as noted by Van Veen and colleagues [48] and later by Sekihara and colleagues [59], complete suppression of brain activity, only occurs when the cross correlation of sources exceeds 0.9. When realistic sources of noise are added to simulated MEG data, complete suppression does not occur [60]. Instead Qurana and Cheyne [60] have shown that for correlated sources at realistic signal-to-noise ratios, beamformers produce a single localisation directly in-between the two sources. Given the clear separation between bilateral auditory sources in our data, as shown in Fig. 3, we argue that systematic mis-localisation is unlikely to have occurred. Furthermore, following source analysis, we used the online Neurosynth and Neurovault tools to compute the spatial correlation between unthresholded group-level, whole-brain images (see Fig. 3; Supporting Figure 3) and several ‘concept-based meta-analysis maps’, generated from over 10,000 neuroimaging studies [61]. Results, reported in Supporting Table 1, showed the highest correlation with the term ‘auditory’ for both the control (r = 0.635) and ASD group (r = 0.471).

ROIs were defined in bilateral auditory cortex (see Supporting Figure 1), to investigate time-frequency responses in greater detail. Oscillatory power was calculated in steps of 0.02 s using the multitaper method, and post-stimulus periods (0 to 1.5 s) were statistically compared to baseline periods (−1.5 to 0 s). Control participants showed increased 35-45z power from 0.24-1.5 s for left A1 and 0.21-1.5 s for right A1 (Fig. 3a bottom panel, times passing a p < 0.05, two-tailed, threshold are indicated with a dotted line). In contrast, the ASD group showed increased 35-45 Hz power for much shorter time windows: in right A1 between 0.31-0.72 s and 0.97-1.35 s; and for left A1 between 0.41-0.63 s (Fig. 3b top panel, times passing a p < 0.05, two-tailed, threshold are indicated with a dotted line). Next, we statistically compared ASSR 35-45 Hz power between groups, for both ROIs. It was found that the control group had greater 35-45 Hz power in both right A1, 0.50-1.5 s (Fig. 4a) and left A1, 0.59 to 1.5 s (Fig. 4b), compared with the ASD group.

Fig. 4
figure 4

ASSR power between groups. ASSR 35-45 Hz power was statistically compared between groups for (a) right A1 and (b) left A1. It was found that the control group had greater ASSR power than the ASD group from 0.52-1.5 s in right A1 and 0.60-1.5 s for left A1. The black dotted line under the graph indicates times passing a p < 0.05 threshold (two-tailed) for the control > ASD contrast.

ASSR—inter-trial coherence

Next, inter-trial coherence (ITC) was calculated for the A1 ROIs, using the same time-frequency approach as for power. ITC values were Z-scored for statistical analysis [56]. First, we statistically compared the post-clicktrain time-period (0 to 1.5 s) with the baseline time period (−1.5 to 0 s). The control group showed statistically significant, p < 0.05, increases in ITC from 0.1-1.48 s for left A1, and 0.14-1.39 s for right A1 (Fig. 5a). The ASD group showed statistically significant, p < 0.05, increases in ITC from 0.18-1.50s for left A1, and between 0.12-0.60s, 0.80-0.90, and 1.08-1.5 s for right A1 (Fig. 5b).

Fig. 5
figure 5

ASSR ITC analysis. a-b 35-45HZ ASSR inter-trial coherence (ITC) results. The time period 0-1.5 s post-clicktrain onset was statistically compared with a 1.5 s baseline period. Data are plotted separately for (a) the control group and (b) the ASD group. Dotted lines under the graph indicate times passing a p < 0.05 threshold (two tailed) compared to baseline, with different colours corresponding to right A1 (red) and left A1 (green). c-d Between group 35-45 Hz ASSR ITC results. It was found that the control group had greater ASSR ITC than the ASD group from 0.64-0.82 s in right A1 (c) and 1.04-1.22 s for left A1 (d). The black dotted line under the graph indicates times passing a p < 0.05 threshold (two-tailed) for the control (blue) > ASD (red) contrast

Statistical comparison of ITC between groups showed that the control group had higher ITC in both right A1 (Fig. 5c, p < 0.05) and left A1 (Fig. 5d, p < 0.05), but only within short time-windows from 0.64-0.82 s (right A1, Fig. 5c) and 1.04-1.22 s (left A1, Fig. 5d).

ASSR—behavioural data

Next, we investigated whether ASSR responses in the ASD group were correlated with the behavioural questionnaire data collected from participants. ASSR power and ITC values were averaged, separately, over those times showing a significant difference (p < 0.05) between the control and ASD group, as reported in the previous sections (also, see Figs. 4 and 5c-d, black dotted lines). Furthermore, given the similar time course of group differences across right and left A1, ASSR power values were averaged across the ROIs. This was not the case for ITC Z values, however, which were not averaged across right and left A1. These data were correlated with autism quotient (AQ) and glasgow sensory questionnaire (GSQ) scores, for the ASD group, only. There were no significant correlations for ASSR power (Fig. 6a, top (AQ) r = 0.01, p = 0.96; bottom (GSQ) r = −0.17, p = 0.50), or ITC Z values (Fig. 6b, top (AQ, left A1) r = −0.36, p = 0.14; top (AQ, right A1) r = 0.11, p = 0.65; bottom (GSQ, left A1) r = −0.35, p = 0.14; bottom (GSQ, right A1) r = 0.02, p = 0.92). The correlation analysis, was repeated for glasgow sensory questionnaire (GSQ) scores, summed across the six auditory questions only, however, no significant correlations were found (p > 0.05, see Supporting Figure 5).

Fig. 6
figure 6

ASSR—behaviour relationships. Scatter plots to show the relationship between ASSR power, averaged across left/right A1 (a), or ITC Z values (b), with autism quotient (AQ) and glasgow sensory questionnaire (GSQ) scores. There were no significant (p > 0.05) correlations for any brain-behaviour relationship. The shaded region indicates 95% confidence intervals. ITC, inter-trial coherence; ASSR, auditory steady state response

tGBR—source level

Transient gamma-band responses to the auditory clicktrain were localised using a beamforming approach (see the “Methods” section). As for the ASSR analysis, we first confirmed that the cortical generator(s) of the ASSR originated in bilateral auditory cortex. We calculated the percentage change in 30-60 Hz power from 0.0-0.1 s post-clicktrain onset compared with a 0.1 baseline period [57]. As expected, both groups group showed maximal increases in tGBR power for regions overlapping with bilateral primary auditory cortex (Fig. 7a) [18, 21].

Fig. 7
figure 7

tGBR analysis. a The percentage change in transient gamma-band response, tGBR, power (30-60 Hz) is presented on a 3D cortical mesh, thresholded at t > 3.6% (white dotted line) for illustrative purposes, separately for control (top) and ASD (bottom) groups. b ROIs were defined in left and right A1 (see Supporting Figure 1). The percentage change in tGBR was plotted separately across ROIs (ASD: blue bar; controls: orange bar). Solid black lines indicate 95% confidence intervals. There were no significant differences in tGBR between groups (p > 0.05)

Paralleling the ASSR analysis, ROIs were defined in left and right A1. For each ROI and participant, we calculated the percentage change in tGBR power from 30-60 Hz, between 0.0-0.1 s post-clicktrain onset and a 0.1 s baseline period. Statistically comparing groups, it was found that there were no significant differences, p > 0.05, in tGBR power (see Fig. 7b).

Discussion

This study examined the oscillatory basis of auditory steady state responses (ASSRs) and transient gamma-band responses (tGBR) in a group of 18 autistic adolescents and 18 typically developing controls. We utilised robust source-localisation methods and analysed auditory responses across time. Compared to the ASSR in the control group, we found reduced ~ 40 Hz power for the ASD group, for regions of interest defined in the left and right primary auditory cortices. Furthermore, there was reduced inter-trial coherence for the autistic group at 40 Hz, suggesting that phase dynamics in A1 were less consistent over time. Our results corroborate the notion that auditory brain responses in autism are locally dysregulated [5], especially during sustained gamma-band entrainment (< 0.5 s post-stimulus onset).

Auditory steady state responses in autism

Our results are largely consistent with two previous studies which show reduced ASSRs in autistic adolescents [26] and first-degree relatives of people diagnosed with autism [27]. Whilst our study shows reductions in 40 Hz power across both hemispheres (Figs. 3, 4), Wilson and colleagues observed a selective left-hemisphere reduction in power [26]. This might be due to the monaural stimulation approach, used by Wilson and colleagues, producing larger hemispheric asymmetries as compared to binaural auditory stimulation [62]. Future work is clearly needed to clarify hemispheric asymmetries in ASSR power for ASD populations [62].

Our results build on the previous literature in several ways. Firstly, by examining sustained ASSRs from 0-1.5 s, we found that group differences emerged beyond 0.5 s post-stimulus onset (Figs. 3, 4), suggesting that, when driven at gamma frequencies, A1 becomes increasingly dysregulated in ASD compared to controls in a time-dependent manner. This raises the intriguing possibility that sustained, rather than transient, oscillatory activity at gamma-frequencies is affected in autism, perhaps reflecting synaptic dysfunction and an imbalance between excitatory and inhibitory populations of neurons [14]. To investigate this further, future work could parametrically modulate clicktrain duration, intensity, and variability (e.g. perfect 40 Hz vs 38-42 Hz). Secondly, we also found group differences in inter-trial coherence (ITC), with reductions in the autistic group for two short time periods between 0.5 and 1.2 s post-stimulus onset (Fig. 6). Importantly, measures of ITC are normalised by amplitude and have been shown to be more robust for data with lower signal-to-noise ratios [21]. The reduction in ITC for the autistic group may reflect reduced phase consistency across trials and more idiosyncratic neural responses in autism [63, 64], as previously reported for evoked data [65]. However, the reductions in ITC could have also emerged through differences in ASSR power between groups [66]. That said, the time-course of group-differences does diverge between ITC and power (see Supporting Figure 6), with maximum ITC group differences not coinciding with maximum ASSR power differences, which would be expected, if the latter would fully drive the former. In any case, our findings strengthen the claim of reduced ASSRs in autism.

Transient gamma-band responses in autism

Unlike ASSRs, there were no group differences in the transient gamma-band (30-60 Hz) responses to the clicktrain stimulus (Fig. 7). Whilst one previous study using sinusoidal tones reported decreased tGBRs for the first-degree relatives of autistic people, a later study using auditory clicktrains, found no group differences in either power or ITC [27]. More generally, findings of transient/evoked gamma-band power across sensory domains are very mixed, with both increases and decreases reported (reviewed in [5]). The divergence between steady-state and transient gamma in this study has implications for potential oscillopathies in ASD, as differences in gamma power may depend on the time period under investigation as well as the underlying neural circuits generating gamma oscillations [23].

ASSRs as markers of dysregulated local activity

There has been recent interest in characterising atypical patterns of gamma-band oscillations in autism, due to their link with local cortical function and connectivity [5]. The precise E-I mechanisms underlying gamma generation are well characterised, for a review, see [12]. Of particular importance is the functional inhibition of pyramidal neurons by fast-spiking interneurons via binding of the neurotransmitter gamma-aminobutyric acid (GABA) [12, 67]. Relatedly, there is emerging evidence showing GABA dysfunction in autism [67]. Reduced gamma-band steady-state responses in autism may therefore reflect dysregulated neuronal inhibition, resulting in E-I imbalance [14]. As argued by Kessler, Seymour and Rippon [5], this local dysregulation could result in both hyper and hypo-sensitivities in ASD, depending on the particular sensory input, and the degree of top-down modulatory processes employed by individuals [11]. To quantify the precise mechanisms underlying reduced gamma-band ASSRs, future studies could utilise dynamic causal modelling of A1 neuronal circuits [68], combined with parametric modulations of ASSRs (e.g. duration, frequency) and participant attention [69]. It would also be interesting to use more naturalistic auditory stimuli, for example, speech stimuli [70, 71], to investigate whether neural entrainment is affected more generally in ASD.

It should also be noted that ASSRs are not simply generated via the linear accumulation of transient evoked responses [18, 72, 73]. Instead, the ASSR may reflect a sustained non-linear neural response at the input stimulation frequency and its harmonics, peaking at the system’s preferred modulation rate [18]. In support of this, Edgar and colleagues (2016) report that in children, ASSRs are difficult to detect, despite measurable auditory evoked responses [23]. Similarly, our data show intact auditory evoked fields (see Supporting Figure 2) and transient gamma-band responses in autism (Fig. 7), in the presence of a reduced ASSR (Fig. 4). Rather than a generalised gamma-band dysfunction in autism, our data suggest a more nuanced reduction in the non-linear dynamics underlying steady-state auditory gamma [27]. Interestingly, a MEG study examining somatosensory processing in ASD showed reduced frequency harmonics at 50 Hz [10], while Vilidaite and colleagues reported a reduction in harmonic EEG responses during visual steady-state stimulation in autistic adults [74]. Furthermore, two MEG studies revealed reduced alpha-gamma phase-amplitude coupling in the visual system in ASD [12, 16]. Overall, this suggests that non-linear aspects of local cortical processing could be dysregulated across sensory domains in ASD [6].

ASSRs are developmentally relevant, increasing by approximately 0.01 ITC value per year [22, 23, 50]. This trajectory may reflect the continuing development of superficial layers of cortex where gamma-band oscillations predominantly originate [25]. We hypothesise that the ASD-related reduction in ASSRs reported in this study results from an atypical trajectory of gamma-band maturation, in line with developmental disconnection theories of autism [75]. Given that the 40 Hz ASSRs continue to mature throughout late adolescence and adulthood [50], it remains to be established, whether the development of ASSRs in ASD is simply delayed, or whether reductions persist throughout life. To investigate this further, future studies should use high-powered longitudinal ASD samples and age-appropriate MEG systems [76], to characterise ASSR development throughout childhood, adolescence and into adulthood [77]. If confirmed, divergent ASSR trajectories could act as important autism-relevant markers of intervention efficacy [78].

Limitations

In this study, formal clinical ASD assessment of our participants, e.g. the ADOS [79], was not performed. We therefore implemented strict participant exclusion criteria, only including autistic participants with a confirmed clinical diagnosis of ASD or Asperger’s syndrome. Between groups, there were significant differences in autistic and sensory traits, measured using two self-report questionnaires (Table 1). However, upon closer inspection of behavioural data (see Supporting Figure 5), the ASD group showed a mixture of hyper- and hypo-sensitive traits between different sensory modalities making precise brain-behavioural correlations problematic. This may explain the lack of relationship between ASSR power/ITC and AQ/GSQ scores in ASD (Fig. 6). Brain-behaviour relationships might be better quantified using MEG in combination with psychophysical tests of auditory perception and formal clinical assessments.