Keywords

1 Introduction

Listeners with impaired audiograms likely suffer from a combination of pathologies that may interact to affect their speech intelligibility in adverse listening conditions. The well-studied cochlear gain loss aspect of hearing loss due to outer-hair-cell deficits is known to impact audibility of sound and yields reduced frequency selectivity as well as cochlear compression loss. Cochlear neuropathy is associated with a reduction in the number and types of auditory nerve (AN) fibers responsible for robust afferent transmission of sound (Kujawa and Liberman 2009). Whereas cochlear gain loss affects the whole dynamic range of sound intensities, cochlear neuropathy is thought to affect sound encoding at supra-threshold levels as high-threshold AN fibers are most sensitive to noise exposure (Furman et al. 2013). Indeed, a recent study has demonstrated that listeners with normal hearing thresholds can show supra-threshold hearing deficits (e.g., in envelope ITD, AM detection threshold tasks) that are related to temporal coding fidelity in the auditory brainstem while being uncorrelated to audiometric or distortion-product (DPOAE) thresholds (Bharadwaj et al. 2015). DPOAE thresholds offer an objective and purely peripheral correlate to the hearing threshold (Dorn et al. 2001) that is not influenced by AN deficits in afferent transmission.

Even though a temporal coding deficit may influence auditory perception, it is currently not known how cochlear neuropathy interacts with the cochlear gain loss aspect of hearing loss, or whether it is equally important for auditory perception. On the one hand, AM detection is expected to improve when cochlear compression is reduced (Moore et al. 1996), while cochlear neuropathy may degrade temporal coding fidelity to temporal envelopes (Bharadwaj et al. 2014, 2015). To study the interaction between these components, and to quantify contributions in listeners that may suffer from both aspects of hearing loss, we tested amplitude-modulation detection (100 Hz) in quiet and in the presence of noise maskers. To force the system to rely on redundancy of coding within a single auditory filter, we determined AM thresholds in a fixed-level narrowband noise masker (NB; 40 Hz) condition. The difference between the AM threshold in quiet and with the NB noise masker might be a metric that is free from cochlear compression and sensitive to the temporal coding fidelity aspect of hearing loss. A second differential measure uses a fixed level-broadband noise masker (BB, 1 octave) to test whether auditory filter widening due to cochlear gain loss is more detrimental than temporal coding fidelity in processing temporal modulations in the presence of background noise. To help separating hearing deficits in the psychoacoustic measures, AM thresholds were compared to DPOAE thresholds (cochlear gain loss) and EFR measures targeting temporal coding fidelity.

2 Methods

The test population was formed with 10 subjects (4 male, 6 female), aged from 20 to 32 (mean: 25.9), that had near normal hearing thresholds (< 15 dB HL, flat), 4 subjects (3 male, 1 female) aged from 25 to 35 (mean: 25.9), with a normal 4 kHz threshold, but a slightly sloping audiogram in the higher frequencies, and 7 subjects (3 male, 4 female) aged from 39 to 75 (mean: 60.7), with a near 25 dB dB loss at 4 kHz and a mild hearing loss at the higher frequencies. We chose listeners with mild hearing losses to ascertain they would have measurable DPOAEs.

Sound delivery for OAEs, EFRs, and AM detection threshold measurements was provided by ER-2 insert earphones attached to a TDT-HB7 headphone driver and Fireface UCX soundcard. Stimuli were generated in Matlab and calibrated using a B&K type 4157 ear simulator and sound level meter. OAEs were recorded using the OLAMP software and an ER10B + microphone, EFRs were recorded using a 32-channel Biosemi EEG amplifier using a custom built triggerbox, and analyzed using the ANLFFR and Matlab software.

2.1 Amplitude Modulation Detection Thresholds

AM detection thresholds for 4-kHz, 500-ms pure-tone carriers were obtained using a 3AFC method (1-up, 2 down) in quiet and in masking noise. Thresholds were obtained from the last six reversals at the smallest step size and 4 repetitions were measured of which the first run was discarded. The initial modulation depth value was ‑6 dB (20 · log10(m), m = 50 %), and varied adaptively with stepsizes of 10, 5, 3 and 1 dB. The modulation frequency was set to 100 Hz to target auditory brainstem processing based on the shorter EFR group delays reported for modulation frequencies above ~ 80 Hz (Purcell et al. 2004), while providing the modulation within a single equivalent rectangular bandwidth (ERB). Target levels were 60 dB SPL for the pure tone and NH listeners, and adjusted to a level corresponding 25 categorical loudness units (approx. 45–50 dB SL) to allow for equal sensation in the very slight and mild hearing-impaired group. The two on-frequency maskers were presented at the stimulus level—15 dB SPL spectral level (narrowband; NB) 2212 ‑50 dB SPL spectral level for the broadband condition (BB). The spectral level calibration refers to a method in which the noise levels were calibrated relative to the level of the stimulus in the frequency spectrum rather than from the rms of the time-domain waveform. Noise bandwidths were 40 Hz (NB) and 1 octave (BB) centered around 4 kHz.

2.2 Envelope Following Responses

EFRs were obtained for 4 kHz 1-octave wide noise carriers presented at 70 dB SPL for a modulation frequency of 120 Hz and modulation depths (20log10(m)) of ‑8, ‑4 and 0 dB relative to m = 100 %. 600 repetitions of 600 ms modulated epochs were recorded on 32 channels, and after filtering (60–600 Hz), artefact rejection (100 μV), epoching and baseline correction, the FFT of the averaged epochs in each of the 32 channels was calculated. EFR strength (in dB) was determined as the spectral level at the frequency of the modulator and averaged across the 32 channels. EFR strength of the 100 % modulated condition was measured along with the slope of the EFR strength as a function of modulation depth reduction. The latter slope measure was proposed to reflect temporal coding fidelity to supra-threshold sounds (Bharadwaj et al. 2015). Slopes were only considered when at least the 0 and ‑4 dB EFR levels were above the noise floor; if the ‑8 dB EFR was not present, the level of the noise floor was used as the level of the ‑8 dB EFR strength. The slope was thus calculated using a linear fit to 3 datapoints: 0, ‑4 and ‑8 dB.

2.3 DPOAE Thresholds

DPOAE based hearing thresholds were derived from 2 f1-f2 DPOAE I/O measurements using the DPOAE sweep method (Long et al. 2008). The primary frequencies were exponentially swept up (2 s/octave) over a 1/3 octave range around the geometric mean of 4 kHz at a constant frequency ratio f2/f1 of 1.2. Using a sufficiently sharp least squared fit filter (here ca. 2.2 Hz), the distortion component (DCOAE) can be extracted from the DPOAE recording (Long et al. 2008). The DCOAE is generated around the characteristic site of f2 and thus predominantly provides information about the f2 site without being influenced by DPOAE fine structure that is known to affect I/O functions unwantedly (Mauermann and Kollmeier 2004). DCOAE I/O functions were computed as average over 34 DCOAE I/O functions across the measured frequency range. A matched cubic function \({{L}_{DP}}=a+\left( \frac{1}{q}\cdot {{({{L}_{2}}-b)}^{3}} \right)\) with parameters a, b, and q was fit to the data points. DPOAE thresholds were determined as the level of L2 at which the extrapolated fitting curve would reach a level of ‑25 dB SPL (~ 0 Pa).

3 Results

Figure 1 depicts correlations between the DPOAE threshold and the EFR measures for NH (blue circles), very slight (black circle) and mild (red squares) HI listeners. There were no significant correlations between the DPOAE measures and the EFR measures indicating that the EFR measures at 70 dB SPL reflect more aspects of hearing loss than captured by auditory threshold measures alone. Note that the audiometric hearing threshold and the DP threshold were highly correlated (p < 1e‑6) suggesting they both reflect the perceived threshold of hearing. Likely a combination of cochlear gain loss, temporal coding fidelity, along with potential head-size differences affects the 100 % modulated EFR strength. Because the EFR slope measure was not correlated to EFR strength in the same listeners, it may be that the differential slope measure to modulation depth reduction is more sensitive to temporal coding fidelity as earlier suggested in Bharadwaj et al. (2015).

Fig. 1
figure 1

Correlations between DPOAE thresholds and EFR strength (m = 100 %) (panel A) and EFR slope measures as a function of modulation depth reduction (panel B). Panel C shows that the EFR slopes and strength measure do not correlate indicating they reflect different aspects of auditory coding. The blue circles indicate NH listeners, the black circles represent NH listeners with a very slight sloping high-frequency hearing loss, and the mild HI listeners (red squares) had elevated hearing thresholds at 4 kHz. Because these figures reflect general correlations between objective metrics, data from additional listeners (Verhulst et al. 2015) were added to this analysis

Figure 2 shows the psychoacoustic AM detection thresholds in quiet and in the presence of broadband (panel A) and narrowband noise (panel B). Average AM detection thresholds were similar for NH and mild HI listeners supporting the observation that temporal modulation detection is not determined by the hearing threshold (Moore et al. 1996). In fact, the variation in thresholds for the NH listeners was significantly correlated to the EFR slope metric (panel C) demonstrating that temporal coding fidelity predicts performance when hearing thresholds are normal (see also Bharadwaj et al. 2015). For the mild HI listeners, this relationship is more complex as the reduced temporal coding fidelity in those listeners (i.e., only 2 out of 6 listeners had EFR responses down to ‑4 dB AM depth) would predict worse AM detection thresholds. The observation that HI listeners had normal AM performance despite their reduced temporal coding fidelity suggests that cochlear gain loss can compensate for temporal coding deficits to yield normal AM detection thresholds.

Fig. 2
figure 2

Psychoacoustic amplitude modulation detection thresholds for the NH (blue and black) and mild HI listeners in the quiet condition and in the presence of a fixed level broadband (panel A) and narrowband (panel B) noise masker. The black symbols reflect those NH listeners that had normal hearing thresholds but a slightly sloping audiogram at the high frequencies. Panel C shows the relation between EFR slope measure and the AM detection thresholds in quiet, indicating that temporal coding fidelity predicts performance in this task for listeners with normal hearing thresholds

AM detection thresholds in background noise worsened for all listeners. Whereas the broadband noise had a variable effect on the NH listeners, the mild HI listeners were only mildly affected (Fig. 2a). In contrast, the narrowband noise impacted AM detection performance in HI listeners significantly more strongly than did broadband noise (Fig. 2a, 2b). This difference between the masker conditions was absent for the NH listeners.

Degradation in AM detection performance due to the presence of background noise is depicted in Fig. 3 for the broadband (panels A&C) and narrowband noise conditions (panels B&C) and compared to the objective DPOAE and EFR measures.

Fig. 3
figure 3

Degradation in AM detection performance after addition of masking noise (i.e. degradation = AM threshold in noise—AM threshold in quiet) and its relation to objective measures in the same listeners: DPOAE thresholds and EFR slopes. Correlations were calculated for the whole population (solid; All) and for the listeners with normal hearing thresholds (dashed; NH)

The amount with which AM detection thresholds worsened for the NH listeners was significantly correlated to the DPOAE threshold indicating that those listeners with the widest auditory filters and steepest compression slopes were less impacted by the addition of the noise. Because this correlation happened for both the NB and BB noise conditions, perhaps cochlear compression and to a lesser extent the width of the auditory filters might be responsible for this result. The EFR slope metric was not significantly correlated to the AM detection threshold reductions in the NH listeners. Because the EFR slope was correlated to the AM detection performance in quiet for the NH listeners, the degradation measure plotted in Fig. 3 may have factored out its influence. Unfortunately, despite the normal EFR strengths to 100 % modulation depths, the tested HI listeners had poor EFR strength for the ‑4 and ‑8 dB modulation depths making EFR slope estimates and associated correlations for this subgroup impossible.

4 Discussion

The present study offers a window into how cochlear neuropathy and cochlear gain loss interact to affect perception of fast temporal modulations important for speech perception in adverse conditions. AM detection thresholds and EFR strength to 100 % modulated stimuli were normal in the mild HI listeners we tested. These findings are in line with other studies that show normal AM detection thresholds (Moore et al. 1996) and EFR strengths (Zhong et al. 2014) for subjects with elevated hearing thresholds. However, interactions between hearing deficits were apparent from correlating the psychoacoustic results with objective measures in the same listeners. Whereas AM detection for NH listeners was correlated to their EFR slopes as a measure for temporal coding fidelity, cochlear gain loss was shown to compensate for reduced temporal coding in the HI listeners to yield near normal AM detection thresholds.

Because AM detection thresholds did not correlate to hearing threshold measures and reflected interactions between temporal coding fidelity and cochlear gain, it is informative to use differential metrics to tease apart different subcomponents of sensorineural hearing loss. Adding a fixed-level broadband noise masker is expected to impact AM detection performance in two ways: (i) wider auditory filters would pass through more noise and degrade performance accordingly, and (ii) AM information within the auditory filter would be more noisy in each coding channel (e.g., in each auditory nerve fiber), such that a sufficient number of AN fibers needs to be present to perform well in the task. Because the bandwidth of the NB masker fell within an ERB, this condition was expected to be more detrimental to listeners that suffer from reduced temporal coding fidelity irrespective of the width of their auditory filters. An important third factor that could influence all the psychoacoustic results is the individual amount of cochlear compression that is present at the tested frequency. Because cochlear compression loss would enhance perceived modulation depth both in the quiet and noise masking conditions, it was assumed that the differential metric would be able to parse out this effect.

Comparison between AM detection thresholds in quiet and noise demonstrated that the mild HI listeners were heavily impacted by the NB noise. This finding is in line with the idea that the NB noise targets temporal coding fidelity within a single auditory filter, especially because 4 out of 6 HI listeners did not have EFRs at the ‑8 dB modulation depth. Also the NH listeners showed a large variability in how the NB noise impacted their performance with a trend (p = 0.09) towards degraded AM detection performance in NB noise for those listeners with steeper EFR slopes. Even though the NB condition was designed to target within auditory filter aspects of temporal coding fidelity, it is possible that another mechanism could also explain these results. For example, because the temporal envelope of the NB noise waveform is much more fluctuating than the BB noise envelope, it is possible that the quality of a modulation coding mechanism in the brainstem and not the numbers of auditory-nerve fibers responsible for a robust coding of temporal envelopes could also explain the NB results.

Lastly, it is interesting to observe that AM detection performance in the mild HI listeners was significantly more impacted by the NB than the BB noise. This difference was not observed for the NH listeners. If the NB condition reflects a temporal coding deficit, then it appeared not to dictate performance in the BB condition for the HI listeners, suggesting that perhaps the overall loss of compression due to cochlear gain loss would enhance modulation sensitivity such that a fixed-level BB noise did not degrade performance substantially, despite the reduced temporal coding observed from the EFR measures. It is too early to make strong conclusions regarding the underlying mechanisms based on the present dataset, as additional data and additional metrics that reflects the individual listeners cochlear compression should be added to further tease apart the psychoacoustic results. In this respect, both categorical loudness scaling metrics and DPOAE compression slope estimates could be included.

To conclude, differential psychoacoustic and EFR methods methods form a promising candidate to separate different aspects of hearing loss in listeners with mixed sensorineural pathologies. Differential diagnostic metrics separating cochlear gain loss from temporal coding deficits are necessary to understand contributions of different interacting pathologies to the perceptual performance.