Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners
Maddox RK., Atilgan H., Bizley JK., Lee AKC.
In noisy settings, listening is aided by correlated dynamic visual cues gleaned from a talker's face-an improvement often attributed to visually reinforced linguistic information. In this study, we aimed to test the effect of audio-visual temporal coherence alone on selective listening, free of linguistic confounds. We presented listeners with competing auditory streams whose amplitude varied independently and a visual stimulus with varying radius, while manipulating the cross-modal temporal relationships. Performance improved when the auditory target's timecourse matched that of the visual stimulus. The fact that the coherence was between task-irrelevant stimulus features suggests that the observed improvement stemmed from the integration of auditory and visual streams into cross-modal objects, enabling listeners to better attend the target. These findings suggest that in everyday conditions, where listeners can often see the source of a sound, temporal cues provided by vision can help listeners to select one sound source from a mixture.