Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners
Maddox RK., Atilgan H., Bizley JK., Lee AK.
In noisy settings, listening is aided by correlated dynamic visual cues gleaned from a talker's face—an improvement often attributed to visually reinforced linguistic information. In this study, we aimed to test the effect of audio–visual temporal coherence alone on selective listening, free of linguistic confounds. We presented listeners with competing auditory streams whose amplitude varied independently and a visual stimulus with varying radius, while manipulating the cross-modal temporal relationships. Performance improved when the auditory target's timecourse matched that of the visual stimulus. The fact that the coherence was between task-irrelevant stimulus features suggests that the observed improvement stemmed from the integration of auditory and visual streams into cross-modal objects, enabling listeners to better attend the target. These findings suggest that in everyday conditions, where listeners can often see the source of a sound, temporal cues provided by vision can help listeners to select one sound source from a mixture.