Project Details
Projekt Print View

TRR 31:  The Active Auditory System

Subject Area Biology
Medicine
Term from 2005 to 2017
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 14945932
 
Final Report Year 2018

Final Report Abstract

In the real world, the auditory system constantly has to cope with a mixture of sounds from many acoustic sources. Furthermore, the output from these different sound sources in a complex acoustic scene dynamically varies over time as do the features that characterize a specific sound signal. After a sound wave has reached the receiver’s ears, it no longer is physically available for additional analysis. Thus, the auditory system has to integrate sounds from the same source over time requiring a memory for what has been presented and to segregate the sounds from the different sources that are simultaneously active. This must be achieved even if the sources provide signals dynamically changing over time. Despite all these difficulties, the auditory system appears to solve this complex task with ease. How this is done was the central research question of the CRC/TR 31 “The Active Auditory System” studying humans and animal models. The approach taken to understand the processing of sounds in auditory scene analysis involved a range of methods combining results obtained in neurophysiological and imaging studies and in the psychophysical investigation of perception, and comparing these with predictions of models simulating the perceptual mechanisms. Based on this comparison, it will be possible to conclude whether the processes observed in the experimental evaluation of the mechanisms underlying auditory scene analysis are sufficient to explain the brain’s ability to parse the signals from the different sources in natural acoustic scenes. Simulating the mechanisms underlying auditory scene analysis can pave the way to develop better technical hearing devices or human-computer interfaces. The research focused on a set of questions that were pursued in depth: (1) How does active listening support the selection (i.e., weighting) of features of sounds from a specific source that are represented by the patterns of neuronal activity in the auditory pathway. (2) How do binaural processing mechanisms support the segregation of static and moving sources? (3) How are acoustic features dynamically integrated over time? (4) How does combined audio-visual processing affect the source analysis in complex acoustic scenes? (5) How does the bottom-up feature analysis interact with top-down processes, or more specifically, how do the hypotheses generated in the bottom-up analysis and the state of the brain affect perception? (6) How do these mechanisms affect speech perception in normal hearing and hearing-impaired human subjects? The CRC has made significant progress regarding these research questions. The work of the CRC has identified the relevance of various sound features in different listening situations. Many monaural sound features, e.g., common onset of frequency components and harmonic relations between them, interact in perception determining in the binding of sound components from one source. Binaural feature analysis, e.g., interaural time and intensity differences and the coherence of the signals reaching the two ears determines the ability to segregate sources. The processes underlying feature analysis and integration simultaneously operate on multiple time scales. Time scales in the millisecond range are relevant in binaural unmasking or separation of signals based on spectro-temporal fluctuations of signal levels. Time scales in the range of seconds affect the sequential evaluation in a stream of signals allowing to segregate signals from different sources. Analysis at intermediate time scales affects audio-visual integration. Processes operate with intermediate or large time scales are characterized by the action of top-down processes that dynamically modify the bottom-up analysis. By experimental intervention, e.g., by transcranial electric stimulation, we were able to modify this analysis. The models developed for the processing of sounds in complex acoustic scenes provided for a good prediction of the speech perception of normal and hearing-impaired human subjects. The improved prediction allows a better adjustment of hearing devices. With algorithms derived from the physiological mechanisms the function of hearing devices can be further improved. Finally, by incorporating the knowledge obtained from studying the physiological mechanisms underlying auditory scene analysis into automatic speech recognition systems, robustly functioning human computer interfaces can be constructed.

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung