Home » Workshop2014

Workshop2014

For newer editions of this workshop click here.

Workshop and lecture series on “Cognitive neuroscience of auditory and cross-modal perception”

26 – 30 May 2014 (Lectures 26 – 28 May, Consultations 29 – 30 May)

workshopposter2014

Venue

Hotel Ambassador
Hlavná 101
040 01 Košice
Slovakia ( map)
(Consultations: Lecture room P1, also known as SA1A1, Faculty of Science, Šafárik University, Jesenná 5, Košice, map)

Organizers:
doc. Norbert Kopčo, PhD. (norbert.kopco@upjs.sk) – scientific program
Beáta Tomoriová, PhD. (beata.tomoriova@upjs.sk), Ing. Ľuboš Hládek (lubos.hladek@upjs.sk) – technical
Perception and Cognition Lab
, Institute of Computer Science , Faculty of Science, P. J. Šafárik University

Contact:
kogneuro@gmail.com

Objectives:
This workshop and lecture series will include introductory lectures and advanced research talks on a range of topics related to the neural processes of auditory, visual and cross-modal perception. The talks will illustrate multidisciplinary character of cognitive neuroscience research, covering behavioral, neuroimaging, and modeling approaches, as well as applications of the research in auditory prosthetic devices. The workshop is aimed at early-stage and advanced students and young researchers, and it will provide ample opportunities for direct interactions between the lecturers and the attendees.

Themes:

  • spatial hearing
  • vision and crossmodal perception
  • neural modeling
  • methods in cognitive neuroscience: behavioral experiments, imaging (EEG, fMRI), modeling
  • applications: cochlear implants, hearing aids

Format of lectures on 26 – 28 May:

  • 14 one-hour lectures by experts on individual topics (2 lectures per expert)
  • student / attendee session with short 10-min presentations and/or posters by the students
  • small modeling / data analysis assignments prepared by the experts for the students

Format of consultations on 29 – 30 May:
This is an optional portion of the workshop. Some of the experts will prepare computer assignments on their presentation topics that the students can work on individually. On Thursday and Friday the experts will be consulting with individual members of the lab on their research projects. The experts will also be available to consult the assignments and/or research with the other attendees by appointment in the mornings on these days.There will be a lecture hall and computer lab provided in the university for these meetings.

Registration

The workshop is open to all interested students/scientists. Registration is free of charge but required (mostly for organizational reasons). In order to register, please send an email to kogneuro@gmail.com stating your name and affiliation, dates on which you are planning to attend. In case you would like to have a presentation please send us an abstract (up to 200 words) and an indication whether you prefer poster or oral presentation no later than May 21.

Travel, accommodation, visitor information:

Attendees are kindly requested to make their own travel and accommodation arrangements.

Program

Lectures

Monday, 26 May 2014, Expert Lectures, Posters
Session Chairs: AM Norbert Kopco, PM Gabriela Andrejkova

9:00 – 10:00 Simon Carlile (University of Sydney)
Six degrees of spatial separation – The portal for auditory perception
10:15 – 11:15 Christopher Stecker (Vanderbilt University)
Integration of spatial information across multiple cues
11:30 – 12:30 Frederick (Erick) Gallun (US Dept. of Veterans Affairs and Oregon Health & Science University)
Impacts of age and hearing loss on spatial release from masking
Lunch
14:00 – 15:00 Pierre Divenyi (Stanford University)
Cocktail-party effect deficit in aging: what formant transition processing tells us
15:15 – 16:15 Bernhard Laback (Austrian Academy of Sciences)
Binaural cues in electric hearing
16:15 – 17:00 Poster Session I
Zsuzsanna Kocsis, István Winkler, Orsolya Szalárdy, Alexandra Bendixen
Effects of attention and the presence of multiple cues of concurrent sound segregation on the object-related negativity (ORN)
Dávid Farkas, Susan Denham, Alexandra Bendixen, István Winkler
Individual Differences in Auditory Multi-Stability
Rene Sebena, Igor Dudinsky
Subjects’ expectations, and the imagery scanning.
Guillaume Andéol, Ewan Macpherson, Andrew Sabin
Sound-localization in noise performance is determined by sensitivity to spectral shape
Ľuboš Hládek, Christophe Le Dantec, Aaron R Seitz, Norbert Kopčo
Audio-visual calibration and plasticity of auditory distance perception
Beáta Tomoriová, Ľuboš Hládek, Rudolf Andoga, Norbert Kopčo
Spatial Aspects of Contextual Plasticity in Sound Localization

Tuesday, 27 May 2014, Expert Lectures, Posters
Session Chairs: AM Beata Tomoriova, PM Milan Jovovic

9:00 – 10:00 Arash Yazdanbakhsh (Boston University)
Neural and computational models of vision: Fundamental problems of vision
10:15 – 11:15 Aaron Seitz (University of California, Riverside)
Cognitive Neuroscience of Perceptual Learning: Specificity and Transfer
11:30 – 12:30 Christopher Stecker (Vanderbilt University)
Neuroimaging of spatial hearing
Lunch
14:00 – 15:00 Pierre Divenyi (Stanford University)
Phonemic restoration: done by our brain or our tongue?
15:15 – 16:15 Frederick (Erick) Gallun (US Dept. of Veterans Affairs and Oregon Health & Science University)
Characterizing attentional resources using the auditory dual-task
16:15 – 17:00 Poster Session II
Annamária Kovács, Martin Coath, Susan Lynda Denham, István Winkler
Auditory onsets, landmarks, surprise and salience
Lukáš Varga, Zuzana Kabátová, Ivica Mašindová, Branislava Bercíková, Daniela Gašperíková, Iwar Klimeš, Milan Profant
Deafness etiology and functional outcomes after cochlear implantation in prelingually deaf children
D. Horváth, J.Uličný, B. Brutovský
Cognition of geometry of high-dimensional data – topological structured projection emerging through simulated experiments with self-organized non-Euclidean Manifold learning.
Eleni L. Vlahou, Aaron Seitz, Norbert Kopčo
Adaptation to Room Reverberation in Nonnative Phonetic Training
David Hartnagel, Alain Bichot, Patrick Sandor and Corinne Roumes
Audio-Visual spatial fusion varies with eyes’ position and allocentric visual reference frame
Peter Tóth, Angela Josupeit, Norbert Kopco, Volker Hohmann
Modeling of Speech Localization in a Multitalker Mixture Using “Glimpsing” Models of Binaural Processing

Wednesday, 28 May 2014, Expert Lectures, Student talks
Session Chairs: AM Jana Estocinova, PM Lubos Hladek

9:00 – 10:00 Simon Carlile (University of Sydney)
The plastic ear: Perceptual relearning and auditory spatial perception
10:15 – 11:15 Aaron Seitz (University of California, Riverside)
Cognitive Neuroscience of Perceptual Learning: Mechanisms Guiding Learning
11:30 – 12:30 Arash Yazdanbakhsh (Boston University)
Neural and computational models of vision: Neural models and their implementation in early vision
Lunch
14:00 – 15:00 Bernhard Laback (Austrian Academy of Sciences)
Spectral localization cues in acoustic and electric hearing
15:15 – 17:00 Attendee TalksJana Eštočinová, Wolf Zinke, Susanne Sommer, Stefan Pollmann
Persistent behavioral and neuronal effects of context-reward contingencies on visual search.
Milan Jovovic
Dynamical Brain: a signal processing tool for analyzing EEG/fMRI brain scans
Gabriela Andrejková and Norbert Kopčo
Effect of streaming on localization with a preceding distractor
Jaroslav Bouse
Binaural model of lateralization
Ľuboš Marcinek, Norbert Kopčo
Analysis of non-auditory effects on contextual plasticity in spatial hearing
Tom Barker
Modelling Auditory Streaming using Non-negative Modulation Pattern Tensor Factorisations
Peter Lokša, Norbert Kopčo
Visual Adaptation And Spatial Auditory Processing
Norbert Kopčo, Jana Eštočinová
Perceptual and neuronal patterns of auditory distance processing in humans

Assignments

Seitz – demo of UltimEye and sensory learning PC game (description)
Divenyi – MATLAB assignments on STRFs, speech synthesis (strf, strftools)
Laback – electric hearing – matlab assignment “Acoustic simulations of cochlear implant stimulation.”
Yazdanbakhsh – vision modeling 1, vision modeling 2
Gallun – simulating psychoacoustical data and then analyzing the result of the simulation (scripts)
Stecker – analysis of a FMRI data set using MATLAB tools
Carlile – Auditory Modeling Toolbox for MATLAB and analysis of HRTFs
Kopco – Neural coding and linear regression

Consultations

Thursday, 29 May 2014:
9:30 Stecker, Divenyi, 10:30 Laback, Seitz
Friday, 30 May 2014:
(8:45 meeting with dean) 9:30 Yazdanbakhsh, Gallun, 10:30 Carlile, Kopco

Abstracts

Invited talks

Six degrees of spatial separation – The portal for auditory perception.

Simon Carlile

University of Sydney

Our perception of auditory space depends on the analysis of acoustic cues produced at each ear by the surrounding sound sources (reviews Carlile, 1996; Carlile et al., 2005). Despite more than a century of research, the nature of the representation underlying this perception remains the subject of much debate (Grothe et al., 2010; Ashida and Carr, 2011). Strong evidence indicates that the early encoding of binaural time differences involves a two channel, opponent code rather than a labelled line code as suggested previously. In this lecture we will review the historical development of these ideas and, in that context, look at data from very recent experiments that take a different approach to probing this auditory representation using a variation of the two-point discrimination task. Subjects made a two-point separation judgement using concurrent speech sounds. Discrimination thresholds changed non-linearly as a function of the overall separation, with a minimum at 6° of separation. This ‘dipper’ function was seen for regions around the midline as well as for more lateral regions (30° and 45°). By contrast, the JNDs for the binaural cues to sound location (interaural time and level differences) were linear for matched base intervals.

These data suggest that the perceptual representation of auditory space involves a multi-channel mapping which emerges subsequent to the encoding of the binaural cues. Rather, it could be based on a logical representation of space that may operate as a network of neural interconnections rather than as a topographical ‘map’. There would be significant advantages to representing space through the logical relationships of ‘object locations’ rather than at the level of cues or features, especially when considering the challenge of efficiently integrating auditory and visual information. Audio-visual binding at the level of object representation would obviate the need to convert audition’s head coordinates to vision’s retinal coordinates on a feature by feature basis. Strongly corroborating this idea is our recent finding of a dipper function in a very similar two-point discrimination experiment but where one point was a sound source and the other a light source (Orchard-Mills et al., 2013).

Ashida G, Carr CE (2011) Sound localization: Jeffress and beyond. Cur Op Neurobiol 21:745-751.
Carlile S (1996) The physical and psychophysical basis of sound localization. In: Virtual auditory space: Generation and applications. (Carlile S, ed), p Ch 2. Austin: Landes.
Carlile S, Martin R, McAnnaly K (2005) Spectral Information in Sound Localisation. In: Auditory Spectral Processing (Irvine DRF, Malmierrca M, eds), pp 399-434: Elsevier.
Grothe B, Pecka M, McAlpine D (2010) Mechanisms of Sound Localization in Mammals. Physiol Rev 90:983-1012.
Orchard-Mills E, Leung J, Burr D, Morrone MC, Wufong E, Carlile S, Alais D (2013) A Mechanism for Detecting Coincidence of Auditory and Visual Spatial Signals. Multisensory Research 26:333-345.

The plastic ear: Perceptual relearning and auditory spatial perception.

Simon Carlile

University of Sydney

The auditory system of adult listeners has been shown to re-calibrate to spectral cues altered using in-ear moulds (Hofman et al., 1998). We have recently confirmed that moulds initially degraded localization performance, that significant improvement followed chronic exposure (10-60 days) but that this occurs for both audio-visual and audio only regions of space (Carlile and Blackman, 2013). This begs the questions as to the teacher signal for this remarkable functional plasticity in the adult nervous system. Furthermore, the individual differences in the extent and rate of accommodation suggest a number of factors driving this process. Prompted by work on sensory-motor learning and the role of the motor state in auditory localisation, we hypothesized that it would be possible to facilitate this process by providing multi-modal and sensory-motor feedback during accommodation. This most recent work demonstrates that a relatively short period of training involving sensory-motor feedback (5 – 10 days) significantly improved both the rate and extent of accommodation that generalized across different stimuli (Carlile et al., 2014). This has significant implications not only for the mechanisms by which this complex sensory information is encoded to provide a spatial code but also for adaptive training to altered auditory inputs.

Carlile S, Blackman T (2013) Relearning auditory spectral cues for locations inside and outside the visual field. J Assoc for Res in Otolaryngol 15:249-263.
Carlile S, Balachandar K, Kelly H (2014) Accommodating to new ears: The effects of sensory and sensory-motor feedback. J Acoust Soc Am 135:2002-2011.
Hofman PM, Riswick JGAV, Opstal AJV (1998) Relearning sound localization with new ears. Nature Neurosci 1:417 – 421.

Cocktail-party effect deficit in aging: what formant transition processing tells us

Pierre Divenyi

Stanford University

Experiments were conducted on a group of normal-hearing young and a group of only mildly hearing-impaired elderly listeners to determine the effect of a loud “/iuiuiuiu…/”-analog distracter on the identification of a /iui/ or /uiu/-analog target pattern. The stimuli were synthesized vowel-analog signals with single formant peaks frequency-modulated (FM) at 5 Hz, i.e., at a rate in the syllabic range. The target and the distracter had different fundamental frequencies in the 100-to-200-Hz range, emulating the voices of different talkers. Target level-to-Distracter level ratio necessary for correct identification of the target pattern was determined as a function of the target/distracter fundamental frequency difference, FM swing, and distracter level. Results indicate that the two groups are similarly influenced by physical parametric changes with one major exception: the elderly subjects need about 20 dB larger target-to-distracter ratios to perform at the same performance level as the young. Using a widely accepted model of cortical response patterns elicited by complex sounds, the STRF (=spectro-temporal receptive field) model, our analysis show that a large portion of the elderly subjects’ deficiency in this task can be attributed to higher than peripheral auditory processing deficits. Relevance of the results to elderly persons’ loss of speech understanding in crowd noise will be discussed. [Work supported by AFOSR and the VA Medical Research.]

Phonemic restoration: done by our brain or our tongue?

Pierre Divenyi

Stanford University

Spondees, both true disyllabic words and concatenated monosyllabic word pairs, had their middle section excised from the midpoint of the first to the midpoint of the second vowel – the excised segment thus included the final consonant(s) of the first and the initial consonant(s) of the second syllable. The excised segment was replaced by one of four signals: silence, speech-spectrum white noise with flat envelope, speech-spectrum white noise with its envelope modulated using the one of the excised speech, or a low-pass filtered sawtooth wave having the fundamental frequency of the excised speech. These stimuli were presented to normal-hearing young listeners instructed to type what they heard, i.e., guess both monosyllabic half words. Input and response spondees were compared by synthesizing the typed response and also synthesize the stimulus spondee before cutting its middle, using am articulatory gesture-based synthesizer (the Haskins Laboratories’ TaDA). The stimulus and response spondees were aligned using dynamic time warping with the vowel midpoints as anchors. The resulting equal-duration input and response spondees were analyzed in two ways: by looking at their phonemic confusions and by comparing the normalized distances of the trajectories between the input and output articulatory gestures. While phoneme-based confusions of several phonetic features were high, stimulus-response gesture distances underlying the same features were generally about one order of magnitude lower. These results suggest that gesture trajectories of the excised speech segments are guessed by the listener more correctly than the actual phonemes, and that an articulatory representation of speech is present somewhere in the listener’s brain, permitting him/her to recover the acoustic signal corresponding to the gesture functions. [Work supported by NSF, AFOSR, and the VA Medical Research.]

Impacts of age and hearing loss on spatial release from masking

Frederick (Erick) Gallun

US Dept. of Veterans Affairs and Oregon Health & Science University

Listeners in complex auditory environments can benefit from the ability to use a variety of spatial and spectrotemporal cues for sound source segregation. Probing these abilities is an essential part of gaining a more complete understanding of why listeners differ in their ability to navigate the auditory environment. Two fundamental ways in which the auditory systems of individual listeners can differ is through aging and hearing loss. One difficulty with uncovering the independent impacts of age and hearing loss on spatial release, however, is the commonly observed phenomenon of age-related hearing loss. In order to potentially reveal effects of aging on spatial hearing, it is essential to develop testing methods that reduce the influence of hearing loss on the outcomes. In addition, the statistical power needed for such testing generally requires larger numbers of participants than can easily be tested using traditional behavioral methods. This talk will describe the development and validation of a rapid method by which experimental participants or clinical patients can be categorized in terms of their ability to use spatial and spectrotemporal cues to separate competing speech streams. Results show that a brief (20-trial) fixed-stimulus procedure can provide data comparable to that obtained with a longer adaptive procedure, and that individualizing the target level to provide audibility allows the influence of hearing thresholds to be reduced and the impacts of aging to be more easily examined.

Characterizing attentional resources using the auditory dual-task

Frederick (Erick) Gallun

US Dept. of Veterans Affairs and Oregon Health & Science University

The attention operating characteristic (AOC) displays joint performance in a dual-task paradigm. Sampling theory allows the AOC to be used to distinguish two tasks that share resources from two tasks that call upon independent resources. This talk will review data on intensity discrimination and identification showing that “easier” tasks can cause dual-task costs, while “harder” tasks can have no costs associated with the dual-task. Recent data will be discussed in which speech stimuli have been analyzed using the same dual-task approach. Finally, suggestions will be made for ways in which the current enthusiasm for the dual-task paradigm in clinical research can be used to improve our theoretical understanding of attention and memory as well as demonstrating the effects of various impairments and prostheses.

Binaural cues in electric hearing: Spectral localization cues in acoustic and electric hearing

Bernhard Laback

Austrian Academy of Sciences

Cochlear implants (CIs) are prosthetic auditory devices that allow restoring basic hearing ability in people with profound or complete hearing loss. CIs are increasingly implanted on both ears (bilaterally) aiming to provide listeners with the advantages of binaural hearing. Binaural hearing is essential for sound localization along the left/right dimension and important for understanding speech in noisy environments.

In the first part of this lecture I will first describe current CI systems and their limitations in transmitting binaural information. Then, I will focus on the basic sensitivity of CI listeners to binaural information. Lastly, I will present new approaches for encoding binaural information with future bilateral CI systems.

The second part focuses on spectral localization cues which are pivotal for vertical-plane (front/back and up/down) localization. I will first describe the basic processes known from normal (acoustic) hearing, including new insights about the plasticity of the auditory system in learning new localization cues. Then, I will present some basic electric and acoustic hearing research work on the feasibility of encoding spectral localization cues with CIs.

The lectures will be complemented by some small cochlear implant processing assignments.

Cognitive Neuroscience of Perceptual Learning: Specificity and Transfer

Aaron Seitz

University of California, Riverside

Perceptual learning (PL), experience induced gains in discriminating sensory features, has classically been thought to be highly specific to trained retinal location of the stimuli. Together with evidence that specific learning effects can result in corresponding changes in primary visual cortices, researchers have theorized that specificity implies regionalization of learning in the brain. However, other research suggests that specificity in perceptual learning can arise from learning read-out in decision areas or through top-down processes. Here, I review literature regarding the extent literature, including evidence from psychophysical, single cell recording, EEG, and fMRI, and computational models, regarding specificity in perceptual learning.

Cognitive Neuroscience of Perceptual Learning: Mechanisms Guiding Learning

Aaron Seitz

University of California, Riverside

How do we select what to learn? A central issue in neuroscience is how the brain selectively adapts to important environmental changes. While the brain needs to adapt to new environments, its architecture has to be protected from modification due to continual bombardment of undesirable information. Here I discuss rules by which the brain solves this so-called stability-plasticity dilemma in the context of perceptual learning.

Neuroimaging of spatial hearing, or quantifying human auditory cortical tuning to sound-localization cues with fMRI

Christopher Stecker

Vanderbilt University, Nashville TN USA

The auditory cortex plays a necessary role in sound localization by both human and animal listeners, yet the representation and processing of sound-localization cues in the auditory cortex is not well understood. In animal models, a majority of cortical neurons exhibit preference for sounds with interaural time differences (ITD) and/or interaural level differences (ILD) favoring the contralateral ear (that is, corresponding to sound sources contralateral to the recorded hemisphere). On that basis, gross measures such as those measured by functional MRI are expected to exhibit clear biases favoring contralateral binaural cues. Over the course of three separate fMRI studies with human listeners, we have consistently observed such biases in responses to ILD, but much weaker tuning to ITD. In this presentation, l discuss those studies, focusing in particular on questions related to the roles of stimulus history, behavioral task, and cortical populations. That discussion will folllow an initial overview of the fMRI method, its physiological bases, and potential challenges for auditory research. Additionally, hands-on experiences with fMRI data analyses for auditory fMRI will be offered to interested students.

Integration of spatial information across auditory cues

Christopher Stecker

Vanderbilt University, Nashville TN USA

The ability of human listeners to localize sound and understand the auditory environment depends on binaural information carried by various acoustic features of sound, for example interaural time and level differences (ITD and ILD, respectively). Those features are, in turn, variously affected by distortion arising from echoes, reverberation, and competing sound sources, such that the relative reliability or informativeness of specific cues depends significantly on the listening context. In reverberation, for example, ILD cues are typically reduced, whereas post-onset ITD cues may take on large and fluctuating values that bear little relation to the sound-source location. Current experiments in our lab have addressed the weighting of binaural information across cues and over the durations of brief sounds. The results suggest a context-dependent weighting mechanism that avoids overdependence on uninformative cues while maintaining sensitivity to cues that reliably identify sound-source locations or detect changes in the acoustic environment.

In this presentation I will discuss the particular challenges of listening in acoustically complex environments, in terms of the multiple cues available to listeners and their relative usefulness. I will describe examples of listeners’ weighted combination of spatial information across cue type, sound frequency, and over time, and follow up with a discussion of the potential impacts of cue-weighting strategies on spatial hearing in acoustically complex environments.

Neural and computational models of vision

Arash Yazdanbakhsh

Boston University

The lectures explores the psychological, biological, mathematical and computational foundations of visual perception. Lectures combine with simulation and essay assignments to provide a self-contained examination of core issues in early and middle visual processing. Mathematically specified neural and computational models elucidate the structure and dynamics of the mammalian visual system. Emphasis is placed on understanding the psychophysics and physiology of mammalian vision, both as a means of better understanding our own human intelligence, and as a foundation for tomorrow’s machine vision architectures and algorithms.

Neural and computational models of vision: Fundamental problems of vision

1) Visual grouping and segregation
2) Seeing and recognizing
3) Input interface: retina, its veins and blind spot
4) Perceiving surface: constancy, contrast, and discounting the illuminant
5) Boundaries and surfaces, or a different decomposition of a visual stimulus?
6) Detectors in primate visual system: unoriented and oriented detectors
7) The trade off between noise and saturation in a neural signal
8) Reflectance and ratio; shunting and mass action

Neural and computational models of vision: Neural models and their implementation in early vision

1) Brightness: Constancy and contrast
2) Shift property and Weber law
3) Retinal physiology
4) Back to the trade off between noise and saturation in a neural signal
5) Distance–dependent neural processes
6) Recurrent competitive networks

Poster session I

Effects of attention and the presence of multiple cues of concurrent sound segregation on the object-related negativity (ORN)

Zsuzsanna Kocsis 1,2, István Winkler 2,3, Orsolya Szalárdy 1,2, Alexandra Bendixen 4

1 Budapest University of Technology and Economics, 2 Hungarian Academy of Sciences Research Centre for Natural Sciences, 3 University of Szeged, 4 University of Leipzig

This study was aimed at assessing the effects of 1) attention and 2) combinations of known cues of concurrent sound segregation on the object-related negativity (ORN) and the P400 event‑related brain potential components, indexing the segregation of concurrent sound events. Participants were presented with sequences in which 50% of the sounds were complex tones made up from the 5 lowest harmonics. The other half of the complex tones had one (the 2nd) or two of their harmonics (2nd and 4th) mistuned by +8 %, delayed by 100 milliseconds, or delivered with different interaural time and intensity difference compared with the other harmonics (perceived location difference). In separate stimulus blocks, one, two, or all three manipulations of the harmonics were combined. Participants either watched a silent movie (passive listening) or they were instructed to mark on each trial whether they heard one or two concurrent tones (active listening) while the electroencephalogram (EEG) was recorded. The manipulations elicited larger ORN amplitudes in the active than in the passive listening situation. As expected, P400 was only elicited in the active situation. Perceived location difference was found to be a weaker cue for concurrent sound segregation and for ORN than the other two manipulations. Sub‑additive amplitude effects were observed for the various cue combinations. This suggests that ORN indexes the perception of two concurrent sounds, rather than the strength or reliability of the sensory evidence driving the perceptual decision.

Keywords: attention, concurrent sound segregation and object-related negativity

Auditory onsets, landmarks, surprise and salience

Annamária Kovács1, Martin Coath2, Susan Lynda Denham2, István Winkler1,3

1 Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, Budapest, 2 Cognition Institute, Plymouth University, UK, 3 Institute of Psychology, University of Szeged, Hungary

Enhancement of auditory transients is well documented in the auditory periphery and mid-brain and it is also known that transients are important in, for example, speech comprehension, object recognition and grouping.
In this work we introduce the novel approach of using an artificial neural network to implement a model of auditory transient extraction which is based on the assymetry of the distribution of energy inside a frequency dependent time window. We compare the output using this method with the original model and with other methods of identifying salient events in an auditory stimulus motivated by phonological (Landmarks) and information theoretic (Bayesian Surprise) analysis of human speech.

Keywords: Auditory transients, spectro-temporal responses, auditory cortex, artificial neural networks, models, acoustic landmarks, Bayesian surprise

Individual Differences in Auditory Multi-Stability

Dávid Farkas1,2, Susan Denham3, Alexandra Bendixen4, István Winkler1,5

1 Hungarian Academy of Sciences Research Centre for Natural Sciences, 2 Budapest University of Technology and Economics, 3 University of Plymouth, 4 Carl von Ossietzky University of Oldenburg, 5 University of Szeged

When sensory input is ambiguous, individuals differ in how much they switch between the possible percepts (perceptual bi-/multi-stability; e.g. Aafjes et al., 1966; Kondo et al., 2012). A recent study (Denham et al., 2014) employing an auditory multi-stable streaming paradigm found that individuals retain the same idiosyncratic switching pattern even one year after the first test. Here we searched for correlates of these characteristic individual switching patterns in executive functions and various personality traits. Results showed that two executive functions, namely shifting and inhibition, can be linked to the individual switching patterns. We also found that neither inhibition as a personality trait nor creativity was correlated with the switching patterns. From the personality trait correlates, multiple linear regression revealed that the “Active Engagement with the World” factor of the meta-trait Ego-resiliency (Block, 2002; Farkas & Orosz, in prep.) is the personality trait most closely correlated with inter-individual variance in how listeners perceive streaming sound sequences.

Deafness etiology and functional outcomes after cochlear implantation in prelingually deaf children

Lukáš Varga1,2, Zuzana Kabátová1, Ivica Mašindová2, Branislava Bercíková1, Daniela Gašperíková2, 3, Iwar Klimeš2, 3, Milan Profant1

1 Department of Otorhinolaryngology – Head and Neck Surgery, Faculty of Medicine and University Hospital, Comenius University, Bratislava, Slovakia 1 Institute of Experimental Endocrinology, Slovak Academy of Sciences, Bratislava, Slovakia, 3 Centre for Molecular Medicine, Slovak Academy of Sciences, Bratislava, Slovakia

Study aims to detect possible associations between hearing loss etiology and postoperative rehabilitation outcomes in prelingually deaf children, with particular focus on hereditary deafness caused by connexin mutations. Eighty-one of 92 prelingually deaf implanted children, tested for DFNB1 mutations, were divided into 3 etiology groups and underwent audiological evaluation in tone audiometry, speech audiometry, monosyllabic words, and categories of auditory performance (CAP), conducted 1, 3, and 5 years after implantation. Statistically significant differences (p < 0.05) for tone audiometry were obtained, particularly after the first and third year post implantation, between ‘connexin’ and ‘known’ etiology groups. In speech audiometry, the monosyllabic word test, and CAP, the connexin group of children scored significantly better than the two control groups only after 3 and 5 years. Although the rate of excellent performers was higher in the connexin group, poor results were achieved in all groups in similar proportion. Implanted children with GJB2 mutations tended to achieve better functional outcomes than the two control groups, although clear-cut significance was not always achieved. Hearing loss etiology may be considered as one of the important predictors, although complex influence of other factors on postoperative performance should be included in cautious individual counseling. Supported by APVV-0148-10.

Keywords: GJB2, hearing loss, children, connexin, performance

Subjects’ expectations, and the imagery scanning.

Rene Sebena, Igor Dudinsky

PJ Safarik University in Kosice, Faculty of Arts, Department of Psychology, Kosice, Slovak Republic

The aim of this study is demonstrate whether subjects’ scanning times varied with different verbal distance cues. The sample consisted of 40 students of psychology at the Faculty of arts, UPJS in Kosice with mean age 22,3 years, 57,5% of females. Subjects’ expectations were manipulated via the initial instructions. In the different groups, subjects were told (1) to expect a positive relationship, (2) to expect a negative relationship. We found no significant effect of subjects’ expectations on scanning times F=0.33, p =0.57. Scanning time varied as a function of distance in all two groups F(4,40) = 4.68, P < .01; F(4,40) = 4.36, P < .01; in the positive expectation and negative expectation group, respectively. Present study did not confirm effect of expectations on subjects’ scanning-time.

Cognition of geometry of high-dimensional data – topological structured projection emerging through simulated experiments with self-organized non-Euclidean Manifold learning.

Horváth, J.Uličný, B. Brutovský
Department of biophysics, PJ Safarik University, Kosice
Classical metric and non-metric MDS variants are widely known manifold learning (ML) methods which admit to find low-dimensional (2D) representation (projections) of high dimensional data inputs. However, their use is highly limited to the cases when data are inherently reducible to low dimensionality. In general, drawbacks and limitations
of these, as well as pure, MDS variants become more apparent when the exploration (learning) is exposed to highly structured and irreducible data structures. By using series of artificial and real-world datasets we demonstrate, that the over-determination problem can be solved by means of the hybrid multi-component discrete-continuous optimization heuristics. Its remarkable feature is, that projections onto 2D are constructed iteratively and simultaneously with data categorization compensating information loss due to 2D projection. The search for optimality is performed in a stochastic and self-organized adaptive way. Our extensive computer simulations demonstrate, that optimization may provide decompositions which resemble the standard decomposition of atlas onto charts. Sufficiently slow optimization enables parametric adaptation of local category-dependent distances to specific data. Intuitive view on the 2D projections and their high dimensional origins suggests, that here presented method has potential to support comprehensive exploratory data analysis and specific human skills in the data perception.

Sound-localization in noise performance is determined by sensitivity to spectral shape

Guillaume Andéol1, Ewan Macpherson2, Andrew Sabin3

1 IRBA (Institut de recherché biomédicale des armées), 2 Western University, 3 Northwestern University

Background

Analysis of the spectral shape imposed on an incoming signal by a listener’s head-related transfer functions (HRTFs) is required to localize its source in the up/down and front/back dimensions. Noisy environments both impair sound localization in those dimensions and increase differences in performance across participants. Because noise smoothes the spectral shape of the stimulus, the effects of noise on sound localization might be related to the original prominence of the HRTFs’ spectral shape (spectral strength, an acoustical factor), or to the listener’s sensitivity to spectral shape (a perceptual factor). In the current study we compared those two hypotheses.

Methods

Listeners sat on an elevated chair inside an anechoic room with their head placed at the center of an 8-loudspeaker cubic array. The wideband noise target was presented in quiet or in wideband noise at six different signal-to-noise ratios (from -7.5 to +5 dB in 2.5 dB steps). Listeners had to identify which loudspeaker had emitted the signal. Sensitivity to spectral shape was quantified by spectral-modulation detection thresholds measured with a broadband (0.2–12.8 kHz) or high-frequency (4–16 kHz) carrier and for different spectral modulation frequencies (below 1 cycle/octave, between 1 and 2 cycles/octave, above 2 cycles/octave). Spectral strength was computed as the spectral distance between the magnitude spectrum of the listener’s HRTFs and a flat spectrum.

Results

Data obtained from 19 normal-hearing listeners showed no correlation between HRTF spectral strength and localization performance. A significant correlation was found, however, between sensitivity to spectral shape for high-frequency carrier/low spectral-modulation frequency and localization performance.

Conclusion

These results suggest that the perceptual ability of the listener rather than the acoustical properties of the HRTFs determine sound localization performance in noise.

Adaptation to Room Reverberation in Nonnative Phonetic Training

Eleni L. Vlahou1, Aaron Seitz2, Norbert Kopčo1

1 PJ Safarik University, Kosice,2 University of California, Riverside

Background

Speech communication often occurs in adverse listening conditions, such as noisy and reverberant environments. Room reverberation distorts the speech signal and hampers intelligibility, an effect particularly pronounced for nonnative listeners (e.g., Nábĕlek and Donahue, 1984, J. Acoust. Soc. Am. 75: 632-634). There is evidence that prior exposure to consistent reverberation is beneficial for native listeners, resulting in improved speech intelligibility (Brandewie & Zahorik, 2010, J. Acoust. Soc. Am. 128: 291-299), but less is known about the patterns of interference and adaptation to room reverberation for nonnative listeners during the acquisition of novel phonetic categories. In the present study we investigate these issues, addressing in particular the differential effects of phonetic training in multiple reverberant rooms versus a single anechoic environment on the perception of nonnative phonemes in anechoic and reverberant conditions.

Methods

Listeners were trained on a difficult dental-retroflex phonetic distinction. Stimuli were CV syllables coming from a Hindi speaker and were presented in anechoic space or in simulated reverberant environments, crossed with supervised and unsupervised training. Supervised training consisted of a 2AFC task with trial-by-trial performance feedback. Unsupervised training employed a videogame which promoted stimulus-reward contingencies. Before and after training, participants were tested using the trained voice and trained rooms, as well as using an untrained voice and untrained rooms.

Results

When tested with the trained voice, participants showed significant improvements for trained and untrained stimuli and rooms. Exposure to the stimuli in three different rooms vs. exposure only in anechoic room resulted in similar amounts of learning and generalization to untrained rooms. Supervised training resulted in larger improvements than unsupervised training. No generalization of learning to an untrained voice was observed for either type of room simulation.

Conclusions

The results show that phonetic categorization training of the dental-retroflex distinction in nonnative listeners is robust against variation in room characteristics, but also that it does not benefit from exposure to the stimuli in different reverberant environments. The lack of generalization of learning to an untrained voice suggests that listeners encoded talker-specific, non-phonetic details. While these results confirmed that acquisition of novel phonetic categories for nonnative listeners is robust against reverberation variations, it is likely that the extent to which phonetic learning and reverberation adaptation interact depends on the specific acoustic and phonetic features important for the trained discrimination.

Funding: Work supported by EU FP7-247543, VEGA-1/0492/12.

Audio-visual calibration and plasticity of auditory distance perception

Ľuboš Hládek1, Christophe Le Dantec2, Aaron R Seitz2, Norbert Kopčo1

1 Faculty of Science, P. J. Šafarik University, Košice Slovakia, 2 Department of Psychology, University of California Riversice, CA, USA

A previous study [Hladek et al., (2013), Ventriloquism effect and aftereffect in the distance dimension, ICA Montreal, POMA Volume 19, pp. 050042] reported complex patterns of the ventriloquism effect (VE) and the aftereffect (VA) in distance dimension, however, the performance with the aligned adaptor was omitted. In the current study we compare auditory localization in distance with visually misaligned versus visually aligned adaptors. The subjects in a two-day-experiment localized 300-ms broadband noise in a small reverberant room from one of the 8 loudspeakers placed in the front of the listener. The trials with auditory-only (A-only) component were interleaved with trials with both audio and visual components (V-Farther, V-Closer, V-Aligned). The condition was held constant during a session. Most of the variance previously reported was by removed by inclusion of the V-Aligned condition, however, VE in V-Closer condition was stronger compared with V-Farther condition. Immediate VA was stronger than persistent VA, but was constant across conditions and distances. The carry-over effects of both VE and VA were observed. The current results suggest only a modest effect of distance and the direction of induced shift on both VE and VA, however, different time scales of the observed effect suggest involvement of the different neural processes.

This publication is the result of the Project implementation: SOFOS – knowledge and skill development of the academic staff and students at the University of Pavol Jozef Safarik in Kosice with emphasis on interdisciplinary competencies and integration into international research centres, ITMS: 26110230088,supported by the Research & Development Operational Programme funded by the ESF.

We support research activities in Slovakia. This project is being co-financed by the European Union.

Audio-Visual spatial fusion varies with eyes’ position and allocentric visual reference frame.

David Hartnagel, Alain Bichot, Patrick Sandor and Corinne Roumes

Institut de Recherche Biomédicale des Armées, 91223 Brétigny-sur-Orge, France

Space perception implies egocentric and allocentric cues from multiple sensory modalities. In darkness, the position of the eyes in head affects visual-auditory (VA) fusion in space, the reference frame of VA fusion space is neither head-centered nor eye-centered but is instead the result of an integration phenomenon (Hartnagel et al., 2007). Results in vision research have shown influence of visual allocentric reference frame on visual localization. Schmidt et al. (2003) have shown local distortion effect of visual landmark and experiments on the Reolof effect have shown shift of localization relative to the asymmetric surrounding display (Dassonville et al., 2004). In the present experiment we investigate effects of visual allocentric cues on VA fusion space when the egocentric reference frames (eyes and head) are aligned or dissociated. To ensure that egocentric reference frames are aligned or dissociated, the subject’s head was maintained by a bite-board and eye position was checked by an eye-tracker. On an hemi-cylindrical screen hiding 21 loudspeakers in a 2D arrangement, a projector displays a green permanent rectangular large background (135°H x 80°V); participant (head and body aligned) was sideways-oriented so the display appeared shifted 15° to the right relative to straight ahead and the surrounding visual frame was asymmetrical (frame offset). Two types of allocentric visual cues were tested, the edges of the visual display and 2 broken lines (vertical/ horizontal). A broadband noise burst and a 1° spot of light, 500ms duration, were simultaneously presented with a random 2D spatial disparity. Participants had to judge about their unity. Results showed that AV fusion depends mainly on egocentric reference frames relative position (eyes and head) and that local allocentric reference frame has no significant effect. Comparisons with previous results confirm the importance of surrounding visual display.

Keywords : Audio; Visual; Fusion; Reference frames; Space

References :

Dassonville, P., Bridgeman, B., Kaur Bala, J., Thiem, P. and Sampanes, A. (2004). « The induced Roelofs effect: two visual systems or the shift of a single reference frame? » Vision Research 44(6) 603–611.

Hartnagel, D., Bichot, A., and Roumes, C. (2007). « Eye position affects audio-visual fusion in darkness ». Perception 36(10), 1487 1496.

Schmidt, T., Werner, S., and Diedrichsen, J. (2003). « Spatial distortions induced by multiple visual landmarks: How local distortions combine to produce complex distortion patterns ». Perception & psychophysics 65(6) 861.

Modeling of Speech Localization in a Multitalker Mixture Using “Glimpsing” Models of Binaural Processing

Peter Toth1, Angela Josupeit2, Norbet Kopco3, Volker Hohmann2

1Charles University in Prague, Czech Republic,2Carl von Ossietzly Universitat Oldenburg, Germany,3 P.J. Safarik University, Kosice, Slovakia

Background

A recent study measured the human ability to localize a speech target masked by a mixture of four talkers in a room [Kopco et al., JASA 127, 2010, 1450-7]. The presence of maskers resulted in increases in localization errors that depended on the spatial distribution of maskers, the target-to-masker energy ratio (TMR), and the listener’s knowledge of the maskers’ locations. The current study investigated the performance of two binaural auditory “glimpsing” models in simulated experimental conditions. The models were tested under the assumption that optimal information about the TMR in individual spectro-temporal glimpses is available, quantifying the ability of the models to encode spatial properties of complex acoustic scenes.

Methods

The framework for the modeling consisted of: 1. auditory preprocessing, 2. extraction of binaural cues, 3. identifying the “glimpses”, i.e., the spectro-temporal bins dominated by energy from only one source, 4. selecting target-related glimpses based on Ideal Binary Masks, and 5. estimating the target position. Two binaural models, one based on short-term

running interaural coherence [Faller and Merimaa, JASA 116, 2004, 3075-89] and one on instantaneous interaural phase difference [Dietz et al., Speech Communication 23, 2011, 592-605] were modified and implemented. The stimuli were simulated by convolving speech tokens from the experiment with binaural room impulse responses recorded in a reverberant space similar to the experimental room.

Results

The two models produced similar predictions, both slightly worse than human performance. However, many trends in the data were captured by the models. E.g., the mean responses for lateral target locations were medially biased, the RMS errors were smallest for central target locations, and the overall performance varied with TMR. However, there were also qualitative differences. E.g., the models predicted best performance near the masker locations while humans were better at localizing targets far from the maskers.

Conclusion

The tested binaural models were able to capture several characteristics of human performance. Even though each model extracts binaural information in a different way, the model predictions were comparable, suggesting that the extracted features are equivalent and integrated in similar ways. The differences between the model predictions and human performance might be due to differences in interaural level difference processing, across-channel feature integration, or the assumed method of combination of target and masker glimpses.

Funding

Work supported by EU FP7-247543, VEGA-1/0492/12, the DFG SFB/TRR 31 “The Active Auditory System”, and the PhD program “Hearing”.

Spatial Aspects of Contextual Plasticity in Sound Localization

Beata Tomoriova1, Lubos Hladek1, Rudolf Andoga2, Norbert Kopco1

1 P.J. Safarik University of Kosice, Slovakia, 2 Technical University of Kosice, Slovakia

A previous study examining the effect of a preceding distractor on sound localization found that responses were biased even in control trials in which no distractor was presented before the target sound (Kopco et al., JASA, 121, 420-432, 2007). These shifts in no-distractor responses are referred to as “contextual plasticity”. In the current study we examined the spatial aspects of the contextual effect by varying the spatial arrangement of the context. The subject’s task in the experiment was to localize a 2-ms noise burst presented from one of seven target loudspeakers spaced symmetrically relative to the medial plane or the interaural axis. In experimental runs, in 75% of the trials (the distractor trials), the target was preceded by an identical distractor presented from the center of the loudspeaker range 25 ms before the target. The remaining 25% of trials (the no-distractor trials) presented the target alone. In baseline runs, no distractor trials were included. Separate experimental runs examined how contextual plasticity was influenced by the distribution of the targets on the distractor trials. In these runs, the distractor targets were presented either from only the three left-most speakers, the three right-most speakers, or from any of the nondistractor speakers. Contextual biases away from the distractor were found for both medial and lateral distractor locations. The biases depended on the configuration of the distractor trials. The half-range configurations elicited biases in the corresponding part of the range while no biases were observed for the other half. The full-range configurations elicited smaller biases. These shifts were observed independent of the orientation of the listener relative to the speaker array or of the half-range region examined. These results provide basic characterization of the neural structure that undergoes the contextual adaptation and they describe how the spatial specifics of context affect contextual plasticity.

Talks

Persistent behavioral and neuronal effects of context-reward contingencies on visual search.

Jana Eštočinová

Center of Applied Informatics, Faculty of Science, P.J. Šafárik University in Košice, Košice, Slovakia

Repetition of spatial context leads to more efficient search for targets in distractor-filled displays (contextual cueing effect). This learning occurs incidentally and implicitly, develops already after few display repetitions, and persists over several days.

However, the reward outcome can also exert an implicit effects improving a wide-range of perceptual and attentional processes, i.e. by prioritizing the stimuli which yield positive outcomes when selected. Accordingly, even whole target-distractor contexts yielding positive outcomes might enhance contextual-cueing learning.

Current behavioral/fMRI study investigated the interplay of reward and implicit learning underlying the contextual cueing. In a visual search task, half of the spatial configurations were repeated within the task, the remaining half were newly presented. Crucially, during the training, half of the repeated and novel spatial configurations were highly rewarded and another half were lowly rewarded. Within the fMRI session several days after the training, subjects performed the same task without any reward feedback.

Results showed that high reward speeded visual search of repeated contexts both within training and fMRI session. Therefore, past reward guided visual search even in future occasions. This long-lasting benefit was accompanied with reduced activity within dorsal attention network, and increased activity within retrosplenial cortex, angular gyrus, and medio-frontal cortex. Altogether, reward-based learning actively applied reward-context memories in order to optimally use attentional sources for efficient future visual search.

ACKNOWLEDGMENTS:

[This work is the result of the Project implementation: University Science Park TECHNICOM for Innovation Applications Supported by Knowledge Technology, ITMS: 26220220182, supported by the Research & Development Operational Programme funded by the ERDF.]

We support research activities in Slovakia. This project is being co-financed by the European Union.

Modelling Auditory Streaming using Non-negative Modulation Pattern Tensor Factorisations

Tom Barker

Department of Signal Processing at Tampere University of Technology, Finland

We present a novel method for estimating the likely perceptual organisation of simple alternating tone patterns in human listeners. By training a tensor model representation using features which incorporate both low-frequency modulation rate and phase, a set of components is learned. Test patterns are modelled using these learned components, and the sum of component activations is used to predict either an ‘integrated’ or ‘segregated’ percept.

Dynamical Brain: a signal processing tool for analyzing EEG/fMRI brain scans

Milan Jovovic

P.J.Šafárik University

Dynamical data modeling methodology is developed for multidimensional signals scaling. A signal is decomposed in a coupled structure of binding synergies of spatio-temporal events. In a multi-scale approach, a scale-space wave information propagation is utilized by computing stochastic resonances within nucleons of information. In this talk, the brain wave nucleons will be showed for an EEG scan during a sleep. A distance measure of synchronicity of the brain wave patterns will be also explained.

Binaural model of lateralization

Jaroslav Bouse

Czech Technical University in Prague, Faculty of Electrical Engineering, department of Radioelectronics

We present a phenomenological binaural model which allows the detection of subjective lateralization. The model consists of a peripheral part taken from the literature (Dau et al., 1996, Acoust. Soc. of Am., 99:3615-3622) and a binaural part designed by authors of this study. There are two calculation units in each hemisphere in the binaural part. These units mimic medial and lateral superior olive. The actual lateralization data are then determined after the comparison stage emulating the count-comparison principle. Results of subjective experiments investigating lateralization of pure tones and narrow band noises were used to verify the model. The pure tone data were taken from the literature (Yost, 1981, J. Acoust. Soc. Am, 70(2):337-409). The narrow band noise data were measured in this study on seven normal hearing listeners. The model shows very good fit with mean values of the subjective data.

Effect of streaming on localization with a preceding distractor

Gabriela Andrejková and Norbert Kopčo

Perception and Cognition Lab, P. J. Šafárik University in Košice

This study presents results of a new analysis of behavioral data from a follow-up experiment based on previously published data [Kopčo, Best, Shinn-Cunningham (2005). “Click versus Click-Click: Influence of a Preceding Stimulus on Sound Localization,” Association for Research in Otolaryngology abstract]. In the experiment, the sound localization ability of human listeners has been examined for click stimuli in the horizontal plane. The stimuli were preceded either by a grouping (1-click) and streaming (8-click) distractor, coming from the frontal and lateral positions to a subject. The experiment was performed in anechoic and reverberant environment. The presence of the 1-click distractor caused target responses to be displaced towards the distractor, and caused increases in response variance, similar to the results of [Kopčo, Best, Shinn-Cunningham: Sound localization with a preceding distractor, JASA. 121]. To test whether some of these effects are due to perceptual grouping of the stimuli, the 8-click streaming distractor was expected to lead to reductions in biases and variance. This hypothesis was confirmed for the stimuli in reverberant environment, but not in anechoic. In addition, we also examined the contextual effect of the distractor trials on the baseline target-only trials. A bias away from the distractor was observed with the 1-click distractor. This effect became stronger with the 8-click distractor, suggesting that the contextual effect is not related to perceptual grouping mechanisms. These results illustrate that complex perceptual interactions can be expected even in simple sound localization tasks.

ACKNOWLEDGMENT: This contribution was supported by Slovak Research and Development Agency under grant project APVV-0452-12 Spatial Attention and Listening in Complex Environments.

Analysis of non-auditory effects on contextual plasticity in spatial hearing

Ľuboš Marcinek1, Norbert Kopčo2,

1 Technical University of Košice, 2 P.J.Šafárik University in Košice

The perceptual systems in the human brain are adaptable. This plasticity is critical for our ability to survive in variable environments. This work explores the properties of one specific type if plasticity in spatial auditory perception called contextual plasticity. This effect is observed as a bias in responses when target localization trials are interleaved by trials in which the target is preceded by a distractor. Specifically, it explores how non-auditory factors such as vision, top-down factor, and motor representation affect plastic changes in spatial hearing. The impact of mentioned factors was explored by performing computational analyses of previously collected behavioural data from two experiments. We performed analysis of biases in responses, temporal profiles analysis, correlation coefficients analysis and analysis of standard deviations. A complex mixture of effects was observed, showing that vision, motor representation and top-down factors influence plasticity in spatial auditory processing. Moreover, it was shown that the effect is not influenced by the ordering of the stimuli at large stimulus onset asynchrony (SOA).

Work supported by: VEGA-1/0492/12

Perceptual and neuronal patterns of auditory distance processing in humans

Norbert Kopčo and Jana Eštočinová
P. J. Šafárik University in Košice, Slovakia
norbert.kopco@upjs.sk, jana.estocinova@upjs.sk

The estimation of auditory distance is typically dominated by the overall received stimulus intensity. However, auditory distance perception can also be guided by intensity-independent cues. Specifically, the interaural level differences (ILDs) provide distance information for lateral stimuli and, in reverberant space, the direct-to-reverberant energy ratio (DRR) cue provides distance information for sources from all directions. In a behavioral experiment, Kopco et al. (2012) confirmed that listeners used these cues to estimate nearby-source distance in the absence of the intensity cue. Then, in an adaptation fMRI experiment, Kopco et al. identified the brain area in the left superior temporal gyrus and planum temporale that represents the neural basis of auditory distance perception.

The current study is a part of an effort to separate the brain areas that extract the individual distance cues from the spatial distance representation. To this purpose, we performed a series of behavioral experiments in which we manipulated the ILDs and DRR cues. Specifically, we compared performance with some of the cues eliminated, presented congruently, or presented incongruently. Preliminary results suggest that the cue strength is strongly adaptive, resulting in very low weight put on DRR after the subject is exposed to stimuli with congruent ILD and intensity cues. Also, diotic DRR-based performance was found to be better than monaural DRR-based performance even though the directional percept was more consistent with a real situation in the latter condition. These results help elucidate the psychophysical basis of auditory distance perception; they will be used to select the stimuli for fMRI experiments that examine the underlying neural mechanisms.

ACKNOWLEDGMENTS:

[This work is the result of the Project implementation: University Science Park TECHNICOM for Innovation Applications Supported by Knowledge Technology, ITMS: 26220220182, supported by the Research & Development Operational Programme funded by the ERDF.]

[This contribution was supported by Slovak Research and Development Agency under grant project APVV-0452-12.]

We support research activities in Slovakia. This project is being co-financed by the European Union.

Visual Adaptation And Spatial Auditory Processing

Peter Lokša, Norbert Kopčo
P. J. Šafárik University in Košice, Slovakia

Sensory information from one modality (e.g. audition) can be affected by stimuli from other modalities (e.g. vision.), a phenomenon known as cross-modal interaction. The most famous cross-modal interactions include the ventriloquism effect and the ventriloquism aftereffect in which the perceived location of an auditory target is shifted when the stimulus is presented simultaneously with a spatially displaced visual stimulus. A previous study of the reference frame of the ventriloquism aftereffect showed that: 1) locally induced ventriloquism effect can be induced, corresponding to 80% of the AV displacement, that 2) ventriloquism aftereffect corresponding to 50% of the AV displacement is observed for auditory-only stimuli, and that 3) the reference frame of the aftereffect is a mixture of eye-centered and head-centered coordinate frames [Kopčo, Lin, Shinn-Cunningham, Groh (2009). Reference frame of the ventriloquism aftereffect. Journal of Neuroscience, 29(44):13809-13814]. Another recent study [Wozny, Shams (2011). Recalibration of auditory space following milliseconds of cross-modal discrepancy. Journal of Neuroscience, 31(12): 4607-4612] observed ultra-fast adaptation effect which has a very quick onset and which fades away after few seconds. Here, a dissertation project is presented that will use the data of Kopco et al. (2011) to analyze a new visually-induced auditory adaptation phenomenon, and the reference frame of the ultra-fast adaptation.

Work supported by: VEGA-1/0492/12, APVV-0452-12

Funding

This workshop / lecture series is organized within the Project implementation: SOFOS – knowledge and skill development of the academic staff and students at the University of Pavol Jozef Safarik in Kosice with emphasis on interdisciplinary competencies and integration into international research centres, ITMS: 26110230088, supported by the Research & Development Operational Programme funded by the ESF.

Modern education for knowledge society / This project is being co-financed by the European Union

eulogoopvzdelavanielogo