Dr. Theodore has received a 3-year grant from the Division of Behavioral and Cognitive Sciences of the National Science Foundation titled “Determinants of perceptual learning for speech perception.” The activities will be completed in collaboration with Dr. Matt Winn at the University of Minnesota. The public abstract is shown below.
Abstract: One of the great mysteries in the cognitive sciences is how humans understand speech despite extreme variation in the actual acoustics of the sounds produced. Speakers rarely produce identical versions of the same speech sound, because everyone has a personal way of talking depending on their dialect, gender, and age. Other factors, such as fatigue or excitement, might change moment-to-moment. Although listeners are usually able to adapt to these differences, research has yet to discover exactly how this adaptation works. This project establishes a new tool for creating experimental stimuli that can generate the wide variation that occurs in natural speech and uses this tool to examine how listeners adapt to talker differences. The experiments test the ways that listeners incorporate new evidence from a talker’s voice, how they use existing knowledge about the speech sounds in their language, how they learn what to ignore, and how individuals may differ in their ability to adapt to a talker. The project has potentially broad societal benefits because it will provide foundational knowledge on how people communicate with each other and provide hands-on education for undergraduate and graduate trainees who will develop skills for experimental design, computer programming, data analysis, and science communication. The project will also create tools for other researchers, including an open-access stimulus corpus and brief lecture videos and assignments that can be used to teach students and others about the physics of sound, thus promoting scientific literacy among the public. All materials associated with this project will be made publicly available.
This interdisciplinary project unites psycholinguistic experimentation, signal processing, and computational modeling to test three key parts of a Bayesian belief-updating model of speech adaptation: (1) beliefs reflect listeners’ knowledge of the relationship between cue distributions and phonetic categories, (2) adaptation reflects integration of observed evidence with prior beliefs that can be driven by both unsupervised and supervised learning signals, and (3) belief-updating is context-specific (e.g., conditioned on talker). The project will establish a novel method for creating fricative continua, create an extensive stimulus corpus, and compare learning between the stimulus corpus and natural stimuli. The novel method of stimulus creation will be used to generate sounds that independently manipulate frequency scaling (reflecting vocal tract size) and gain properties (reflecting specific speech articulations) of fricative cues to discover which specific cues guide learning and how that information generalizes (or not) across talkers. The experiments will also examine how listeners integrate unsupervised and lexically supervised learning signals over time, providing insight into the influence of prior exposure on lexically guided learning and individual differences in learning. These are important innovations because they address two weaknesses in the existing perceptual learning literature, including the practice of superimposing two sounds to create ambiguous input that might be unnatural and the failure to measure incremental change in perception over time. Theoretical benefits include a strong test of the Bayesian belief-updating model of speech adaptation, identification of specific cues that drive perceptual learning, and unpacking how perceptual learning integrates unsupervised and lexically supervised signals.