Speech Perception Laboratory
     Lab Publications
     Publication Abstracts
     Conference Abstracts
 
Psychology Department
NU Home

Abstracts of Representative Conference Presentations

Theodore, R.M., and Miller, J.L. (2008). Characteristics of listener sensitivity to talker-specific phonetic detail. Poster presented at the 156th meeting of the Acoustical Society of America, Miami, Florida.

Listeners are sensitive to talker differences in phonetic properties of speech, including voice-onset-time (VOT) in word-initial stop consonants. Earlier findings from our laboratory [R. M. Theodore & J. L. Miller, J. Acoust. Soc. Am. 123, 3934 (2008)] indicate that learning how a talker produces one voiceless stop (e.g., /p/ in pain) transfers to another voiceless stop (e.g., /k/ in cane), providing support for feature-based processing of VOT at the level of individual talkers. Here we examined possible constraints on such processing by asking whether transfer would also occur when the learning and transfer words were not minimal pairs. In familiarization phases, listeners heard two talkers produce pain. Critically, word-initial VOTs were manipulated such that one talker produced pain with relatively short VOTs and the other talker produced pain with relatively long VOTs. In test phases, listeners were presented with a short-VOT and long-VOT variant of coal produced by each talker, and were asked to select which variant was most representative of the talker. Results showed that the listeners selected the VOT variant of coal in line with their previous exposure to pain, indicating that feature-based processing of talker-specific VOT is robust. Poster


Theodore, R.M., and Miller, J.L. (2008). Listeners' sensitivity to talker differences in voice-onset-time: Segments versus features. Talk presented at Acoustics '08 (a joint meeting of the Acoustical Society of America, the European Acoustics Association, and the Société Française d'Acoustique), Paris, France.

Recent findings indicate that listeners are sensitive to talker differences in phonetic properties of speech, including voice-onset-time (VOT) in word-initial voiceless stop consonants.  Here we extend earlier findings from our laboratory [J. S. Allen & J. L. Miller, J. Acoust. Soc. Am. 115, 3171-3183 (2004)] by examining the level of representation underlying this sensitivity.  In familiarization phases, listeners heard two talkers produce pain.  Critically, word-initial VOTs were manipulated such that one talker produced short VOTs and the other talker produced long VOTs.  In test phases, listeners were presented with a short-VOT and long-VOT variant of either pain or cane; in both cases, listeners were asked to select which of the two VOT variants was most representative of a given talker.  Results to date indicate that which variant of pain is selected at test is in line with listeners' exposure during training (replicating earlier findings), and that this effect holds even when listeners are tested on cane, which begins with a different voiceless stop than heard during training.  These results suggest that listeners are sensitive to talker differences in VOT at the level of a phonetic feature, rather than at the level of a particular phonetic segment.


Theodore, R.M., Miller, J.L., and DeSteno, D. (2007). The effect of speaking rate on voice-onset-time is talker-specific. Talk presented at the XVIth ICPhS, Saarbrücken, Germany.  [In J. Trouvain & W. J. Barry (Eds.), Proceedings of the XVIth International Congress of Phonetic Sciences, pp. 473-476.]

Talkers differ in phonetic properties of speech.  One such property is voice-onset-time (VOT), an important marker of the voicing contrast in English stop consonants.  Research has shown that VOT is affected by speaking rate: for any given talker, VOT increases as rate slows.  The current work examines whether this contextual influence varies across talkers.  Many tokens of /ti/ (Experiment 1) or /pi/ and /ki/ (Experiment 2) were elicited from talkers across a range of rates.  VOT and syllable duration were measured for each token.  The results showed that although VOT increased as rate slowed for all talkers, the extent of this increase varied significantly across talkers.  For a given talker, however, the extent of the increase was stable across a change in place of articulation.  These findings suggest that talker differences in phonetic properties of speech reflect talker-specific contextual influences. 


Theodore, R.M., Miller, J.L., and DeSteno, D. (2007). Talker-specific contextual influences on voice-onset-time. Poster presented at the 153rd meeting of the Acoustical Society of America, Salt Lake City, Utah.

Research has shown robust contextual influences on voice-onset-time (VOT) in speech production. The current work examines talker-specificity for two such cases: speaking rate (VOT increases as syllable duration increases) and place of articulation (VOT increases as place moves from anterior to posterior position). Tokens of /pi/ (labial) and /ki/ (velar) were elicited from talkers across a range of rates. VOT and syllable duration were measured for each token. For each talker, separate labial and velar linear functions relating VOT to syllable duration were calculated. Ongoing analyses indicate that: (1) For both the labial and velar functions there is significant variability across talkers’ slopes [see also Theodore et al., J. Acoust. Soc. Am. 120, 3293 (2006)], but there is no significant variability in the difference between labial and velar slopes for a given talker. Thus the effect of speaking rate is talker-specific, and stable across place of articulation. (2) For each talker, the velar intercept is located at a longer VOT than the labial intercept, with significant variability in the magnitude of displacement across talkers. Thus the effect of place is also talker-specific. These findings support the view that phonetic properties of speech include talker-specific contextual influences. Poster


Theodore, R.M., Miller, J.L., and DeSteno, D. (2006). Effect of speaking rate on individual talker differences in voice-onset-time. Poster presented at the 152nd meeting of the Acoustical Society of America, Honolulu, Hawaii.

Recent findings indicate that individual talkers systematically differ in phonetically relevant properties of speech. One such property is voice-onset-time (VOT) in word-initial voiceless stop consonants: at a given rate of speech, some talkers have longer VOTs than others. It is also known that for any given talker, VOT increases as speaking rate slows. We examined whether the pattern of individual differences in VOT holds across variation in rate. For example, if a given talker has relatively short VOTs at one rate, does that talker also have relatively short VOTs at a different rate? Numerous tokens of /ti/ were elicited from ten talkers across a range of rates using a magnitude-production procedure. VOT and syllable duration (a metric of speaking rate) were measured for each token. As expected, VOT increased as syllable duration increased (i.e., rate slowed) for each talker. However, the slopes as well as the intercepts of the functions relating VOT to syllable duration differed significantly across talkers. As a consequence, a talker with relatively short VOTs at one rate could have relatively long VOTs at another rate. Thus the pattern of individual talker differences in VOT is rate dependent. Poster


Mondini, M., and Miller, J.L. (2004). Perceiving non-native speech: Word Segmentation. Poster presented at the 147th meeting of the Acoustical Society of America, New York, New York.

One important source of information listeners use to segment speech into discrete words is allophonic variation at word junctures.  Previous research has shown that non-native speakers impose their native-language phonetic norms on their second language; as a consequence, non-native speech may (in some cases) exhibit altered patterns of allophonic variation at word junctures.  We investigated the perceptual consequences of this for word segmentation by presenting native-English listeners with English word pairs produced either by six native-English speakers or six highly fluent, native-French speakers of English.  The target word pairs had contrastive word juncture involving voiceless stop consonants (e.g., why pink/ wipe ink; gray ties/ great eyes; we cash/ weak ash).  The task was to identify randomized instances of each individual target word pair (as well as control pairs) by selecting one of four possible choices (e.g., why pink, wipe ink, why ink, wipe pink).  Overall, listeners were more accurate in identifying target word pairs produced by the native-English speakers than by the non-native English speakers. These findings suggest that one contribution to the processing cost associated with listening to non-native speech may be the presence of altered allophonic information important for word segmentation. Poster


Miller, J.L., Mondini, M., Grosjean, F., and Dommergues, J-Y. (2003). Dialect effects in speech perception: Standard (Parisian) French and Swiss French. Poster presented at the 44th meeting of The Psychonomic Society, Inc., Vancouver, Canada.

Languages differ in the relative importance of given acoustic-phonetic properties in specifying phonological contrasts. Earlier we reported a comparable effect for dialects: Native speakers of Swiss French, but not native speakers of standard French, used vowel duration when identifying a vowel contrast (Miller & Grosjean, 1997). In the present study we found that this effect is not limited to identification, but also involves which tokens listeners perceive to be the best exemplars of the two vowel categories. For native speakers of Swiss French, the best exemplars of the vowels differed substantially in duration, whereas for native speakers of standard French, they differed only minimally. This pattern closely reflects differences in how native speakers of the two dialects produce the vowels (Miller et al., 2000). These findings provide further evidence that listeners use acoustic-phonetic information in a dialect-specific manner when mapping the acoustic signal onto the phonological categories of their language. Poster


Mondini, M., van Alphen, P.M., and Miller, J.L. (2002). Native-language influence on phonetic perception in Dutch-English bilinguals. Poster presented at the 144th meeting of the Acoustical Society of America, Cancun, Mexico.

We examined how native-language experience influences processing a second language, focusing on how native Dutch listeners who learned English as a second language perceive the English voiceless consonant /p/. Previous research [J.E. Flege and W. Eefting, Speech Commun., 6, 185-202 (1987)] shows that the voiced-voiceless boundary for an (English-based) voice-onset-time (VOT) series is located at a shorter VOT for such bilingual listeners than for native English listeners, consistent with the fact that voiceless stops are produced with shorter VOTs in Dutch than in English. We asked whether such bilinguals also differ from native English listeners in which stimuli throughout the series are perceived as reasonable exemplars of /p/. Native English listeners and native Dutch listeners were tested on a three-choice identification task with an (English-based) extended VOT series that ranged from /ba/ to /pa/ to an "unnatural" exaggerated /pa/, labeled */pa/. Both the /b/-/p/ and /p/-*/p/ boundaries were located at shorter VOTs for the native Dutch than the native English listeners, indicating that Dutch native-language experience influenced the entire range of VOTs perceived as reasonable exemplars of the /p/ category. Thus native-language experience has a comprehensive influence on the mapping from acoustic signal to phonetic category. Poster


Miller, J.L. (2002). Internal structure of phonetic categories: Some characteristics and constraints. Talk presented at the 143rd meeting of the Acoustical Society of America, Pittsburgh, PA.

A widely held assumption in the speech perception literature for many years was that during the course of processing listeners derive an abstract phonetic representation and, in doing so, discard information about the fine-grained detail of the speech signal. However, more recent research has shown that the representations of speech are much richer than this emphasis on abstract categories would suggest, and that listeners retain in memory a substantial amount of fine-grained acoustic-phonetic information. One line of evidence for the richness of phonetic representations comes from research showing that phonetic categories are internally structured in a graded fashion, with some members of the category perceived as better exemplars (as more "prototypical") than others. In this talk I will describe findings from our research program that highlight some of the characteristics of these internally structured categories, and discuss how these characteristics place constraints on models of phonetic perception.


Brancazio, L., Miller, J.L. and Mondini, M. (2002). Audiovisual integration in the absence of a McGurk effect. Poster presented at the 143rd meeting of the Acoustical Society of America, Pittsburgh, PA.

The McGurk effect, a change in perceived place of articulation due to an incongruent visual stimulus (e.g., auditory /pi/ with visual /ti/ perceived as /ti/), demonstrates the contribution of vision to speech perception. Interestingly, in a given experiment the McGurk effect typically does not occur on every trial. We investigated whether non-McGurk trials result from a failure to perceptually integrate auditory and visual information by simultaneously manipulating visual place of articulation and visual speaking rate. Previous work [Green & Miller, Perc.Psychophys., 38, 269-276 (1985)], has shown that the boundary along an auditory /bi/-/pi/ voice-onset-time (VOT) continuum occurs at a longer VOT when the auditory stimulus is paired with a slow rather than a fast visual /pi/. We paired stimuli from an auditory /bi/-/pi/ continuum with fast and slow versions of a visual /ti/, and subjects identified each item as /b/, /p/, /d/, or /t/. We found a rate effect on McGurk trials, with the /d/-/t/ boundary occurring at a longer VOT when the visual stimulus was slow rather than fast. Importantly, we found a comparable rate effect for the /b/-/p/ boundary on non-McGurk trials. This indicates that audiovisual integration occurs even in the absence of a McGurk effect. Poster


Miller, J.L., Mondini, M., Grosjean, F., and Dommergues, J-Y. (2000). Dialect differences in the temporal characteristics of vowels: A comparison of standard (Parisian) and Swiss French. Poster presented at the 140th meeting of the Acoustical Society of America, Newport Beach, CA.

Earlier we reported a dialect difference in the use of temporal information for vowel perception: Native speakers of Swiss French used temporal as well as spectral information when identifying /o/ versus /ɔ/, whereas native speakers of standard (Parisian) French used only spectral information [J.L. Miller & F. Grosjean, Language and Speech, 40, 277-288 (1997)]. We interpreted this dialect difference in terms of the more prominent role that vowel duration plays overall in the phonological system of Swiss French compared to standard French. To investigate further the basis of the dialect effect, we have been measuring the duration of /o/ and /ɔ/ in monosyllabic words for native speakers of the two dialects. Our findings to date indicate a robust dialect effect in production: The duration difference between /o/ and /ɔ/ is substantially larger and more consistent in Swiss French than in standard French. Thus the perceptual dialect effect for /o/ and /ɔ/ we reported earlier reflects both a specific difference in the temporal characteristic of this vowel pair and an overall difference in the role of vowel duration in the phonological systems of the two dialects. Poster