The Handbook of Speech Perception. Группа авторов

Чтение книги онлайн.

Читать онлайн книгу The Handbook of Speech Perception - Группа авторов страница 69

The Handbook of Speech Perception - Группа авторов

Скачать книгу

representations to lexical representations and for ultimately accessing the lexical semantic network.

       Speech perception

      Using more sensitive measures than phonetic categorization including reaction time (Pisoni & Tash, 1974) and judgments of category goodness (Miller, 1997; Iverson & Kuhl, 1996), speech‐perception studies have shown that listener reaction times are slower in identifying a stimulus on a continuum, and their judgments of category goodness are reduced as stimuli approach the phonetic category boundary. Variations in task requirements support the robustness of these gradient effects (e.g. Carney, Widin, & Viemeister, 1977). Thus, phonetic categories are graded and have an internal structure to them (see Miller, 1997). In this sense, categories are not truly binary representations that are either present or absent, but rather some exemplars of a category are better representations of the category than others.

      Such findings support a functional architecture in which the degree of activation of a representation is itself graded and influences, as well, the degree of activation of potential competitors. As a stimulus approaches the phonetic category boundary, its activation decreases and there is a concomitant increase in the activation, and hence the extent of competition with the contrasting phonetic category representation. For example, assume a [da]–[ta] continuum ranging in 10 ms steps from 0–40 ms voice onset time (VOT) with a category boundary of 20 ms. As described earlier, there is competition between stimuli that share acoustic properties. Thus, presentation of a 40 ms stimulus (perceived as a [d]) would compete with the representation of the contrasting voiced phonetic category [t]. However, a stimulus with a VOT of 30 ms is a poorer exemplar of the voiceless phonetic category, and thus not only does it activate the phonetic representation of [t] more weakly, but there is an increase in the activation of the contrasting voiced phonetic category [d] (see Blumstein, Myers, & Rissman, 2005).

      Neural evidence also supports the gradient nature of phonetic categories. Both temporal and frontal areas show graded responses as a function of the goodness of the phonetic category input, with the least activation for the best exemplar of the phonetic category and increased activation as stimuli on a continuum approach the phonetic category boundary (Blumstein, Myers, & Rissman, 2005; Frye et al., 2007; Guenther et al., 2004). Importantly, other neural areas (middle frontal gyrus, supramarginal gyrus) fail to show such graded activation, displaying sensitivity only to between‐phonetic‐category and not to within‐phonetic‐category differences (Joanisse, Zevin, & McCandliss, 2007; Myers et al., 2009). That there is both graded and categorical perception of phonetic categories reflects two critical aspects of speech perception: the need for sensitivity to fine acoustic differences on the one hand, and sensitivity to category membership on the other. We will return to this point in the Conclusion of this chapter.

       Lexical access

      Using the visual world paradigm, it has been shown that access to lexical representations is affected by the fine acoustic structure of the auditory input. In particular, it has been shown that looks to a visual target are affected by within‐phonetic‐category acoustic differences (McMurray, Tanenhaus, & Aslin, 2002; see also McMurray, Tanenhaus, & Aslin, 2009). In the 2002 study of McMurray and colleagues, eye movements were measured as participants identified a named target word (using a mouse click) from an array of four pictures. The pictures consisted of a target word whose name began with a labial stop consonant, for example pear, a minimal pair of the target bear, and two phonologically unrelated words, for example lamp and ship. The auditorily presented names varied along a [b]–[p] VOT continuum ranging from 0–40 ms in 5 ms steps. Results showed graded responses with increasing looks to the competitor minimal pair (i.e. bear) as the acoustic‐phonetic input (a VOT variant of [p]) approached the phonetic boundary. These findings show that lexical access is indeed graded.

      Perhaps stronger evidence of the effects of within‐phonetic‐category differences on access to higher levels of processing comes from studies showing that within‐phonetic‐category effects not only influence access to lexical representations but also cascade to the lexical semantic network. Examining semantic priming in a lexical decision task, Andruski Blumstein, and Burton (1994) presented prime words semantically related to a target stimulus in which the initial voiceless stop consonant of the prime was an exemplar stimulus (spoken naturally and acoustically unmodified) or it was a poorer exemplar of the voiceless stop phonetic category (the VOT was reduced by one third or two thirds). Shortening the VOT of the stimuli rendered them closer to the voiced phonetic category boundary. Importantly, pilot work showed that stimuli presented alone were perceived correctly as beginning with voiceless stop consonants. However, the reduction of the VOT for the acoustically modified stimuli resulted in a reduction in the magnitude of semantic priming relative to the unmodified prime stimuli, particularly for the primes reduced by two thirds.

       The motor theory of speech perception

      We turn now to an unresolved question: What is the nature of feature representations? The crux of the problem turns on variability in the phonetic input. As indicated at the beginning of this chapter, there are many sources of variability that affect and influence the ultimate speech input that the listener receives. The question is whether, despite this variability, there are patterns (articulatory or acoustic) that provide a stable mapping from acoustic input to features and ultimately phonetic categories. At this point, no one has solved this invariance problem, that is, no one has solved the transformation of a variable input to a constant feature or phonetic category representation. Even if one were to assume that lexical representations are episodic, containing fine structure acoustic differences that are used by the listener, as has been proposed by Goldinger (1998) and others, such a view still begs the question.

Скачать книгу