Cognitive Neuroscience and Neuropsychology NeuroReport 8, 1761–1765 (1997) 1 1 PREVIOUS work suggests that speech sounds incorporating short-duration spectral changes (such as the formation transitions of stop consonants) rely on left hemisphere mechanisms for their adequate processing to a greater degree than do speech sounds incorporating spectral changes of a longer duration (such as vowel sounds). Ten normal subjects were scanned using positron emission tomography while discriminating pure tone stimuli incorporating frequency glides of either short or long duration. A comparison of these two conditions yielded significant activation foci in left orbitofrontal cortex, left fusiform gyrus, and right cerebellum. Because non-linguistic stimuli were used, these foci must reflect some basic low level aspect of neural processing that may be relevant to speech but cannot be a consequence of accessing the speech system itself. Key words: Auditory perception; Cerebellum; Frontal cortex; Fusiform gyrus; Hemispheric specialization; Language; Positron emission tomography; Speech acoustics Introduction 1 1 1 p Auditory language comprehension is a multistage process relying on many different cognitive functions as an acoustic signal is perceived, parsed into subphonemic features, and the features synthesized into a semantically coherent utterance. In approximately 96% of right-handers and 70% of lefthanders, the left hemisphere is preferentially involved in the later (phonological and semantic) stages of this process,1 and it is possible that speech processing proceeds asymmetrically from its earliest stages. Indeed, it has been hypothesized that the left hemispheric specialization for speech may be based, in part, on prepotent mechanisms within this hemisphere for the processing of auditory patterns of a fine temporal grain, such as are found in speech.2–4 Some phonetic features (such as voicing and place of articulation) are instantiated in spectral changes of an acoustic signal that happen with a time course of tens of milliseconds. For example, the perception of place of articulation contrasts (like /ba/ vs /da/) requires accurate tracking of formation trajectories at constant-vowel transitions which are typically no more than 40 ms long. Evidence for a left hemispheric specialization for the processing of acoustic transients comes from several sources. First, patients with left hemisphere lesions and aphasia appear to be impaired at discriminating phonemes on the basis of voicing and place of articulation, but such patients do not appear to © Rapid Science Publishers Left-hemisphere specialization for the processing of acoustic transients Ingrid S. Johnsrude,CA Robert J. Zatorre, Brenda A. Milner and Alan C. Evans1 Neuropsychology/Cognitive Neuroscience Unit and 1McConnell Brain Imaging Centre, Montreal Neurological Institute, McGill University, 3801 University Street, Montreal, Canada H3A 2B4 CA Corresponding Author show impairment of vowel discrimination, which is made on the basis of spectral information present in the auditory signal over a longer period.5–9 Second, dichotic listening studies in normal populations demonstrate a greater right ear (left hemisphere) advantage for speech sounds that require perception of short duration spectral changes for their accurate identification than for speech sounds that do not.2,3,10,11 For example, Shankweiler and StuddertKennedy2 found no differences in accuracy of identification of steady-state vowels between stimuli presented to the left ear and those presented to the right. In contrast, when perception of stop consonant–vowel syllables was examined (/pa, ta, ka, ba, da, and ga/), they found that subjects were more accurate at reporting stimuli presented to the right ear, to a highly significant degree. Finally, two recent positron emission tomography (PET) studies12,13 revealed increased activation in left hemisphere areas in response to rapidly changing auditory cues. In a study in which normal subjects listened passively to speech-like syllables incorporating formation transitions, Belin and his colleagues12 found bilaterally symmetrical activity in auditory areas when the formation transitions were long (200 ms), but a leftward asymmetry in activation in response to short (40 ms) formant transitions. Fiez et al.12 observed that increases in cerebral blood flow (CBF) in the left frontal operculum were larger when their subjects performed an auditory detection task upon stimuli that incorporated stop consonants than for steadyVol 8 No 7 6 May 1997 1761 I. S. Johnsrude et al. 1 11 11 11 state vowels. Thus, speech signals that are stable or that change only slowly in time (over hundreds of milliseconds) appear to be processed bilaterally. In contrast, speech signals that change rapidly in time (over tens of milliseconds) appear to be preferentially processed by left hemisphere mechanisms. The present PET activation study was designed to examine the idea that providing fine-grained temporal resolution to the auditory percept may be a general specialization of the left hemisphere and may apply to the processing of any sound that changes rapidly in time, not just sounds with linguistic relevance. If this is the case, then the processing of non-speech pure tones that incorporate frequency glides with the temporal characteristics of stop consonant formant transitions ought to involve left hemisphere regions preferentially, when compared with processing of pure tone stimuli incorporating frequency glides of a longer duration. Speech signals are spectrally complex however, and the left hemispheric specialization observed in the studies described above may not be related to the temporal structure of acoustic transients per se, but may instead depend primarily on their spectral characteristics. The stimuli used in this experiment are qualitatively different from speech stimuli in their spectral properties. It was hoped that this would be a stronger test of the hypothesis that the left hemisphere possesses specialized mechanisms for the processing of fine-grained temporal structure. Materials and Methods The experimental protocol was approved by the ethical review committee of the Montreal Neurological Institute and Hospital. Twelve right-handed neurologically normal participants (six men, six women; mean age 23 years) were recruited from the McGill community, and trained on 80 trials of each of the experimental tasks (see below). Any subject who did not score at least 75% correct on both tasks during the last 40 trials was not scanned. Two subjects (one male, one female) were thus excluded. Two sets of sound stimuli were synthesized on an IBM-compatible computer. The ‘short glide’ set consisted of eight items; 220 ms steady-state pure tone at one of four frequencies (1250, 1600, 1950, or 2300 Hz), preceded by a 30 ms 350 Hz ascending or descending linear frequency glide. The chosen steady-state frequencies, the absolute frequency difference subtended by the glide, and the duration of the glide were all within the range observed in the second formant of English stop consonants. The set of ‘long glide’ stimuli was almost identical, except that the duration of the glide component was extended to 100 ms, and the duration of the steadystate component was decreased to 150 ms, so that the absolute duration of the stimuli was held constant at 250 ms. Similarly, the absolute frequency difference subtended by the glide was held constant at 350 Hz. Subjects underwent PET scans in two separate conditions, presented in a random order. In both conditions, subjects listened to the glide stimuli, presented in pairs, through insert earphones (EarTone ER3A) at 75 dB SPL (A). The ‘short glide’ stimuli were used in one condition, whilst the ‘long glide’ stimuli were used in the other. In both conditions, subjects were asked to make same–different judgements on the pairs, indicating their choice by a key press response with the right hand. Within each pair, the steady-state positions of the stimuli were identical in frequency. In half the trials, however, the glide 11 11 1p FIG. 1. Schematic drawing of the scan time course and sample stimuli used in the experiment. Two trials are shown in each condition. 1762 Vol 8 No 7 6 May 1997 The processing of acoustic transients 1 1 1 FIG. 2. Averaged PET t-statistic volume superimposed upon the averaged MR image. The range of t-values for the PET data is coded by the colour scale. (A) Sagittal section, taken 26 mm to the left of midline. (B) Horizontal section taken 30 mm below the bicommissural plane. (C) Horizontal section, taken 18 mm below the bicommissural plane. (D) Horizontal section taken at 10 mm above the bicommissural plane. (E) Horizontal section, taken 13 mm below the bicommissural plane. Subtraction of the regional cerebral blood flow observed in the long glide condition from that observed in the short glide condition yielded significant focal changes in the left orbitofrontal region; area 11 (stereotaxic coordinates x = –31, y = 44, z = –19; t = 4.11; shown on A and C); in the left fusiform gyrus; area 19/37 (x = –32, y = –50, z = –12; t = 3.55; as shown in A and E); and the right cerebellum (x = –28, y = –61, z = –29; t = 3.54; shown on B). In addition, an activation focus was observed in the left middle occipital gyrus, area 18 (x = –24, y = –85, z = 9, shown on A and D), although this focus did not reach significance (t = 3.25). 1 1 p component was different (i.e. ascending to the steadystate tone in one item, and descending to that tone in the other item). Pairs were presented with a 3 s inter-trial interval, and the interval between items of a pair was 1 s (Fig. 1). Subjects were scanned for 60 s, and each task commenced approximately 40 s before scanning, and continued for 45 s after the scan was completed (32 pairs presented). PET data were acquired using a Scanditronix PC-2048 system, which produces 15 slices at an intrinsic resolution of 5.0 ´ 5.0 ´ 6.0 mm. Using the bolus H216O method without blood sampling, relative distribution of CBF was measured in the two conditions. PET images were reconstructed using an 18 mm full-width half-maximum filter and normalized for global CBF value. A high resolution MR image (160 slices, 1 mm thick) was also obtained for each subject (Philips Gyroscan 1.5T) and each pair of MR and PET data sets was resampled into a standardized stereotaxic coordinate system.14,15 The MR and PET images were averaged across subjects for each activation state to obtain an average MRI and a mean CBF change image. Significant focal CBF changes were identified in the averaged PET volume by converting it to a t-statistic volume (by dividing each voxel by the mean standard deviation in normalized CBF for all intracerebral voxels).16 The functional images were merged with the averaged MR image to allow localization of significant t-statistic peaks, which were identified by application of an intensity threshold to the t-statistic images. The presence of significant focal changes was tested by a method based on three-dimensional Gaussian random-field theory.16 Values of t > 3.5 were deemed statistically significant (p < 0.0002, onetailed, uncorrected). Vol 8 No 7 6 May 1997 1763 I. S. Johnsrude et al. 1 11 Results Subjects were significantly more accurate at making same–different judgements on long glide pairs than on short glide pairs (t = 4.72; p < 0.001), although both tasks were performed well (mean 95% correct for the long glide task and 86% correct for the short glide task). There was no difference in mean response latency in the two tasks (t = 1.84; p > 0.05; mean long glide latency 893 ms, mean short glide latency 973 ms). When the pattern of CBF observed in the long glide condition was subtracted from that in the short glide condition, significant cortical activations were observed in only three regions: the left orbitofrontal cortex, fusiform gyrus, and in the right cerebellar hemisphere, pars lateralis (Fig. 2). A CBF decrease was observed in right mid-dorsolateral frontal cortex (area 9; x = 27, y = 3, z = 36; t = 3.50). 11 Discussion 11 11 11 1p All of the neocortical activation foci were in the left hemisphere. The right lateral cerebellar activation is consistent with these left neocortical foci: the right lateral cerebellum is richly interconnected with the left cerebral hemisphere via the red and pontine nuclei and the thalamus, and several investigators have posited a role for the right cerebellar hemisphere in at least some aspects of language function.17,18 Given that the task demands of the two conditions were very similar, differing only in the duration of the glide component of the sounds, our findings support the contention that are specialized mechanisms for the processing of acoustic transients in the left cerebral hemisphere. Since we used pure tone stimuli of no linguistic value, we conclude that this specialization, while relevant to speech perception, must reflect a more general functional property of the human left hemisphere, and may apply to the processing of any short duration spectral change. The fusiform gyrus activation was particularly interesting, given that studies of phonemic monitoring consistently show activation in this region.19,20 For example, Zatorre and his colleagues20 performed a PET study in which subjects listened to pairs of monosyllabic English words. When the subjects were asked to monitor the pairs for the occurrence of a particular stop-consonant, and activation in this condition was compared with that obtained when they were simply listening to the words, significant CBF change was observed in area 19/37 (stereotaxic coordinates x = –34, y = –57, z = –9), about 8 mm posterior to the focus observed in the current study. Similarly, Démonet and his colleagues19 asked subjects to perform a stop consonant detection task that 1764 Vol 8 No 7 6 May 1997 had been made perceptually difficult. When activation in this condition was compared with that obtained during performance of a phoneme detection task that was perceptually simpler, the only significant CBF change observed was in the left area 19 (stereotaxic coordinates x = –32, y = –74, z = –12), about 2.4 cm posterior to that observed in the present study. Thus, the posterior fusiform region in the left hemisphere may be involved in processing the acoustic manifestations of some phonetic features, as would be necessary for the correct identification of a target stop consonant. Given that electrophysiological and neuromagnetic studies in cats, monkeys and humans have demonstrated the existence of neurones in the region of primary auditory cortex that are apparently able to preserve the temporal structure of transient auditory signals, and to encode the direction of a transient change,21–25 we expected to see some differences in activation in the region of auditory cortex. This was not the case. Either the two types of stimuli used in this study are processed similarly at early stages, or, as seems more likely, the sensitivity of the PET subtractive technique in combination with the discriminative task we used is insufficient to distinguish between the relevant auditory systems. We are currently exploring alternative PET methods which may be more sensitive to blood flow changes in this region. Conclusion The present study examined the brain regions involved in processing a single property of speech sounds that typically yield a strong right ear advantages in dichotic listening studies; namely, the temporal structure of stop consonant formant transitions. We did not attempt to respect the spectral properties of such formant transitions, and it is important to note that the speech signal is full of other information at changes more slowly over time. Data from this experiment, however, do provide support for the hypothesis that the left hemisphere possesses specialized mechanisms for the processing of acoustic information of a fine temporal grain. References 1. Branch C, Milner B and Rasmussen T. J Neurosurg 21, 399–405 (1964). 2. Shankweiler D and Studdert-Kennedy M. Q J Exp Psychol 19, 59–63 (1967). 3. Studdert-Kennedy M and Shankweiler D. J Acoust Soc Am 48, 579–594 (1970). 4. Schwartz J and Tallal P. Science 207, 1380–1381 (1980). 5. Blumstein SE, Cooper WE, Zurif EB and Caramazza A. Neuropsychologia 15, 371–383 (1977). 6. Gow DW Jr and Caplan D. Brain Lang 52, 386–407 (1996). 7. Miceli C. Caltagirone C, Gainotti G and Payer–Rigo P. Brain Lang 6, 47-51 (1978). 8. Oscar-Berman M, Zurif EB and Blumstein SE. Brain Lang 2, 345–355 (1975). 9. Tyler L. Spoken Language Comprehension: An Experimental Approach to Disordered and Normal Processing. Cambridge: MIT Press, 1992. The processing of acoustic transients 1 10. Berlin CI and McNeil MR. Dichotic listening. In: Lass N, ed. Contemporary Issues in Experimental Phonetics. New York, Academic Press, 1978: 327–374. 11. Cutting JE. Percept Psychophy 16, 601–612 (1974). 12. Belin P, Zilbovicius M, Crozier S et al. Soc Neur Abst 22, 1698 (1996). 13. Fiez JA, Raichle ME, Miezin FM et al. J Cogn Neurosci 7, 357–375 (1995). 14. Talairach J and Tournoux P. Co-planar Stereotaxic Atlas of the Human Brain. Three-dimensional Proportional System: An Approach to Cerebral Imaging. Stuttgart: Thieme, 1988. 15. Colins DL, Neelin P, Peters TM and Evans AC. J Comput Assist Tomogr 18, 192–205 (1994). 16. Worsley KJ, Evans AC, Marrett S and Neelin P. J Cerebr Blood Flow Metab 12, 900–918 (1992). 17. Fiez JA. Neuron 16, 13–15 (1996). 18. Leiner HC, Leiner AL and Dow RS. Trends Neurosci 16, 444–447 (1993). 19. Démonet J-F, Price C, Wise RJS and Frackowiak RSJ. Brain 117, 671–682 (1994). 20. Zatorre RJ, Meyer E, Gjedde A and Evans AC. Cerebr Cortex 6, 21–30 (1996). Glass I and Wollberg Z. Hearing Res 9, 27–33 (1983). Kaukoranta E, Hari R and Lounasmaa O. Exp Brain Res 69, 19–23 (1987). Phillips D. J Exp Psychol: Human Percept Perform 19, 203–216 (1993). Steinschneider M, Schroeder C, Arezzo J and Vaughan H Jr. Electroencephalogr Clin Neurophysiol 92, 30–43 (1994). 25. Whitfield I and Evans E. J Neurophysiol 28, 655–681 (1965). 21. 22. 23. 24. ACKNOWLEDGEMENTS: We thank the staff of the McConnell Brain Imaging Centre and Pierre Ahad for their technical assistance. Supported by grants from the Medical Research Council of Canada (SP-30, MT-2624 and MT-11541), and from the McDonnell-Pew Program in Cognitive Neuroscience. Received 21 February 1997; accepted 6 March 1997 General Summary The identification of a variety of speech sounds depends on the perception of changes in the sound spectrum that happen over tens of milliseconds, such as occurs in the transition of the second formant in stop-consonants. In order to determine whether the lefthemispheric specialization for language may be based in part on a relative specialization of this hemisphere for the processing of acoustic transients, we performed a PET study in which 10 normal subjects were scanned while discriminating pure tone stimuli incorporating frequency glides of either short, or long duration. A comparison of these two conditions yielded significant blood flow changes in left cortical areas and in right cerebellum. Because non-linguistic stimuli were used, we concluded that this asymmetrical activation (while relevant to speech perception) must reflect a more general specialization of the human left hemisphere that applies to the processing of any short duration spectral change. 1 1 1 1 p Vol 8 No 7 6 May 1997 1765
© Copyright 2025 Paperzz