Chapter 3: Transcribing Speech Sounds
1. The following city names all contain the letter ‘t’ within the word. In which of them is the letter ‘t’ pronounced as [th]?
2. The following words all contain the segment /k/. In which of them is it pronounced as the allophone [kh]?
3. The following words all contain the segment /p/. In which of them is it pronounced as the allophone [ph]?
We know now that we can use the IPA to transcribe speech sounds, and that our transcription can be either broad or narrow. When we make a narrow transcription, we’re including as much detail as possible about how speakers produce sounds, which often means including diacritics. To give an accurate narrow transcription of Canadian English, we would have to include a property that is part of nearly every variety of English – aspiration on voiceless stops.
To illustrate what aspiration is, I’m going to ask you to say a silly sentence: The spy wanted to buy a blueberry pie.
Now say it again, and hold your hand in front of your mouth. The spy wanted to buy a blueberry pie.
Did you feel any differences between the words spy, buy and pie? For native speakers of English, the word pie is produced with a little puff of air as the [p] is released. That puff of air is called aspiration. English speakers systematically produce aspiration on voiceless stops at the beginning of a stressed syllable, but not on voiced stops. To understand why we have to think about voicing and about the manner of articulation.
Remember that voiced sounds are produced by vibrating the vocal folds, whereas voiceless sounds have the vocal folds held open so air can pass freely between them. Remember also that producing a stop involves closing off the vocal tract completely for a moment, then releasing the obstruction and allowing air to flow freely again.
Think about the voiced stop at the beginning of the word buy. The lips are closed – that’s the stop closure – and the vocal folds start vibrating for the voiced [b]. Then the lips open and the stop is released, and the vocal folds keep vibrating for the diphthong [aɪ].
But in the word pie, things work differently. The lips are closed for the bilabial stop. But because [p] is a voiceless stop, the vocal folds are not vibrating. We open the lips to release the stop, but 30 or 40 milliseconds pass before we start vibrating the vocal folds. That 30-40 milliseconds between when the stop closure is released and the voicing begins is called the voice onset time or VOT. In English, voiceless stops in certain positions have a VOT of 30-40 milliseconds, so we say that they’re aspirated. But voiced stops have a much shorter VOT, of about 0-10 milliseconds. In other words, the vocal folds start vibrating at almost exactly the same time as the stop closure is released, so voiced stops in English are unaspirated. The diacritic to indicate aspiration on a stop is a little superscript h, like so: [ph, th, kh].
But to make matters even more complicated, it’s not all voiceless stops that get aspirated in English – only voiceless stops at the beginning of a stressed syllable. In words like appear and attack, the voiceless stop isn’t the first sound in the word, but it comes at the beginning of a stressed syllable so it gets aspirated. [əphiɹ] [əthæk]
But in the words apple and nickel, the voiceless stop comes after a stressed syllable and before an unstressed syllable, so it doesn’t get aspirated. [æpəl] [nɪkəl]
We don’t aspirate voiceless stops at the ends of words, like in brick. [bɹɪk]
And we don’t aspirate voiceless stops following an [s], even if they’re at the beginning of a stressed syllable:
Aspiration of voiceless stops is something that native speakers do so regularly and so automatically that it’s very hard for us to perceive it because it’s just always there. To convince you, I’m going to record someone saying this sentence and show you the waveforms. This program is known as a waveform editor. And here’s Kendrick’s voice saying that sentence.
The spy wanted to buy a blueberry pie.
Here’s the waveform: this is a visual representation of the sound waves that Kendrick just produced. See that I can select certain parts of the sentence and play them back. spy, buy, pie
Look first at buy – you can see that there’s very a brief silence: that’s where Kendrick’s lips were closed for the bilabial stop. Then when he releases his lips the waveform gets nice and big for the sonorous vowel [aɪ].
Look over here at pie. You see the same silence where the lips are closed, and the same big waveform for the vowel [aɪ] but before the vowel, there’s this noisy burst of turbulence – that’s the aspiration.
And now look at spy. We see the turbulence at the beginning for the fricative [s], followed by the silence while the lips are closed and the nice sonorous vowel. But there’s no burst of noise following the release of the lips because the [p] in spy was not aspirated. In fact, if I select just the -py portion of spy, what does it sound like? To a native speaker of English, this part sounds like buy, because the [p] is unaspirated.
When you’re transcribing words with the voiceless stops [p t k], your challenge will be to figure out if the stops are aspirated or unaspirated, so you can indicate the aspiration in your narrow transcription. In most varieties of English, aspiration happens in these predictable environments.
- Voiceless stops are aspirated at the beginning of a word, and at the beginning of a stressed syllable.
- Voiceless stops are unaspirated at the beginning of an unstressed syllable. They’re also unaspirated in any other position, like at the end of a syllable or the end of a word.
- And even if a syllable is stressed, a voiceless stop is unaspirated if it follows [s].
- In English, voiced stops are never aspirated. They’re always unaspirated.
One thing that I want you to remember is that this pattern of aspiration is particular to the grammar of English, but stops behave differently in other languages. In French and Spanish, for example, voiceless stops are almost always unaspirated. And some languages, like Thai, actually have a three-way distinction between voiced, unaspirated voiceless, and aspirated voiceless stops.