Voice Science
The Science of Voice
A spectrogram is a graphic display of the frequencies that make up a particular sound signal. A complex signal such as the voice is actually made up of the sum of sound waves of many different frequencies. The vibration of the vocal folds produces a series of harmonics. The lowest harmonic is called the fundamental frequency, and it is typically about 100 Hz in men ("Hz" is short for Hertz, and stands for one cycle per second).
In addition to the sounds produced by the larynx, we can also produce sounds in other parts of the vocal tract.
These sounds are usually made my forcing air to flow through narrow openings. For example, when we make the "s" sound we force air between the tongue and the roof of the mouth (the palate). The turbulence created with this air flow produces the desired sound.
Vowels are typically in the low frequency, and consonants are in the high frequencies.
In a spectrogram, time is plotted along the horizontal axis and frequency is plotted along the vertical axis. The intensity of the sound at each particular time is represented by the amount of shading in the graph.
The graph below shows an example of the word "see-saw." The first part of each syllable is made up of the consonant "s," and consists mostly of high frequency sounds. After the "s" sound there are the "ee" and "aw" sounds which are made up of lower frequency sounds. There also are harmonics, which show up as the parallel horizontal bands.
Beneath the spectrogram is a plot of the actual speech signal which is used to calculate the spectrogram.
Below is another example of a spectrogram of a short sentence. Note that the harmonics of the vowels often slowly increase or decrease in frequency. These are the small changes in inflection that occur when we speak.
(Spectrograms calculated using the program "SoundScope" by GW Instruments)