Engineering Acoustics. Malcolm J. Crocker
Чтение книги онлайн.
Читать онлайн книгу Engineering Acoustics - Malcolm J. Crocker страница 69
All these forms of sensory‐neural hearing loss mentioned are presently medically untreatable. One exception is Ménière's disease, in which the inner ear becomes distended. Drugs can sometimes improve this condition.
Another reported cause of sensory‐neural deafness is drug‐induced hearing loss which is called ototoxicity. There are many approved drugs that can cause ototoxicity through direct effect on cochlear biology and they include anti‐cancer chemotherapy drugs, some kinds of antibiotics, quinine, and loop diuretics used to treat hypertension [49].
These drugs can cause either temporary or permanent hearing loss and vertigo and tinnitus. Recent studies have suggested that certain industrial chemicals may also potentiate the effect of noise exposure [50, 51]. In this case, damage to the cochlea is the direct result of chemical injury to the fragile hair cells within the organ of Corti.
4.4.3 Presbycusis
Another form of sensory‐neural hearing loss is presbycusis. This loss in hearing sensitivity occurs in all societies. Figure 4.24 shows the shift in hearing threshold at different frequencies against age [52]. As can be seen, presbycusis mainly affects the high frequencies, above 2000 or 3000 Hz. It affects men more than women. The curves shown in Figure 4.24 show the average hearing loss (with average hearing at the age of 25 assumed to be zero hearing loss). The group of people tested was subject to both presbycusis (the so‐called natural hearing loss with age), and sociocusis (a term coined by Glorig [53] to include all the effects of noise damage in our everyday lives with the exception of excessive industrial noise).
Figure 4.24 Shift in hearing threshold at different frequencies against age for men and women [52].
Presbycusis is believed to be caused by central nervous system deterioration with aging, in addition to changes in the hearing mechanism in the inner ear. Hinchcliffe [54] believes that these changes explain the features normally found in presbycusis: loss of discrimination of normal or distorted speech, increase in susceptibility to masking particularly by low‐frequency noise, and a general loss in sensitivity. Rosen [55] suggested that degenerative arterial disease is a major cause of presbycusis. Others have suggested that a diet low in saturated fat may, in addition to protecting a person against coronary heart disease, also protect him or her against sensory‐neural hearing loss (presbycusis).
The sensory‐neural hearing loss caused by intense noise is discussed in Chapter 5.
4.5 Speech Production
Since speech and hearing must be compatible, it is not surprising to find that the speech frequency range corresponds to the most sensitive region of the ear's response (Section 4.3.2) and generally extends from 100 to 10 000 Hz. The general mechanism for speech generation involves the contraction of the chest muscles to force air out of the lungs and up through the vocal tract. This flow of air is modulated by various components of the vocal mechanism (Figure 4.25) to produce sounds which make up part of our speech pattern. The modulation effect first takes place at the larynx, across which are stretched the vocal cords. These are composed of two bands of membranes separated by a slit which can open and close to modulate the flow of air [56]. The modulation frequency depends upon the tension in the muscles attached to the vocal cords, and on the size of the slit in the membranes (about 24 cm for males and 15 cm for females). The sound emitted by the vocal cords has a buzz‐type sound corresponding to a sawtooth waveform containing a large number of harmonically related components.
Figure 4.25 Sectional view of the head showing the important elements of the voice mechanism.
This sound/air flow is then further modified by its flow through the numerous cavities of the throat, nose, and mouth, many of which can be voluntarily changed at will by, for example, changing the position of the tongue or shape of the lips, to produce a large variety of voiced sounds. It is possible to produce some sounds without the use of the vocal chords and these are known as unvoiced or breathe sounds. These are usually caused by turbulent air flow through the upper parts of the vocal tract and especially by the lips, teeth, and tongue. It is in this way that the unvoiced fricative consonants f and s are formed. In some cases, part of the vocal tract can be blocked by constriction and then suddenly released to give the unvoiced consonants p and g [57].
Generally, vowels have a fairly definite frequency spectrum whereas many unvoiced consonants such as s and f tend to exhibit very broadband characteristics. Furthermore, when several vowels and/or consonants are joined together their individual spectra appear to change somewhat. The time duration of individual speech sounds also tends to vary widely over a range of 20–300 ms.
In the general context of speech, vowels, and consonants become woven together to produce not only linguistically organized words, but sounds which have a distinctive personal characteristic as well. The vowels usually have greater energy than consonants and give the voice its character. This is probably due to the fact that vowels have definite frequency spectra with superimposed periodic short‐duration peaks. However, it is the consonants which give speech its intelligibility. It is therefore essential in the design of rooms for speech to preserve both the vowel and consonant sounds for all listeners. Consonants are generally transient, short‐duration sounds of relatively low energy. Therefore, for speech, it is necessary to have a room with a short reverberation time to avoid blurring of consecutive consonants; we would expect therefore speech intelligibility to decrease with increasing reverberation time. At the same time we find that in order to produce a speech signal level well above the reverberant sound level (i.e. high signal‐to‐noise ratio), we require increased sound absorption in the room. This necessitates a lower reverberation time. Although this may lead us to think that an anechoic room would be most suitable for speech intelligibility, some sound reflections are required both to boost the level of the direct sound and to give the listener a feeling of volume. Therefore, an optimum reverberation time is established. This is usually under one second for rooms with volumes under 8500 m3. If the speech power emitted by a male speaker is averaged over a relatively long period (i.e. five seconds), the overall sound power level is found to be 75 dB. This corresponds to an average sound pressure level of 65 dB at 1 m from the lips of the speaker and directly in front of him or her. Converting the sound power level to sound power shows that the long time averaged sound power for men is 30 μW. The average female voice is found to emit approximately 18 μW. However, if we average over a very short time (i.e. l/8 second) we find that the power emitted in some vowel sounds can be 50 μW, while in other soft spoken consonants it is only 0.03 μW. Generally, the human voice has a dynamic range of approximately 30 dB throughout its frequency range [58]. At maximum vocal effort (loud shouting) the sound power from the male voice may reach 3000 μW.
Table 4.3 gives the long‐term rms sound pressure levels at l m from the average male mouth for normal vocal