The “BRAIN” Model of intelligibility in business telephony

Despite the vast technological advances in telephony over the past century, there are still many challenges and obstacles that remain in terms of intelligibility

June 28, 2010

By Fiona Mclean-Banks, Polycom Business Development Manager at distributor Zycko

Despite the vast technological advances in telephony over the past century, there are still many challenges and obstacles that remain in terms of intelligibility. George Ashley Campbell, a pioneer in long-distance telegraphy and telephony, showed in 1910 that accuracy over the telephone stood at only 59% as opposed to 96% over open air, and although things have come a long way since then, we still find ourselves putting up with inaccuracies that we encounter in telephone conversations today.

The “BRAIN” model of critical elements in business telephony shows how mutual dependencies can be used to improve intelligibility, and how modern solutions can directly benefit communication.
Analogue loop lengths, building wiring, variable line equalisation characteristics, poor handset and speakerphone designs, mixed networks, noise in conference rooms, paper shuffling, pen tapping, fan noise, and a host of other issues, still pose a considerable challenge to telephony intelligibility.

With globalisation and the rise of the mobile workforce, the importance of intelligible telephony is magnified, yet, so are the obstacles.  Business people do not have enough time to complete their work and misinterpretation due to voice quality is a frustration. Accuracy and quality associated with voice conversations is therefore ever-more important. Users are often talking to people they haven’t met, and usually need to forge high-value relationships over the phone. Users also sometimes speak different native languages so accented speech becomes a further burden to intelligibility.
Conferences often occur among groups, increasing the potential for inaccuracies due to background noise, and increasing the importance of a successful conversation due to difficulty in scheduling the meeting. And lastly, meetings can be very long, yet require sustained attention. Small inaccuracies have a big impact on attention-span.

The components of speech intelligibility

The hearing and understanding of speech takes place through three stages: physical, cognitive and analytical. In the physical stage, speech travels from the talker’s mouth to the listener’s ears. In the analytical stage, ambiguities are resolved by the listener through knowledge of grammar and accent rules, and the local context of the words. In the analytic stage, those words that have still not been resolved are scrutinised, whereby the listener examines them in broader context and tries to interpret the intended meaning. The second and third phases lend themselves to distraction, particularly when ambiguities stem from communication between foreign talkers, where the same contextual assumptions may not be applicable. Thus, the best defence against misunderstanding is to minimise obstacles in the physical stage.

The physical elements of intelligibility

The BRAIN model of physical speech communication comprises five critical parameters: Bandwidth, Reverberation; Amplitude; Interaction; and Noise.
Together, they form the basis for intelligibility. What is interesting is that these elements work together, and research has shown that to a large degree, each of these parameters can compensate for deficiencies in another.
This means that problems can be solved by improving parameters that are more accessible in a particular room environment in order to achieve the desired result. In order to do this it is essential to understand what can be done when each of the five parameters is deficient:

1. Bandwidth

Bandwidth is the amount of speech bandwidth that is carried to the listener.
Telephones are limited to the band 300 Hz to 3.3 kHz, and carry only 20 percent of the frequencies present in human speech. Some modern business audio systems, on the other hand, carry frequencies as high as 22 kHz.

A deficiency in bandwidth is usually caused by narrowband Plain Old Telephone System (POTS) or IP connections to carry a telephone connection and has until recently, been regarded as an insoluble problem, constrained by infrastructure. Thanks to new technology this problem is disappearing, with standards-compliant IP telephones and speakerphones becoming available to take advantage of available data bandwidth, and delivering much higher audio bandwidth and fidelity.

2. Reverberation
Reverberation measures the amount of room echo that occurs between the talker and the microphone. It makes speech more difficult to understand and is strongly affected by room characteristics (hard, reflective walls, floor and ceiling), room size, and the position of the speaker in relation to the microphone.

Reverberation can be addressed most easily by optimising the placement of the microphone in relation to the talker. In a larger room environment with multiple talkers this can be done by using a multiple-microphone system which intelligently selects the microphone with the best pickup for each talker – and by ensuring that principle talkers are positioned closest to the microphone. Direct solutions also include improving the room acoustics by mounting acoustic diffuser and absorbing elements on the walls, ceiling, and corners.

Increasing the bandwidth can compensate significantly for reverberation, while increasingly amplitude and reducing noise can also be effective to an extent.

3. Amplitude

Amplitude refers to how loud the talker sounds to the listener. Quiet talkers are more difficult to understand than loud ones and a listener who is distant from the loudspeaker will have a harder time hearing than when close to it. Telephone networks also have different gains, varying in loudness from connection to connection.

While moving the position of the talker or listener is an obvious solution to amplitude deficiency, it is not always practical. Systems capable of automatically adjusting gain when needed can improve intelligibility greatly in these situations. Increasing bandwidth, reducing reverberation and reducing ambient noise can also compensate significantly for low amplitude.

4. Interaction

Interaction is the ability of two or more participants to naturally interact with each other during a telephone conference. One speaker should be able to interrupt the other without disturbing the flow of conversation – preventing the “what?”; “you go ahead”; “no you go ahead” exchange caused by mediocre speakerphones. Interactive speech between distant groups can be difficult for several reasons, with the most obvious being the absence of a true full-duplex system that allows for transparent interactive speech. Better microphone placement, wider bandwidth and a stronger signal can also assist.

Often, poor interaction is cause by an excessive end-to-end delay of the audio signal. This can be caused by the communication channel itself – satellite connections have a longer connection delay than earth-based ones.
IP telephony systems can also have delays for as long as 150ms, and delays can also come from some conference bridges, which can insert delays in their processing. Room reverberation can also increase delays between talker and listener.

5. Noise
Noise refers to the ambient noise picked up by the microphone along with speech. This is strongly impacted by the room environment. Because common noise sources and speech share much of the same spectrum, reducing noise is imperative for intelligibility. Fixing noise at the source should be the first step – moving microphones away from air conditioning ducts, overhead projectors, coffee makers etc. Directional microphones should be pointed away from noise sources. Aside from these direct approaches, increased bandwidth can also significantly improve the situation. Conference solutions that eliminate noise from the telephone line, such as clicking, buzzing and hissing also assist in this regard.

Further to this, some systems today incorporate active noise reduction algorithms that analyse microphone audio in time and frequency domains and apply a form of filtering that can reduce fan noise by 6-9 dB, without affecting speech.

Towards a flawless conferencing experience

A thorough understanding of each of the elements of the BRAIN model, and how they work together, combined with continual enhancements to conferencing technology, means that many of the problems experienced in telephony over the last century can now be remedied. As conferencing becomes an increasingly popular business practice, it is good to know that there is a solution appropriate for any given conferencing environment, at a reasonable cost.