Predicting Binaural Speech Intelligibility

Speech intelligibility is most accurately and representatively measured using subjective procedures, involving panels of human test subjects. Unfortunately, subjective tests are cumbersome and expensive. For this reason, researchers, engineers and acoustics consultants often rely on objective procedures to predict speech intelligibility. Examples of such procedures are the Articulation Index (AI), Speech Intelligibility Index (SII) and the Speech Transmission Index (STI).

The SII and the STI are considered to represent the state of the art in intelligibility prediction. Although clear differences exist between the models (related to the models themselves and to their applications), both methods have many common features. The models are based on the observation that information carried in speech can be thought of as the sum of contributions by individual frequency bands. The models incorporate knowledge speech perception to arrive at intelligibility predictions in the form of a 0-1 index that is easy to interpret.

Standardized versions of the STI and SII are monaural models, based on single-channel estimates. SII and STI were designed to predict intelligibility in diotic listening conditions (i.e., same signal at left and right ear), based on measurements with a single microphone. This means that any binaural intelligibility benefits are disregarded. The benefit of listening to speech with two ears instead of one in conditions with background babble is known as the cocktail party effect. A significant body of scientific research on this topic, spanning half a century, provides ample resources to draw from for devising binaural intelligibility models. Models that cover aspects of binaural hearing will extend the scope to other applications, yielding more accurate prediction results.

Within the HearCom project we developed binaural extensions of the SII and STI. A summary of the models can be found in the links below.