Image-source method for room acoustics


Background

The image-source model (ISM) is a well-known technique that can be used in order to generate a synthetic room impulse response (RIR), i.e., a transfer function between a sound source and an acoustic sensor, in a given environment. Once such a RIR is available, a sample of audio data can be obtained by convolving the RIR with a given source signal. This provides a realistic sample of the sound signal that would effectively be recorded at the sensor in the considered environment.

This approach is of considerable importance as it provides a quick and easy way to generate a number of RIRs with varying environmental characteristics, such as different reverberation times, for instance. Consequently, the ISM technique has been used intensively in many application domains of room acoustics and signal processing.


Allen & Berkley's image-source model

Allen & Berkley were the first to propose a full implementation of the ISM technique in a landmark paper in 1979 [1]. Since then, this specific implementation has been used by many researchers.

Allen & Berkley's method however suffers from a number of drawbacks, which are related to the way the image-source model is implemented. One such issue is the fact that the RIR is first built as a histogram registering the impulses (vs. time) generated at the receiver by all the considered image sources. Because the histogram bins correspond to discretised values of time, the time delay of each image-source impulse has to be rounded towards the nearest bin centre value. While this might not be critical for some applications of room acoustics, this rounding process might have significant consequences for applications that rely on the time delay between waveforms received at multiple sensors (e.g., as in microphone array applications). A further drawback of the original ISM implementation is the fact that the resulting histogram of image-source impulses does not even remotely look like a real acoustic transfer function. In their work, Allen & Berkley attempt to solve this issue by high-pass filtering the histogram, which has the effect of basically transforming each Dirac impulse in the histogram into a sinc-like function.

An improvement to Allen & Berkley's original ISM implementation was proposed by Peterson in [2], where each image-source impulse is implemented directly as a (truncated) fractional-delay filter (i.e., a sinc function). This effectively allows the exact representation of non-integer time delays for each image source. Note that the same result is also obtained by computing the transfer function in the frequency domain and then taking the inverse Fourier transform back into the time domain.

Fig. 1 below shows a typical example of a RIR obtained using Allen & Berkley's method in conjunction with Peterson's improvement (i.e., using a frequency-domain implementation).


Figure 1: RIR example using Allen & Berkley's ISM implementation.

Notice how the RIR computed with Allen & Berkley's implementation displays a typically anomalous tail decay. This is in contrast with RIRs that would be typically recorded in a real acoustic environment, an example of which is shown in Fig. 2.


Figure 2: Example of real RIR, recorded in a room with T60≈0.6s.



Improved image-source method

Considering the above defects of Allen & Berkley's ISM implementation, we did some research work on this topic and came up with an improved ISM algorithm that addresses the problems mentioned above. The details of our implementation can be found in [3], but in essence, it uses a frequency-domain simulation of the image sources, and implements a phase inversion upon each sound reflection on the room boundaries. This effectively balances the contribution of each image source at the receiver, and results in improved RIRs that represent a much better approximation of typical real-room RIRs. An example of RIR computed using our method is shown in Fig. 3 below.


Figure 3: Example of RIR simulated with the method proposed in [3].

As can be seen in Fig. 3, and in contrast to Fig. 1, the impulse response computed according to [3] is very close to what can be expected when measuring a real transfer function in practice (Fig. 2).


Accuracy of the energy-decay prediction method

In addition to providing more accurate RIRs, the work proposed in [3] also describes a mathematical formula for the prediction of the simulated RIR's energy decay over time (energy decay curve, EDC). This is particularly advantageous since it allows the prior analysis of some of the RIR's characteristics without having to simulate the RIR itself using the ISM, which can be potentially time-consuming (especially for small room volumes, large reverberation times, or high values of sampling frequency). Using this proposed formula, the parameters of the considered environmental setup can be precisely pre-adjusted in order to achieve a desired reverberation time (RT) in the subsequently simulated RIRs [3,4]. This can be simply done as follows. First, define some absorption coefficients in the environmental simulation setup and compute the predicted EDC. If this EDC does not lead to the desired RT level, simply alter the value of absorption coefficients slightly, and re-compute the EDC again. By doing this a few times in a row, a RIR can be simulated that matches the desired characteristics more or less exactly.

Fig. 4 below illustrates the result from such a process. For a given RT value, the EDC prediction method in [3,4] is used to predict the value of absorption coefficients that will achieve exactly that level of reverberation. A RIR is then simulated in this environment (using our improved ISM technique [3]), and the "true" RT value is then measured from this RIR (using Schroeder's integration method), and plotted against the desired value of reverberation time. For each desired RT value, this experiment was repeated for a total of 100 RIRs, simulated in various environments with random source and sensor positions, random room volumes (ranging from 15 to 250 m3), and using both uniform and non-uniform wall absorption coefficients. Fig. 4 presents the median and interquartile range plots for the distribution of the resulting 100 measurements. It also provides a comparison with the method that is currently used by many researchers in order to achieve this result, usually using either Sabine or Eyring's RT formula. The same process as described above is used for these simulations: first, Sabine or Eyring's RT formula is used to determine the absorption coefficients achieving a desired RT, 100 RIRs are then simulated using the image-source model (Allen & Berkley's implementation [1] with Peterson's improvement [2]), and the "real" RT value is then measured in each resulting RIR.


Figure 4: Median and inter-quartile range plot of measured T20 vs. desired T20.

In Fig. 4, the best results are obviously achieved for the line that is closest to y = x (i.e., the measured RT is equal to the desired value). It can be seen that the method proposed in [3,4] achieves results that are pretty close to ideal, which validates the good accuracy of this method. On the other hand, RT prediction formulas such as Sabine or Eyring provide relatively poor results according to this test. These formulas, which were originally developed for real acoustic environments, are known to provide only coarse approximations in practice, let alone for ISM simulations. As shown in Fig. 4, not only is the predicted average RT quite erroneous, but the range of resulting RT values is also quite significant; when using Sabine or Eyring's formula to set a RT of T20 = 0.25s, for instance, the environment's true RT might effectively end up anywhere between about 0.35s and 0.45s, depending on factors such as room volume, absorption coefficient ratios, etc. Consequently, when using these traditional approaches, the real reverberation level resulting in the RIRs will in general be very different from what was originally defined.

Fig. 4 uses T20 as a measure of reverberation (T20 is defined as the time required for the RIR energy to decay from -5 to -25 dB). The reason why this parameter is used here rather than T60 is purely practical. In order to measure T20 in a RIR simulated with the ISM, the RIR only needs to be computed down to approximately -25 or -30 dB. In contrast, accurately measuring T60 (defined as the time required by the RIR energy to reach -60 dB) requires the RIR to be computed down to -65 or -70 dB. This in turn involves a huge number of image sources and hence leads to much longer simulation times (especially when computing 100 RIRs with large reverberation time values!). Furthermore, roundoff errors and other numerical noise during the computations also mean that it is in general near impossible to accurately compute an EDC down to -70 dB in practice. As a result, measuring T60 in a simulated RIR can only be achieved by extrapolation of the original slope of the EDC, which will however lead to inaccuracies due to the fact that the EDC is not usually exactly linear. Therefore, a figure similar to Fig. 4 for T60 would contain significant inaccuracies resulting from the measurement process, despite the method in [3,4] being able to provide accurate results.


About the Matlab code provided on these pages

The Matlab files that can be found on this web site are basically subdivided into two categories. The programs available from the "old" Matlab code page correspond to an implementation of Allen & Berkley's original image-source method [1], combined with Peterson's fractional-delay improvement [2] (implemented in the frequency domain). Despite the availability of our improved implementation (on the "new" code page), I am still providing the original Matlab code here for completeness, but I would only recommend using it if you are specifically looking for Allen & Berkley's original ISM implementation. Otherwise, there is no reason why this old Matlab code should be used instead of the updated version.

Obviously, the implementation provided on the "new" Matlab code page contains the updated and improved ISM implementation, which was made available following the acceptance of our JASA paper [3] (note: it took about one year as well as several e-mail enquiries from us for JASA to complete the first round of reviews for this paper! In the future, I will seriously think twice about submitting a paper to JASA again...). From a usage perspective, these updated Matlab files might look similar compared to the code available in the old implementation. However, most of the new Matlab files are updated and/or optimised versions of the old Matlab code. As a result, the new code version is unlikely to be compatible with the old one, and one version should therefore not be used in conjunction with the other. This does not mean, however, that the old code is not functional in any respect.

Compared to the earlier implementation, the Matlab code for the updated ISM implementation contains the following main improvements:
  • the new code has been further optimised for speed and the RIR and audio data computations now execute faster than in the earlier implementation
  • the new implementation provides more accurate room impulse responses (RIRs) that look more realistic compared to real acoustic measurements (as demonstrated above)
  • the new code makes use of the energy-decay approximation formula, proposed in [3], which simplifies the overall RIR simulation process
  • in order to provide more accurate results, the previous code version first required the simulation of several RIRs in order to obtain an estimate of the environment's reverberation time, which could be potentially very time-consuming; the new code does not require this since the energy-decay approximation formula [3] is used in order to obtain a relatively accurate prediction of the RIR's reverberation time
  • rather than a definition of the room's absorption coefficients, the new code allows the user to simply define the desired value of reverberation time T60 or T20 as an input to the simulation process.
This code is made available for other researchers to use (and hopefully benefit from it!), so feel free to download and use these Matlab files in any way you like. However, keep in mind that all pieces of code available on this web site are subject to the GNU general public license and do not come with any form of warranty.


References

[1] J. Allen and D. Berkley, Image method for efficiently simulating small-room acoustics, Journal of the Acoustical Society of America, vol. 65(4), pp. 943-950, April 1979.
[2] P. Peterson, Simulating the response of multiple microphones to a single acoustic source in a reverberant room, Journal of the Acoustical Society of America, vol. 80(5), pp. 15271529, November 1986.
[3] E. Lehmann and A. Johansson, Prediction of energy decay in room impulse responses simulated with an image-source model, Journal of the Acoustical Society of America, vol. 124(1), pp. 269-277, July 2008.
[4] E. Lehmann, A. Johansson, and S. Nordholm, Reverberation-time prediction method for room impulse responses simulated with the image-source model, Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'07), pp. 159-162, New Paltz, NY, USA, October 2007.