Sie sind auf Seite 1von 7

UNTREF, Sound Engineering, Acoustic Laboratory

November 2013, Argentina

RELATION BETWEEN OPTIMUM RELEASE TIME ON A DYNAMIC RANGE COMPRESSOR AND THE EFFECTIVE DURATION OF THE RUNNING AUTO-CORRELATION FUNCTION (!e)
RAMN FACUNDO1
1

Univesidad Nacional de Tres de Febrero, Sound Engineering, Caseros, Argentina. facundo.ramon@gmail.com

Abstract Release time on a dynamic range compressor is considered optimum when signals level recovery is not perceived. When a compressor is used, signals level recovery can be perceived as loudness variation or amplitude modulation depending on the chosen release time. Loudness perception can be related to signals parameters such as spectral energy distribution or effective value of the auto-correlation function. On the other hand, perceptible index of amplitude modulation also can be related to signal spectrum and modulation frequency. In this paper, a relation between effective duration of signals running auto-correlation function (r-ACF) and detection of level increment at different speeds is investigated by subjective tests. Despite tests results are inconsistent, there is small evidence of possible relation between both variables that suggest that slow level amplitude variations are more distinguishable when signals effective duration of r-ACF is low. 1. INTRODUCTION Dynamic range compressors serves to automatically adapt the dynamic range of a signal to the limits given by the storage or transmission medium. For example, sound reinforcement systems use dynamic range compressors, such as limiters, to protect the loudspeakers against possible signals excesses. In music recording, mastering experts use compressors to fit the signal level into the recording medium boundaries. Also, in music production compressors are used with an artistic purpose, for example: to emphasize one instrument in a mix or to increase the perceived loudness of a final mix [1]. Furthermore, compressors are widely used in hearing aids to minimize distortion for high-level input signals [2]. It is necessary to define at least four variables on a dynamic range compressor: threshold level, attack time, release time and ratio of attenuation. Basically, the dynamic processor is constantly measuring the input signal level. If it exceeds the threshold level during a period longer than the attack time, the signal is attenuated in a proportion defined by the ratio of attenuation. Then, when the signal level decreases and gets lower than threshold level, attenuation goes back to 0 dB in a time defined by the release time variable. Therefore, automatic dynamic attenuation is based on the signal level and on predefined variables. The challenge for the sound engineering is to correctly choose and combine the variables to achieve the desire dynamic range or artistic effect without making the compression perceptible for the listener [3]. When is used wrongly, dynamic compression can be perceived as distortion due to the excessive attenuation, or, as loudness variation or
1

amplitude modulation caused by the bad choice of attack and release times [4]. Release time is the only variable purely subjective; its value is defined only by perception. Attack time is restricted by the medium tolerance to transients, slow attack time can allow fast but powerful signal transients that could cause damages on the transmission or recording medium; but, very fast attack time is perceived unnatural and also decreases loudness because attenuation gets controlled by the signals transients and not by its integrated energy [4]. The needed dynamic range defines threshold level and ratio of attenuation; but, also, low threshold and extreme ratios can be perceived as distortion [4, 5]. Only release time is completely and only dependent on perception; generally, it is adjusted to achieve an undetectable signal level recovery. Slow release times can generate holes in the loudness perception while fast release times generates rapid level variations, perceived by the listeners as amplitude modulations [4]. The optimum release time is the one that makes signal level recovery imperceptible. Just-noticeable changes in amplitude have been studied by Zwicker and Fastl [6]. It has been shown that minimum detectable amplitude modulation index of a 1 kHz sinusoidal tone is dependent on the modulation frequency, therefore, on the speed of level variation. It has also been shown that modulation index sensitivity varies from a pure tone to a wide-band noise. Therefore, amplitude variation perception is also dependent on signals characteristics such as its bandwidth. The autocorrelation function (ACF) proportionate signals descriptors that can be correlated with human

auditory system [7, 8]. The effective duration of the autocorrelation function (!e) represents repetitive of a signal, or, a measure of its own reverberation. It was found that loudness perception of equal energy sharp notch filtered noise is correlated with its !e value. This extends the critical band [9] concept and establishes new points of view for loudness estimation [10]. If two audio signals have equal electrical energy, compressors actuates equally on both, however the perceived variation of amplitude could not be equal. It has been studied that level meters in regular dynamic compressors are inconsistent with wellknown facts about the loudness perception by human auditory system [11]. The use of a filter before the level meter has been proposed to measure using the Fletcher and Munson curves concept [12]. Nevertheless, the implementation of a new level meter is incompatible with already existing compressors. Furthermore, the weighted level measurement can affect dynamic range compressors main objective propose: dynamic range control. The objective of this investigation is to find if there is a relation between loudness increment detection at different speed (different release times) and the effective duration of the signals r-ACF. This can be helpful for release time settings when dynamic range compressors are used. And can lead to the automation of release times in function on the signals ACF. 2. DYNAMIC RANGE COMPRESSORS Typically, a dynamic range compressor is made off two stages: a level meter stage and a gain controller stage (see figure 1) [4, 13, 14]. Signal can be measured before the gain controller (feed-forward or look-ahead compressor) or after (feed-back or regular compressor). The main difference between both topologies is the possibility to achieve instantaneous attack time [13]. However, it does not affect the release time.
/$0%#*!1%23(! "#$%! &'%()'*! /$0%#*!43(23(!

A basic and simplified analog level meter is shown in figure 2.

Figure 2: Simplified level meter circuit schematic.

Signal is half rectified by the diode D1. The variable resistor R1 defines charging time of capacitor C1 as shown in equation 1. ! ! ! !! ! ! ! !"
!!

(1)

Where V is the condenser charge in volts, !! is the applied tension in volts, R is resistors values in ohms, C is capacitance in faradays, and t is time in seconds. After being charged, capacitor C1 gets discharged through the variable resistor R2, which also defines capacitors discharging time. ! ! ! !! ! !"
!!

(2)

The level obtained at the output can be used to drive a voltage-controlled amplifier (VCA). In that case, attack time is set with resistor R1 and release time with resistor R2. The capacitors voltage level drives attenuation (when it is fully charged, attenuation is maximum). Therefore, when threshold is exceeded, attenuation fluctuates as shown in figure 3. And when signal decreases, attenuation decreases as shown in figure 4.
1 Input Signal

Amplitude

0 !1

0.5

1.5

2.5 Gain

3.5

4.5

Gain [Times]

1 0.5 0

0.5

1.5

2.5 Output Signal

3.5

4.5

Amplitude

+,-,*! .,(,)!

1 0 !1

0.5

1.5

Figure 1: Generic dynamic range feedback compressor simplified schematic.

2.5 Time [s]

3.5

4.5

Figure 3: Attenuation during attack time.

The level meter is constantly measuring signal level and deciding how much attenuation is needed.
2

Input Signal

0 !1

0.5

1.5

2.5 Gain

3.5

4.5

Gain [Times]

1 0.5 0

0.5

1.5

2.5 Output Signal

3.5

4.5

Amplitude

were found [7, 8, 15]. It was shown that loudness perception is related to signals repeatability [15], repetitive signals are perceived louder than not repetitive ones. Also, operatic singing with vibrato style was analyzed in function of !e min and correlation was found [16], vibrato decreases signals !e min value. This means amplitude and frequency variations can be related with signals !e min value. Also, because the r-ACF involves A-weighting filter, the most sensitive frequencies for the human ear are prioritized. 4. TEST

Amplitude

0 !1

0.5

1.5

2.5 Time [s]

3.5

4.5

Figure 4: Attenuation during release time.

The product of resistance and capacitance is known as the constant time of the system (see eq. 3). !" ! ! (3)

Dynamic range level variation was simulated over signals with different effective durations of r-ACF. Subjects were asked to answer, by yes or no, if the level increment was perceived. 4.1. Signals

This is the time capacitor needs to reach 63.2% of its full charged (Vf). Because of its common use on dynamic range compressors, release time is defined as time needed for the signal to recovery 63.2% of its not attenuated value [3]. It normally can be set from 0.1 seconds up to 25 seconds. Despite the attenuation curve can also be linear [13], the attention of this paper is focused on compressors with exponential attenuation because it is consider the most commonly found. 3. RUNNING AUTO CORRELATION FUNCTION Running auto-correlation function (rAFC) is defined as follows: ! !! !! ! ! !
!! ! ! !! ! !

Three signals with different !e value were generated. Value of !e was controlled by the application of sharp notch filters with different bandwidths to white noise [7, 8]. First, 5 seconds of white noise were generated using Adobe Audition 1.5 with rise and fall time of 10 ms. Second, with Scientific Filters tool of Adobe Audition 1.5, noise was filtered. A 48th order band-pass filter with center frequency at 500 Hz was used. The bandwidths were 28 Hz, 40 Hz and 150 Hz. Third, signals were normalize to have same RMS value using Statistical Analysis tool of Adobe Audition. The value of !e was obtained using an iterative method developed by Sato and Wu [17]. Integration time was set at 0.5 s, time lag was set at 0.2 s and running steps at 0.1 s [18].
'$!" '#!" '!!"
!"#$%&'#

! !! ! ! ! !"
!

(3)

+,-./0"'" +,-./0"#" +,-./0"*"

where 2T is the integration interval and ! ! ! ! ! ! !!! !. Function !!! ! is signals waveform and !!! ! is the impulse response of an A-weighting filter. The objective of the function is to compare the signal with itself displaced in time along finite intervals. From this comparison, information related with human perception can be extracted [7, 8]. Function value at ! ! ! gives the energy present at the origin of the delay. Notice that energy value will be weighted by A-weighting filter. Therefore, it is an indicator of loudness [7, 8]. The r-ACFs envelope normally decays with the delay increment. The effective duration of the r-ACF (!e) is the time that takes to the envelope to decay -10 dB from its initial value. This parameter gives information about the signal self-correlation. If it is high, it indicates periodicity or repeatability. Relations between signals !e and loudness perception
3

&!" %!" $!" #!" !" !" !()" '" '()" #" $&'# #()" *" *()" $"

Figure 5: !e as a function of time. Signal 1 corresponds to 28 Hz bandwidth noise. Signal 2, 40 Hz of bandwidth. Signal 3, 150 Hz of bandwidth.

Table 1: Mean, median and minimum !e values. ACFs Effective duration (!e) Signal 1 Bandwidth Mean Median Minimum 28 [Hz] 77 [ms] 74 [ms] 47 [ms] Signal 2 40 [Hz] 40 [ms] 38 [ms] 26 [ms] Signal 3 150 [Hz] 21 [ms] 17 [ms] 13 [ms]

4.2.

Calibration

Energy at the origin of the delay "(0) of each signal is shown in figure 6.
'$# '!# '"#

!"#$%&'()%

!&# !%# !$# !!# !"# "# "()# *# *()# !# &*)% !()# '# '()# $# +,-./0#*# +,-./0#!# +,-./0#'#

Test was done with headphones [19]. A headphone amplifier was used for testing multiple listeners simultaneously. For signal level calibration, dummy head, sound level meter, loudspeakers, headphones and headphones amplifiers were used. The sound level meter was used to achieve 75 dB SPL Z at one meter of the loudspeaker with not attenuated stimulus being reproduced. Then, dummy head was positioned at sound level meter place facing the speaker. Same stimulus was reproduced and recorded. Finally, with headphones on the dummy head and the headphones amplifier, the same stimulus was reproduced. Gain was set to obtain equal input signal on the digital recorder. Therefore, the SPL generated by headphone was considered 75 dB SPL. 4.3. Subjects and procedure

Figure 6: "(0; t, 0.5). Signal 1 corresponds to 28 Hz bandwidth noise. Signal 2, 40 Hz of bandwidth. Signal 3, 150 Hz of bandwidth.

Finally, using MatLab Software, the audio signals where multiplied by the function defined by eq. 1 with different RC values. The release (RC value) was set from 0 to 3 seconds in intervals of 0.5 seconds. Therefore, 7 signals were generetad from each of the 3 filtered noises. In total, 21 signals were generated. The attenuation value always started at 0.5 times or -6 dB.
1 Input Signal

Twenty normal hearing subjects between 20 and 28 years old participated in groups of eight or less. They were given printed instructions (see annex I) and also a spoken explanation (see figure 8). The subjects were asked to decide, either yes or no, if level increment was detected during each stimulus. Previously, not attenuated stimuli were presented in order to get them familiarized with the sharp notch filtered noise. The stimuli were presented in random order one after each other and were repeated as many times as needed by the listeners to make a choice.

Amplitude

0 !1

0.5

1.5

2.5 Gain

3.5

4.5

Gain [Times]

1 0.8 0.6 0 1 0.5 1 1.5 2

!
2.5 Output Signal 3 3.5 4 4.5 5

Figure 8: Group of eight participants before test.

0 !1

0.5

1.5

2.5 Time [s]

3.5

4.5

No comparison was made because it would not represent real phenomenon. When compressed signals are listen, instantaneous perception of level increment is intended to be avoided without any reference but the previous sound. 5. RESULTS Percentage of level increment detection of each signal in function of the release time is presented in figure 9.
4

Figure 7: Example. Signal 2 multiplied by an envelope with 1 second of release. Gain at the beginning is -6 dB.

Amplitude

#!!" +!" *!"


!"#$"%&'&()#*"

-./012"#" -./012"$" -./012"%"

6. DISCUSSION It was not possible for listener to distinguish level amplitude increment on signal 1. Detection values of 50% indicate that listeners were guessing the answers. This is adjudicated to the nature of the stimulus. Noise is composed by random amplitude and random frequency, if a sharp band pass filter is applied, frequency gets well defined but amplitude is still random. Slow variations of amplitude are confused with signals random amplitude variations. When bandwidth increases, the detection of no amplitude variation turns easier because the signal itself is more constant in amplitude. Signal 2 is considered to have enough bandwidth to be perceived as constant noise. However, when release time is applied, level increments are not highly detected. This can be explained by observing figure 6. Signal 2 has slope changes on its "(0) function. Times with positive slope matches release times with high percentage of detection and times with negative slope matches with low release detection. This would mean that, when attenuation is applied, the energy variations could be summed or cancelled. This effect could also explain the high percentage of detection of signal 1 between 1.5 s and 2.5 s release times. Signal 3 has enough bandwidth to be perceived as constant when is not attenuated. Its "(0) function is almost constant except from 2 to 3 seconds. The increment matches with the increment at release time of 1.5 s. After gain reaches its 63.2% value, signals energy is increases by 2 dB. Nevertheless, tendency lines show a clear difference between signal 2 and signal 3. While signal 2 has a flat tendency line, signal 3 shows a positive slope. This could mean that slow amplitude increments are more perceived for signals with low !e value than for signals with long !e value. 7. CONCLUSIONS Sharp notch filtered noise can be thought as a tone with random amplitude. Therefore, amplitude increment detection is more difficult with narrow band noises than wide band noises. There is a clear difference between tendency line of signal 2 and signal 3. While signal 2 is almost immune to level increments at different speeds, signal 3 increases detection with longer release times. Both events can suggest the existence of a relation between the signals !e value and amplitude increment detection at different speeds. However, signals show dynamic variation on its "(0) that can affect the results. Therefore, a relation between !e signal value and the optimum release time was not found.

)!" (!" '!" &!" %!" $!" #!" !" !"

!,'"

#"

#,'" +&,&-.&")/&"0.1"

$"

$,'"

%"

Figure 9: % of level increment detection in function of release time for each signal.

Signal 1 (!e mean = 77 ms) reaches its maximum level increment detection percentage with a release time of 2,5 s. Detection can be considered random for the release times of 0 seconds (no attenuation applied), 0.5, 1.0, 1.5 and 3.0 seconds. Detection of level increment happened at 2.0 and 2.5 seconds, where detected percentage is >= 75%. Signal 2 (mean !e = 40 ms) has 5% of detection when release time is 0 seconds. When release time increments random detection appears. Not detection happened at 1.5, 2.5 and 3 seconds release time. Signal 3 (mean !e = 21 ms) shows 0% of level increment detection for the release time of 0 seconds. Level increment was not detected with 0.5 seconds and it was clearly detected at 1.5 seconds. Tendency line of percentages in function of the release time is shown in fig. 10 for each signal.

#!!" +!" *!"


!"#$"%&'&()#*"

)!" (!" '!" &!" %!" $!" #!" !" !" !,'" #" #,'" +&,&-.&")/&"0.1" -./012"#" 3.0412"5-./012"#6" -./012"$" 3.0412"5-./012"$6" -./012"%" 3.0412"5-./012"%6" $" $,'" %"

Figure 10: tendencies lines of % detection in function of release time for each signal.

Signal 3 show a positive slope on its tendency line while Signal 1 and 2 do not have significant slopes. Signal 1 has its slope constricted between 50 and 70%, which indicates a split decision about the detection.

8. REFERENCES [1] A. P. Kefauver. The Audio Recording Handbook. A-R Editions, Inc. Chap. 8, pp. 220 223. Middleton, USA. 2001. [2] P. G. Stelmachowicz, D. E. Lewis, B. Hoover, and D. H. Keefe. Subjective effects of peak clipping and compression limiting in normal and hearingimpaired children and adults. Journal Acoustical Society of America, Vol. 105, pp. 412-422. January, 1999. [3] D. M. Huber. Modern Recording Techniques. Elsevier Inc. Chap. 12, pp. 486-497. Burlington, USA. 2005. [4] B. A. Blesser. Audio Dynamic Range Compression For Minimum Perceived Distortion. IEEE Transactions on Audio and Electroacoustic, vol. AU-17, #1. March, 1969. [5] N. B. H. Croghan, K. H. Arehart. Quality and loudness judgments for music subjected to compression limiting. Journal of Acoustical Society of America, Vol. 132, No. 2, pp. 1177-1188. August, 2012. [6] E. Zwicker, H. Fastl. Psychoacoustics, Facts and Models. Springer. Chap. 7, pp. 175-180. Germany. 1999. [7] Y. Ando. Architectural Acoustics Blending sound sources, sound fields and listeners. Springer. Chap 3, pp. 7-23. New York, USA. 1998. [8] Y. Ando, P. Cariani. Auditory and Visual Sensations. Springer. New York. 2009. [9] E. Zwicker. Subdivisinn of the audible frequency range into critical bands (FrequenZgruppen)". Journal Acoustical Society of America, Vol. 33, pp. 248. 1961. [10] J. Hots, J. Rennis, J. L. Verhey.Loudness of sounds with a subcritical bandwidth: A challenge to current loudness models?. Journal of Acoustical Society of America, Vol. 134, No 4. September, 2013. [11] R. J. Cassidy. Dynamic Range Compression Of Audio Signals Consistent With Recent Time-Varying eLoudness Models. IEEE International Conference, Vol. 4, pp. 213-216. May, 2004. [12] ISO 226:2003, Acoustics -- Normal equalloudness-level contours. [13] R. C. Mathes, S. B. Wright. The compandor, an aid against static in radio telephony. Bell System Technical Journal, vol. 13, pp. 315332. 1934. [14] F. Floru. Attack and release time constants in rms-based compressors and limiters. A.E.S. Convention #99. New York, USA. 1995. [15] Y. Soeta, K. Yanai, S. Nakawaga, K. Kotani, K. Horii. Loudness in relation to iterated rippled noise. Journal of Sound and Vibration, Vol. 304, pp. 415-419. 2007. [16] K. Kato, T. Hirawa, K. Kawai, T. Yano, Y. Ando. Investigation of the Relation Between (!e)min and Operatic Singing with Different Vibrato Styles.
6

Journal Of Temporal Design In Architecture And The Environment. Vol 6, pp. 35-48. December, 2006. [17] S. Sato, S. Wu. Comparison of Different Calculation Methods of Effective Duration (!e) of the Running Autocorrelation Function of Music Signals. Acta Acustica United With Acustica, Vol. 97. 2011. [18] K. Mouri, K. Akiyama, Y. Ando. Preliminary Study On Recommended Time Duration Of Source Signals To Be Analyzed, In Relation To Its Effective Duration Of The Auto-Correlation Function. Journal of Sound and Vibration, Vol. 241, pp. 87-95. 2001. [19] !"#$"#%&'()#*"#$+,'-./#-'/#0"#1&2,.-.Loudspeakers and headphones: The effects of playback systems on listening test subjects. Acoustical Society of America - Proceedings of Meetings on Acoustics, Vol. 19. 2013.

!
9. Annex I Test subjetivo de deteccin. Variables: Tiempo de release y duracin efectiva de la funcin de auto-correlacin de la seal ( !! ). Se le presentarn muestras de 5 segundos de ruido de banda estrecha centrado en 500 Hz y con distintos anchos de banda. La amplitud en algunas muestras es constante y en otras incrementa de manera exponencial, emulando el funcionamiento del release de un procesador dinmico. Su funcin es decidir, por s o por no, si percibe incremento en la sonoridad dentro de los 5 segundos que dura la muestra. # 6# 7# 8# 9# :# ;# <# =# ># 6?# 66# 67# 68# 69# 6:# 6;# 6<# 6=# 6># 7?# 76# 13# # # # # # # # # # # # # # # # # # # # # # 45# # # # # # # # # # # # # # # # # # # # # #

!
Muchas Gracias por su tiempo.
7