Beruflich Dokumente
Kultur Dokumente
discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/280756201
READS
4 AUTHORS, INCLUDING:
Julian Palacino
Rozenn Nicol
Orange Labs
7 PUBLICATIONS 0 CITATIONS
SEE PROFILE
SEE PROFILE
Marc Emerit
Orange Labs
17 PUBLICATIONS 54 CITATIONS
SEE PROFILE
889
The first-order Ambisonics microphone (e.g. Soundfield ) is a both compact and efficient set-up for spatial
audio recording with the benefit of a full 3D spatialization. Another advantage is that the signals delivered by
this microphone (i.e. B-Format) can be rendered over headphones by applying appropriate processing, while
ensuring that the 3D spatial information is preserved. With the growing use of personal devices, it should be
considered that most audio content is listened to over headphones. Thus first order Ambisonics recording
provides an attractive solution to pick-up 3D audio content compatible with headphone reproduction. Binaural
decoding refers to the processing to adapt B-Format for headphone rendering (i.e. binaural format). One
solution is based on binaural synthesis of virtual loudspeakers. One promising way to improve the decoding is
active processing which takes information from a pre-analysis of the sound scene, particularly in terms of spatial
information. This paper will compare various binaural decoders. Starting from a listening test which assesses
existing solutions and which shows that the perceived quality may strongly vary from one decoder to another,
the processing is analyzed step by step. The performances are measured by a set of objective criteria derived
from localization cues.
Introduction
2.1
Objective
2.2
Experimental set-up
890
2.3
Results
3.1
HOA encoding
where
(1)
(2)
is the harmonic order and are the associated
Legendre functions defined in by:
(3)
(4)
:
pressure to the corresponding spherical harmonic
(5)
by the coefficients
. Those coefficients have been
expressed here in the frequency domain. They can also be
given in time domain using the inverse Fourier Transform.
(7)
(9)
are the Ambisonics signals representing the whole acoustic
field by the matrix relation:
(10)
where
(11)
and the gain array
(12)
.
3.2
2.4
Conclusion
HOA decoding
coefficients
of the reconstructed soundfield are
expressed as:
891
where
and
3.4
(13)
(14)
(15)
(18)
because
(19)
3.3
4
Instrumental
binaural decoding
assessment
of
Pre-processing of HRTF
4.1 Criteria
The assessment is focused on the rendering of spatial
information. Therefore it is examined how the localization
cues are reproduced in the signals delivered to the listeners
ears. Sound localization uses mainly 2 kinds of cues: interaural cues (namely the Interaural Time Difference or ITD
and the Inter-aural Level Difference or ILD, as described
by the Lord Rayleighs duplex theory [24]) and monaural
cues [22]. ITD and ILD can be directly compared direction
by direction. ILD is calculated using the method proposed
by Larcher [16]:
(21)
(22)
(25)
where and are the pressures at the left and right ear
respectively. ITD is calculated using the method presented
in subsection 3.4. Monaural cues rely essentially on spectral
features. Therefore, the Inter-Subject Spectral Difference or
http://recherche.ircam.fr/equipes/salles/listen/
http://interface.cipic.ucdavis.edu/sound/hrtf.html
http://www.isr.umd.edu/Labs/NSL/
4
http://www.ais.riec.tohoku.ac.jp/lab/db-hrtf/
5
http://www.sp.m.is.nagoya-u.ac.jp/HRTF/database.html
2
3
892
Table 2: ISSD values for different orders and preprocessing methods (in dB2).
Ambisonics order
000
100
120
130
123
X
X
X
X
ISSD
(dB2)
39.04
39.38
30.82
39.02
30.42
45.24
33.26
33.70
32.98
33.39
45.24
32.73
24.97
32.45
24.66
9.69
3.68
10.94
4.81
11.7
9.69
3.15
2.21
4.28
2.97
0
0.53
8.73
0.53
8.73
X
X
Name
000
100
120
130
123 mean
D/d
/d
18.8
2.5
14.9
3.6
14.6
3.9
15.0
3.7
14.8
3.9
15.6
3.5
(28)
4.3
39.04
39.91
39.55
39.55
39.15
0
100
120
130
123
Interpolation
X
X
ISSD
Degradation
(dB2)
Frequency
smoothness
30
ISSD
(dB2)
Minimum phase
filter + ITD
ISSD
Degradation
(dB2)
d. 1th Order
Experimental protocol
Name
c. 4th order
(27)
4.2
b. 30th order
ISSD
(dB2)
a.Original
(26)
ISSD
Degradation
(dB2)
Pre-processing only
(Before Ambisonics
decoding)
Pre-processing
Name
Experimental Results
893
Conclusion
References
[2] http://www.mozartsurecrans.com/
September 2011)
(consulted
894