You are on page 1of 148

RESTORATION OF NONLINEARLY DISTORTED OPTICAL

SOUNDTRACKS USING REGULARIZED INVERSE


CHARACTERISTICS

PhD thesis

Tamas B. Bako

Supervisor: dr. Tamas Daboczi


BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS
DEPARTMENT OF MEASUREMENT AND INFORMATION SYSTEMS
3rd June 2004.

Alulrott, Bako Tamas Bela kijelentem, hogy ezt a doktori ertekezest magam kesztettem
es abban csak a megadott forrasokat hasznaltam fel. Minden olyan reszt, amelyet szo szerint,
vagy azonos tartalomban, de atfogalmazva mas forrasbol atvettem, egyertelm
uen, a forras
megadasaval megjeloltem.
A dolgozat bralatai es a vedesrol kesz
ult jegyzokonyv a kesobbiekben, a Budapesti
M
uszaki es Gazdasagtudomanyi Egyetem dekani hivatalaban lesz elerheto.

Budapest, 2004. j
unius 3.

.....................

Magyar nyelv
u
osszefoglal
o
A regi filmfelvetelek hangja gyakran nem t
ul jo minoseg
u: a lejatszott hang rendkv
ul zajos
es torz. A torzult hang farasztja a kozonseget, akik kevesbe tudnak koncentralni magara a
filmre, ezaltal a film elvezhetosege csokken. Ez az oka annak, hogy szamos regi filmet nem
erdemes lejatszani a kozonsegnek a televzioban vagy a filmsznhazakban. A torz hangot
azonban digitalis jelfeldolgozasi modszerekkel jobba lehet tenni.
Mivel a hangrestauralas szamara semmi mas nem all rendelkezesre, csak a torz es zajos
filmfelvetel, es nincs hozzaferes
unk sem az eredeti jelhez, sem pedig a kesz
ulekekhez, amivel
a felvetelt kesztettek, ezert az egyetlen lehetoseg
unk a hangminoseg feljavtasara a hang
utolagos kompenzalasa. Ez a disszertacio u
j modszereket javasol az optikai u
ton rogztett
regi filmek nemlinearisan torzult hangjanak hatekony es gyors utolagos kompenzalasara.
A disszertacio elso reszeben a nemlinearis modellekrol es a nemlinearis kompenzalo technikakrol esik szo, majd az utolagos nemlinearis kompenzalas lesz reszletesen elmagyarazva
es az, hogy ez a problema miert u
n. rosszul kondcionalt problema. A disszertacio masodik
reszeben olyan modszerek lesznek bemutatva, melyek kepesek kezelni a problema rosszul
kondcionaltsagat (a hang helyrealltas erzekenyseget a torz jelhez hozzaadodott zajokra). A
modszer hatekonysagat szimulaciok es filmreszletek hangjanak helyrealltasa tamasztjak ala.

To the muse
D
ora Sz
asz

Acknowledgement
I am very grateful to Laszlo F
uszfas and Zoltan Seban for helpful discussions and for finding
me the basic literatures of film-processing. I am also grateful to the Hungarian Radio for the
technical support of my research work. The Hungarian National Film Archive, especially
Beke is also acknowledged, who gave me film materials to finish my researches. Also
Eva
many thanks to Laszlo Balogh, who carefully checked the mathematics in this dissertation
and asked me better explanations.
I would also like to thank the many people who have made the Department of Measurement and Instrumentation Technology such a stimulating environment, including those
whose heroic efforts have kept the absurdly nonstandard network running most of the time.

Keywords
The following keywords may be useful for indexing purposes:
Audio restoration, nonlinear compensation, regularization methods, Tikhonov regularization, optical soundtrack, density characteristic.

iv

Summary
This dissertation is concerned with the possibilities of restoration of degraded film-sound.
The sound-quality of old films are often not acceptable, which means that the sound is so
noisy and distorted that the listener have to take strong efforts to understand the conversations in the film. In this case the film cannot give artistic enjoyment to the listener. This is
the reason that several old films cannot be presented in movies or television.
The quality of these films can be improved by digital restoration techniques. Since we
do not have access to the original signal, only the distorted one, therefore we cannot adjust
recording parameters or recording techniques. The only possibility is to post-compensate
the signal to produce a better estimate about the undistorted, noiseless signal. In this dissertation new methods are proposed for fast and efficient restoration of nonlinear distortions
in the optically recorded film soundtracks.
First the nonlinear models and nonlinear restoration techniques are surveyed and the
ill-posedness of nonlinear post-compensation (the extreme sensitivity to noise) is explained.
The effects and sources of linear and nonlinear distortions at optical soundtracks are also
described. A new method is proposed to overcome the ill-posedness of the restoration problem and to get an optimal result. The effectiveness of the algorithm is proven by simulations
and restoration of real film-sound signals.

vi

Contents
1 Introduction

1.1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Structure of thesis

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2 Classification of nonlinearities and nonlinear models

2.1

Classification of nonlinearities . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Representation of memoryless nonlinearities . . . . . . . . . . . . . . . . . .

2.2.1

Taylor series and piecewise linear representation . . . . . . . . . . . .

2.2.2

Polynomial interpolation . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.3

Analytical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Representation of nonlinearities with memory . . . . . . . . . . . . . . . . .

2.3.1

Volterra series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.2

Parametric models . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.3.3

Treshold models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.3.4

Cascade models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.3

3 Techniques for nonlinear compensation

13

3.1

Possible methods for compensation . . . . . . . . . . . . . . . . . . . . . . .

13

3.2

Pre-distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.3

Post-distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

3.3.1

Regularization of the solution . . . . . . . . . . . . . . . . . . . . . .

19

3.3.2

Regularization of the iteration . . . . . . . . . . . . . . . . . . . . . .

20

3.3.3

Choosing the value of the regularization parameter . . . . . . . . . .

22

3.3.4

Bayesian techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

Audio related post-distortion techniques for reducing nonlinear distortions .

24

3.4.1

Histogram equalization . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.4.2

Signal reconstruction with known nonlinearity . . . . . . . . . . . . .

25

3.4

vii

3.4.3

Restoration using nonlinear autoregressive models . . . . . . . . . . .

4 The nonlinear characteristic of movie film

26
27

4.1

Image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

Relationship between silver mass and transparency . . . . . . . . . . . . . .

28

4.3

Relationship between transparency and exposure . . . . . . . . . . . . . . . .

29

5 Imperfections in the optical sound-recording techniques

33

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

5.2

Optical sound-recording techniques . . . . . . . . . . . . . . . . . . . . . . .

34

5.2.1

Variable density method . . . . . . . . . . . . . . . . . . . . . . . . .

35

5.2.2

Variable area method . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

5.3

Distortions at variable density method . . . . . . . . . . . . . . . . . . . . .

37

5.4

Distortions at variable area method . . . . . . . . . . . . . . . . . . . . . . .

40

5.5

Appearance of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

6 Compensation of memoryless nonlinearities


6.1

43

Representation of nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . .

44

6.1.1

Representation using a piecewise linear model . . . . . . . . . . . . .

44

6.1.2

Representation of the inverse nonlinearity . . . . . . . . . . . . . . .

45

6.2

Identification of the nonlinear distortion . . . . . . . . . . . . . . . . . . . .

46

6.3

Effect of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

6.4

Compensation of the signal by Tikhonov regularization . . . . . . . . . . . .

50

6.4.1

Comparison of the solution to the optimal least squares solution . . .

53

6.4.2

Finding the appropriate value of the regularization parameter . . . .

60

6.4.3

Comparison of the novel method to Morozovs and Hansens method

64

6.5

Results on synthetically distorted real audio signals . . . . . . . . . . . . . .

72

6.6

Results on real distorted audio signals

. . . . . . . . . . . . . . . . . . . . .

77

6.7

Compensation of the signal to make an unbiased estimate . . . . . . . . . . .

82

6.8

6.7.1

Finding a proper compensation characteristic using an iterative method 84

6.7.2

Proof that the method is convergent under the given constraint . . .

84

Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

7 Conclusions and future possibilities

89

7.1

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

7.2

Suggestions for future research . . . . . . . . . . . . . . . . . . . . . . . . . .

91

viii

7.2.1

Improved blind identification . . . . . . . . . . . . . . . . . . . . . . .

91

7.2.2

Adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

7.2.3

Elimination of nonlinearities with memory . . . . . . . . . . . . . . .

92

A Brief history of film-sound

93

A.1 Sound-on-disc sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

93

A.2 Sound-on-film sound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

B Optimal signal restoration in linear systems

99

B.1 Simple linear system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

B.2 Piecewise linear model with two and more intervals . . . . . . . . . . . . . . 101
C MATLAB simulation of a realistic photosensitive layer

105

D MATLAB realization of computation of regularized nonlinear characteristics

109

E MATLAB realization of finding the optimal regularization

111

F MATLAB realization of calculation of compensation characteristic for unbiased signal reconstruction

113

ix

List of Tables
5.1

Velocity of different film formats. . . . . . . . . . . . . . . . . . . . . . . . .

36

6.1

Comparison results of the Morozov, Hansen and the new method. . . . . . .

68

6.2

Comparison results of the exact inverse, Tikhonov and the unbiased characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

88

xii

List of Figures
2.1

Block diagram of an LNL system. . . . . . . . . . . . . . . . . . . . . . . . .

12

3.1

Block-scheme of pre-distortion. . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.2

Block-scheme of post-distortion. . . . . . . . . . . . . . . . . . . . . . . . . .

16

3.3

Original, input signal (x in Fig. 3.2). . . . . . . . . . . . . . . . . . . . . . .

17

3.4

Distorted and noisy, observed signal (o in Fig. 3.2). . . . . . . . . . . . . . .

17

3.5

Reconstructed signal by the exact inverse of the nonlinear distortion (


x in Fig.
3.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

18

Characteristic of exposure vs. developable crystals in a monosized silver-halide


layer for different foton quanta sensitivity (r). . . . . . . . . . . . . . . . . .

30

4.2

Exposure vs. 1-transmission characteristic of a typical emulsion. . . . . . . .

31

4.3

Logarithmic exposure vs. density characteristic of a typical emulsion. . . . .

31

5.1

Schematic diagram of variable density method. . . . . . . . . . . . . . . . . .

34

5.2

Sound-on-film, variable density. . . . . . . . . . . . . . . . . . . . . . . . . .

35

5.3

Sound-on-film, variable area. . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

5.4

Schematic diagram of variable area method with electrodynamic mirror oscillograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

5.5

38

Amplitude response of light intensity controlled variable density sound-recording.


Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm)
film at 16 fps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

5.6

Creation of nonlinear distortions due to light diffusion. . . . . . . . . . . . .

41

6.1

Model of the nonlinearity compensation. . . . . . . . . . . . . . . . . . . . .

50

6.2

One block from the piecewise linear compensation model. . . . . . . . . . . .

51

6.3

The supplemented piecewise linear compensation model. . . . . . . . . . . .

51

xiii

6.4

R(pn (n), N(x)) at Gaussian error function and uniformly distributed noise
(noise interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear
function, dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . .

6.5

55

R(pn (n), N(x)) at Gaussian error function and Gaussian noise (noise deviation
at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.6

55

R(pn (n), N(x)) at exponential function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,
dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.7

56

R(pn (n), N(x)) at exponential function and Gaussian noise (noise deviation at
left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8

56

R(pn (n), N(x)) at square-root function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,
dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.9

57

R(pn (n), N(x)) at square-root function and Gaussian noise (noise deviation at
left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed
line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

6.10 R(pn (n), N(x)) at x0.2 function and uniformly distributed noise (interval at
left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed
line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

6.11 R(pn (n), N(x)) at x0.2 function and Gaussian noise (noise deviation at left
0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

6.12 Multisine signal, x, used for the simulations. . . . . . . . . . . . . . . . . . .

65

6.13 Gaussian error function used for the first simulation. . . . . . . . . . . . . .

66

6.14 x5 function used for the second simulation. . . . . . . . . . . . . . . . . . . .

66

6.15 Noisy output signal of the first simulation (distortion is made by the Gaussian
error function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

6.16 Noisy output signal of the second simulation (distortion is made by the x5
function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

6.17 Error of the compensation of nonlinearity by Morozovs method (left) and


Hansens method (right) as a function of . The nonlinear distortion is the
Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xiv

68

6.18 Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of . The nonlinear distortion is the Gaussian
error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

6.19 Error of the compensation of nonlinearity by Morozovs method (left) and


Hansens method (right) as a function of . The nonlinear distortion is the
part of x5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

6.20 Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of . The nonlinear distortion is the part of x5 . 69
6.21 Reconstruction of x by Morozovs method (left) and Hansens method (right)
for the Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . .

70

6.22 Reconstruction of x by the novel method (left) and the optimal result in least
squares sense (right) for the Gaussian error function. . . . . . . . . . . . . .

70

6.23 Reconstruction of x by Morozovs method (left) and Hansens method (right)


for the x5 nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . .

71

6.24 Reconstruction of x by the novel method (left) and the optimal result in east
squares sense (right) for the x5 nonlinear distortion. . . . . . . . . . . . . . .

71

6.25 Original, not distorted audio signal. . . . . . . . . . . . . . . . . . . . . . . .

73

6.26 Audio signal synthetically distorted by a -function. . . . . . . . . . . . . . .

73

6.27 Distorted, noisy signal part chosen for parameter determination of the nonlinear function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

6.28 Result of parameter search of the nonlinear function. . . . . . . . . . . . . .

74

6.29 Estimate error of the iterative algorithm at different regularization values.

75

6.30 True error at different regularization parameters. . . . . . . . . . . . . . . . .

75

6.31 Reconstructed signal by the best characteristic estimate. . . . . . . . . . . .

76

6.32 Reconstructed signal by overregularized characteristic.

. . . . . . . . . . . .

76

6.33 Reconstructed signal by underregularized characteristic (note scale change). .

77

6.34 Real, nonlinearly disorted and noise contaminated audio signal. . . . . . . .

78

6.35 Signal part chosen for parameter determination of the nonlinear function. . .

78

6.36 Results of parameter estimation of the nonlinearity. . . . . . . . . . . . . . .

79

6.37 Result of the iterative algorithm. . . . . . . . . . . . . . . . . . . . . . . . .

79

6.38 Reconstructed signal by optimally regularized characteristic. . . . . . . . . .

80

6.39 Reconstructed signal by underregularized characteristic. . . . . . . . . . . . .

81

6.40 Sinusoid excitation signal used for the simulations. . . . . . . . . . . . . . . .

85

6.41 The nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

xv

6.42 Distorted signal. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

6.43 Reconstruction of x by the exact inverse (left) and Tikhonov-regularized inverse (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

6.44 Unbiased reconstruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

B.1 Estimation of x in the knowledge of o. . . . . . . . . . . . . . . . . . . . . . 100


B.2 Original and inverse piecewise linear system. . . . . . . . . . . . . . . . . . . 101

xvi

Chapter 1
Introduction
1.1

Overview

The optical filmsound-recordig technology is more than 100 years old. Since then millions of
sound-films were made and then stored in the national film archives, which have inestimable
artistical value. The task of the archives is not just to preserve these films but also to prepare
them for broadcasting and show them to the wide audience. However, most of these films
cannot be broadcasted because they suffer from several degradations.
There are several distinct types of film degradations. These can be broadly classified
into two groups: localised degradations and global degradations. Localised degradations are
discontinuities in the waveform which affect only certain samples. Global degradations affect
all samples from the waveform. We can distinguish the following sub-classes of degradations
[1]:
clicks and cracklings,
low-frequency noise transients,
broad band noise,
wow and flutter,
non-linear defects.
Clicks and cracklings are short bursts of interference random in time and amplitude.
The cause of these impulsive disturbances are mutations on the sound-carrier material (e.g.
scratches or dirt spots on the surface).
1

Low-frequency noise transients are mainly larger scale defects than clicks. The reasons
are large discontinuities due to glued parts of film-rolls or other strong damages at optical
sound-recording. These changes in the film material cause special excitations in the light
intensity during sound reproduction and hence cause strong transients in the reproduced
sound. These large discontinuities can be heard as low-frequency pulses.
Broad band noise is common to all analogue measurement, storage and recording systems
and in the case of audio signals it is generally perceived as hiss by the listener. It can be
composed of electrical circuit noise, irregularities in the storage medium and ambient noise
from the recording environment.
Wow and flutter are pitch variation defects which may be caused by eccentricities in the
playback system, motor speed fluctuations or by special distortions of the sound carrier (e.g.
shrinkage of film).
Non-linear defect is a very general class that covers a wide range of distortions. In the
audio field, the principal causes are [2]:
saturation in magnetic recording,
tracing distortion (before compensation was introduced) and groove deformation in
records,
the inherent nonlinearity of optical soundtracks.
There are already many solutions and applications in the scientific literature and on the
market that deals with restoration of local degradations and wide band noise. There are
already several results published in the literature to eliminate pitch defects. However, there
was a relatively small emphasize on the elimination of non-linear defects. It is the topic of
current research interests in DSP for audio [1].
In the last decade, methods restoring damaged audio recordings have progressed from ad
hoc methods, motivated primarily by ease of implementation, towards more sophisticated
approaches based on mathematical modeling of the signal and degradation processes.
This thesis addresses the elimination of distortion of optical soundtracks, a previously
not too extensively investigated problem. Restoration of nonlinear distortions is a special
kind of inverse filtering problem. This problem could be ill-posed, which means that during
reconstruction of the nonlinearly distorted signal, small uncertainties in this signal can cause
strong deviations in the restored one. In this case, our aim is to find a restoration method,
where both the signal distortion and the level of deviation (more simply the level of the
amplified noise) can be kept low. The aim of this dissertation is to clarify the reasons
2

of nonlinear distortions in the case of optical soundtracks and propose methods based on
digital signal processing to reduce the distortion and avoid the appearance of artefacts in
the restored sound.

1.2

Structure of thesis

Chapter 2 introduces the description and representation forms of memoryless nonlinearities


and nonlinearities with memory. Chapter 3 examines the possible methods for eliminating
effects of nonlinear distortions and explains in details the problems and possible solutions
of nonlinear post-compensation techniques. The main problem during post-compensation
is the amplification of the noise that is present in the original material. Without proper
compensation, the noise amplification could be so strong that the resulted sound could
be worse than the distorted one. In this chapter the origin of the noise amplification is
discussed and the possible methods are summarized, which could be applicable to overcome
this problem.
Chapter 4 reviews the nonlinear characteristic of photosensitive materials and shows the
analytical equations, which describe the nonlinear behaviour. Chapter 5 discusses the filmsound recording techniques and the appearance of nonlinear distortions of the photosensitive
materials in the sound.
Chapter 6 shows two novel methods for composing compensation characteristics for postcompensation of distorted signals. One of them is based on Tikhonov regularization operators. The aim of this compensation technique is to minimize the estimated value of the
energy of noise and distortion terms together. The method is fast compared to other compensation methods, because this method does not have iterative steps during the compensation
process. Simulations also show in this chapter that the accuracy of the method is as high as
other compensation methods.
A common problem at regularization of an ill-posed problem is that we have a very little
knowledge about the original signal, hence we dont know, how much regularization is needed
to achieve the optimal result. In this chapter a new method is shown that can automatically
find a good estimate about the amount of regularization without the interaction of a user.
It is quite important at the film industry and at the film archives, where huge amount of
degraded films are waiting for restoration and there is no time to make several experiments
on each film.
The aim of the second compensation method is to produce an unbiased estimate from
3

the noisy, distorted signal about the original, undistorted one.


We also have little knowledge about the nonlinear distortion function, which is another
problem in signal compensation. In chapter 6 a possible method is shown for the identification of the nonlinear function in the knowledge of an analytical, parametrizable formula
about the distortion.
Finally, Chapter 7 presents conclusions and suggests possible directions for future research.

Chapter 2
Classification of nonlinearities and
nonlinear models
2.1

Classification of nonlinearities

A system, at which the relation between the input and the output of the system is described
by the function H(), is a linear system if, for any inputs x1 (t) and x2 (t), and for any constant,
c, the additive property (eq. (2.1)) and the homogeneity property (eq. (2.2)) are satisfied:
H(x1 (t) + x2 (t)) = H(x1 (t)) + H(x2 (t)),

(2.1)

H(c x(t)) = c H(x(t)).

(2.2)

In the case of a nonlinear system the additive and/or homogeneity properties are not satisfied.
Nonlinear systems can be divided into two main categories:
memoryless nonlinear systems,
nonlinear systems with memory.
In a memoryless nonlinear system the current output at time t depends only from the current
input at time t and does not depend from previous or next input values. A nonlinear system
has memory if the output at time t depends on the input at time t, as well as the inputs
over a previous time interval.

2.2

Representation of memoryless nonlinearities

Memoryless nonlinear models are often adequate for representing nonlinearities in systems
that have a very wide bandwidth with respect to the signal bandwidth. The main advantage
5

in resorting to such models is their simplicity, ease of application and low computational
burden [3]. Good examples for applications that can be represented with memoryless nonlinearities are e.g. microwave amplifiers [4], A/D and D/A converters [5], photosensitive
materials [6, 7], tube amplifiers [8, 9], several types of transducers [10] and many other
applications that we cannot enumerate because of the lack of space.

2.2.1

Taylor series and piecewise linear representation

The most elementary model for dealing with nonlinear systems is the Taylor series. The
Taylor series provides a polynomial representation of a memoryless nonlinear system. According to [11], James Gregory was the first to discover the Taylor series in 1668, more than
forty years before Brook Taylor published it in 1717.
If a real function, f (x), has continuous derivatives up to (n+1)th order, then this function
can be expanded in the following fashion:



1 df (x)
1 dn f (x)
1 d2 f (x)
f (x) = f (a) +
+ ...+
+ Rn
+
1! dx x=a 2! dx2 x=a
n! dxn x=a

(2.3)

where Rn , called the remainder after n + 1 terms is given by:


Rn =

Zx

f (n+1) (u)

(x u)n
f (n+1) ()(x a)n+1
du =
n!
(n + 1)!

a < < x.

(2.4)

When this expansion converges over a certain range of x, that is lim Rn = 0 then this
n

expansion is called the Taylor series of f (x) expanded about a.


If the value of n in eq. (2.3) equals 1, we will get a simple linear model, which has
appropriately small error in a given small domain. Linearity has been one of the fundamental
principles upon which theory of signal processing has been structured. Most real-world
problems however, are intrinsically nonlinear and can be modeled as linear ones only within
a limited range of values. Piecewise linear constitute a compromise between the inherent
complexity of the nonlinear domain and the theoretical abundance of linear methods.

2.2.2

Polynomial interpolation

In 1903, Weierstrass published a theorem that states that memoryless nonlinear systems that
are non-polynomial in nature, could be approximately represented with arbitrary accuracy
by polynomial models, over a given range of inputs [12]. This is now known as the Weierstrass
approximation theorem. In the 1950s, Davenport and Root showed how the direct method,
6

and the transform method can be used to determine the statistical properties of the output
of memoryless nonlinear devices [11].
In the late 1960s, Blachman showed that a memoryless nonlinearity can be represented
as a generalised Fourier decomposition into a sum of orthogonal polynomials ([13, 14]). The
orthogonality of the polynomials for particular input signal properties allowed the polynomial
coefficients to be calculated or measured using a cross-correlation method. Appropriate sets
of orthogonal polynomials for a number of stationary input signals, were discovered well
before Blachmans application. In 1939 Szego attempted to produce a complete bibliography
of every paper published on the subject of orthogonal polynomials before that date [15].
The most commonly used orthogonal polynomials are Chebyshev and Hermite polynomials. Chebyshev polynomials , Tn (x), n 0, 1, 2, . . ., are real functions, which form a complete

orthogonal set on the interval 1 x 1 with respect to the weighting function

1
.
1x2

It

can be shown that

Z1

0 if m 6= n

1
Tm (x)Tn (x) =
if m = n = 0

1 x2

if m = n = 1, 2, 3, . . .
2

Since sine wave signals have

1
1x2

(2.5)

amplitude distribution, this kind of nonlinearity inter-

pretation is applicable to generate or eliminate certain harmonic distortions in sinusoid


excitations [3].
Hermite polynomials, Hn (x), n 0, 1, 2, . . . form a complete orthogonal set on the interval

x with respect to the weighting function exp(x2 ) It can be shown that


Z

x2

Tm (x)Tn (x) =

if m 6= n

2 n! if m = n
0

(2.6)

Since Gauss-like signals have exp(x2 ) amplitude distribution this kind of nonlinearity interpretation is applicable to simulate or eliminate distortions in the case of Gaussian distribution, which is a quite often used signal modeling assumption.
The advantage of orthogonal polynomials instead of Taylor ones is that in the case of
cascaded systems they does not produce cross product terms. E.g., in the case of elimination
of the second and third order harmonic distortion of a system by a cascaded polynomial
compensation system, the result will not contain new, higher order terms. The disadvantage
of them is that this behaviour is true only for a small range of signal types, having a given
amplitude distribution.
7

2.2.3

Analytical models

Several nonlinear physical models such as traveling-wave tubes used in radio-frequency communication channels, or photosensitive materials can be described by analytical models,
which are special (usually non-polynomial) mathematical functions. The advantage of these
functions is that they usually have physical basics, and they can be parametrized, hence the
correct identification of a given nonlinearity is only optimization of a few parameters.
An example is the case of narrow frequency excitations such as radio-frequency communication signals, where the relationship between the input and output can be expressed as
separate amplitude and phase distortions. If an input radio-frequency signal is expressed as
x(t) = r(t) cos(t + (t))

(2.7)

then the output, y(t) of a traveling-wave tube can be described as


y(t) = A(r(t)) cos(t + (t) + (r(t))),

(2.8)

where A(r) and (r) are the amplitude and phase nonlinear distortions and t denotes time.
There are quite a few mathematical approximation formulae for these distortions ([16, 17,
18, 19]).
In the case of optical sound-recording the possible analytical formulae could be very important for identification and restoration. Analytical formulae with three or more constants
were proposed for photosensitive materials by several authors. They have reasonable agreement with experimental curves, but the theory between these equations is quite inadequate.
Several empirical formulae were proposed in the 1940s but these formulae were not accurate
enough [20]. A more accurate analytical formula about photosensitive emulsions for the
density vs. log exposure characteristic was given by Solman and Farnel [21]. It has good
agreement with real emulsions, although the photographic fog is not modeled.
A nowadays commonly used formula in the optical sound recording is the curve [22],
which can accurately describe a large range of the characteristic. The equation of the
curve is
T (E) = 1 (1 Tsat Tf og ) E Tf og ,

(2.9)

where T denotes the light-transmission ability of film after development and E stands for
light exposure on film before development. Tsat means the lowest light-transmission ability of
film and Tf og means the highest transmission ability that can be achieved. is a parameter
that is different for different film types. The normal range of this parameter is between about
0.2 and 5.
8

2.3

Representation of nonlinearities with memory

The approaches of nonlinear modelling based on Taylor series and orthogonal series, and the
direct and transform methods of nonlinear system analysis, are suitable only for memoryless
nonlinearities. However, the development of more complex models to deal with nonlinear
systems with memory dates back to the late 19th century.

2.3.1

Volterra series

In 1887 Volterra published a functional series expansion now known as the Volterra series
[23]. This generalised form of the Taylor series expansion can be used to represent a nonlinear system with memory. In 1910 Frechet published a more rigorous representation of the
Volterra series, and contributions towards the generalisation of Weierstrass approximation
theorem for functionals in which the polynomials are replaced by so called polynomic functionals. Specifically, the generalisation of Weierstrass approximation theorem states that
nonlinear systems with memory that are non-polynomial in nature, can be approximately
represented with arbitrary accuracy, by polynomial based nonlinear functional models, over
a given range of inputs.
The Volterra series is a very general means of describing a continuous-time output, y(t)
in terms of an input, x(t). The Volterra series expansion for a causal, time-invariant system
can be expressed as
y(t) = H1 [x(t)] + H2 [x(t)] + . . . + Hn [x(t)]

(2.10)

in which the n-th degree Volterra operator, Hn [] is defined by the convolution


Hn [x(t)] =

hn (1 , . . . , n )x(t 1 ) . . . x(t n )d1 . . . dn

(2.11)

and the Volterra kernels, hn () have unspecified form, but hn (1 , . . . , n ) = 0 for any i 0,
i = 1, 2, . . . , n.

In discrete time, eq. (2.10) becomes [24]


Hn [xt ] =

j1 =0

...

hn (j1 , . . . , jn )xtj1 . . . xtjn

(2.12)

jn =0

This is a generalisation from linear systems theory: for a linear system, y(t) = H1 [x(t)], the
first degree kernel h1 (t) is the impulse response, which completely describes the system. For
higher-degree systems, hn (t1 , . . . , tn ) can be thought of as an n-dimensional impulse response.
9

Discrete Volterra models are widely used in the control literature, classification problems
and artificial neural networks. Present applications in audio include input/output modeling
of audio systems and nonlinear filtering to precompensate for known loudspeaker nonlinearities [25].

2.3.2

Parametric models

There are two basic situations in nonlinear system modeling:


Input/output modeling in which we have access to both the input and output of the
system, and seek to describe the function mapping from present and past (for a causal
system) values of the input to the output.
Time series modeling in which we have access only to the output of the system. In
this case we want to describe the output in terms of an input/output model acting on
a random, independent and identically distributed excitation process.
Volterra modeling is a typical example for input-output modeling. An alternative methodology for nonlinear modelling is to use time series nonlinear modeling. There is a plethora
of such models, but there is no universally recognised method to categorise them [25]. For
example, Tong [26], Tjstheim [27], and Chen and Billings [28] take radically different approaches. They can all, however, be treated as generalisations or specialisations of the
nonlinear ARMA (autoregressive moving average) model.
In an autoregressive moving average model, an observed output signal, o can be represented as
ot =

k
X

ai oti +

i=1

l
X

bj etj + et ,

(2.13)

j=1

where ai and bi are weighting factors, ei is an excitation signal (can be thought as an additive
noise, which current value is unknown). This equation can be generalized to give a nonlinear
ARMA (NARMA) model. This takes the form
ot = f (ot1 , . . . , otk , et1 , . . . , etl ) + et ,

(2.14)

where f is now some arbitrary nonlinear function rather then being a simple weighted sum.
This function could be a polynomial model, which is very similar to a finite length and
finite maximum degree Volterra model. If the degree of the polynomial is two, this is the
10

so-called bilinear nonlinear model [25]:


ot = a0 +

A
X
i=1

2.3.3

ai oti +

B
X

bi eti +

j=1

C X
D
X

ck dl xtk etl .

(2.15)

k=1 l=1

Treshold models

In a threshold model [26], different functions f () are used depending on the value of the
output at some fixed lag d. This introduces nonlinearities even when the functions themselves
are linear. It can be written as

g1 () if r0 xt < r1

g () if r x < r
2
1
t
2
f () =
.
..

g () if r
m
m1 xt < rm

(2.16)

where the tresholds, ri satisfy

r0 < r1 < r2 . . . < rm1 < rm ,

(2.17)

and gi can be defined as a linear or nonlinear model.

2.3.4

Cascade models

Rather than using large, general nonlinear models, an alternative approach is to cascade
smaller models together, connecting the output of one to the input of the next. This can
correspond to the real physical structure of the system itself.
A common cascaded structure is the Linear-Nonlinear-Linear (LNL) or sandwich model
illustrated in Fig 2.1. This model consists of a linear element, h( ), whose output, u(t), is
transformed by a memoryless nonlinearity, N(). The output of the nonlinearity is processed
by a second linear system, g( ). This system is also called Wiener-Hammerstein system.
The LNL cascade has two special cases, the Hammerstein system (NL) and the Wiener
system (LN). Both the Wiener and Hammerstein models can be linear in the parameters if
the component models themselves are linear. Block-oriented models are a generalisation of
cascade models to allow arbitrary connections, including feedback and feedforward, between
subsystems. They are widely used in the control literature.
Cascaded systems can be switched parallel. Palm [29] showed that any finite dimension,
finite order, finite memory Volterra system can be represented exactly by a finite sum of
11

x(t)

h( )

u(t)

v = N(u)

v(t)

g( )

y(t)

Figure 2.1: Block diagram of an LNL system.


LNL models. More recently, Korenberg [30] showed that this was true for Wiener cascade
elements as well. This is a significant advancement, since the identification algorithms for
Wiener models are much simpler than those for LNL cascades [31].

12

Chapter 3
Techniques for nonlinear
compensation
3.1

Possible methods for compensation

When a signal passes a system having a nonlinear transfer function, the output signal will
be distorted. If the distortion is not acceptable, we have to somehow reduce it.
Methods for compensation or elimination of nonlinear distortions can be divided into
three main parts:

If we can modify the structure of the system, we can re-design it in order to reduce
the nonlinear distortion. This is a widely used method in the industry. Examples for
reduction of nonlinear distortions of A/D converters can be seen in [32, 33, 5, 34, 35,
36, 37]; examples for current transformers can be seen in [38] and [39], examples for
reducing nonlinear distortions in movie cameras can be seen e.g. in [40]. Unfortunately
this method is too widespread to deal with it in details.

If we cant modify the structure, but we have access to the input, we can pre-distort
the original input signal to compensate the distortion.

If we have neither access to the structure, nor to the input, we can post-process the
output signal to compensate the distortion.
13

P ()

N()

Figure 3.1: Block-scheme of pre-distortion.

3.2

Pre-distortion

As it was spoken in Chapter 2.3.2, nonlinear modeling and also nonlinear compensation has
two basic situations: input/output modeling, where we have access to the input and output
and time series modeling, where we have access only to the output. In several applications
we have access both to the input, x, and the output, o of the nonlinear system. In these
cases pre-distortion techniques are preferred. Its block scheme is depicted in Fig 3.1. In the
other case, when we have access only to the output of the system post-distortion techniques
can be used.
In the case of pre-distortion the excitation of the nonlinear system is given by another
nonlinear system to eliminate the distortion of the input excitation signal, i, at the output
of the two cascaded system.
The limitation of this method is that the noise level before the original distortion have to
be negligibly low, but this usually can be fulfilled. Hence there is no need to care about the
extra effects of noise and the pre-distortion stage could be simply the inverse of the original
system.
Pre-distortion is a typical solution at the transmitter side of microwave communication
channels, where the transmit amplifier has strong nonlinear distortion. Pre-distorter characteristics were proposed already in 1972 by Kaye [41] who proposed an analog, memoryless
pre-distorter to solve the problem of microwave tubes. A p-th order Volterra inversion for
microwave transmit amplifiers was proposed by Biglieri [42]. Another memoryless compensation techniques were proposed by Karam [43] and Pupolin [44]. Neural network approaches
can be seen in [45] and [19]. Good surveys can be read about this research field in the article
of Lazzarin [46] and in the PhD thesis of Wohlbier [4].
Pre-distortion is used in other fields as well, e. g. predistortion of power amplifiers [47],
laser diodes [48] or cathode ray tubes [49].
Audio related articles are typically reducing the nonlinearities of loud-speakers or com14

plete audio systems. Closed-loop system structures were proposed already in 1977 by Black
[50] and 1983 by Adams [51], who introduced a kind of system re-design. The first pioneer in
the pre-distortion field was A. J. M. Kaizer who made the first loud-speaker models based on
truncated Volterra-series in 1987 [52]. Solutions for loud-speakers based on Volterra-filters
were proposed by Klippel [53, 54, 55, 56, 57, 58] and Schurer [38, 59]. Adaptive nonlinear
compensators were proposed by Klippel [57] and Sternad [60]. Bellini proposed a solution
based on inverting the analytical sound pressure level characteristic of the loud-speaker [61].
Other algorithms were proposed for eliminating acoustic echo by Stenger and Rabenstein
[62, 63, 64, 65] that were based on scalable nonlinearity functions for cancelling nonlinear
distortions in hands-free phone systems. The nonlinear function is described by a polynomial
series, where the coefficients of the series were the parameters of the nonlinear function. The
method can adapt to the changing in the parameters of the distortion and can be extended
for handle nonlinearities with memory.
In all cases the main problem is to identify the characteristic of the nonlinear system.
In some studies, the nonlinear characteristic is assumed to be given, the others proposed
identification techniques.

3.3

Post-distortion

While system re-design and pre-distortion are relatively simple tasks, post-distortion is a
more difficult one. The difficulty arises because most post-distortion processes are ill-posed.
This is also the case of the optical soundtracks.
A problem characterized by the equation f (x) = y is well-posed, if the following conditions introduced by Hadamard in the early 1900s are satisfied [66]:
the solution exists for each element y in the range of Y ;
the solution x is unique;
small perturbations in y result in small perturbations in the solution x without the
need to impose additional constraints.
If any of the above conditions are violated, the problem is said to be ill-posed.
Ill-posed problems exist in countless different fields just like measurement technology
[67], spectroscopy [68], optical measurements [69], image restoration [70, 71], high voltage
measurements [72, 73, 74], RC network identification [75] and in many other fields. Several
15

N()

?
- +j

P ()

Figure 3.2: Block-scheme of post-distortion.


solutions were proposed for linear problems, based on filtering techniques, using regularization operators or singular value decomposition, etc. (A good overview can be found about
these methods e.g. in [76] or [77]). However, relatively small amount of works deal with the
ill-posed problems of nonlinear signal reconstruction. In the followings, these problems will
be examined in details.
In the case of nonlinear post-distortion usually the third ill-posed problem arises: small
perturbations in the measurement will result big deviations in the solution. The schematic
block-scheme of post-distortion can be seen in Fig. 3.2. In this case the noise-source is before
the inverse stage and in a lot of cases the noise level is not negligible. If the inverse system
amplifies the signal, the noise will also be amplified. The amplification could be so strong
that the amplified noise signal covers the original one.
A simulation example for noise amplification can be seen in Fig. (3.33.5). In this
simulation the original sinusoid signal was distorted by a Gaussian error function. The
signal-to-noise ratio was 50 dB. After restoration, the noise was amplified at the top part of
the sinusoid, where the nonlinear curve was nearly flat.
Given an ill-posed problem various schemes are available for defining an associated problem which is well-posed [66]. This approach is referred to as regularization of the ill-posed
problem. In particular, an ill-posed problem may be regularized by
1. changing the definition of what is meant by an acceptable solution,
2. changing the space to which the acceptable problem belongs,
3. revising the problem statement,
4. introducing regularization operators and
5. introducing probabilistic concepts to obtain a stochastic extension of the original deterministic problem.
16

2.5

1.5

0.5

0.5

0.1

0.2

0.3

0.4

0.5
time

0.6

0.7

0.8

0.9

Figure 3.3: Original, input signal (x in Fig. 3.2).

2.5

1.5

o = erf(x)

0.5

0.5

0.1

0.2

0.3

0.4

0.5
time

0.6

0.7

0.8

0.9

Figure 3.4: Distorted and noisy, observed signal (o in Fig. 3.2).

17

^
x

2.5

1.5

0.5

0.5

0.1

0.2

0.3

0.4

0.5
time

0.6

0.7

0.8

0.9

Figure 3.5: Reconstructed signal by the exact inverse of the nonlinear distortion (
x in Fig.
3.2).
Inversion problems have been extensively studied since 1960. In the early 1960s Tikhonov
began to produce an important series of papers on ill-posed problems. He defined a class of
regularisable ill-posed problems and introduced the concept of a regularising operator which
was used in the solution of these problems [78].
While for linear ill-posed problems a very comprehensive regularization theory is available, the development of regularization methods for non-linear ill-posed problems and the
corresponding theory is quite young and very vital field of research with many open questions [79]. The rigorous analysis of the Tikhonov regularization in the nonlinear context was
initiated first only in 1989 by Engl, Kunich and Neubauer [80].
Since nonlinear equations generally do not have an analytical solution, these algorithms
are mostly iterative ones [81]. In this case there are two points at the algorithms, where
regularization operators can be used:
regularization may be required to make the solution well-posed,
regularization may be required to avoid divergence of the iterative algorithm.
These techniques will be introduced in the next three sections.
Another class of algorithms to handle nonlinear ill-posed problems are based on probabilistic concepts such as Bayesian algorithms and Markov-chain Monte-Carlo methods [25].
18

The aim of these techniques is to create a parametric model of the original, undistorted and
noiseless signal, then to find the possible parameters of this model, based on the noisy and
distorted observation, hence recreate the original signal. These techniques will be introduced
in section 3.3.4.

3.3.1

Regularization of the solution

Let us consider the following nonlinear problem:


y = N(x)

(3.1)

Our goal is to best approximate eq. (3.1) in the situation, when the exact data, y, are not
precisely known and only a perturbed data, o with
ky ok

(3.2)

are available. Here, is called the noise level. This problem is usually ill-posed, because
the third rule of Hadamard is not satisfied: small perturbations in o will produce big perturbations in the estimate of x, (that will be noted in the followings by x), just like in the
example of section 3.3.
A commonly used method for solving this problem is Tikhonov regularization. In Tikhonov
regularization, eq. (3.1) is replaced by a minimization problem, where not only the prediction error, kN(
x) ok is minimized, but other terms as well, which are in connection with
the estimated input signal. A practical realization of this minimizaton problem is
kN(
x) ok + k
x xc k min,

(3.3)

where > 0 is the regularization parameter and xc is some center value ideally chosen as
the critical point of interest, but often just set to zero [82]. In this case, when we try to find
that x value, which produces the minimum value of eq. (3.3), deviances between our initial
guess, xc and our estimate, x will be punished, hence big deviations, caused by noise wont
be allowed.
In eq. (3.3), it is not obligatory to use the norm of x xc . Using other norms lead to

the generalized Tikhonov regularization that can be expressed as

kN(
x) ok + k R{
x} R{xc }k min,
where R() is the generalized regularization operator [79, 83].
19

(3.4)

One possibility could be maximum entropy regularization




Z
x(t)
kN(
x) ok + x(t) log
dt min,
xc (t)

x ,

(3.5)

where xc (t) is some initial guess about x(t) such as in eq. (3.3). In this case xc is often just
1. For further explanation and examples for nonlinear maximum entropy regularization, see
for example [84, 85, 86, 87].
Other commonly used possibility is bounded variation regularization

Z
d

x
(t)
dt min,
x ,
kN(
x) ok +

dt

(3.6)

which enhances sharp features in x as needed in, e.g., image reconstruction, see [88, 89, 90,
71, 91].
In the case of monotone nonlinear functions, where
N(x2 ) N(x1 ) 0

if x2 x1 0

(3.7)

the least squares minimization can be avoided and one can use the simpler regularized
equation
N(
x) + (
x xc ) = o,

(3.8)

which is called Lavrentiev regularization or method of singular perturbation [92]. This


method preserves the original structure of the problem and sometimes can lead to easilyimplemented localized approximation strategies [93].
Since eq. (3.3) (3.6) and (3.8) are nonlinear equations, analytical solution of them
is generally not possible. The commonly used method is to solve the problem by iterative
methods. In the next section the iterative methods will be discussed.

3.3.2

Regularization of the iteration

The first candidate for solving eq. (3.1) in an iterative way could be Newtons method [81]
that is the iterative solution of the output least squares problem
ko N(
x)k min,

(3.9)

where k k corresponds to the L2 norm. (Of course, regularization methods also can be used

on all the other equations discussed in the previous section, but for simplicity and for easy
20

understanding, the iterative methods will be shown on eq. (3.9) ). In this case eq. (3.9)
simplifies to


dN()
(o N(
x)) = 0
d =x

From eq. (3.10), the Newtons method can be described as



dN() 1
(o N(
xk )),
xk+1 = xk +


d

(3.10)

(3.11)

=
xk

starting from an initial guess, x0 . Even if the iteration is well defined and

dN ()
d

is invertible

for every x, the inverse is usually unbounded for ill-posed problems. Hence eq. (3.11) is
inappropriate in this case, since each iteration means to solve a linear ill-posed problem, and
some regularization technique has to be used instead. Applying Tikhonov regularization
yields the Levenberg Marquardt method [94]
xk+1 = xk +

1
dN () 2

d
=
xk


dN()
xk )),
(o N(
+ k d =x

(3.12)

where k is a sequence of positive numbers. Augmenting eq. (3.12) by the term

1
dN () 2
d

=
xk

+ k

k (
xk xc )

(3.13)

for additional stabilization gives the iteratively regularized Gauss-Newton method [95]
"
#

1
dN()

xk+1 = xk +
(o N(
xk )) k (
xk xc ) .
(3.14)
dN () 2
d =xk
+


k
d
=
xk

The other widely used iterative method is the steepest descent method [96]

dN()
xk+1 = xk
,
d =xk

(3.15)

where is an appropriately chosen positive value or sequence of positive values. If


k = N(
xk ) o
this leads to the so-called Landweber iteration [97]

dN()
(N(
xk ) o).
xk+1 = xk +
d =xk

(3.16)

(3.17)

Another nonlinear iterative method that is based on the steepest descent algorithm is [98]
xk+1 = xk + (o N(
xk )) .

(3.18)

For a more detailed explanation about techniques based on Newtons method see e.g. [99].
21

3.3.3

Choosing the value of the regularization parameter

One important question in the application of regularization methods is the proper choice
of the regularization parameter, . Let us see the equation of the Tikhonov regularization
problem again:
kN(
x) ok + k
x xc k min .

(3.19)

If we choose near to zero, the regularization will be too little. The solution, x tends to the
original, ill-posed result that is the solution of the output least squares problem, eq. (3.9).
If approaches to infinity, the result will be overregularized. The output norm becomes
negligible compared to k
x xc k. In this case the solution will be well posed, however, it
tends to xc . The result will be our initial guess that estimate could be strongly distorted

(for example simply zero). The optimal solution can be found at an optimum value that
lies somewhere between 0 and .

Several methods were proposed for finding an optimum in the case of linear problems.

A commonly used method is the Generalized Cross Validation (GCV).


The underlying principle in cross validation is that if an arbitrary observation is left out
from o, then its input can be well predicted using the solution calculated from the optimal
regularized remaining observations. GCV is based on the same principle and, in addition,
ensures that the regularization parameter found has some desirable invariance properties,
such as being invariant to an orthogonal transformation (which includes permutations) of the
data. For the linear problem, A x = b, this leads to choosing the regularization parameter
as the minimizer of the following function
G() =

(A x b)2

trace(I A(AT A + 2 )1 AT )

(3.20)

For more explanation see e.g. [100] or [101].


Glans [102] proposed a method based on minimizing the imaginary part of x that is
produced by the numerical errors of the computation method. The technique in this method
seems quite unreliable and this method has no heuristic and no formal proof. Instead of
this method, Daboczi [103, 104] proposed a systematic iterative method for finding in the
case of impulse signals, based on a rough signal model. Chen [105] proposed a solution for
deconvolution of noisy images even if the point-spread-function (the linear, two-dimensional
filter function that distorted the original image) is not exactly known. Roy proposed a
method based on the difference norm calculated from the linearly distorted observation and
its further distorted version with the same linear distortion [106], however, this method also
22

has no formal, no heuristic proof. Solutions based on probabilistic approaches were proposed
in [71] and [107].
At the iterative techniques, Bertocco [108] published a method that worked on the iterative deconvolution of step-response signals estimating the noise spectrum from the flat part
and the signal spectrum from the changing one. Parrucks method [109] is based on similar
assumptions.
In the case of nonlinear iterative problems, first Engl gave an analysis about the convergence rate dependence from the regularization term in the case of iteratively solved maximum
entropy and Tikhonov regularization [80, 85]. Haber examined rigorously these problems and
collected the possible methods in [101] and [99]. These methods are based on simple continuation, or cooling [99]. They start with a relatively large value of , then they gradually
reduce that. If the result deemed to be unacceptable, is increased by a certain factor.
A combination of Tikhonov regularization and gradient method was proposed by Ramlau
[110].
For nonlinear Tikhonov regularization Morozov proposed a so-called discrepancy rule
[111] in which the regularization parameter is chosen as the solution of
kN(
x, ) ok = C,

C1

(3.21)

where is the estimation of the norm of the noise [112].


Another heuristic method is the L-curve technique developed by Hansen [113]. This
method does not have a formal proof, however, it is often used because of its simplicity
[101]. The L-curve is made by plotting the log of the misfit, kN(
x) o)k as the function of
log(k
xk) which are obtained for different regularization parameters. This plot has a typical

L-shape. Hansen claimed that the best model norm for a small misfit is obtained at the
corner of the L-curve.
For Lavrentiev regularization, constrains were given for in [114], but there are no special
methods to determine its exact value.

3.3.4

Bayesian techniques

Bayesian nonlinear restoration techniques are based on nonlinear time series. Many models
are possible for nonlinear time series (see e.g. [28, 26]). In the audio field nonlinear autoregressive (NAR) models are widely used [115]. A commonly used representation of NAR
23

models is the cascaded NAR model:


b
i
X
X
(i,j) b(i,j) yti ytj
yt = xt +
i=1 j=1

b X
j
i X
X

(i,j,k)bi,j,k yti ytj ytk + higher degree terms,

(3.22)

i=1 j=1 k=1

where yt is the t-th sample from the distorted signal from that we can make a (noisy)
observation, b(i,j) , b(i,j,k) are the weighting parameters of the NAR process, (i,j) , (i,j,k) are
h0, 1i binary indicators, which decide the usefulness of a weighting parameter, b is the
maximum lag of the model and xt is the undistorted signal modeled as an autoregressive
process
xt = et +

k
X

ai xti .

(3.23)

i=1

A major advantage of this model formulation is that the inverse of the nonlinear stage is
a straightforward nonlinear moving average (NMA) filter, which is guaranteed to be stable.
Hence it is simple to reconstruct the signal xt from yt for a given set of NAR parameters
[25].
The signals and the parameters are modeled as random variables, usually by Gaussian
or multivariate Gaussian distribution. The correct parameter values of the NAR model can
be estimated by finding the value, which has the maximum probability. The parameter
searching can happen by Monte-Carlo methods [2, 116, 117] or simulated annealing [25] or
by any other optimum-searching algorithms.
The advantage of this method is that it can work in the case, when there is no a priori
information about the input signal nor the shape of the nonlinear distortion function. A
disadvantage is that a priori information can hardly be implemented in the process. Other
problem is that the optimum searching algorithm itself can stuck in local minima and requires
high computational power. Also a serious problem is that the velocity of the optimum
searching algorithm at a given task is unknown, therefore the applications realized with this
method cannot be used in real-time environments.

3.4

Audio related post-distortion techniques for reducing nonlinear distortions

Several solutions have been made for nonlinear pre-compensation of audio devices such as
pre-compensation of hi-fi sets, nonlinear echo cancellation of mobile sets, compensation of
24

loudspeakers, etc., however, relatively small amount of work has been done in the field of
nonlinear post-compensation. In the followings these works will be discussed.

3.4.1

Histogram equalization

Histogram equalisation is a simple technique to estimate a memoryless nonlinear transfer


function through which a speech signal has been passed [118]. A smooth function is fitted to
the histogram of sample values from an extract of the signal. This is compared to a reference
histogram shape, based on analysis of a range of speakers, and a 1:1 mapping is derived which
will make the smoothed histogram conform with the reference one. This mapping is then
applied to the distorted signal.
Because it is assumed that the original signal closely conforms to a standard reference
histogram, this method cannot readily be applied to complex music signals, where histograms
differ greatly between recordings and vary significantly over the duration of a recording. The
other problem that was claimed by the author is that the algorithm is very sensitive to noise.
The algorithm was originally proposed for use in speech communication channels, and has
led to a patented device [119]. A related method has been used to restore recordings made
using early analogue-to-digital converters with non-uniform quantisation step heights and
some missed codes [120]. Since these are all small-scale, local defects, they can be reduced
by smoothing the histogram, without the need for a reference.

3.4.2

Signal reconstruction with known nonlinearity

For situations in which distortion is caused by a known memoryless nonlinearity, an iterative


algorithm has been proposed by Polchlopek [98] to reconstruct the original signal where
only a bandlimited version of the distorted signal is available. The reconstruction uses
the iterative method described in eq. (3.18). The algorithm seems to be applicable also
for certain nonlinearities with memory. The analysis of the algorithm for noise was not
performed.
Tsimbinos composed the inverse of the memoryless nonlinearity from orthogonal polynomials to compensate distortions in digital radio receivers [121]. The advantage of this
method is that in the case of sinusoid excitations, the unwanted harmonics can be filtered
out without the appearance of new harmonic components. However, the method works only
in the case of pure sinusoid excitations that is not the case at general audio problems.
25

3.4.3

Restoration using nonlinear autoregressive models

If there is absolutely no information available about the nonlinear distortion, we have to


make a blind compensation. Audio signals can be well represented by autoregressive models,
therefore a possible method in the case of audio signals is to use autoregressive models for
identification and compensation such as eq. (3.22) and (3.23) in Chapter 3.3.4. This method
is used by Troughton for eliminating tape saturation [2, 116]. The method is applicable to
handle also nonlinearities with memory.
The disadvantage of this method is that the correct model order of the autoregressive
models are not known. The correct parameters are also not known. These data can be
found only by optimum searching algorithms, however, these algorithms may not find the
true parameters, they may stuck in local minima. In this case the resulted signal could be
even more distorted than the original one.

26

Chapter 4
The nonlinear characteristic of movie
film
4.1

Image formation

Optical recording of sound and motion picture is made by photosensitive materials. These
materials are on thin film-rolls. Formerly the carrier was made of cellulose nitrate, later
cellulose acetate. Nowadays it is made of polyester-based plastic. This carrier material is
coated by a photosensitive layer. A normal photosensitive layer consists of very large number
of tiny crystals (grains) of silver-halide embedded in a layer of gelatin. The combination of
grains and gelatin is often referred to as the photographic emulsion [122].
During taking a picture, the optical image is projected onto the photosensitive layer for
a fraction of a second. In ordinary practice this photographic effect is not revealed by any
visible change in the appearance of the emulsion. The exposed emulsion, however, contains
an invisible latent image of the light pattern that can be translated readily into a visible
silver image by action of a developing agent. This latent image is formed by ionization of
silver in the silver-halide crystals that produces very small (few atoms large) silver specks on
the crystal and errors in the crystal structure. During development, if a crystal is adequately
exposed, these mutations cause the acceleration of chemical reactions between the mutated
crystal and the developing agent causing the fast decay of these crystals to metallic silver
grains. This is the so-called print-out. The reaction of the not (or inadequately) affected
crystals is about two or more decades slower [123].
Developing of a silver-halide crystal can be treated as a binary process. If the crystal
contains enough mutation, it will completely transform to silver during the development. An
27

inadequately exposed crystal will be practically untouched. Since the crystals are absolutely
isolated from each other due to the gelatin carrier, the status of a crystal will be independent
of the status of the neighbouring crystals.
With this process, the light amplitude distribution can be reconstructed as the amount
of silver grains on the developed film. Since these silver grains are black, we will get a black
and white negative copy from the original optical image. However, the relationship between
the amount of silver and exposure is not linear.

4.2

Relationship between silver mass and transparency

In practice, the reduction of transparency of the layer is of most interest than the quantity
of silver. Transparency (T ) is defined as the ratio of flux transmitted (Pt ) to that incident
(Po ) on a uniformly exposed and processed area that is large compared to the area of a grain
[20]:
T =

Pt
.
Po

(4.1)

In their classical paper [124], Hurter and Driffield proposed a new measure, the opacity:
O=

1
.
T

(4.2)

They also proposed to represent the relationship between light exposure and opacity of
the developed film in logarithmic scaled graphs, because it is more descriptive in visual
and photographic reproduction than absolute values or transparency vs. exposure. The
logarithmic of the opacity is termed density (D):
D = log (O) = log (T ) .

(4.3)

The value of D depends on the emulsion, the light magnitude, duration and spectral behaviour of the exposing light. Usually the quantity of light received per unit area is of
greatest interest. This is called exposure and denoted by E. E can be expressed as
ZT

I(t)dt,

(4.4)

where I is the intensity of light and T is the duration of exposure.


The ratio between silver mass and density is the photometric equivalent. Its reciprocal
is the covering power and is a measure of the efficiency with which the silver mass produces
28

optical density. This number depends on the number of silver-halide grains per unit area and
on the average area of a grain surface, but not on the exposure [125]. Hence the transmission
and the amount of silver is proportional.

4.3

Relationship between transparency and exposure

As it was told in the introduction of this chapter, relationship between exposure and the
amount of silver (so the transmission) is not linear.
If a single layer of silver-halide grains of equal size and sensitivity is exposed, the probability that a given grain will form latent image depends entirely on the random arrival of
photons and the chance of absorption of a photon by a grain. Assuming that a grain must
absorb r quanta to become developable, the possibility, p, that a grain will absorb r quanta
from an exposure such that the mean number of absorbed quanta per grain is q, is given by
the Poisson-equation:
p(q, r) = exp(q)

qr
.
r!

(4.5)

Grains absorbing more than r quanta will also be developable. The probability that a grain
absorbs r or more quanta will be
P (q, r) = 1 exp(q)

r1 r
X
q
0

r!

(4.6)

The most sensitive crystals in a typical photosensitive layer require at least 10 or more
quanta. Special emulsions used for recording X-ray or nuclear particles require 1 quantum
per grain. A typical emulsion requires about 1000 photons per grain to make the half of the
crystals developable [126]. Calculated characteristics by these basic parameters for some r
values can be seen in Fig. 4.1.
The linear parts of the characteristics of monosized and uniformly sensitive emulsions are
very tight. Usually this is not proper for image recording. Therefore, instead of monosized
photosensitive emulsions, usually lognormally distributed emulsions are used, where several
different sized crystals are present in the emulsion having different photon sensitivity. In
this case the distribution of the grain size can be described by the equation:
!
(ln (x ))2
1
exp
x > ; > 0
p(x) =
2 2
(x ) 2
where is the shape parameter (variance) and is the location parameter (modus).
29

(4.7)

relative amount of silver

1
r=1
r=16
r=32

0.8

r=64
0.6
r=128

0.4

0.2

0
0

20

40

60

80

100

120

140

160

number of photons

Figure 4.1: Characteristic of exposure vs. developable crystals in a monosized silver-halide


layer for different foton quanta sensitivity (r).
The photon sensitivity of the same sized grains (one size class) is also not uniform. A
single size class has a sensitivity distribution also close to lognormal. In the case of commonly
used photosensitive emulsions, the variance of the sensitivity of a size class is about the same
as the variance of grain size (70% to 170% of the variance of sensitivity) [20].
The exposure vs. (1-transmission) simulated characteristic of a typical emulsion can be
seen in Fig. 4.2 (the MATLAB simulation file can be seen in Appendix C). The logarithmic
exposure vs. density characteristic can be seen in Fig. 4.3. The applicable characteristic
part, where the change of the output is appropriately high, now is much wider. It is about
two decades. However, the whole characteristic is far from linear. There is no linear part,
only a small part near to the beginning can be approximated as linear.
The characteristic begins with a constant part, where the particles are still insensitive
to the light intensity. The transmission here, however is not one, but a bit smaller. This
is caused by crystal imperfections created during the creation of the photoemulsive layer,
which causes a basic blackness in the image. This basic blackness is called photographic fog
or veil.
The constant part is followed by a toe, then an interval, which can be represented by the
following equation in the linear graph [40]:
1 T = (1 Tsat Tf og ) (E E0 ) + Tf og ,
30

(4.8)

1transmission
1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1
exposure

Figure 4.2: Exposure vs. 1-transmission characteristic of a typical emulsion.

density
1.4

1.2

0.8

0.6

0.4

0.2

0
5
10

10

10

10

10

10
log exposure

Figure 4.3: Logarithmic exposure vs. density characteristic of a typical emulsion.

31

where Tsat is the transmission at saturation, Tf og is the basic transmission and is a constant
that depends from the photosensitive material. This is the so-called gamma-curve. Usually
photo- and film-negative materials have , which is smaller than one, positive photosensitive
materials have higher than one.
After the gamma-curve part of the photosensitive layer the emulsion becomes more and
more saturated. At extremely high light intensities, in a given interval the transmission
becomes to increase and the density decrease (this part is not involved in Fig. 4.3). This
part is called solarisation. The effect is caused by special secondary chemical effects. This
light intensity cannot be reached in the sound-stripe of the film, therefore we dont have to
deal with it.

32

Chapter 5
Imperfections in the optical
sound-recording techniques
5.1

Introduction

At professional sound-films, many methods were used for sound recording. After the 1990s
almost only digital sound-recording technologies are preferred, because they have high soundquality and they can be easily copied. However, before the digital age, only analogue methods
were exist. In the film industry, these techniques usually were based on optical sound projection. Magnetic recording technique was also used in the film industry since the 1950s.
Although magnetic recording technique had lower distortion as optical methods, it was not
so widespread, since copying of this kind of film is much more difficult and magnetic sound
degrades much more quickly at every broadcasting. Therefore before 1990, the optical soundrecording methods were typically used. Before the 1950s, only the optical sound-recording
techniques were known in the film industry.
The advantage of optical sound-recording methods in film-making that they can be easily
copied together with the film without using any additional technologies. Another advantage
that during sound-recording and reproduction nothing has to touch the surface of the film,
therefore the sound on the film will not be degraded by the reproduction. However, optical
sound-recording techniques have disadvantages as well. One disadvantage is the quite high
distortion level, which comes from the nonlinear behaviour of the photosensitive materials,
the other one is the quite high noise level. In the following sections the possible optical
sound-recording techniques will be explained and the description of the distortions at these
techniques will be discussed.
33

Figure 5.1: Schematic diagram of variable density method.

5.2

Optical sound-recording techniques

Optical sound-recording has two different methods. One of them was developed by Western
Electric and Fox Movietone and called variable density method. The other method was
developed by RCA and is called variable area method. Variable area method is still being
used for sound-recording. Variable density method was used only until the 70s. However,
from 1925 until 1950, variable density recording was as widespread as variable area based
one. This recording method was used in the studios of most of the major motion picture
producers including Paramount, M-G-M, 20th Century-Fox, Universal, Columbia Pictures,
Movietone and Hearst Newsreel Companies [127].
In the beginnings, variable area method had higher distortions and was much sensitive to
copying as the variable density one, however, with the development of the technology, these
problems were eliminated. After the 70s variable density method was not used anymore,
because this method was too sensitive to the changes in the exposition and developing
environment, compared to the variable area one and in that time sound-distortions of variable
area recordings were already dramatically reduced.
In the following sections, both sound-recording method will be explained in details. After
that the reasons of sound distortions of these methods will be clarified.
34

Figure 5.2: Sound-on-film, variable density.

5.2.1

Variable density method

At variable density recording a thin light ray is projected to the constantly moving filmband and the intensity of the light ray is controlled by the sound signal (Fig. 5.1). Since
creation of movie pictures require not constant movement, therefore creation of sound stripe
and creation of picture are made in two different modules of the movie camera (sometimes
they are made on even two different film rolls). The sound record on the sound-film will be
displaced from the center of the corresponding picture by a distance of 21 frames [128, 129].
However, this effect will not cause any additional problem in copying of the film or cause
any additional distortions and we dont have to deal with the picture module of the camera
in the followings. Therefore Fig. 5.1 shows already only the sound projection part.
When the film is developed after being exposed to the variable intensity light, the sound
track will be made up of lines of varying density extending across the sound track (Fig. 5.2).
The variation of density between successive dark and light bands determines the amplitude
of the recorded sound [130].
Variable density recording has two sub-methods. One method, which was used by Fox
Movietone and Lee de Forest, utilizes a special light source that can produce variable intensity
light. This device is called Aeolight, which is a glow discharge lamp consisting of a cold
cathode and a mixture of inert gases. The intensity of illumination varies with the applied
signal voltage. In the Lignose-Breusing systems cathode-ray tubes were used for this purpose
[131]. However, these light sources could not provide too much light intensity, which caused
small sound level, hence small signal-to-noise ratio.
35

Table 5.1: Velocity of different film formats.


Format
Velocity [mm] at different image number per second
25

24

18

16

570

475

456

304

16 (sub-standard)

190.5

182.8

137

122

8S

105.7

101.5

76.1

67.7

70
35 (standard)

* Only for silent films.

After 1930 the so called Kerr-cell was used widespread as a light valve in the way of a
constant light-ray. Kerr-cell works with two crossed polarizers with a glass of nitrobenzene
between them. Nitrobenzene can change the plane of polarization that can be controlled by
high voltage, hence light intensity can be controlled with this way. In this case the light
source could be a simple incandescent lamp that can produce strong light.
The most sophisticated method for light intensity control was invented by Klangfilm
Eurocord, which is the electrodynamic mirror oscillograph. Here the sound-current controls
the angle of a small mirror that lets more or less light through a slit. This device does not
require polarizers and high driving voltage.
During sound-recording, the controlled light shined through a narrow slit onto the moving
film, which was kept running at a constant speed (Fig. 5.1). The film speed was different at
different film formats. Speeds for the standard film formats can be seen in Table 5.1 (data
is taken from [132]).
The gap width of the slit in the case of 35 mm film is about 20 m. The image of the
slit is reduced and projected to the film by a lens. The gap width on the image is about 10
m, the width of the soundtrack itself in the case of variable area method is 2.94 mm, the
scanned area at sound reproduction is only 2.13 mm [129, 133].
Another method of making variable density film recordings is by the use of a special light
valve which was first used by Western Electric. The light valve of Western Electric varies the
amount of light by the opening and closing of a slit. This slit is the space between two taut
sides of a loop of wire suspended in a magnetic field. As the sound current passes through
the loop, the loop opens and closes passing varying amounts of light through it. The image of
the slit with varying width is then focused with lenses on the moving film so as to form lines
of varying density when the film is developed. This method was also called as longitudinal
36

Figure 5.3: Sound-on-film, variable area.


sound-recording.

5.2.2

Variable area method

The variable area method was developed by RCA. In this system the intensity of the light
is kept at a constant value, but the area of the sensitized film, which is effected by the light
is varied (Fig. 5.3). The system consists essentially of a source of light, a light valve and
a suitable optical system for concentrating the light into a very fine beam (Fig. 5.4). The
light valve is usually an electrodynamic mirror oscillograph, which is similar to the one that
is used at variable density method.
The width of the beam is usually 5 m. The maximum width of the optical soundtrack
is 2.94 mm, the maximum usable area is 2.54 mm, but only 2.13 mm is scanned during
sound-reproduction because at the sides of the sound-track 0.405 mm black stripes are used
to avoid interference with moving picture on film [129, 133].

5.3

Distortions at variable density method

The first source of distortions is the light controlling device at the sound-recorder. At
intensity control method the image of a constant width slit is projected onto the running
film-band and the sound-information is carried by the light intensity. Without input signal,
a basic light intensity is projected to the film and the signal driving adds to this value. In
the case of a simple sinusoid excitation with angular frequency, , the intensity at a given
37

Figure 5.4: Schematic diagram of variable area method with electrodynamic mirror oscillograph.
time will be
I(t) = I0 + I1 sin(t) = I0 (1 + r sin(t)),

(5.1)

where I0 is the basic light intensity, I1 is the driving intensity and r is the modulation factor.
If the width, s0 of the slit would be infinitesimal, the exposure on film would be the same
as the input signal. Since s0 is finite, the exposure, E will be


I0 s0
sin()
s0
E=
1+r
sin(t)
=
,
c

d
where c is the velocity of film and d =

2c

(5.2)

is the wavelength of the input frequency.

Comparing the exposure to the input light intensity we can see that nonlinear distortions
are not created by the recording device, although linear distortions will appear. The amplitude response in the case of 24 frame/sec 35 mm standard film and 16 frame/sec 16 mm
substandard film can be seen in Fig. 5.5. In the case of standard film, at 8 kHz, the distortion
is smaller than 0.45 dB. Since the sound-amplifier devices at film recording had about 8 kHz
bandwidth, this distortion is negligible. In this case the distortion can be treated simply as
a memoryless nonlinear distortion.
The distortion at 8 kHz is higher in the case of substandard film, about 7.5 dB. However,
this version was rarely used in professional film technique and only for silent films.
At longitudinal method, the light intensity is constant, but the width of the slit is controlled. In the case of sinusoid excitation for the slit width, s, we can write
s(t) = s0 (1 + r sin(t)).
38

(5.3)

E/E

0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000
f [Hz]

Figure 5.5: Amplitude response of light intensity controlled variable density sound-recording.
Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm) film at 16 fps.
The exposure will be different from eq. (5.2):



sin(2)
I0 s0
cos()
a1 sin(t) +
a2 cos(t) + . . .
E=
1+2
c

(5.4)

The exposure will be nonlinearly distorted. The distortion depends both on the modulation
factor and on the frequency of the sinusoid. At standard film, 1 kHz, r = 1 the distortion is
5%, at 5 kHz, r = 0.6 the distortion is higher than 30% [134]. This is the reason that this
technique was not too widespread.
The second source of sound-distortions is the density characteristic of film. As it was
shown in chapter 4, the density characteristic is highly nonlinear, which can be approximated
by the -curve. The first part of this characteristic (where the light intensity is still low)
is nearly linear. In the beginnings of the history of sound-film, this part was used by the
variable density recording method, because the distortions were small and aeolight could not
produce too high light intensity anyway [40]. However, due to the small signal levels and the
extremely high sensitivity to the recording and developing parameters, this technique was
replaced by another technique.
Since the main aim is to have a distortionless sound on the positive film and the sound
on the negative is not so important, it is enough, if the resulted characteristic of the film
positive and negative will be a straight line. If the characteristic of the film negative is the
inverse of the film positive, the result will be a pure, unbiased output. However, it is already
39

a too strong constraint. For our aim it is enough, if the multiplicative of the derivatives of
the negative and positive filmcharacteristics give 1 in every point in the range of interest:
dDpos
dDneg
= 1.
d lg(Eneg ) d lg(Epos )

(5.5)

In this case the resulted characteristic will be a straight line, which may be shifted from zero.
If it is shifted, a DC offset will appear on the optical receiver of the sound-reproduction
machine, but it is not a big problem, since it will be filtered out by the sound amplifier
devices.
If the straight parts of the density characteristics are used, we can calculate with the
values. In our case the multiplicative of the negative and positive values have to be 1.
This is the Goldberg-lemma [134]. At this method, the sound on the film negative will be
distorted, but this distortion can be eliminated on the positive, if the recording is copied to
a positive film with correct .
The disadvantage of this method is that it is not able to eliminate noise amplification
and in addition to this, every copy procedure adds more noise to the original sound, which
comes from the granularity of the film. A digital correction method would be able to use
the optimal compensation characteristic without any additional noise.

5.4

Distortions at variable area method

At variable area method, the width of the slit and the light intensity remains constant and
the width of the black part in the sound-stripe is driven by the input signal. When the input
signal is zero, the half of the sound stripe is exposed. The exposure of the exposed part is
E0 =

Is0
,
c

(5.6)

Is0
.
2c

(5.7)

the exposure of the whole sound stripe is


E0 =

If the input is excited by a sinusoid the exposure of the whole sound stripe can be written
as

I0 s0
E=
2c



sin()
1+r
sin(t)

s0
,
d

(5.8)

which is very similar to eq. (5.2). It means that the light valve causes no nonlinear distortion,
only a small linear one that can be neglected in the case of standard film.
40

light area
light diffusion

dark area

Figure 5.6: Creation of nonlinear distortions due to light diffusion.

The second distortion source is the film material itself. When an image is projected onto
the film, the developed image will not be the same as the original. The light will disperse in
the emulsion and will be reflected from the back of the film, hence some parts of the film,
which originally was not exposed, will also be exposed. In the case of recording a sinusoid
signal, the image of the sinusoid will be deformed, the darker part of the sinusoid will be
filled up and the lighter part will be much more thin (Fig. 5.6). This effect causes
a strong nonlinear distortion in the recorded sound similar to a rectification process. This
effect is known as Donner-distortion [40]. Since the amount of diffusion depends strongly on
the shape of the sound-signal, this kind of distortion can be treated as a nonlinear distortion
with memory.

The Donner-distortion can be reduced by proper copying of the original optical soundtrack negative to the film positive. At optimal parameters the light diffusion effects at the
film positive will (partly) compensate the light diffusion effects at the negative. The disadvantage of this method is that this kind of compensation is very sensitive to the copy
parameters and is not able to completely eliminate distortion. This is the reason that most
old films has a 10% or more harmonic distortion.
41

5.5

Appearance of noise

The noise sources at optical sound-recordings are common at the different recording-tecniques.
There are three different noise sources, particularly
celluloid-noise of film-carrier,
noise of scratches and other small degradations,
granular noise of photosensitive layer.
Celluloid-noise is caused by the optical imperfections of the celluloid-based film carrier.
The carrier of the film is not absolutely transparent and the transparency is slightly different
at different point of the film roll, which causes a wideband noise during playback.
Similar to the celluloid-noise small scratches and dust on film also cause differences in
transparency, hence cause noise during playback.
Granular noise is created by the finite size silver grains in the photosensitive layer and
this is the determinative part of the noise in film. If there were no developed grain in the
area that is scanned by the sound-reproducing device, the noise level would be zero. This
would be also the case, when the photosensitive layer were so dark that light would not go
through the film. Between the two extremities the noise level differs from zero. The estimate
value of the difference of transparency can be computed as
r
d Tk (1 Tk )
4T =
,
2
F

(5.9)

where d is the average size of silver grains, F is the size of scanning area and Tk is the
transparency of the examined area [40].
Eq. (5.9) shows the advantage of variable area method. Since at variable area method
the transparency is almost 1 or almost zero everywhere the numerator under the square-root
will be small, hence the noise level will also be small.

42

Chapter 6
Compensation of memoryless
nonlinearities
In the previous chapters we have collected and depicted the problems and available solutions of nonlinear restoration. We have analyzed the possible methods of nonlinear modeling
techniques and nonlinear compensation techniques in general. We also have examined the
problems that are appeared at the professional film-sound technology, which are the followings:
The sound of old films has high nonlinear distortion. This is especially the case of
films, which were made with variable density method. This nonlinear distortion comes
mainly from the nonlinear behaviour of the photosensitive material.
The sound has quite high noise level, which comes mainly from the aging of the film
material and the granularity of the photosensitive material.
There are thousands of old films waiting for rescue. The status of these film materials
are getting worse and worse day by day. It means that film restoration technicians
have not enough time to make several experiments on each film and to make time
consuming iterative calculations on the sound.
Since variable density films have the highest distortion and they are the oldest ones,
the main aim is to rescue these films first. The distortion of variable density film can be
described as a memoryless distortion, which can be much more easily described and handled
as distortions with memory. This information can be used to create a fast and optimal
solution.
43

For finding an optimal solution, in the following sections we will examine the possibilities
to create a fast restoration algorithm using no or very small amount of iterations in the case of
memoryless nonlinear distortions. In section 6.1 we will select a proper nonlinear description.
In 6.2 we will examine the possibilities to identify the nonlinearity of the examined film. Next,
in 6.3, we will look after the effects of noise on the film, then in 6.4 we will propose a fast
non-iterative regularization method to eliminate the unwanted effects of noise and to keep
the norm of the difference between the original signal and the estimated one low. Another
method will be shown in 6.7, where the aim is not to reduce the difference and reduce the
noise as much as possible, but to make our estimate signal unbiased.

6.1

Representation of nonlinearity

Let us see a nonlinear function, N(), that is assumed to be continuous and differentiable
on the closed interval [x0 , x1 ]. Another assumptions are that the function is invertible and
memoryless.
Our aim is to find a proper estimate in some sense about the original signal distorted by
N(). In this case we may have to solve nonlinear equations to get the result. Since N() is an
arbitrary nonlinear function, generally, the analytical solution with the exact N() function is
not possible, hence a proper representation form is required to simplify computations. (Note,
that this representation form is not necessarily the realization form of the nonlinearity in a
given software or hardware, it is required only for the computations to find the solution of
the nonlinear compensation method.)

6.1.1

Representation using a piecewise linear model

In the case of a continuous, infinite times differentiable, invertible and memoryless nonlinear function, in a given, small interval, [xi , xj ], good approximation of the original, nonpolynomial function can be achieved by the m-th degree Taylor polynomial:
N(x)|x,a[xi,xj ] = T (x) + R(x) =
dN (x)
dx

= N(a) +
+
+

d2 N (x)
dx2

1!

2!

(x a)2 + . . .

m!

(x a)2 + R(x).

dm N (x)
dxm

44

(x a)

(6.1)

The residual error, R(x), exists and can be estimated, if N() is differentiable in the
interval, (xi , xj ), m + 1 times. The supremum of the error can be computed as

 m+1
d

N(x)
m+1
.
sup{R(x)} = |xi xj |
max
dxm+1

(6.2)

This means that in the knowledge of the highest value of the m + 1-th derivative in the

range of interest, an input interval can be given, in which the residual error will be equal or
smaller than our requirement. In this case, we can use a set of models, where each model
represents the original nonlinear function in a given interval. If the intervals are consecutive,
the whole nonlinear function can be represented in the range of interest by these models.
(This is the same representation form as that discussed in Chapter 2.3.3.)
If m = 1, eq. (6.1) reduces to
N(x)|x,a[xi ,xj ] N(a) +

dN(x)
(x a),
dx

(6.3)

which is already linear in parameter x. The residual error will be



 2
d N(x)
2

.
sup{R(x)}|x[xi,xj ] = |xi xj | max
dx2

(6.4)

with length of L, where the error will be smaller then or equal to



 2
d N(x)
.
L max
dx2

(6.5)

If we want to keep the rate


of the residual error under a certain limit then in the knowledge
d2 N (x)
of the maximum of dx2 , the range of interest in x can be divided into several intervals

So we can obtain a piecewise linear representation that can describe the original nonlinear

function, N() with a required small error. The only requirements are that N() have to be
two times differentiable and the second derivative has to be finite in the range of interest.
In practice these requirements can usually be fulfilled.

6.1.2

Representation of the inverse nonlinearity

The inverse nonlinearity can also be represented by a piecewise linear model:

Since


dN 1 (y)
(y y0 ).
N 1 (y) y[N (xi ),N (xj )],y0 =N (x0 ) = N 1 (y0) +
dy
dN 1 (N(x))
dx
=
1
dx
dx
45

(6.6)

(6.7)

and

therefore




dN 1 (y)
dN(x)
dN 1 (N(x))
=

dx
dy y=N (x0 ) dx x=x0
x=x0


dN 1 (y)
dN(x)
= 1
dy y=N (x0 ) dx x=x0

1
dN 1 (y)

=

dN (x)
dy
y=N (x0 )
dx

(6.8)

(6.9)

x=x0

which means that the derivative of the inverse nonlinearity can be represented by the derivative of the original nonlinearity, which is an important fact. The corollary of these equations
is the Goldberg-lemma (eq. 5.5), which was used already in section 5.3.

6.2

Identification of the nonlinear distortion

In order to have a proper piecewise linear representation about the original and inverse
nonlinearity, we have to know the original nonlinear function. In the case of variable density optical soundtracks this function can be determined from the sound signal and in the
knowledge of some a priori information [135]. The function can be determined as
y(t) = G1 (G2 x + O2 ) + O1 ,

(6.10)

where () is the -function of the film, O1 , O2, G1 and G2 are the offset and gain parameters
before and after the nonlinearity. G2 depicts the amplifier before the light valve at the film,
O2 depicts the basic light intensity, when no driving signal is present on the light valve, O1
determines the intensity of the scanning light and the residual offsets of the reproducing
device, while G2 determines the amplification of the reproducing device.
For reconstructing the exact nonlinear function, the values of G1 , O1 and O2 are required.
Note, that G2 is not important, because this parameter adjusts only the volume of the original
sound. The reconstruction is based on the assumption that the recorded audio signal contains
clearly periodic sound-parts. This is the case in most of musical sound parts or in the case
of a short part of human voice, when a vowel is formed by the speaker.
If the recorded signal part, s(t) is periodic, it can be written as a sum of harmonically
related sinusoids:
s(t) =

ai sin(2if0 t + i ),

46

i = 1...

(6.11)

where f0 stands for the fundamental frequency of the periodical signal and ai and i are the
amplitude and phase of the i-th sinusoid.
In eq. (6.11) we did not take into account the DC component in the Fourier-series. The
reason is that the DC component of the original signal cannot be separated from the DC
component added by the nonlinearity. For this reason, in eq. (6.11) we assumed that s(t)
had no DC component. This is a reasonable assumption in the case of audio signals. If the
input signal contains also DC component, this component can be treated as the part of the
input offset O1 . In this case the DC component will not be restored in the estimated signal,
however, we dont have to deal with the DC component, because DC component is inaudible
at audio signals.
If the signal, s(t), is led through a memoryless nonlinear system, a different periodic
signal arises:
u(t) = G1 (G2 s(t) + O2 ) + O1 =
X
=
bj sin(2f0 + j ) + b0 ,

j = 1...

(6.12)

Eq. (6.11) and (6.12) form a common transformation, which assigns a u(t) signal to every
value of the unknown parameters:
u(t) = T (v(f0 , t)),

(6.13)

where v(f0 , t) is the set of the unknown variables:


v(f0 , t) = {G1 , O1 , O2 , a1 , . . . , a , 1 , . . . , }.

(6.14)

The unknown variables can be uniquely obtained, if the T transformation is invertible. A


sufficient condition is, when the number of a and parameters are limited, the number of
samples from u(t) is higher than the number of variables, s(t) has no DC component and
() is a monotonic nonlinear function [136].
In the case of movie soundtracks these conditions can be fulfilled. The uttered vowels
in the movie contain enough long periodic parts, which are ideal for the identification. The
sound has no DC component, or if it is removed, it does not affect the sound-quality. The
vowels are bandlimited, hence the number of a and parameters are limited and the
function is a strictly monotonic function.
The signal of movie soundtracks is corrupted by wide-band noise. In this case, the
problem is ill-posed, because the observed samples can exceed the limits of the output domain
47

of the nonlinear function. A solution for v(f0 , t) can be found by minimizing the value of
the following form:
Cost =

Zt2

(u(t) T (
v(f0 , t)))2 dt

(6.15)

t1

Eq. (6.15) can be minimized by any optization algorithm e.g. Monte-Carlo method. It
is still a question, how many sinusoids should be used to describe the original signal s(t).
This can be estimated from the graph of the optimal cost versus the number of sinusoids.
If the s(t) signal is undermodeled, the cost will be high and will quickly decrease for higher
number of sine signals. On the contrary, if the periodic signal is overmodeled, the use of
higher number of sinusoidal signals will not change the optimal cost drastically. Hence, by
finding this turning point, the number of sinusoids can be chosen. Experiments show that
the method is not too sensitive to the number of sinusoids and the use of eight sinusoidal
signals has been found appropriate to give good results [135].
Although, optimization of eq. (6.15) is quite computation intensive, however, we dont
have to compute it on the whole sound material, only on some representative sample, which
are not longer than few hundred samples. Therefore comparing the time required to calculate
the solution of eq. (6.15) to the time required to restore the whole material, this amount
of computation is acceptable. An advantage of this solution that the identification and the
restoration are two different modules. In this case, if we have the exact nonlinear curve of the
film-roll, we can skip this procedure, making the restoration even faster. This modularization
is impossible in the solutions based on time series models and probabilistic approaches as it
was claimed in section 3.3.4.

6.3

Effect of noise

Let us consider a signal, x, that is distorted by a nonlinear, two times differentiable, memoryless and invertible function, N() creating a new signal, y:
y = N(x).

(6.16)

If the data, y and the nonlinear function is known, the inversion problem is to find an
estimate, x, that satisfies the data in some sense.
If N() is invertible then in the case of eq. (6.16) the solution is simple:
x = N 1 (y),
48

(6.17)

where N 1 () is the analytical inverse of N(). However, usually the output signal, o, that
we can observe is contaminated by noise, n. If we assume that the noise is additive and
independent on the signal then
o = y + n = N(x) + n.

(6.18)

If we still want to use the exact inverse, we can write:


x = N 1 (o) = N 1 (N(x) + n).
Using eq. (6.3) and (6.9) we can express the difference caused by n at a given x0 as

dN 1 (y)
1
x N (N(x0 )) +
n
dy y=N (x0 )

dN 1 (y)
= x0 +
n
dy

(6.19)

y=N (x0 )

1

= x0 +
dN (x)
dx

n.

(6.20)

x=x0

Eq. (6.20) means that x will be different from the expected value. The size of the difference
between x and x depends on the amplitude of n and on the derivative of the nonlinear
function in the given x0 point. If the derivative of the original nonlinear function is small, the
amplification of the noise in x could be extremely high, which cannot be already neglected.
This means that the exact nonlinear inverse in the case of noisy signals may be not proper
and a better restoration method should be found.
With the restoration we can have two different aims:
1. Our aim could be to restore the signal so that the expected value of the difference of
the estimate and the original signal be minimal in least squares sense:


min E{(
x x)2 } ,

(6.21)

2. our aim could be that the expected value of the estimate signal at a given time point
be the same as the original one:
E{
x(t)} = E{x(t)}.

(6.22)

A solution will be shown for the first problem in section 6.4, a solution will be shown for
the second problem in section 6.7.
49

N(x)

?
- +j

K(o)

Figure 6.1: Model of the nonlinearity compensation.

6.4

Compensation of the signal by Tikhonov regularization

The nonlinearity compensation model can be seen in Fig. 6.1. This is a cascaded model,
where a compensation block, the compensation characteristic of which is denoted by K(o),
is used after the original distortion.
Since in our film-sound restoration problem the output signal, y, is already given and
cannot be pre-compensated, post-compensation technique have to be used. As it was shown
in Chapter 3.3 and Chapter 6.3 this problem could be ill-posed, therefore proper techniques
have to be used to avoid the too high noise amplification. To solve this problem and find an
analytical solution, first we have to use a proper representation form of the nonlinearity, as
it was discussed in Chapter 6.1. If we use eq. 6.3 to have a piecewise linear representation,
then one block of the model will look like as in Fig. 6.2. If this model was computed at a
given x0 input value and y0 = N(x0 ) output value, furthermore the disturbance of the output
signal is n, then 4x denotes x x0 , 4y means o y0 n, where o is the noisy observation.

Note, that in this model the noise, n, does not appear explicitly. In this model the noise

affects only the work-point, hence the amplification of the compensation block. Also note
that the interval length of each model block is L, which length was computed such, that the
residual error of the blocks are smaller than a chosen error limit. If L tends to zero, this
error limit also tends to zero.
Let us find K(o) by Tikhonov regularization. To do this, we have to supplement Fig.
6.2 by a new block that computes 4
y , the estimate of the output signal from 4
x. The
supplemented system can be seen in Fig 6.3.

In this model, the function (cost function) that we have to minimize, in the case of one
block is
k4y 4
y k + k4
xk,
50

(6.23)

4x

dN (x)
dx

x=x0

4y

dK(y)
dy

y=N (x0 )+n

4
x

Figure 6.2: One block from the piecewise linear compensation model.

4x

dN (x)
dx

x=x0

4y

dK(y)
dy
y=N (x0 )+n

4
x

dN (x)
dx
x=x0

4
y

Figure 6.3: The supplemented piecewise linear compensation model.


where


dN(x)
4
y=
4
x.
dx x=x0

(6.24)

We are looking for 4


x value at a given value, where eq. (6.23) is minimal. At the minimum,

the partial derivative of eq. (6.23) by 4


x will be 0:

(k4y 4
y k + k4
xk) = 0
4
x

(6.25)

If we use the Euclidean norm eq. (6.25) transforms to


!

dN(x)
4
x)2 + (4
x)2
= 0
(4y
dx x=x0

2
dN(x)
dN(x)
4y
4
x + 4
x = 0.
dx x=x0
dx x=x0

4
x

(6.26)

From this result, we can write:

4
x
=
4y

dN (x)
dx
x=x0
2
dN (x)
+
dx
x=x0

(6.27)

If the interval length, L, the model blocks tends to zero, then the left side of eq. (6.27)
will tend to

d
x
,
dy

which is

dK(o)
,
do

the derivative of the compensation function. So we can say


51

that at a given regularization parameter, using the Euclidean norm, an analytical solution
can be found for the derivative of the compensation characteristic [137, 138] that is

dN (x)

dx
dK(o)
x=x0
=
.
(6.28)
2
do x=x0
dN (x)
+
dx
x=x0

This means that the piecewise linear model of the inverse nonlinear function can be

computed in the knowledge of the derivative of the original nonlinear function by numerical

integration using the solution of eq. (6.28). However, for the numerical integration we
have to know one more parameter: the integration constant. For the determination of this
constant let us write N 1 () on the following way:
N 1 () = F ()|F (0)=0 + C,

(6.29)

where F () is the numerical integral of eq. (6.28) for example calculated using the F (0) = 0
constraint. This constraint is necessary only to have a solution for the inverse nonlinear
characteristic, which can be already shifted by a DC component. If the F (0) = 0 constraint
cannot be fulfilled because of the shape of the nonlinearity, then instead of F (0) = 0, another
arbitrarily chosen constraint can be used.
For the computation of C, several methods can be used:
1. In the knowledge of the probability density function of x and n, C can be computed
by minimizing the weighted norm of the difference of the original signal and the compensated one [137, 138]
Z

px ()

pn ()kx F (N(x) + n) + Ckdd min .


C

(6.30)

2. If we know the expected value of x (the DC component of x), we can adjust C so, that
the expected value of x will be equal to that. Usually The DC component is zero and
we have to make a characteristic, where the expected value of x is also zero.
3. In several problems the exact value of the DC component is not the part of interest.
This is the case of the most audio applications. In this case the only restriction is
that the DC component have to be near to zero, otherwise some elements in the audio
chain could make additional sound distortion or the life expectancy of some devices
(e.g. loudspeakers) could be shortened. Hence, the C constant can be arbitrarily
chosen and the DC component can be filtered out by for example a simple high-pass
filter.
52

Assuming the third method and simply using F ()|F (0)=0 , a MATLAB realization of the
compensator characteristic can be seen in Appendix D.

6.4.1

Comparison of the solution to the optimal least squares solution

Eq. (6.28) can be treated as a rough approximation of the probabilistic approach of the
compensation problem as well.
Assuming that the mean value of the input signal and the mean value of the noise is zero
(E{x} = 0, E{n} = 0), moreover the input signal, x and the noise, n are uncorrelated, for
the case of piecewise linear representation, a probability based solution can be seen for the
compensation characteristc, K(o) in Appendix B. Here the squared sum of the difference of
x and x is minimized. As it is derived in the Appendix, the correct coefficients of such a
linear piecewise compensator system can be computed as
Bi0 = Bi1

k
X
j=1

p(y Ui |o Uj )Aj0

Pk

Ui |o Uj )Aj1
2
p(y

U
|o

U
)A
+
i
j
j1
j=1

Bi1 = P
k

j=1 p(y

E{n2 }
E{x2 }

(6.31)

where Aj0 and Aj1 are the coefficients of the piecewise linear representation of N(x) in the
interval Ui :
N(x)|N (x)Uj = Aj0 + Aj1 x

(6.32)

and Bj0 and Bj1 are the coefficients of the linear representation of K(o):
K(o)|oUj = Bj0 + Bj1 o

(6.33)

If the probability density function of the noiseless output signal, y and the disturbance, n is
exactly known, then the probability that y will be in the interval, Ui , while the observation
signal, o will be in Uj (shortly p(y Ui |o Uj )) can be computed as
p(y Ui |o Uj ) =

Ui

py ()pn ()dd

(6.34)

(+)Uj

As we can see, eq. (6.31) is very similar to eq. (6.28) in its structure. If the length of
every Uj tends to zero, Aj1 tends to the derivative of N(x) at x = Aj0 , and Bj1 tends to the
53

derivative of K(o) at o = Bj0 . The lower row of eq. (6.31) becomes to



R
dN (x)

dn
p (n) dx
n
dK(o)
x=N 1 (y0 +n)
=
.

2

do o=y0
R
2

}
dn + E{n
p (n) dNdx(x)
E{x2 }
n

(6.35)

x=N 1 (y0 +n)

Now, we have to determine somehow, what is the difference between the derivative,

dN (x)
dx

of

the nonlinear function and the numerator of eq. (6.35) to express the difference between the
optimal least squares solution and the Tikhonov-based one.
The numerator of eq. (6.35) is a convolution integral, which is hard to evaluate directly
and compare to

dN (x)
.
dx

If we depict the convolution integral with a separate distortion

function
Z



dN(x)
dN(x)
dn = R(pn (n), N(x))
pn (n)
dx x=N 1 (y0 +n)
dx x=N 1 (y0 )

then we can write eq. (6.35) as



dK(o)
1
=

do o=y0
R(pn (n), N(x))  dN (x) 2
dx

dN (x)
dx

E{n2 }
E{x2 }

(6.36)

(6.37)

R2 (pn (n),N (x))

where R(pn (n), N(x)) is a function that can be calculated as a fraction of the numerator of
eq. (6.35) and

dN (x)
dx

at every input x value. If this value is near to 1, it means that the

difference between the least squares solution and the Tikhonov solution is small.
Let us approximate

E{n2 }
E{x2 }

by a constant, denoted by . In this case eq. (6.37) will be

similar to eq. (6.28) except the

1
R(pn (n),N (x))

member. Since it is the convolution of the

derivative of a strictly monotone function and the probability density function of a relatively
small level noise and it is divided by a representation point of the derivative, the value of this
R(pn (n), N(x)) function will be in a relatively small interval, however, it will not be constant
and will not be always 1. This means that the previously described Tikhonov regularization
method will be close to the least squares optimum and will reduce the error caused by noise
amplification, although the result will be only suboptimal in least squares sense. As Sarkar
claims in [66] in the case of a similar linear regularization problem, this approximation makes
relatively small error in the most applications.
Let us see some examples, how does R(pn (n), N(x)) change in the case of different nonlinear functions and different noises. Four different nonlinear functions were tested with
uniformly distributed and Gaussian distributed noises. These functions were Gaussian error
function and exponential functions in the range of x = [3, 3], and x0.5 and x0.2 functions
54

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

N(x)

N(x)

0.5

0.5

0.5

0.5

1
3

1
3

Figure 6.4: R(pn (n), N(x)) at Gaussian error function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

N(x)

N(x)

0.5

0.5

0.5

0.5

1
3

1
3

Figure 6.5: R(pn (n), N(x)) at Gaussian error function and Gaussian noise (noise deviation at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

55

25

25

20

20

15

15
N(x)

N(x)

10

10

R(pn(n),N(x))
0
3

R(pn(n),N(x))
0

0
3

Figure 6.6: R(pn (n), N(x)) at exponential function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

25

25

20

20

15

15
N(x)

N(x)

10

10

R(pn(n),N(x))
0
3

R(pn(n),N(x))
0

0
3

Figure 6.7: R(pn (n), N(x)) at exponential function and Gaussian noise (noise deviation
at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

56

3.5

3.5

2.5

2.5
N(x)

N(x)

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

0.5

0
2

0.5

0
2

10
x

10
x

Figure 6.8: R(pn (n), N(x)) at square-root function and uniformly distributed noise (noise
interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

3.5

3.5

2.5

2.5
N(x)

N(x)

1.5

1.5

R(pn(n),N(x))

0.5

0
2

R(pn(n),N(x))

0.5

0
2

10
x

10
x

Figure 6.9: R(pn (n), N(x)) at square-root function and Gaussian noise (noise deviation
at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:
R(pn (n), N(x)).

57

2.5

2.5

1.5

1.5
N(x)

N(x)
R(p (n),N(x))

R(p (n),N(x))

0.5

0
2

0.5

0
2

10
x

10
x

Figure 6.10: R(pn (n), N(x)) at x0.2 function and uniformly distributed noise (interval at left
0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line: R(pn (n), N(x)).

1.6

1.6

1.4

1.4
N(x)

N(x)

1.2

1.2

R(pn(n),N(x))

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0
2

R(pn(n),N(x))

0
2

10
x

10
x

Figure 6.11: R(pn (n), N(x)) at x0.2 function and Gaussian noise (noise deviation at left 0.1,
noise deviation at right 0.01). Solid line: nonlinear function, dashed line: R(pn (n), N(x)).

58

in the range of x = [0, 10]. In the case of the last two functions the output is assumed to be
zero if x < 0.
The interval length of uniformly distributed noise was 0.1 and 0.01. The deviation of
Gaussian noise was 0.1 and 0.01.
The results can be seen in figures 6.46.11. The deviation of R(pn (n), N(x)) from 1 is
very little in the case of Gaussian error function or exponential function. The difference from
1 is already not noticeable in the case of exponential function. A little bit higher differences
can be seen in the case of x0.5 , x0.2 functions in the environment of zero, but it is also on
an acceptable level. And of course, it is zero where the signal is assumed to be zero. Here
R(pn (n), N(x)) is not important,
Let us denote

dN (x)
dx

dK(o)
do

will be zero regardless to the value of R(pn (n), N(x)).

by and R(pn (n), N(x)) simply by R. Now let us examine the

difference of eq. (6.28) and eq. (6.37):

1
R 2 +

R2

2 +

(1R)3 +()
R(2 + 2 )(2 +)

(6.38)

If 0 then in the denominator in eq. (6.38)


The denominator tends to

.
R

R2

and will dominate instead of 2 .

In the numerator of eq. (6.38) will dominate, hence the

difference of the Tikhonov solution and the optimal least squares solution tends also to 0.
2

If then in the denominator of eq. (6.38) 2 will dominate in (2 +

)
R2

and in

( + )). Hence the denominator will contain an member, however, the highest power
degree in the numerator is only 3 . So the difference will also tend to zero. The difference
will be the highest in the case, when 2

.
R2

Choosing a proper value, this difference

can also be minimized. We have to take care of the numerator of eq. (6.38. Let us see, when
will it be zero:
(1 R)3 + ( ) = 0

= 2 + 2 R

If R = 1 (this is he case of the nearly linear parts of N()) and

(6.39)

R2

(this produces the

highest difference) then the difference will be near to zero is the case, when

E{n2 }
= 2 + =
R
E{x2 }

R 1.

(6.40)

This is a good starting point to choose a proper .


Comparing the solution of Tikhonov regularization to the least squares solution we can
say:
59

The solution of Tikhonov regularization is not the same as the least squares one,
however, they are close to each other. The difference caused by the R(pn (n), N(x))
term is usually not too big, because the value of R(pn (n), N(x)) is close to 1 in the
range of interest. The difference can be further reduced, if is appropriately chosen.
Although the solution of Tikhonov regularization and the least squares one is not the
same, this is not a big problem in our viewpoint, because our aim was only to reduce
the artifacts during nonlinear compensation at optical audio recordings caused by noise
and for this aim Tikhonov regularization is as proper as the least squares solution.
For the computation of least squares solution the exact knowledge of the probability
density function of the signal and noise (pn (n) and px (x)) is required. If it is not
known, the solution cannot be computed or if it is not properly known, the solution
could be strongly distorted and unusable. Tikhonov regularization is a much more
robust method and good results can be achieved also without the knowledge of pn (n)
or px (x). For example an appropriate regularization parameter, can be computed
already from the energy estimate of the noise and signal, E{n2 } and E{x2 }. (Certainly,
in the knowledge of pn (n) and px (x) a bit better results can be achieved.)

6.4.2

Finding the appropriate value of the regularization parameter

When we have information about the probability density function of the observed signal
and the noise po (o) and pn (n), we can produce a better estimation about the regularization
parameter, than

E{n2 }
.
E{x2 }

If the undistorted input signal, x is assumed to be constant and

the probability density function, pn (n) of the additive noise n is known, the optimal value of
the regularization parameter, , and the optimal compensation characteristic, K(o, ), can
be computed by minimizing the expected value of the difference of the original x0 constant
and its estimate value, x = K(o, ), by :

E{e(x0 , )} =

pn ()kx0 K(N(x0 ) + , )kd

(6.41)

If x is not constant, but the probability density function of x denoted by px (x) is known,
the optimal compensation characteristic can be computed as the minimization of the proper
60

weight of E{e(x0 , )} at every x points:


E{()} =

px ()

pn ()k K(N() + , ) kdd.

(6.42)

Usually, eq. (6.42) cannot be solved and and K(o) cannot be determined directly,
since pn (n) and px (x) is not known in the most problems. However, we can estimate them.
In practice pn (n) can be estimated by two kind of methods:
1. Collect information about pn (n) from those signal parts where only noise is present.
In the case of speech and other audio signals the signal contains several pauses, which
can be found manually [137], or if we would like automatize the whole process it
can be detected by voice activity detectors. These detectors are based on short term
energy measurements [139], zero crossing term measurements [140] or cepstral analysis
of the signal [141].
2. Collect information from the whole signal, using filter banks and select the noise information by statistical methods. These methods are based on spectral minima tracking
[142, 143, 144, 145, 146], quantile based noise spectrum estimation [147] and extended
spectral subtraction [148].
The probability density function, px (x) of the input signal can be estimated iteratively,
using a few steps. A similar method was proposed for linear case by Daboczi [103] and
further developed for nonlinear case by the author [135, 149]. The algorithm of the method
is the following:
1. First we need to have an initial guess about px (x) and pn (n). A good estimate can be
computed for pn (n) by using one of the possibilities described above. For px (x)0 if
we do not have other possibility we can use the probability density function of the
output signal px (x)0 = po (o).
2. Compute by minimizing eq (6.42) using the estimates of px (x) and pn (n).
3. Compute K(o, ) and with this, compute x from o.
4. Using the histogram of x a new estimate can be calculated for px (x).
5. If the number of iterations is already high enough or the difference between the new i
and the previous i1 parameter is small enough, we can stop the iteration, otherwise
we can go back to the second step.
61

A possible MATLAB realization can be seen in Appendix E.


We dont have to use the whole audio material to compute px (x)0 , it is enough to use a
representative part of the signal [135]. This means that the non iterative behaviour of the
analytical solution still remains, because the iteration to compute px (x) and have to be
done only on a small portion of the audio material.
Let us examine the convergence properties of the algorithm described above. When
n = 0, hence pn (n) = then
E{()} =

px ()k K(N(x), kd

(6.43)

Now, the best solution is when K(o, ) = N 1 (o), hence when = 0. After one iteration we
will get back px (x) almost regardless to the initial guess of px (x). At the solution, E{()}
will be zero and it will be higher in any other case. (Of course, if px (x) is constant zero, or
a at K(o) = o = 0, cannot be found, but we can find it in any other not degenerated
situation.)
When n is small, K(N(x) + n, ) can be approximated by the first elements of the Taylor
polynomial:
K(N(x) + n, )|x=x0


dK(o)
n
= K(N(x), )|x=x0 +
do o=N (x)

(6.44)

Now, eq. (6.41) (the inner integral of eq. (6.42)) can be written as
Z


dK(o)
nkd
pn ()kx0 K(N(x), )|x=x0
do o=N (x0 )

(6.45)

The inside of the norm can be represented as



dK(o)
n = T (x, ) + N (x, n)
x0 K(N(x), )|x=x0
do o=N (x0 )
T (x, ) = x0 K(N(x0 ), )

dK(o)
N (x, n) =
n,
do o=N (x0 )

(6.46)

where N (x, n) corresponds to the error term caused by noise and T (x, ) denotes the error

caused by the distortion of the regularized inverse characteristic. If we use a regularization


parameter, near to zero, the difference between the original inverse characteristics and
the regularized one will be small, therefore the caused signal distortion will also be small,
however, the noise amplification becomes high. Using higher and higher values, the noise
62

amplification becomes smaller and smaller, but the derivative of the compensation characteristic becomes smaller and smaller, hence the compensation characteristic becomes more
and more flat and he distortion caused by the characteristic grows.
When is small, the probability distribution function of the estimate, px (
x) will be
widely distributed, because the estimate is contaminated by high amplitude noise. When
approaches , px (
x) will approach because the compensation characteristic will be even

more flat.

Let us examine the behaviour of these two terms assuming a limiter characteristic for
N(), which has small distortion in the case of small amplitude input signals and the distortion increases when the amplitude increases. This kind of nonlinearity is also known as
mild nonlinearity. The unwanted distortions in the audio field generally can be represented
by limiter characteristics. Nonlinear characteristics having different behaviour usually not
appear in audio recordings, except some manipulated signals with special sound-effects or
noise gates.
If during the iteration in the algorithm the i-th value of is very high, px (x)i will differ
from zero only in a very small interval, at small x values and in eq. (6.42) those part of
the error will be emphasized, which is close to zero. In the case of a limiter characteristic,
this part is the almost linear part. Here the noise term, N is negligibly small and only the
distortion term, T will dominate. The error could be decreased if is reduced and T is

decreased.

Unfortunately, the convergence of the algorithm is not guaranteed in the case of a deadzone-like nonlinearity, since here the highly distorted part will be emphasized when is high
and px (x)i is small. However, as we discussed, dead-zone nonlinearities do not appear in
audio recordings, which are our practical problem.
When the i-th value of is extremely small, px (x)i will be widely distributed and almost
flat, due to the noise. In eq. (6.42) the noise term will dominate. The error can be reduced
if is increased.
This means that (in the case of limiter-like nonlinearities) the algorithm will converge to
that value, where the N and T are nearly the same (of course, the exact ratio will depend

also on the exact shape of px (


x)). This solution corresponds to the Bertoccos method for
linear problems in [108], where a noise term was computed from those signal parts, where
the original signal was flat and a distortion term was computed from signal changes and the
optimal compensation was reached, when the two terms were equal.
The solution is also similar to Hansens L-curve method [113] where the logarithm of
63

signal energy of the estimate, log(k


xk) was compared to the misfit, kN(
x) o)k. The energy
of the estimate will be small, when is high, and T is also high. The misfit will be small

when is small, but N is high. Hence this method will provide similar solution to our
method.

Simulations show that this algorithm is as robust as the algorithms depicted in chapter
3.3.3. It was shown with some heuristic steps that in the case of limiter characteristics the
algorithm converges to similar values than the other algorithms discussed in 3.3.3, however,
the algorithm still lacks a thorough convergence analysis.
The advantages of our method compared to Hansens method:
Hansens method requires to map log(k
xk) and kN(
x) o)k in a very wide range
to see clearly the L-shape. This novel algorithm requires only a few iteration steps.

Experiments show that usually three iterations are enough to get a proper value
[135].
In Hansens method the edge point can be chosen from the curve manually or using
some heuristic methods. This novel algorithm can choose absolutely automatically.
Sometimes the L-shape of the Hansen-diagram is not clear, which makes harder to
choose the value of .

6.4.3

Comparison of the novel method to Morozovs and Hansens


method

As it was shown in section 3.3.3, two different method is used generally for finding the correct
regularization parameter in nonlinear problems: Morozovs method and Hansens method.
In Morozovs method the regularization parameter is chosen as the solution of
kN(
x, ) ok = C,

C1

(6.47)

where is the estimation of the norm of the noise; in Hansens method the regularization
parameter is chosen as the corner point of the log(kN(
x) o))k vs. log(k
xk) diagram.

For comparison of these methods to the proposed new one, two simulations will be shown

with a multisine input signal (Fig. 6.12) that was passed through two different nonlinear
functions (Fig. 6.13 and 6.14) and contaminated by Gaussian noise. In the first case the nonlinear distortion was the Gaussian error function, erf (), which is a general limiter function.
In the second case it was the part of x5 :
y = 0.1(x + 250.2 )5 2.5
64

(6.48)

X 3

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2
4

x 10

Figure 6.12: Multisine signal, x, used for the simulations.


because this function is similar to the distortions that appear at film-sound.
For the simulation of the observation noise, Gaussian noise was added to the distorted
signals to produce 40 dB signal-to-noise ratio, which was calculated as
P 2!
y
SNR = 10 log P i i2 .
j nj

(6.49)

The output signals can be seen in Fig. 6.15 and 6.16.

The initial px (x)0 parameter of the proposed algorithm was chosen as po (o). For faster
computation, was chosen and the methods were evaluated only in 20 points. In the iterative
algorithm i was chosen from these points, where the estimated error was the smallest one
and only three iterations were made. These 20 points were equidistant in the logarithmic
scale:

1
,
i = 0 . . . 19
(6.50)
4i
The results can be seen in Fig. 6.176.24. The first four figures show the choosing of the
i =

proper regularization parameter, , by the examined methods. The other four figures show
the reconstruction in time domain, x. The results also can be seen in Table 6.1. The error,
in this table was computed as
i=M
1 X
(x[i] xest [i])2 .
=
M i=1

65

(6.51)

Y
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
4

4
X

Figure 6.13: Gaussian error function used for the first simulation.

Y 500

400

300

200

100

100
4

4
X

Figure 6.14: x5 function used for the second simulation.

66

O 3

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

2
4

x 10

time [samples]

Figure 6.15: Noisy output signal of the first simulation (distortion is made by the Gaussian
error function).

O 35

30

25

20

15

10

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2
4

x 10

Figure 6.16: Noisy output signal of the second simulation (distortion is made by the x5
function).

67

0
lg(||oy

||)

lg(||oyest||)

est

= 0.01
2

2
= 0.0039
= 0.002

C=10

C=2

C=1
6.1e05
5

6
0
10

10

10

10

10

10

6
1

12

10

10

0.8

0.6

0.4

0.2

0.2

0.4
lg(|| xest ||)

Figure 6.17: Error of the compensation of nonlinearity by Morozovs method (left) and
Hansens method (right) as a function of . The nonlinear distortion is the Gaussian error
function.
0.7

0.8
||xxest||

E{||xxest||}

0.7

0.6

0.6

0.5

0.5

0.4

0.4
0.3
0.3

iternum = 2 and 3
0.2

0.2
= 6.1e05

0.1

= 6.1e05
0.1
iternum = 1
0
0

10

10

10

10

10

0
0
10

10

10

10

10

10

10

10

10

Figure 6.18: Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of . The nonlinear distortion is the Gaussian error function.

Table 6.1: Comparison results of the Morozov, Hansen and the new method.
Morozov, Morozov, Morozov, Hansen new method true case
C=10

C=2

C=1

, N() : erf ()

1e-02

3.9e-03

2e-03

6.1e-05

6.1e-05

6.1e-05

, N() : erf ()

0.1004

0.0652

0.0481

0.0201

0.0201

0.0201

, N() : x5

7e-01

2.5e-01

1.5e-01

1e-04

4e-03

2e-02

8.5e-02

3.2e-03

1.9e-02

1.0e-03

6.6e-04

4.95e-04

, N() : x

68

lg(||oyest||)

lg(||oyest||)
= 0.7

2
C=10

2
= 0.25

= 0.15

C=2

3
4

C=1

6
5
7
1e04

6
8

7
0
10

10

10

10

10

10

9
0.52

10

0.5

0.48

0.46

0.44

0.42

lg(||xest||)

Figure 6.19: Error of the compensation of nonlinearity by Morozovs method (left) and
Hansens method (right) as a function of . The nonlinear distortion is the part of x5 .

0.015
E{||xx

0.012
||xxest||

||}

est

0.01

0.01

0.008

0.006

0.005

0.004

= 0.02

= 4e03

iternum = 2

0.002

iternum = 3
0
0
10

10

10

10

10

10

10

0
0
10

10

10

10

10

10

10

10

10

Figure 6.20: Error of the compensation of nonlinearity by the novel method (left) and the
true result (right) as a function of . The nonlinear distortion is the part of x5 .

69

^ 3
X

^ 3
X

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0.2

0.4

0.6

0.8

1.2

time [samples]

1.4

1.6

1.8

2
4

time [samples]

x 10

x 10

Figure 6.21: Reconstruction of x by Morozovs method (left) and Hansens method (right)
for the Gaussian error function.

^ 3
X

^ 3
X

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

x 10

1.8

2
4

x 10

Figure 6.22: Reconstruction of x by the novel method (left) and the optimal result in least
squares sense (right) for the Gaussian error function.

70

^ 1.5
X

^ 1.5
X

1
1
0.5

0.5
0

0.5
0

1
0.5
1.5

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0.2

0.4

0.6

0.8

1.2

time [samples]

1.4

1.6

1.8

2
4

time [samples]

x 10

x 10

Figure 6.23: Reconstruction of x by Morozovs method (left) and Hansens method (right)
for the x5 nonlinear distortion.

^ 1.5
X

^ 1.5
X

0.5

0.5

0.5

0.5

1.5

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

1.5

x 10

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2
4

x 10

Figure 6.24: Reconstruction of x by the novel method (left) and the optimal result in east
squares sense (right) for the x5 nonlinear distortion.

71

In the case of Gaussian error function, Hansens method and the proposed new method
gave the same optimal regularization parameters. In the case of x5 , the novel method gave
better results than the Hansens one. Morozovs method has overestimated in all cases,
even in the case, when the C constant in eq. (3.21) was chosen as 1.

6.5

Results on synthetically distorted real audio signals

To test the proposed restoration procedure, an original speech signal, x, recorded from the
radio was synthetically distorted by a -function:
y = (x) =

G = 0.5,

G(x + O1 ) + O2 if x > O1
otherwise

O2

(6.52)

O1 = 0.4 O2 = 0 = 5.8

The resulted signal was contaminated by Gaussian noise, n to achieve 20 dB signal-to-noise


ratio, which was calculated as
SNR = 10 log

P 2!
y
P i i2 .
j nj

(6.53)

The original and the distorted, noise contaminated signals can be seen in fig. 6.25 and 6.26.
The O1 , O2 and G parameters of the function are assumed to be unknown. To determine
these parameters, a small signal part from the distorted, noisy signal was chosen, which can
be seen in Fig. 6.27. The original signal was modeled by eight harmonically related sinusoid.
The fundamental frequency was 84.6 Hz. The results of the search program can be seen in
Fig. 6.28. The gain estimate, G (0.5068) and offset estimate, O2 (0.0007) are near to the
real parameters. The O1 parameter (0.3454) is a bit far from the real one, but this only
causes a DC shift in the resulted audio signal that can be corrected by a high-pass filter.
In the knowledge of (), using eq. (6.42) and the iterative algorithm described in section
6.4.2 we can calculate the estimate of the nonlinearity parameter, . The results of the
iterations can be seen in Fig. 6.29. The real error values can be seen in Fig. 6.30.
The estimate of the regularization parameter (9.76e-04) is quite close to the real one
(3.9e-03). The compensated signal can be seen in Fig. 6.31. An overcompensated ( = 0.25)
and an underregularized example ( = 9e 0.13) can also be seen in Fig. 6.32 and Fig. 6.33.
72

0.8

0.6

0.4

0.2

0.2

0.4

0.5

1.5

2
time [samples]

2.5
5

x 10

Figure 6.25: Original, not distorted audio signal.

O 0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.5

1.5

2
time [samples]

2.5
5

x 10

Figure 6.26: Audio signal synthetically distorted by a -function.

73

O 0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0.01

50

100

150

200

250

300

350

400

450

500

time [samples]

Figure 6.27: Distorted, noisy signal part chosen for parameter determination of the nonlinear
function.

Figure 6.28: Result of parameter search of the nonlinear function.

74

E{||xxest} [dB]
50

50
3rd iteration
2nd iteration
= 9.76e04

100

150

1st iteration
200
0
10

10

10

10

10

10

10

12

10

Figure 6.29: Estimate error of the iterative algorithm at different regularization values.

||xx

|| [dB]

est

80

60

40

20

20

40

60
= 3.9e03
80

100

120
0
10

10

10

10

10

10

10

12

10

Figure 6.30: True error at different regularization parameters.

75

^
X 0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.5

1.5

2
time [samples]

2.5
5

x 10

Figure 6.31: Reconstructed signal by the best characteristic estimate.

^ 0.035
X

0.03

0.025

0.02

0.015

0.01

0.005

0.005

0.5

1.5

2
time [samples]

2.5
5

x 10

Figure 6.32: Reconstructed signal by overregularized characteristic.

76

^
X

10

15

20

25

30

35

40

0.5

1.5

2
time [samples]

2.5
5

x 10

Figure 6.33: Reconstructed signal by underregularized characteristic (note scale change).

6.6

Results on real distorted audio signals

To test the algorithm on real audio signals, a distorted movie sound was provided by the
Hungarian National Film Archive. Unfortunately, due to the poor capabilities of the archive,
the audio signal was given on a VHS tape. The audio material suffered from strong thumps,
clicks and hiss due to the aged original film material and the VHS tape recorder and it also
suffered from a very strong nonlinear distortion due to a wrong film-copy process. A part
of this audio signal can be seen in Fig. 6.34. The nonlinearity parameter, was told to be
3.8, however, there was no information about the offset and gain parameters. To determine
this, a small signal part was chosen from the file that can be seen in Fig. 6.35. The original
signal was modeled by eight harmonically related sinusoids. The basic frequency was 559.85
Hz. The parameter estimates were
G

0.2104,

O_2 = -0.1349,
O_1 =

0.8246.

From the resulted parameters the regularized inverse characteristic was calculated by the
proposed algorithm. The results of the iterations can be seen in Fig. 6.37.
The optimal reconstructed signal can be seen in Fig. 6.38. An underregularized example
can be seen in Fig. 6.39.
77

O 0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.5

1.5

2.5
6

time [samples]

x 10

Figure 6.34: Real, nonlinearly disorted and noise contaminated audio signal.

O 0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

10

20

30

40

50

60

70

80

time [samples]

Figure 6.35: Signal part chosen for parameter determination of the nonlinear function.

78

Figure 6.36: Results of parameter estimation of the nonlinearity.

E{||xx

||} [dB]

est

110

120

130

140

150
= 3.9e03
160
1st iteration
170

2nd and 3rd iteration

180

190
0
10

10

10

10

10

10

10

Figure 6.37: Result of the iterative algorithm.

79

12

10

^
X

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.5

1.5

2
time [samples]

2.5
6

x 10

Figure 6.38: Reconstructed signal by optimally regularized characteristic.


Although, VHS tape is not the best medium for digital reconstruction of nonlinearly
distorted signals due to its small bandwidth and small signal-to-noise ratio, the proposed
algorithm performed well. The quality of the resulted signal became significantly better, the
disturbing sound distortion became much smaller. The noise level also remained small and
the optimal regularization did not introduce any disturbing artefacts.

80

^
X 0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.5

1.5

2
time [samples]

2.5
6

x 10

Figure 6.39: Reconstructed signal by underregularized characteristic.

81

6.7

Compensation of the signal to make an unbiased


estimate

The restored data becomes biased due to the restoration method introduced in section 6.4.
It means that the expected value of the restored data at a given time point will not be equal
to the expected value of the original data in that time point. Although, it is not definitely
disturbing effect in the case of audio signals, there could be some applications, which require
an unbiased estimate about the original signal.
In this section, we will give a possible solution to get unbiased estimate in the case of
nonlinearly distorted and noise contaminated signals, if the probability density function of
the noise and the nonlinear distortion function is known.
Let us take a look again to the model nonlinearity compensation in Fig. 6.1 and try to
calculate the expected value of x:
E{
x}|y=N (x0 ) =

px (
x)
xd
x=

po (o)K(o)do =

pn (o y)K(o)do,

(6.54)

Due to the noise, the expected value of x will not be K(E{y}), but K(E{y}) where K() is

the correlation of the noise probability function and the compensation characteristic. So the
estimated value of x could be distorted.
To avoid this, and to make an estimate, x from o, which has an expected value that is

unbiased, we have to make such a K(o) compensation characteristic, where


Z

pn (o y)K(o)do = x = N 1 (y),

(6.55)

which means that the correlation of K(o) and pn (n) have to be the inverse of N().
Since correlation transforms to multiplication and a complex conjugation in the frequency
domain it gives the idea to find the solution using the Fourier-transforms:
F {K(o)} =

F {N 1 (x)}
F {pn (n)}

(6.56)

where F {} means Fourier-transformation and a means the complex conjugate of a. In


practice, this task can be solved by a sampled data block from N 1 () and pn (n) with finite
length. In this case the realized correlation will be a circular one. Due to the circular
correlation, the result could be strongly different from the expected one: huge oscillations
can appear in the resulted characteristic. This is due to the fact that we made a discrete
82

Fourier-transformation on a part of a strictly monotonic function that means an unwanted


convolution with a windowing function that causes leakage between the frequency bins hence
the appearance of new frequency components in the spectrum. After dividing this spectrum
by the spectrum of the noise probability function these components can be amplified, and
when we make the inverse Fourier-transformation, these unwanted components can cause
oscillations.
This effect of the circular correlation can be reduced by some techniques such as windowing techniques or zero padding. Another technique could be mirroring the shape of the
nonlinear characteristic and complement the original shape with this one, then using the
resulted characteristic for the determination of the proper F {K(o)}. The last technique can

lead to the Nahman-Gens method [150] or to the cosine transformation. However, the effect

of the circular correlation can only be reduced by windowing, and the circular correlation
can be eliminated with the Nahman-Gens method only, if the derivatives at the matching
point of the mirrored and the original shape are the same. Generally, these methods cannot
handle the problem properly.
Eq. (6.56) is only one solution at sampled data blocks, however, not just one solution
exists in this case. This claim can be proven, if we write eq. (6.55) with finite length,
sampled data blocks:
E{
x} = P K,
x

P
K

= [
xi ; . . . ; x
i+N ],

pn (yiM/2 N(xi ))

pn (yi+M/2 N(xi ))

.
..
.
..
..
=
.

pn (yiM/2+N N(xi+N )) . . . pn (yi+M/2+N N(xi+N ))


= [K(yiM/2 ); . . . ; K(yi+M/2 )]

(6.57)

Let us assume that x


is calculated in N point from P and K, where K is known in M point.
If the not negligible part of pn (n) that have to be used in the calculations has an interval
length of I, and yi is sampled with L step length, then K have to be known at least in
M =N+

I
L

points, otherwise the beginning or the ending rows will not contain the whole

range of interest in pn (). Hence M have to be always higher than N. This means that
eq. (6.57) will be underdetermined, the solution for K is a subspace, where infinite number
of solutions exist for K. One solution can be found by Fourier-transformation, but it will
contain oscillations due to the circular correlation. However, we need a solution, where K()
is smooth, otherwise the compensation characteristic would increase the variance of x too
much.
83

6.7.1

Finding a proper compensation characteristic using an iterative method

Eq. (6.55) can be solved iteratively by the following method [151]:


K0 = N 1 ,
Ki+1 = Ki + (N 1 corr{Ki , pn }),

<

1
,
max |F {pn }|

(6.58)

where corr{a, b} means the correlation of vectors a and b.


For the numerical realization one has to note that if pn (n) is mapped in q points in the
range of [n1 , n2 ] and Ki vector has the length of r in the range of [y1 , y2 ], then the length of
1
1
Ki+1 will be r q in the range of [y1 + n2 n
, y2 n2 n
], since the correlation vector will be
2
2

correct in this range.

A possible MATLAB realization of this kind of compensation characteristic for the case
of the Gausian error function and uniformly distributed noise can be seen in Appendix F.

6.7.2

Proof that the method is convergent under the given constraint

The Fourier-transformation of the i + 1-th member of eq. (6.58) can be written as


Ki+1 = Ki + (N1 Ki P),

(6.59)

where K, N1 and P are the Fourier-transforms of Ki (), N 1 () and pn (n) functions respectively. means elementwise product of vectors.
Ki+1 can be written without recursion:
K0 = N1
K1 = (1 P) K0 + N1

K2 = (1 P)2 K0 + (1 P) N1 + N1

K3 = (1 P)3 K0 + (1 P)2 N1 + (1 P) N1 + N1
..
.
Ki+1 = (1 Pn )(i+1) K0 + (N1 +

+(1 P) N1 + + (1 P)i N1 ).

(6.60)

As we can see, Ki+1 can be written as a geometric sum and a member on the i + 1-th power:
Ki+1 = (1 P)(i+1) K0 +
84

1 (1 P)i
N1 .
1 (1 P)

(6.61)

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

200

400

600

800

1000

1200

time [samples]

Figure 6.40: Sinusoid excitation signal used for the simulations.


If k1 Pk < 1, the iteration will be convergent, (1 P)(i+1) K0 will vanish and the
sum will converge to

N
,
P

which is the inverse of the correlation. Since the method works in

the time domain, circular correlation will not appear and the resulted characteristic will be
smooth.

6.8

Simulation results

To show the behaviour of the method for unbiased estimates, a simulation will be shown
with a simple sinusoid excitation signal, x that can be seen in Fig. 6.40. The amplitude of
the signal is 0.9. The signal is led through a nonlinearity that has the following characteristic
(Fig. 6.41):
y = (2.45 x + 3)3 .

(6.62)

To simulate the effect of noise, uniformly distributed white noise was added to the distorted signal to achieve 50 dB signal to noise ratio (the signal-to noise ratio is defined as the
ratio of the energy of the distorted signal and the noise). The noisy and distorted signal can
be seen in Fig. 6.42.
Three different estimates of the original, undistorted signal were calculated, one with the
exact inverse of the nonlinear distortion (left side of Fig. 6.43, the other one with Tikhonovregularized characteristic (right side of Fig. 6.43). Here, the regularization parameter was
85

Y 180

160

140

120

100

80

60

40

20

0
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1
X

Figure 6.41: The nonlinear distortion.

O
140

120

100

80

60

40

20

0
0

200

400

600

800
time [samples]

Figure 6.42: Distorted signal.

86

1000

^
X

^
X

0.5

0.5

0.5

0.5

1.5

1.5

2.5

2.5

200

400

600

800

1000

1200

200

400

600

800

time [samples]

1000

1200

time [samples]

Figure 6.43: Reconstruction of x by the exact inverse (left) and Tikhonov-regularized inverse
(right).
chosen as

E{n2 }
E{x2 }

E{n2 }
E{o2 }

= 2.03 105 . The third estimate was calculated with the method

proposed in section 6.7,which provides unbiased estimate (number of iterations = 2, = 1).


The result can be seen in Fig. 6.44.

To compare the three estimates, first the normalized difference was determined to see the
difference of the estimates from the original signal. The normalized difference was computed
as

N
1 X
(x(i) x(i))2 .
diff =
N i=1

(6.63)

To examine the residual distortion of the estimated signals, a sinusoid was fitted to the
estimates in least squares sense and the amplitude of the sinusoid was determined. The
values of the normalized differences and the amplitudes can be seen in Table 6.2.
The estimate created by the exact inverse has quite high difference from the original
signal and looks very noisy (Fig. 6.43). In addition, the estimate of the amplitude of the
sinusoid is strongly distorted, it is 1.01 instead of 0.9.
The estimate made by Tikhonov-regularized inverse has the smallest difference from the
original signal, and has the less noise in the right side of Fig. 6.43, however, the signal
remained slightly distorted: the amplitude estimate is 0.869 instead of 0.9.
The amplitude estimate of the unbiased estimate is 0.898 that is very close to the original
one. However, the normalized least squares difference from the original signal is about double
than that of the Tikhonov-regularized estimate.
As it was expected, the unbiased estimate performs not the best from the point of view
of the squared difference.
87

^
X

0.5

0.5

1.5

2.5

200

400

600

800

1000

1200

time [samples]

Figure 6.44: Unbiased reconstruction.

Table 6.2: Comparison results of the exact inverse, Tikhonov and the unbiased characteristics.
characteristic type

normalized difference

amplitude of fitted sinusoid

exact inverse

0.03294

1.0108

Tikhonov-regularized inverse

0.00124

0.8692

unbiased inverse

0. 00246

0.8978

88

Chapter 7
Conclusions and future possibilities
7.1

Conclusions

There are thousands of valuable film rolls in the national film archives that cannot be presented for the audience because of they degraded quality. These films are noisy and they
have high nonlinear distortion. The quality of these films become worse day by day. The
quality can be preserved by a copying process, however, it is not enough. A restoration
process is required to achieve an acceptable sound and image quality for broadcasting.
Several techniques exist already for compensation of nonlinear distortions, however, most
of these techniques are based on pre-compensation, or ignores the effects of noise. Since in
the case of a film roll, only the distorted signal is given that is very noisy, the solution have
to be based on a post-processing technique and have to be robust for noise. Due to the
complexity and the difficulties of post-processing, such as the iterative behaviour of several
techniques and the sensitivity of noise, only a few application was created for nonlinear
post-processing. These applications were discussed in section 3.4.
The disadvantage of these techniques are the unsatisfactory robustness and the required
computational power. There is still a need for a fast restoration method, which may be used
for real-time restoration, too. Another requirement is the reduction of human interaction in
the restoration process. Due to the enormous amount of film data, if the restoration requires
too much manual adjustments, the restoration cannot be made within reasonable time and
price.
The aim of this thesis was to find fast algorithms for restoration of the sound of films
and to find the possibilities to reduce human interaction as much as possible.
Two new methods were shown for nonlinear restoration. The first method is based on
89

Tikhonov regularization and produces an output signal that has small difference from the
original, not distorted signal in least squares sense. Usually regularization techniques are
iterative ones, but it was shown in 6.4 that in the case of memoryless nonlinear distortions,
such as the density characteristic of films, a regularized compensation characteristic can
be calculated from the original nonlinear distortion without any iteration. The resulted
nonlinear characteristic was compared to the least squares solution and they were close to
each other. The resulted characteristic can be stored for example in a look-up table and the
restoration itself also can be solved in one step. Hence this part of the algorithm works fast
and it is proper for real-time applications or extreme fast background data-processing.
The determination of the regularized characteristic itself could require some manual and
some iterative steps. For example selecting sound-parts for the determination of noise could
require some manual work, the determination of the regularization parameter and the parameters of the nonlinearity require iterative steps. However, the iteration steps have to be
done only on a small part of the signal, so they are also quite fast. Not much human interaction is required compared to other sound-restoration works, such as removal of short-time
disturbances or proper modification of the sound-level at certain film-parts.
The second restoration method described in 6.7 produces an unbiased estimate about the
original signal. The calculation of the compensation characteristic is an iterative method.
The restriction of the convergence and the proof of it is also given in 6.7. The resulted
characteristic just as at the previous method can be stored in a look-up table and the
restoration itself is already a non-iterative process. A constraint for the convergence and the
proof of the convergence were also given.
For the calculation of these compensation characteristics, the knowledge of the noise probability distribution and the distortion characteristic is required. The first method requires
the probability distribution of the original signal as well.
Methods for determination of noise probability density function (pdf) were collected in
6.4.2, and also in this section a method was shown that is able to find a good estimate of the
regularization parameter of the first type compensation characteristic without the knowledge
of the pdf of the original signal. A heuristic explanation was given to the convergence of the
algorithm.
Although it is an iterative algorithm, simulations show that the convergence of the algorithm is extremely fast and usually three iteration steps are already enough to get a proper
estimate about the regularization parameter. The iteration steps are also fast, because in
practice we dont have to use the whole signal for the estimation, only a representative part
90

of the signal, which could be only one or two seconds long.


Unfortunately, usually we dont have enough information about the original nonlinear
distortion, since all that we have is a few hundred meters of developed film and we have no
information about the recording and development parameters. However, using the analytical
formula of film density characteristic and using the a priori information that the recorded
film contains periodic signal parts, a blind identification method can be developed that is
explained in 6.2.
The identification method, the computation of the regularization parameter and the
restoration by the regularized characteristic was tested on synthetic and real audio signals.
The proposed method performed well and it was proven to be robust also in the case of a
bandlimited VHS recording.

7.2
7.2.1

Suggestions for future research


Improved blind identification

A main problem at the identification is that the original signal is modeled by harmonically
related sinusoids, therefore the method requires periodic signals from the distorted signal for
the analysis. However, the audio signals contain only quasi-periodic signals, which can be
treated to periodic only in a very small interval, e.g. few times ten samples. This is very
small amount of data, and if it is contaminated by strong noise, the identification could be
quite inaccurate.
Longer signal parts could be used, if the original signal could be modeled otherwise.
In the last years promising results were achieved by modeling sound signals using damped
sinusoids [152] and damped and delayed sinusoids [153].
Using a more sophisticated signal modeling method the number of model parameters
can be significantly increased. In the harmonical model we had to use only a fundamental
frequency parameter and amplitude and phase parameters for the sinusoids. In the case of
damped and delayed sinusoids we have to use decay and delay parameters for all sinusoids,
which means almost twice as much parameters as originally. The high number of parameters
can easily result the stuck of the optimizer algorithm in a local minimum. This is also an
important problem of the blind identification that requires further investigations.
91

7.2.2

Adaptivity

Another important problem is the adaptivity of the method. In this thesis, the nonlinear
distortion is assumed to be time invariant. Since one film roll has the same sensitivity on
the whole film band, the film-sound was recorded having the same amount of basic light
intensity and the concentration of the developer fluid can be treated to be constant in the
case of one film roll, therefore time invariance of the nonlinearity is a good assumption in the
case of film rolls. However, for other applications an automatic and regular determination
of the parameters of the nonlinear characteristic could be very useful.
Although the nonlinear distortion of the film can be treated to be constant at least on
a long part of the film roll, the probability distribution of the noise and the useful signal
can change during the film. In this case different amount of regularization at different part
of the films could lead better sound quality, which means the continuous adjustment of
the regularization parameter of the nonlinear distortion, like the filter adjustment in noise
reduction algorithms [154, 155].

7.2.3

Elimination of nonlinearities with memory

The developed methods in this thesis address only memoryless nonlinearities and they are
applicable only for compensation of distortions at variable density type films. However,
several old and degraded films were made with variable area method. The distortion of
these films can be described as a distortion with memory that makes the restoration of these
films difficult. The correct restoration and the effects of noise at this kind of films still
requires further investigations. A possible method could be the further development of the
iterative restoration method described in [98], but taking the effect of noise also into account.

92

Appendix A
Brief history of film-sound
Sound-recording and film-recording techniques exist already more than 120 years. The first
moving pictures were made separately by several inventors. Eadweard Muybridge made moving pictures about his horse in 1877 [156]. Wordsworth Donisthorpe described and patented a
camera in November 1876, in which photographic plates could be exposed in rapid succession
[157]. It was further developed by Thomas Alva Edison. Edison invented also the phonograph in 1877 that could record and play back sound [158]. In 1878 Donisthorpe proposed
a device in the Nature about the combination of his Kinesigraph equipment with Thomas
Edisons phonograph as a means of recording and reproducing dramatic performances [159].
However, it required more than 10 years for making the first steps to bind sight and sound
together.
Film-sound recording and playback had two different methods [127]. The oldest one
was the needle-sound (sound-on-disc sound), where the sound was recorded separately
by phonograph or gramophone. This method had the best sound quality until 1926, the
invention of electrical recording. After that it was not used anymore. Needle-sound was
followed by optical sound-tracks that become a film-sound standard and it is still used today
(sound-on-film sound). Magnetic sound-recording (another kind of sound-on-film sound)
appeared only after 1950 when Dolbys technique made it possible to produce high quality
film-sound by this method. This method is also preferred today.

A.1

Sound-on-disc sound

The history of film-sound begins in 1889 with the name of William Kennedy Laurie Dickson,
Edisons colleague [127]. He tried to make a new movie-projector in which he synchronized
93

Edisons motion picture camera and phonograph together [160]. The first working versions
of this machine were sold in 1895. These were the so-called peep-shows, because only one
man could enjoy the movie at once, due to the lack of sound power. The theatre-version of
this projector was made only in 1913 by a special mechanical sound-amplifier [127].
Leon Gaumont, in France, began as early as 1901 to work on combining the phonograph
and motion picture. He worked on the project during several widely seperated intervals (a
series of shows of the Film Parlant at the Gaumont Palace in Paris in 1913 and demonstrations in the U.S. were the biggest accomplishments).
An attempt by Carl Laemmle of Paramount in 1907 to exploit a combination of phonograph and motion picture resulted in a German development called Syncroscope. It was
handicapped by the short time which the record would play and, after some apparently successful demonstrations, was dropped for want of a supply of pictures with sound to maintain
programs in the theaters where it was tried.
Efforts to provide sound for movies were attempted by Georges Pomerede, who used
flexible shafts or other mechanical connections to combine phonograph and motion pictures
in 1907, while E. H. Amet in 1912-1918 used electrical methods for the sound. Wm. H.
Bristol began his work on synchronous sound in 1917. There were few further efforts in the
U.S. to provide sound for pictures by means of mechanical recording until 1926.

A.2

Sound-on-film sound

The first trial to make optical sound-recordings was in 1878 by Aexander Blake. However,
he could not solve the playback. It was made first by Ernst Ruhmer, physician, in 1901. For
recording, he used an arc lamp driven by a carbon microphone through a transformer. The
light of the lamp was recorded onto a photosensitive film. The film recorded the changing
light intensity of the lamp as lighter and darker spots. The playback was performed also by
an arc lamp and a selenium sensor [161].
Film and optical sound was joint together by Eugene A. Lauste, who worked at Edisons
Orange N.J. lab between 1887-1892 under the direction of W. K. L. Dickson. While working
for Edison, Lauste read a Scientific American 1881 article [162] about Bells Photophone [163]
and sought to use this method to record sound on 35mm motion picture film. He applied
for a patent in England on Aug. 11, 1906, and granted in 1910, for a new and improved
method for simultaneously recording and reproducing movements and sounds [127]. His
first device used a mechanical grate, then mirrors, and by 1910 developed a light gate of a
94

vibrating silicon wire between two magnets. Lauste made many sound films between 1910
and 1914, but was halted by the war.
In 1917 Theodore W. Case developed the Thalofide photocell that used thallium oxysulfide. By 1922 he developed the Aeo-light as a source of modulated light. E. I. Sponable
worked with Case after 1916 and from 1922 to 1925 he shared equipment with Lee de Forest.
Case and Sponable in 1924 developed a sound recording mechanism for a modified Bell and
Howell camera using the Aeo-light tube. After breaking off from de Forest in 1925, Case
began to develop a projector sound head, offset 20 frames at a speed of 90 ft. per min., using
a narrow slit with a helical filament. General Electric and Western Electric were developing
their own sound systems. Instead of Aeo-light, they were using mechanical solutions for
light modulation, so did not wish to buy into the Case-Sponable system. William Fox at
Western Electric licensed his system at July 23, 1926, and organized the Fox-Case Corp.
with Courtland Smith as president to develop what became known as the Movietone News
service. Sponable left the Case lab to join Fox in designing the recording studios in New
York and Hollywood, and he designed in 1927 a screen that allowed sound to pass through
the screen. The Fox-Case Corp. licensed amplifiers and speakers from Western Electric in
1926 and from ERPI (Electrical Research Products Inc., was formed as a Western Electric
subsidiary, organized in January 1927). The sound-quality of these devices has ended the
life of sound-on-disc method.
Beside this, in 1918, J.T. Tykociner developed a sound-on-film system at the University
of Illinois that used mercury arc light and a Kunz photocell (a cathode of potassium on
silver).
In 1921, Charles A. Hoxie developed a sound film recorder called the Pallophotophone
(meant shaking light sound) at General Electric, a company that had a well-established
photographic and motion picture laboratory under C. E. Bathcholtze for company use and
publicity. He recorded speeches by President Coolidge and his Secretary of War and others
that were broadcast on WGY in Schenectady in 1922. He developed that Pallotrope that
was a photoelectric microphone to be used as the sound pickup. His film soundtracks were
variable-area type.
GE gave demos in 1926 and 1927 of the Hoxie system with loudpseakers and amplifiers
from Bell Labs. The GE system was called the Kinegraphone and used to exhibit a road
show version of the Paramount film Wings in 1927, using multiple-unit cone-and-baffle type
loudspeakers in a bank on each side of the screen. The soundhead was placed on top of the
projector because sound projectors had not yet been installed in theaters. Film speed was 90
95

ft. per min (24 fps) and the optical soundtrack was recorded on the edge of the film, image
size having been reduced from 1 inch down to 7/8 inch to make room for the variable area
soundtrack. In 1927 the film project was transferred from the Engineering Laboratory to
the Radio Dept. for commercial manufacturing. GE would work closely with Westinghouse
and RCA in the manufacturing of sound film equipment.
In 1926, one of the first Vitaphone shorts was made with Bryan Foy in the Manhattan
Opera House in New York. Vitaphone used a 12-inch or a 16-inch disc on a turntable at
33-1/3 rpm for 9-10 minutes, from inside to outside, on one side only, with a lateral-cut
groove. Victor made the Vitaphone records with much less abrasive filler, causing the discs
to wear out after only 24 plays. Vitaphone discs had a needle force of 80-170 grams and
frequency response of 4300 Hz. The Fox sound-on-film was 8000 Hz but had more wow
and flutter and more noise caused by light cell reading film emulsion grain. The RCA-GE
Photophone system used the variable area method that had less noise than the Fox variable
density method. Wentes light valve in the Western Electric variable density sound-on-film
method was capable to transmit up to 8500 Hz.
On Dec. 20, 1926, Western Electric and AT&T created the Electrical Research Products
Inc. (ERPI) to license non-telephone technology, including Vitaphone and microphones and
amplifiers and loudspeakers. On Dec. 31, Fox signed an agreement with ERPI to combine its
Movietone sound-on-film method with Western Electrics amplifification methods for theater
use. This variable density system would compete for the next decade with the RCA variable
area system that was adopted by RKO after 1928.
In 1930, after the invention of electrical recording that made sound pictures possible after
1926, the motion picture soundtrack was standardized as a single-track (monaural) soundon-film (optical) track on the edge of a 35mm film strip. Some variable width tracks had a
solid black left edge, or some had a solid right edge, or some were variable on both edges. In
most theaters, a Western Electric 35mm film projector had an optical pickup head that read
the variable width or variable density image with a selenium photoelectric cell, producing
an electrical signal that went to a monaural tube amplifier that drove a large horn speaker
behind the screen at the front of the theater.
In 1934, RCA introduced a 16mm sound motion picture camera for the amateur market
that recorded an optical soundtrack on the edge of the film.
1940 Nov. 13 is the premier date of Walt Disneys Fantasia in New Yorks Broadway
Theater with a multichannel soundtrack produced by Leopold Stokowski who recorded an
optical track for each section of the orchestra, resulting in 9 separate soundtracks. These
96

were mixed by Stokowski into 4 master optical tracks that were played in synchronization
on special equipment made by RCA for a multiple-loudspeaker theater installation called
Fantasound; behind the screen were three horns and placed around the other walls of the
theater were 65 smaller speakers. The separation and directionality of sounds was impressive.
However, the system was not practical because of the $85,000 cost to equip each theater,
opposition by unions, and a demand by the government that RCA stop manufacturing the
necessary sound components because of defense priorities. After the 2nd full installation of
equipment at the Carthay Circle Theater in Los Angeles, it was not installed in any other
theaters.

97

98

Appendix B
Optimal signal restoration in linear
systems
B.1

Simple linear system

Consider a linear system having infinite bandwidth:


y = A1 x + A0 .

(B.1)

We have only a noisy observation about the output of this system:


o = y + n,

(B.2)

where n is the output noise that assumed to be independent from y. The expected value of
x and n is zero and the variance of x and n is assumed to be finite and different from 0.
We would like to have an estimate, x about the original input signal by using a postdistortion process. The model of the estimation process can be seen in Fig. B.1. x can be
expressed as
x = B1 o + B0 = B1 y + B1 n + B0 = B1 A0 + B1 A1 x + B1 n + B0 .

(B.3)

We would like to make an optimal reconstruction, x, in least squares sense about x by


finding the proper value of B0 and B1 . The estimate value of the least squared error, ,
99

-y

= A1 x + A0

?
- +j

-x

= B1 o + B0

Figure B.1: Estimation of x in the knowledge of o.


between x and x can be written as
E{} = E {(
x x)}2


= E (B1 A0 + B1 A1 x + B1 n + B0 x)2

= E{B12 A20 } + E{2B12 A0 A1 x} + E{2B12 A0 n} + E{2B0 B1 A0 } E{2B1 A0 x}


+E{B12 A21 x2 } + E{2B12 A1 xn} + E{2B0 B1 A1 x} E{2B1 A1 x2 }
+E{B12 n2 } + E{2B0 B1 n} E{2B1 nx}

(B.4)

We are looking for those B0 and B1 values, where E{} is minimal. The minimum of Eq.
(B.4) is at the point, where

E{}
B0

= 0 and

E{}
B1

= 0. For B0 , we can write:

2B1 A0 + 2B1 A1 E{x} + 2B1 E{n} + 2B0 2E{x} = 0

(B.5)

Since E{x} = E{n} = E{xn} = 0 therefore


B0 = B1 A0 ,

(B.6)

regardless from x or n.
For B1 , we can write:
2B1 A20 + 4B1 A0 A1 E{x} + 4B12 A0 E{n} + 2B0 A0 2A0 E{x}

+2B1 A21 E{x2 } + 4B1 A1 E{xn} + 2B0 A1 E{x} 2A1 E{x2 }

+2B1 E{n2 } + 2B0 E{n} 2E{xn} = 0.

(B.7)

Simplifying eq. (B.7) and expressing for B1 , we will get


B1 =

1
A1 +
100

E{n2 }
A1 E{x2 }

(B.8)

20

^
x

11
10

18

16

8
14
7
12
6
10
5
8
4
6
3
4

10

11

10

12

14

16

18

20
o

Figure B.2: Original and inverse piecewise linear system.

B.2

Piecewise linear model with two and more intervals

Consider the following piecewise linear system:


(
A10 + A11 x < x <
y=
A20 + A21 x x <

(B.9)

We have only a noisy observation about the output of this system:


o = y + n,

(B.10)

where n is the output noise that assumed to be independent from y. The estimate value of
x and n is zero.
We would like to have an estimate, x about the original input signal by using a postdistortion process. The estimator can be represented as fig. B.2:
(
B10 + B11 o < o <
x =
B20 + B21 o o <
where = A10 + A11 = A20 + A21 .
Using eq. (B.11) as the post-distorter, we can distinguish four different cases:
1. when y < and y + n < then x = B10 + B11 (A10 + A11 x + n);
2. when y and y + n < then x = B10 + B11 (A20 + A21 x + n);
3. when y < and y + n then x = B20 + B21 (A10 + A11 x + n);
4. when y and y + n then x = B20 + B21 (A20 + A21 x + n).
101

(B.11)

If the probability of these four cases noted by p1 , p2 , p3 and p4 respectively, we can write:
p1 = p(o < |y < ),
p2 = p(o < |y ),
p3 = p(o |y < ),
p4 = p(o |y ),
p1 + p2 = 1 if o < ,
p3 + p4 = 1 if o ,

(B.12)

First, examine the case, when o < . The estimate value of the squared difference is
E{} = E{(p1 (B10 + B11 (A10 + A11 x + n)) + p2 (B10 + B11 (A20 + A21 x + n)) x)2 }

(B.13)

After factorization of the squared sum and simplifying with those parts where E{x} = 0,
E{n} = 0 and E{xn} = 0 eq. (B.13) reduces to:
2
2
2
E{} = B10
+ 2p1 B10 B11 A10 + 2p2 B10 B11 A20 + p21 B11
A210 + 2p1 p2 B11
A10 A20
2
2
2
2
+p22 B11
A220 + B11
E{n2 } + p21 B11
A211 E{x2 } + 2p1 p2 B11
A11 A21 E{x2 }
2
2p1 B11 A11 E{x2 } + p22 B11
A221 E{x2 } 2p2 B11 A21 E{x2 } + E{x2 }

(B.14)

We are looking for those B10 and B11 values, where E{} is minimal. The minimum of
Eq. (B.4) is at the point, where

E{}
B10

= 0 and

E{}
B11

= 0. For B10 , we can write:

B10 = p1 B11 A10 p2 B11 A20 .

(B.15)

For B11 , we can write:


p1 B10 A10 + p2 B10 A20 + B11 E{n2 } + p21 B11 A210 + p1 p2 B11 A10 A20

+p22 B11 A220 + p21 B11 A211 E{x2 } + 2p1 p2 B11 A11 A21 E{x2 }

p1 A11 E{x2 } + p22 B11 A221 E{x2 } p2 A21 E{x2 } = 0.

(B.16)

Substituting eq. (B.15) into eq. (B.16) and solving for B11 we will get
B11 =

1
(p1 A11 + p2 A21 ) +

E{n2 }
(p1 A11 +p2 A21 )E{x2 }

(B.17)

Similarly for B20 and B21 we will get


B20 = p3 B21 A10 p4 B21 A20 ,
1
B21 =
E{n2 }
(p3 A11 + p4 A21 ) + (p3 A11 +p
2
4 A21 )E{x }
102

(B.18)

For a piecewise model with k intervals, [U1 , U2 , . . . , Ui , . . . , Uk ], the following formula can
be derived:
Bi0 = Bi1

k
X
j=1

Bi1 = Pk

p(y Ui |o Uj )Aj0

j=1 p(y

1
Ui |o Uj )Aj1 +

E{n2 }
1
E{x2 }
p(yU
|oU
)A
i
j
j1
j=1

(B.19)

Pk

If the probability distribution function of y and n is exactly known, p(y Ui |o Uj ) can

be computed as

p(y Ui |o Uj ) =

Ui

(+)Uj

103

py ()pn ()dd

(B.20)

104

Appendix C
MATLAB simulation of a realistic
photosensitive layer
%Monte-Carlo simulation of transission vs. exposure characteristic of
%a lognormal distribution photosensitive emulsion
clear all
pack
format long
%interval of interest
x=(0.001:0.001:3.7); %this is the diameter of silver halide particles in um^2
%typical distribution parameters
%these parameters were approximated to the graph in
%T. H. james, the theory of the Photographic Process,
%MacMillan, 1966, page 39, Fig 2.1
%Histogram of size frequency curve
S = 2.5;
M = 0.07;
c1 = 100 * 1/log(S)*sqrt(2)*sqrt(pi);
%making a typical lognormal distribution
%number of particles vs. projection area of particles
lognorm_dist = c1 * exp(-(log(x)-log(M)).^2/2/log(S));
105

%lognorm_dist = lognorm_dist(500:end);
%small particles are insensitive for visible light
lognorm_dist2 = round(lognorm_dist);
%number of silver halide particles in the simulation
N = sum(lognorm_dist2);
%grains(i): number of impacted photons in the simulation
%sensitivity(i): number of required photons to make the i-th grain developable
grains = zeros(N,1);
sensitivity = zeros(N,1);
evaluation_vect = zeros(N,1);
opacity = zeros(N,1);
index1 = 1;index2 = 1;
for i = 1:length(lognorm_dist2),
index1 = index2;
index2 = index2 + lognorm_dist2(i);
sens_avg = round(200/x(i));
sensitivity(index1:index2-1) = exp(randn(index2-index1,1)+log(sens_avg));
opacity(index1:index2-1) = x(i)*ones(index2-index1,1);
end;
for i = 1:1000000,
index = round((N-1)*rand(500000,1))+1;
grains(index) = grains(index)+1;
evaluation_vect = grains-sensitivity;
indexlist = find(evaluation_vect>0);
disp(length(indexlist));
disp(N);
char(i) = sum(opacity(indexlist));
if(mod(i,50)==0)
save char.mat char

106

end;
if(mod(i,10)==0)
figure(1);plot(char);pause(0.01);
end;
end;

107

108

Appendix D
MATLAB realization of computation
of regularized nonlinear
characteristics
function K=makeinvc2(lambda,x2);
% K=makeinvc2(lambda,x2);
% This function produces a regularized inverse, K[i] of y=myfcn(x)
% in the range [0,x2] with dx=0.00025 resolution.
% lambda is the Tikhonov regularization parameter.
% We assume that myfcn^{-1}(0) = 0.
dx = 0.00025;

%resolution

%auxilary parameter for derivation

= 0.000001;

x_begin=0;
y_begin=0;
x_max=abs(x2);
N=ceil(x_max/dx)+1;
x=x_begin;
K=zeros(1,N+1);
K(1)=y_begin;

109

for i=2:(N+1),
dy=(myfcn(x+d)-myfcn(x-d))/d/2;
dy2=(myfcn(x+dy*dx+d)-myfcn(x+dy*dx-d))/d/2;
dy=0.5*dy+0.5*dy2;
dy=dy/(dy*dy+lambda);
K(i)=K(i-1)+dy*dx;
x=K(i);
end;

110

Appendix E
MATLAB realization of finding the
optimal regularization
% n : signal part from the observation that
%

contains only noise

% y : noisy, nonlinearly distorted observation


% x_est : estimate about the original,
%

undistorted signal

% myfcn(x) : original nonlinear function


% myinvfcn(y,K,Kx) : the regularized inverse function
%

K and Kx are auxilary variables

d = 0.0001;
p_nx = (-0.1:d:0.1); p_n

= hist(n,p_nx);;

% histogram of noise (estimation about noise p.d.f.)


p_x_estx = (-5:d:5); p_x_est

= hist(y,p_x_estx); p_x_est

p_x_est/sum(p_x_est);
% histogram of observation (first estimation about
% input signal p.d.f.)
for k=1:3, %three iterations are enough
for j=1:20,
111

% Twenty different regularized characteristics


% are already stored, we just select the best one.
s = int2str(j);
s = [invchar,s,.mat];
feval(load,s);
e(j)=0;
for i=1:length(p_x_est),
e(j)=e(j)+p_x_est(i)*sum(p_n .* ...
(myinvfcn(myfcn(p_x_estx(i))+p_nx,K,Kx)-p_x_estx(i)).^2);
end;
e(j)=myerror(j-1,p_x_est,p_x_estx,p_n,p_nx)
end;
elog(k,:)=e;
[emin,pos]=min(e)
s = int2str(pos-1);
s = [invchar,s,.mat];
feval(load,s);
x_est=myinvfcn(o,K,Kx);
p_x_est

= hist(x_est,p_x_estx);

p_x_est

= p_x_est/sum(p_x_est);

end;
% pos contains the number of the best regularized characteristic.

112

Appendix F
MATLAB realization of calculation of
compensation characteristic for
unbiased signal reconstruction
%computation of compensation characteristic
%for unbiased signal reconstruction
%in the case of the erf() function
%and uniformly distributed noise
%selecting the range of interest
do

= 0.00001;

range_o = 0.99999;
range_n = 0.1;%0.05;
o

= (-range_o:do:range_o);

%creating the probability density function


%of uniformly distributed noise
n

= ones(round(2*range_n/do)-1,1)/(2*range_n/do);

d_h = round(range_n/do)-1;
%first iteration of the compensation characteristic
K_0

= erfinv(o);
113

kappa

= conv(n,K_0);

kappa2 = kappa(2*d_h+1:end-2*d_h);
Nm1

= K_0(d_h+1:end-d_h);

alpha

= 1;

K_1

= K_0(index+d_h+1) + alpha*(Nm1(index)-kappa2(index));

%with K_0 = K_1, the iteration can be further processed

114

Bibliography
[1] S. J. Godsill and P. J. W. Rayner, Digital Audio Restoration - A Statistical ModelBased Approach. Springer-Verlag, 1998, ch. 1.3, pp. 67.
[2] P. T. Troughton and S. J. Godsil, Restoration of Nonlinearly Distorted Audio Using Markov Chain Monte Carlo Methods, Journal of the Audio Engineering Society
(Abstracts), vol. 6, p. 569, June 1998, presented at the 104th Convention of the Audio
Engineering Society, Amsterdam, May 1998. Paper available from the AES.
[3] J. Tsimbinos, Identification and Compensation of Nonlinear Distortion, Ph.D. dissertation, Institute for Telecommunications Research School of Electronic Engineering,
University of South Australia, February 1995.
[4] J. G. Wohlbier, Nonlinear Distortion and Suppression in Traveling Wave Tubes:
Insights and Methods, Ph.D. dissertation, University of Wisconsin-Madison, 2003.
[5] P. Kiss, U. Moon, J. Steensgaard, J. Stonick, and G. Temes, High-speed

4 ADC

with error correction, Electronics Letters, vol. 37, no. 2, pp. 7677, January 2001.

[6] J. Geigel and F. K. Mushgrave, A Model for Simulating the Photographic Development Process on Digital Images, Association for Computing Machinery, Inc., Tech.
Rep., 1997.
[7] A. M. Daz, A. F. Barros, and F. M. Candocia, Image Registration in Range Using a
Constrained Piecewise Linear Model: Analysis and New Results, Proceedings of the
2003 International Conference on Imaging Science, Systems and Technology (CISST
03), Las Vegas, Nevada, vol. 1, pp. 152158, June 23-26 2003.
[8] J. Schimmel, Non-linear Dynamics Processing, Presented at the 114th Convention of
the Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint
5775.
115

[9] M. van der Veen and P. Touzelet, New Vacuum Tube and Output Transformer Models
Applied to the Quad II Valve Amplifier, Presented at the 114th Convention of the
Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint 5748.
[10] S. Moller, M. Gromowski, and U. Zolzer, A Measurement Technique for Highly Nonlinear Transfer Functions, Proceedings of the 5th International Conference on Digital
Audio Effects (DAFx-02), Hamburg, Germany, pp. DAFX203 DAFX206, September 26-28 2002.
[11] C. B. Boyer, A History of Mathematics, second edition. John Wiley, 1991, ch. 18.
[12] K. Weierstrass, Mathematische Werke.

Mayer and M
uller, Berlin, 1903, vol. 3, pp.

137, (Review in Jahrbuch Database JFM).


[13] N. M. Blachman, The Signal-signal, Noise-noise, and Signal-noise Output of a Nonlinearity, IEEE Transactions on Information Theory, vol. IT-14, no. 1, pp. 2127,
January 1968.
[14] , The Uncorrelated Output Components of a Nonlinearity, IEEE Transactions
on Information Theory, vol. IT-14, no. 2, pp. 250255, January 1968.
[15] G. Szego, Orthogonal polynomials, American Society Colloquium Publications,
1939.
[16] A. A. M. Saleh, Frequency-independent and frequency-dependent nonlinear models
of TWT amplifiers, IEEE Transactions on Communications, vol. COM-29, pp. 1715
1720, November 1981.
[17] A. Ghorbani and M. Sheikhan, The Effect of Solid State Power Amplifiers (SSPAs)
Nonlinearities on MPSK and M-QAM Signal Transmission, Sixth International Conference on Digital Processing of Signals in Communication, pp. 193197, 1991.
[18] C. Rapp, Effects of HPA-Nonlinearity on a 4-DPSK/OFDM-Signal for a Digitial
Sound Broadcasting System, Proceedings of the Second European Conference on Satellite Communications, Liege, Belgium, pp. 179184, October 22-24 1991.
[19] M. Ibnkahla, J. Sombrin, F. Castanie, and N. J. Bershad, Neural Networks for Modeling Nonlinear Memoryless Communication Channels, IEEE Transactions on Communications, vol. 45, no. 7, pp. 768771, July 1997.
116

[20] C. E. K. Mees, The theory of the photographic process.

Macmillan, 1966, ch. 4, pp.

7286.
[21] R. J. Cox, Photographic sensitivity - proceedings of the symposium on photographic
sensitivity held at Gonville and Caius College and Little Hall, Cambridge, September,
1972. Academic Press, 1973, ch. 25, pp. 375389.
[22] H. Farid, Blind Inverse Gamma Correction, IEEE Transactions on Image Processing,
vol. 10, no. 2, 2001.
[23] M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems.

John Wiley,

1980.
[24] S. Y. Fakhouri, Identification of Volterra kernels of nonlinear systems, Proceedings
of the IEE, Part D, vol. 127, no. 6, pp. 296304, November 1980.
[25] P. T. Troughton, Simulation Methods for Linear and Nonlinear Time Series Models with Application to Distorted Audio Signals, Ph.D. dissertation, University of
Cambridge, 1999.
[26] H. Tong, Non-Linear Time Series: A Dynamical System Approach. Clarendon Press,
Oxford, 1993.
[27] D. T. stheim, Non-linear time series: A selective review, Scandinavian Journal of
Statistics, vol. 21, pp. 87130, 1994.
[28] S. Chen and S. A. Billings, Representations of non-linear systems: The NARMAX
model, International Journal of Control, vol. 49, no. 3, pp. 10131032, 1989.
[29] G. Palm, On representation and approximation of nonlinear sytems, Biological Cybernetics, vol. 31, pp. 119124, 1978.
[30] M. J. Korenberg, Parallel cascade identification and kernel estimation for nonlinear
systems, Annals of Biomedical Engineering, vol. 19, pp. 429455, 1991.
[31] D. T. Westwick, Methods for the Idenification of Multiple-Input Nonlinear Systems,
Ph.D. dissertation, McGill University, Montreal, 1995.
[32] T. L. Sculley and M. A. Brooke, Nonlinearity Correction Techniques for High Speed,
High Resolution A/D Conversion, IEEE Transactions on Circuits and SystemsII:
Analog and Digital Signal Processing, vol. 42, no. 3, pp. 154163, March 1995.
117

[33] N. T. Thao, Systematic Approach to the Digital Compensation for Deterministic


P
Nonideal Characteristics in
4 Modulation, IEEE Transactions on Circuits and
Systems-II: Analog and Digital Signal Processing, vol. 45, no. 9, pp. 13151321, September 1998.
[34] P. Rombouts, J. Raman, and L. Weyten, An Approach to Tackle Quantization Noise
P
Folding in Double-Sampling
4 Modulation A/D Converters, IEEE Transactions
on Circuits and SystemsII: Analog and Digital Signal Processing, vol. 50, no. 4, pp.
157163, April 2003.
[35] P. Rombouts, J. D. Maeyer, and L. Weyten, A 250-kHz 94-dB Double-Sampling

Modulation A/D Converter With a Modified Noise Transfer Function, IEEE Journal
of Solid State Circuits, vol. 38, no. 10, pp. 16571662, October 2003.
[36] O. Mitrea, C. Popa, A. M. Manolescu, and M. Glesner, A curvature-corrected CMOS
bandgap reference, Advances in Radio Science, vol. 1, pp. 181184, 2003.
[37] M. Ortmanns, Y. Manoli, and F. Gerfers, A New Technique for Automatic Error
P
Correction in
4 Modulators, IEEE ISCAS, 2004, paper 1022.

[38] H. Schurer, C. Slump, and O. Herrmann, Comparison of Three Methods for Lineariza-

tion of Electrodynamic Transducers, Proceedings of the ProRISC/IEEE BeNeLux


workshop on Circuits, Systems and Signal Processing, Mierlo, The Netherlands, pp.
285290, November 27-28 1996.
[39] L. Cristaldi, A. Ferrero, M. Lazzaroni, and R. Ottoboni, A Linearization Method for
Commercial Hall-Effect Current Transducers, IEEE Transactions on Instrumentation
and Measurement, vol. 50, no. 5, pp. 11491153, October 2001.
[40] F. Kemenes, Hangfenykepezes (Optical sound-recording).

Mernoki Tovabbkepzo

Intezet, 1954, a Mernoki Tovabbkepzo Intezet 1953-54. evi eloadassorozatabol: 2852,


(in Hungarian).
[41] A. R. Kaye, D. A. George, and M. J. Eric, Analysis and Compensation of Bandpass
Nonlinearities for Communications, IEEE Transactions on Communications, vol. 20,
no. 5, pp. 965972, October 1972.
[42] E. Biglieri, S. Barberis, and M. Catena, Analysis and Compensation of Nonlinearities
in Digital Transmission Systems, IEEE Journal on Selected Areas in Communications,
vol. 6, no. 1, pp. 4251, January 1988.
118

[43] G. Karam and H. Sari, Analysis of Predistortion, Equalization, And ISI Cancellation Techniques in Digital Radio Systems with Nonlinear Transmit Amplifiers, IEEE
Transactions on Communications, vol. 37, no. 12, pp. 12451253, December 1989.
[44] S. Pupolin and L. J. Greenstein, Performance Analysis of Digital Radio Links With
Nonlinear Transmit Amplifiers, IEEE Journal of Selected Areas in Communication,
vol. SAC5, pp. 534546, April 1987.
[45] P.-R. Chang and B.-C. Wang, Adaptive Decision Feedback Equalization for Digital
Satellite Channels Using Multilayer Neural Networks, IEEE Journal on Selected Areas
in Communication, vol. 13, no. 2, pp. 316324, February 1995.
[46] G. Lazzarin, S. Pupolin, and A. Sarti, Nonlinearity Compensation in Digital Radio
Systems, IEEE Transactions on Communications, vol. 42, no. 2/3/4, pp. 988998,
February/March/April 1994.
[47] R. Raich, H. Quian, and G. T. Zou, Digital baseband predistortion of nonlinear power
amplifiers using orthogonal polynomials, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP03), vol. 6, pp. VI 689
VI 692, April 2003.
[48] S. Nadjarah, X. N. Fernando, and R. Sedaghat, Adaptive digital predistortion of laser
diode nonlinearity for wireless applications, Canadian Conference on Electrical and
Computer Engineering,IEEE CCECE 2003, vol. 1, pp. 159162, May 2003.
[49] S. K. Wilson and P. Delay, A Method to Improve Cathode Ray Oscilloscope Accuracy, IEEE Transactions on Instrumentation and Measurement, vol. 43, no. 3, pp.
483486, June 1994.
[50] H. Black, Inventing the Negative Feedback Amplifier, IEEE Spectrum, pp. 5560,
1977.
[51] G. J. Adams, Adaptive Control of Loudspeaker Frequency Response at Low Frequencies, Journal of the Audio Engineering Society, May 1983.
[52] A. J. M. Kaizer, Modelling of the Nonlinear Response of an Electrodynamic Loudspeaker by a Volterra Series Expansion, Journal of the Audio Engineering Society,
vol. 35, no. 6, June 1987.
119

[53] W. Klippel, The Mirror FilterA New Basis for Linear Equalization and Nonlinear
Distortion Reduction of Woofer Systems, Journal of the Audio Engineering Society,
vol. 40, no. 9, p. 675, March 1992.
[54] , Filter Structures to Compensate for Nonlinear Distortions of Horn Loudspeakers, Journal of the Audio Engineering Society, October 1995, preprint number 4102.
[55] , Modeling the Nonlinearities in Horn Loudspeakers, Journal of the Audio Engineering Society, vol. 44, no. 6, pp. 470480, June 1996.
[56] , Compensation for Nonlinear Distortion of Horn Loudspeakers by Digital Signal
Processing, Journal of the Audio Engineering Society, vol. 44, no. 11, pp. 964972,
November 1996.
[57] , Adaptive Adjustment of Nonlinear Filters Used for Loudspeaker Linearization,
Journal of the Audio Engineering Society, vol. 46, no. 11, p. 939, May 1998, preprint
number 4646.
[58] , Diagnosis and Remedy of Nonlinearities in Electrodynamical Transducers,
Journal of the Audio Engineering Society, September 2000, preprint number 5261.
[59] H. Schurer,

A. G. Nijmeijer,

M. A. Boer,

C. H. Slump,

and O. E.

Herrmann, Identification And Compensation Of The Electrodynamic Transducer


Nonlinearities, Proceedings of the International Conference on Acoustic, Speech and
Signal Processing, ICASSP97, vol. 3, pp. 23812385, 1997. [Online]. Available:
citeseer.nj.nec.com/schurer97identification.html
[60] M. Sternad, M. Johansson, and J. Rutstrom, Inversion of Loudspeaker Dynamics
by Polynomial LQ Feedforward Control, Proceedings of the IFAC Symposium on
Robust Control Design, Prague, Czech Republic, vol. 13, 2000. [Online]. Available:
citeseer.nj.nec.com/sternad00inversion.html
[61] A. Bellini, G. Cibelli, E. Ugolotti, A. Farina, and C. Morandi, Non-linear Digital Audio Processor for Dedicated Loudspeaker Systems, IEEE Transactions on Consumer
Electronics, vol. 44, no. 3, pp. 10241031, August 1998.
[62] A. Stenger, L. Trautmann, and R. Rabenstein, Nonlinear Acoustic Echo Cancellation With 2nd Order Adaptive Volterra Filters, IEEE International Conference on
Acoustics, Speech and Signal Processing, March 1999, phoenix, USA.
120

[63] A. Stenger and R. Rabenstein, Adaptive Volterra Filters for Nonlinear Acoustic Echo
Cancellation, Nonlinear Signal and Image Processing (NSIP), June 20-23 1999.
[64] A. Stenger, W. Kellermann, and R. Rabenstein, Adaptation of Acoustic Echo Cancellers Incorporating a Memoryless Nonlinearity, IEEE Workshop on Acoustic Echo
and Noise Control (IWAENC99), Pocono Manor PA, USA, 1999.
[65] A. Stenger and W. Kellermann, Adaptation of a Memoryless Preprocessor for Nonlinear Acoustic Echo Cancelling, Signal Processing, Elsevier, vol. 80, pp. 17471760,
September 2000.
[66] T. K. Sarkar, D. D. Weiner, and V. K. Jain, Some Mathematical Considerations in
Dealing with the Inverse Problem, IEEE Transactions on Antennas and Propagation,
vol. AP-29, no. 2, pp. 373379, March 1981.
[67] S. Vladimir, A dekonvol
ucio es merestechnikai alkalmazasi lehetosegei, III. Orsz
agos
Elektronikus M
uszer- es Merestechnikai Konferencia, March 13-16 1972, in Hungarian.
[68] P. E. Siska, Iterative unfolding of intensity data with application to molecular beam
scattering, The Journal of Chemical physics, vol. 59, no. 11, pp. 60526060, December
1973.
[69] D. Henderson, A. G. Roddie, J. G. Edwards, and H. M. Jones, A deconvolution technique using least-squares model-fitting and its application to optical pulse measurement, US Department of Commerce, National Technical information Service, National
Physical Laboratory technical report DES-87, 1988.
[70] J. Biemond, R. L. Lagendijk, and R. M. Mersereau, Iterative Methods for Image
Deblurring, Proceedings of the IEEE, vol. 78, no. 5, pp. 856883, May 1990.
[71] R. Molina, J. Nunez, and J. Mateos, Image Restoration in Astronomy - A Bayesian
Perspective, IEEE Signal Processing Magazine, vol. 18, no. 2, pp. 1129, March 2001.
[72] S. E. Kiersztyn, Numerical Correction of HV Impulse Deformed by the Measuring
System, IEEE Transactions on Power Apparatus and Systems, vol. PAS-99, no. 5,
pp. 19841991, Sept/Oct 1980.
[73] I. Kollar, P. Osvath, and W. S. Zaengl, Numerical Correction and Deconvolution of
Noisy HV Impulses by Means of Kalman Filtering, IEEE International Symposium
on ELectrical Insulation, Boston, Mass, pp. 5963, June 5-8 1988.
121

[74] N. H. Younan, A. B. Kopp, D. B. Miller, and C. D. Taylor, On Correcting HV Impulse


Measurements by Means of Adaptive Filtering and Deconvolution, IEEE Transactions
on Power Delivery, vol. 6, no. 2, pp. 501506, April 1991.
[75] V. Szekely, Identification of RC Networks by Deconvolution: Chances and Limits,
IEEE Transactions on Circuits and Systems-I. Theory and applications, vol. 45, no. 3,
pp. 244258, March 1998.
[76] E. Hensel, Inverse Theory and Applications for Engineers. Prentice Hall, 1991.
[77] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems. Kluwer,
Dordrecht, 1996.
[78] A. N. Tikhonov and V. Y. Arsenin, Solutions of Ill-posed problems. Wiley, 1977.
[79] M. Ulbrich, A Generalized Tikhonov Regularization for Nonlinear Inverse Ill-Posed
Problems, Technische Universitat, Munchen, Tech. Rep., July 1998, tUM-M9810.
[Online]. Available: http://www-lit.mathematik.tu-muenchen.de/reports/
[80] H. W. Engl, K. Kunisch, and A. Neubauer, Convergence rates for Tikhonov
regularisation of nonlinear ill-posed problems, Inverse Problems, vol. 5, pp. 523540,
August 1989. [Online]. Available: stacks.iop.org/0266-5611/5/523
[81] H. W. Engl and P. K
ugler, Nonlinear Inverse Problems: Theoretical Aspects and
Some Industrial Applications, to be published in Elsevier, 2003.
[82] M. Gulliksson, Regularizing nonlinear least squares with applications to parameter
estimation, ECMI98, June 22-27 1998.
[83] K. Kunisch and W. Ring, Regularization of nonlinear illposed problems with closed
operators, Numer. Funct. Anal. Optim., vol. 14, pp. 389404, 1993. [Online].
Available: citeseer.nj.nec.com/kunisch92regularization.html
[84] U. Amato and W. Hughes, Maximum entropy regularization of Fredholm integral
equations of the first kind, Inverse Problems, vol. 7, pp. 793808, December 1991.
[Online]. Available: stacks.iop.org/0266-5611/7/793
[85] H. W. Engl, Convergence rates for maximum entropy regularization, SIAM Journal
on Numerical Analysis, vol. 30, no. 5, pp. 15091536, October 1993.
122

[86] G. Landl and R. S. Anderssen, Non-negative differentially constrained entropy-like


regularization, Inverse problems, vol. 12, pp. 3553, February 1996. [Online].
Available: stacks.iop.org/0266-5611/12/35
[87] A. Mohammad-Djafari, J.-F. Giovannelli, G. Demoment, and J. Idier, Regularization,
maximum entropy and probabilistic methods in mass spectrometry data processing
problems, Int. Journal of Mass Spectrometry, vol. 215, no. 1-3, pp. 175193, Apr.
2002.
[88] R. Acar and C. Vogel, Analysis of bounded variation penalty method for ill-posed
problems, Inverse Problems, vol. 10, no. 6, pp. 12171229, 1994. [Online]. Available:
citeseer.nj.nec.com/167879.html
[89] L. I. Rudin and S. Osher, Total variation based image restoration with free local
constraints, IEEE International Conference on Image Processing, vol. 1, pp. 3135,
November 13-16 1994.
[90] T. Daboczi and T. B. Bako, Inverse Filtering of Optical Images, IEEE Transactions
on Instrumentation and Measurement, vol. 50, no. 4, pp. 991994, August 2001.
[91] O. Scherzer, Explicit versus implicit relative error regularization on the space of functions of bounded variation, Contemporary Mathematics (AMS), vol. 313, pp. 171198,
2002.
[92] U. Tautenhahn, On the method of Lavrentiev regularization for nonlinear ill-posed
problems, Inverse Problems, vol. 18, pp. 191207, February 2002. [Online]. Available:
stacks.iop.org/0266-5611/18/191
[93] P. K. Lamm, Future-sequential regularization methods for ill-posed Volterra equations, Journal of Mathematical Analysis and Applications, vol. 195, pp. 469494, 1995.
[Online]. Available: http://www.mth.msu.edu/ lamm/Preprints/JMAA/index.html
[94] M. Lampton, Damping-undamping strategies for the Levenberg-Marquardt nonlinear
least-squares method, Computers in Physics, vol. 11, no. 1, pp. 110115, Jan/Feb
1997.
[95] Q.-N. Jin, The analysis of a discrete scheme of the iteratively regularized GaussNewton method, Inverse Problems, vol. 16, pp. 14571476, October 2000.
123

[96] P. Deift and X. Zhou, A steepest descent method for oscillatory Riemann-Hilbert
problems, Bulletin of the American Mathematical Society, vol. 26, no. 1, pp. 119124,
Jan 1992.
[97] A. Neubauer, On Landweber iteration for nonlinear ill-posed problems in Hilbert
scales, Numerische Mathematik, vol. 85, pp. 309328, 2000.
[98] D. Preis and H. Polchlopek, Restoration of Nonlinearly Distorted Magnetic Recordings, Journal of the Audio Engineering Society, vol. 32, no. 1/2, pp. 2630, January/February 1984.
[99] E. Haber, U. Ascher, and D. Oldenburg, On optimization techniques for solving
nonlinear inverse problems, Inverse problems, vol. 16, no. 5, pp. 12631280, October
2000. [Online]. Available: citeseer.nj.nec.com/haber00optimization.html
[100] D. M. Goodman, Deconvolution/Identification Techniques for 1-D Transient Signals,
Lawrence Livermore National Laboratory, Laser Engineering Division, Tech. Rep.,
October 1990.
[101] E. Haber, Numerical Strategies for the Solution of Inverse Problems, Ph.D. dissertation, University of British Columbia, 1997.
[102] W. L. Glans, The Measurement and Deconvolution of Time Jitter in Equivalent-Time
Waveform Samplers, IEEE Transactions on Instrumentation and Measurement, vol.
IM-32, no. 1, pp. 126133, March 1983.
[103] T. Daboczi, Deconvolution of transient signals, Ph.D. dissertation, Department of
Measurement and Instrumentation Engineering, Technical University of Budapest, August 1994.
[104] T. Daboczi and I. Kollar, Multiparameter optimization of inverse filtering algorithms,
IEEE Transactions on Instrumentation and Measurement, vol. 45, no. 2, pp. 417421,
Apr 1996.
[105] W. Chen, M. Chen, and J. Zhou, Adaptively Regularized Constrained Total LeastSquares Image Restoration, IEEE Transactions on Image Processing, vol. 9, no. 4,
pp. 588594, April 2000.
124

[106] S. Roy and M. Souders, Non-iterative waveform deconvolution using analytic reconstruction filters with time-domain weighting, Instrumentation and Measurement
Technology Conference (IMTC), vol. 3, pp. 14291434, May 2000.

[107] Elcio
H. Shiguemori, H. F. de Campos Velho, J. D. S. da Silva, and F. M. Ramos, A
Parametric Study of a New Regularization Operator: the Non-extensive Entropy, 4 th
International Conference on Inverse Problems in Engineering Rio de Janeiro, Brazil,
2002.
[108] M. Bertocco, C. Narduzzi, C. Offelli, and D. Petri, An improved method for iterative identification of bandlimited linear systems, Instrumentation and Measurement
Technology Conference (IMTC), vol. 1, pp. 368372, May 1991.
[109] B. Parruck and S. M. Riad, An Optimization Criterion for Iterative Deconvolution,
IEEE Transactions on Instrumentation and Measurement, vol. IM-32, no. 1, pp. 137
140, March 1983.
[110] R. Ramlau, TIGRA an iterative algorithm for regularizing nonlinear ill-posed
problems, Inverse Problems, vol. 19, pp. 433465, March 2003.
[111] V. A. Morozov, Method for Solving Incorrectly Posed Problems. , Springer, New York,
1984.
[112] U. Tautenhahn and Q. nian Jin, Tikhonov regularization and a posteriori rules for
solving nonlinear ill posed problems, Inverse Problems, vol. 19, pp. 121, February
2003.
[113] P. C. Hansen, Numerical tools for analysis and solution of Fredholm integral equations
of the first kind, Inverse Problems, vol. 8, pp. 849872, December 1992.
[114] J. Janno, Lavrentev regularization of ill-posed problems containing nonlinear near-tomonotone operators with application to autoconvolution equation, Inverse Problems,
vol. 16, pp. 333348, April 2000. [Online]. Available: stacks.iop.org/0266-5611/16/333
[115] S. J. Godsill and P. J. W. Rayner, Digital Audio Restoration - A Statistical ModelBased Approach. Springer-Verlag, 1998.
[116] P. T. Troughton and S. J. Godsill, MCMC methods for restoration of nonlinearly
distorted autoregressive signals, Signal Processing, vol. 81, no. 1, pp. 8397, 2001.
125

[117] K. Mosegaard and M. Sambridge, Monte Carlo analysis of inverse problems, Inverse
Problems, vol. 18, pp. R29R54, June 2002.
[118] S. A. White, Restoration of Nonlinearly Distorted Audio by Histogram Equalization,
Journal of the Audio Engineering Society, vol. 30, no. 11, pp. 828832, November 1982.
[119] , Non-linear Signal Processor. US Patent 4315319, 1982.
[120] M. Tsukamoto, K. Matsunaga, O. Morioka, T. Saito, T. Igarashi, H. Yazawa, and
Y. Takahashi, Correction of Nonlinearity Errors Contained in the Digital Audio Signals, Presented at the 104th Convention of the Audio Engineering Society, 1998,
preprint 4698.
[121] J. Tsimbinos, New Design Techniques for Radio Frequency Input Stages of Communications Receivers, Proc. IREECON91 Conference, Sydney, Australia, pp. 542545,
September 16-20 1991.
[122] B. H. Carrol, Introduction to Photographic Theory. Wiley, 1980, ch. 1.
[123] , Introduction to Photographic Theory. Wiley, 1980, ch. 7.
[124] F. Hurter and V. Driffield, Photochemical Investigations and a New Method of Determination of the Sensitiveness of Photographic Plates, The Journal of the Society
of Chemical Industry, 31 May 1890.
[125] B. H. Carrol, Introduction to Photographic Theory. Wiley, 1980, ch. 12.
[126] R. J. Cox, Photographic sensitivity - proceedings of the symposium on photographic
sensitivity held at Gonville and Caius College and Little Hall, Cambridge, September,
1972. Academic Press, 1973, ch. 1, pp. 125.
[127] R. Fielding, A Technological History of Motion Pictures and Television.

University

of California Press, 1984.


[128] J. Webers, Handbuch der Film- und Videotechnik. M
unchen: Franzis, 1993.
[129] USA standard PH22.401967, Dimensions of Photographic Sound Record on 35 mm
Motion-Picture Prints, United States of America Standards Institute, April 1967.
126

[130] Anonymous,
York

Handbook for Projectionists, 2nd ed.

City:

RCA

Photophone

Inc.,

1930,

411 Fifth avenue,

book

can

be

New

downloaded

from http://www.widescreenmuseum.com/sound/rca01-cover.htm. [Online]. Available: http://www.widescreenmuseum.com/sound/rca01-cover.htm


[131] F. Lohr, A filmszalag u
tja (The way of film).

Magyar Filmintezet es Filmarchvum,

1941, (in Hungarian).


[132] J. Webers, Handbuch der Film- und Videotechnik.

M
unchen: Franzis, 1993, ch. 6.3,

pp. 145146.
[133] DIN 15503, Film 35 mm Lichttonwiedergabe Spurlagen und Spaltbild, August 1968.
[134] H. Lichte and A. Narath, Physik und Technik des Tonfilms, 3rd ed. S. Hirzel, Leipzig,
1945, ch. II, pp. 7289.
[135] T. B. Bako, B. Bank, and T. Daboczi, Restoration of Nonlinearly Distorted Audio
with the Application to Old Motion Pictures, Proceedings of the AES 20th International Conference on Archiving, Restoration and New Methods of Recording, pp.
191198, October 5-7 2001, no. 88-65002.
[136] P. T. Troughton, Bayesian Restoration of Quantised Audio Signals Using a Sinusoidal
Model With Autoregressive Residuals, Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, pp. 159162, October
17-20 1999.
[137] T. B. Bako and T. Daboczi, Reconstruction of Nonlinearly Distorted Signals with
Regularized Inverse Characteristics, Instrumentation and Measurement Technology
Conference, 2001. IMTC 2001. Proceedings of the 18th IEEE, vol. 3, pp. 1565 1569,
May 21-23 2001, no. 01CH37188.
[138] , Reconstruction of Nonlinearly Distorted Signals With Regularized Inverse Characteristics, IEEE Transactions on Instrumentation and Measurement, vol. 51, no. 5,
pp. 10191022, 2002.
[139] M. Marzinzik and B. Kollmeyer, Speech Pause Detection for Noise Spectrum Estimation by Tracking Power Envelope Dynamics, IEEE Transactions on Speech and
Audio Processing, VOL. 10, NO. 2, FEBRUARY 2002, vol. 10, no. 2, pp. 109118,
February 2002.
127

[140] H. H. Lee and C. K. Un, A Study of On-Off Characteristics of Conversational Speech,


IEEE Transactions on Communications, vol. COM-34, no. 6, pp. 630637, June 1986.
[141] J. Haigh and J. Mason, A voice activity detector based on cepstral analysis,
Proceedings of the European Conference on Speech Technology and Communication, EUROSPEECH 93, vol. 2, pp. 11031106, 1993. [Online]. Available:
citeseer.nj.nec.com/haigh93voice.html
[142] H.-G. Hirsch,

Estimation of noise spectrum and its application to SNR-

estimation and speech enhancement,

International Computer Science Insti-

tute, Berkeley, CA, Tech. Rep. TR-93-012, 1993. [Online]. Available:

cite-

seer.nj.nec.com/hirsch93estimation.html
[143] G. Doblinger, Computationally Efficient Speech Enhancement by Spectral Minima
Tracking in Subbands, Proceedings of the European Conference on Speech Technology
and Communication, EUROSPEECH 95, (Madrid, Spain), pp. 15131516, September
1995. [Online]. Available: citeseer.nj.nec.com/doblinger95computationally.html
[144] R. Martin, Noise Power Spectral Density Estimation Based on Optimal Smoothing
and Minimum Statistics, IEEE Transactions on Speech and Audio Processing, vol. 9,
no. 5, pp. 504512, July 2001.
[145] M. Israel Cohen and B. Berdugo, Noise Estimation by Minima Controlled Recursive
Averaging for Robust Speech Enhancement, IEEE Signal Processing Letters, vol. 9,
no. 1, pp. 1215, January 2002.
[146] I. Cohen, Noise Spectrum Estimation in Adverse Environments: Improved Minima
Controlled Recursive Averaging, IEEE Transactions on Speech and Audio Processing,
2003.
[147] V. Stahl, A. Fischer, and R. Bippus, Quantile Based Noise Estimation for Spectral
Subtraction and Wiener Filtering, Proceedings of the International Conference on
Acoustics, Speech and Signal Processing, ICASSP00, vol. 3, pp. 18751878, 2000.
[Online]. Available: citeseer.nj.nec.com/stahl00quantile.html
[148] P. Sovka, P. Pollak, and J. Kybic, Extended Spectral Subtraction, European Signal
Processing Conference (EUSIPCO96), (Trieste, Italy), September 1996. [Online].
Available: citeseer.nj.nec.com/sovka96extended.html
128

[149] T. B. Bako, T. Daboczi, and B. A. Bell, Automatic Compensation of Nonlinear Distortions, Instrumentation and Measurement Technology Conference, 2002. IMTC/2002.
Proceedings of the 19th IEEE, vol. 2, pp. 1321 1357, May 21-23 2002, no. 00CH37276.
[150] N. S. Nahman and M. E. Guillaume, Deconvolution of Time Domain Waveforms in
the Presence of Noise, National Bureau of Standards, NBS, Boulder, CO. USA, Tech.
Note 1047, 1981.
[151] T. B. Bako and T. Daboczi, Unbiased Reconstruction of Nonlinear Distortions, Instrumentation and Measurement Technology Conference, 2002. IMTC/2002. Proceedings of the 19th IEEE, vol. 2, pp. 1099 1102, May 21-23 2002, no. 00CH37276.
[152] Y. Grenier and B. David, Extraction of weak background transients from audio signals, Presented at the 114th Convention of the Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint 5774.
[153] R. Boyer and K. Abed-Meraim, Efficient Parametric Modeling for Audio Transients,
Proceedings of the 5th International Conference on Digital Audio Effects (DAFx-02),
Hamburg, Germany, pp. DAFX97 DAFX100, September 26-28 2002.
[154] S. Canazza, G. de Poli, G. A. Mian, and A. Scarpa, Objective comparison of audio
restoration methods based on Short Time Spectral Attenuation, Proceedings of Science and Technology for the safeguard of Cultural Heritage in the mediterranean basin,
Alcala de Henares, Spain, pp. 173174, July 9-14 2001.
[155] , Comparison of different audio restoration methods based on frequency and time
domains with applications on electronic music repertoire, Proceedings of International
Computer Music Conference. Goteborg, Sweden, pp. 104109, September 16-21 2002.
[156] E. Muybridge, Animals in motion. Dover Publications, 1957.
[157] S. Herbert and M. Heard, Industry, Liberty, and a Vision... Wordsworth Donisthorpes
Kinesigraph. Projection Book, London, 1998.
[158] Anonymous, The Talking Phonograph, Scientific American, 22 December 1877,
article can be downloaded from http://history.acusd.edu/gen/recording/tinfoil77.html.
[Online]. Available: http://history.acusd.edu/gen/recording/tinfoil77.html
129

[159] W. Donisthorpe, Talking Photographs, Nature, 24 January 1878, article can


be downloaded from http://histv2.free.fr/19/donisthorpe.htm. [Online]. Available:
http://histv2.free.fr/19/donisthorpe.htm
[160] W. K. L. Dickson, A Brief History of the Kinetograph, the Kinetoscope and the
Kinetophonograph, SMPE Journal (Society for Motion Picture Engineers), vol. 21,
December 1933.
[161] E. Ruhmer, The Photographophone, Scientific American, 20 July 1901, article
can be downloaded from http://www.fsfl.home.se/backspegel/ruhmer.html. [Online].
Available: http://www.fsfl.home.se/backspegel/ruhmer.html
[162] Anonymous, Bells Photophone, Scientific American, vol. 44, no. 1, pp.
12, January 1881, article can be downloaded from http://histv2.free.fr/bell/bellnotice.htm. [Online]. Available: http://histv2.free.fr/bell/bellnotice.htm
[163] A. G. Bell, On the Production and Reproduction of Sound by Light, American
Journal of Sciences, vol. XX, no. 118, pp. 305324, October 1880, article
can be downloaded from http://histv2.free.fr/bell/bell1.htm. [Online]. Available:
http://histv2.free.fr/bell/bell1.htm

130