31 views

Uploaded by passme369

Study Non Linear Distortion Optical Sound Phdthesis

- WhatsApp Security Whitepaper
- v36_956
- fir_filter_design_considerations
- waveshapeSynth_4up
- Fractal Audio Systems MIMIC (Tm) Technology
- 45057089 Q Amp a Session 01 Noisia
- Xdiss Lang
- Xdiss Lang
- Magnetic Tape Manual 1
- 665_Volterra_2008.pdf
- VOLTERRA 2010 Dafx Novak
- Time Walker
- Chapter 14 Hilbert Filter
- 499 Poster Final
- Q&A Session 01 - Noisia
- rf1
- 0fcfd50fb3d9d213da000000
- Molecules of Emotions
- ICMC-AdditiveInterp
- FAST FOURIER TRANSFORM ALGORITHMS WITH APPLICATIONS

You are on page 1of 148

CHARACTERISTICS

PhD thesis

Tamas B. Bako

BUDAPEST UNIVERSITY OF TECHNOLOGY AND ECONOMICS

DEPARTMENT OF MEASUREMENT AND INFORMATION SYSTEMS

3rd June 2004.

Alulrott, Bako Tamas Bela kijelentem, hogy ezt a doktori ertekezest magam kesztettem

es abban csak a megadott forrasokat hasznaltam fel. Minden olyan reszt, amelyet szo szerint,

vagy azonos tartalomban, de atfogalmazva mas forrasbol atvettem, egyertelm

uen, a forras

megadasaval megjeloltem.

A dolgozat bralatai es a vedesrol kesz

ult jegyzokonyv a kesobbiekben, a Budapesti

M

uszaki es Gazdasagtudomanyi Egyetem dekani hivatalaban lesz elerheto.

Budapest, 2004. j

unius 3.

.....................

Magyar nyelv

u

osszefoglal

o

A regi filmfelvetelek hangja gyakran nem t

ul jo minoseg

u: a lejatszott hang rendkv

ul zajos

es torz. A torzult hang farasztja a kozonseget, akik kevesbe tudnak koncentralni magara a

filmre, ezaltal a film elvezhetosege csokken. Ez az oka annak, hogy szamos regi filmet nem

erdemes lejatszani a kozonsegnek a televzioban vagy a filmsznhazakban. A torz hangot

azonban digitalis jelfeldolgozasi modszerekkel jobba lehet tenni.

Mivel a hangrestauralas szamara semmi mas nem all rendelkezesre, csak a torz es zajos

filmfelvetel, es nincs hozzaferes

unk sem az eredeti jelhez, sem pedig a kesz

ulekekhez, amivel

a felvetelt kesztettek, ezert az egyetlen lehetoseg

unk a hangminoseg feljavtasara a hang

utolagos kompenzalasa. Ez a disszertacio u

j modszereket javasol az optikai u

ton rogztett

regi filmek nemlinearisan torzult hangjanak hatekony es gyors utolagos kompenzalasara.

A disszertacio elso reszeben a nemlinearis modellekrol es a nemlinearis kompenzalo technikakrol esik szo, majd az utolagos nemlinearis kompenzalas lesz reszletesen elmagyarazva

es az, hogy ez a problema miert u

n. rosszul kondcionalt problema. A disszertacio masodik

reszeben olyan modszerek lesznek bemutatva, melyek kepesek kezelni a problema rosszul

kondcionaltsagat (a hang helyrealltas erzekenyseget a torz jelhez hozzaadodott zajokra). A

modszer hatekonysagat szimulaciok es filmreszletek hangjanak helyrealltasa tamasztjak ala.

To the muse

D

ora Sz

asz

Acknowledgement

I am very grateful to Laszlo F

uszfas and Zoltan Seban for helpful discussions and for finding

me the basic literatures of film-processing. I am also grateful to the Hungarian Radio for the

technical support of my research work. The Hungarian National Film Archive, especially

Beke is also acknowledged, who gave me film materials to finish my researches. Also

Eva

many thanks to Laszlo Balogh, who carefully checked the mathematics in this dissertation

and asked me better explanations.

I would also like to thank the many people who have made the Department of Measurement and Instrumentation Technology such a stimulating environment, including those

whose heroic efforts have kept the absurdly nonstandard network running most of the time.

Keywords

The following keywords may be useful for indexing purposes:

Audio restoration, nonlinear compensation, regularization methods, Tikhonov regularization, optical soundtrack, density characteristic.

iv

Summary

This dissertation is concerned with the possibilities of restoration of degraded film-sound.

The sound-quality of old films are often not acceptable, which means that the sound is so

noisy and distorted that the listener have to take strong efforts to understand the conversations in the film. In this case the film cannot give artistic enjoyment to the listener. This is

the reason that several old films cannot be presented in movies or television.

The quality of these films can be improved by digital restoration techniques. Since we

do not have access to the original signal, only the distorted one, therefore we cannot adjust

recording parameters or recording techniques. The only possibility is to post-compensate

the signal to produce a better estimate about the undistorted, noiseless signal. In this dissertation new methods are proposed for fast and efficient restoration of nonlinear distortions

in the optically recorded film soundtracks.

First the nonlinear models and nonlinear restoration techniques are surveyed and the

ill-posedness of nonlinear post-compensation (the extreme sensitivity to noise) is explained.

The effects and sources of linear and nonlinear distortions at optical soundtracks are also

described. A new method is proposed to overcome the ill-posedness of the restoration problem and to get an optimal result. The effectiveness of the algorithm is proven by simulations

and restoration of real film-sound signals.

vi

Contents

1 Introduction

1.1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Structure of thesis

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.1

Classification of nonlinearities . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

2.2.1

2.2.2

Polynomial interpolation . . . . . . . . . . . . . . . . . . . . . . . . .

2.2.3

Analytical models . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.1

Volterra series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.3.2

Parametric models . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

2.3.3

Treshold models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.3.4

Cascade models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

2.3

13

3.1

13

3.2

Pre-distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.3

Post-distortion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15

3.3.1

19

3.3.2

20

3.3.3

22

3.3.4

Bayesian techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

24

3.4.1

Histogram equalization . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.4.2

25

3.4

vii

3.4.3

26

27

4.1

Image formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2

28

4.3

29

33

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

33

5.2

34

5.2.1

35

5.2.2

37

5.3

37

5.4

40

5.5

Appearance of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

6.1

43

Representation of nonlinearity . . . . . . . . . . . . . . . . . . . . . . . . . .

44

6.1.1

44

6.1.2

45

6.2

46

6.3

Effect of noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

6.4

50

6.4.1

53

6.4.2

60

6.4.3

64

6.5

72

6.6

. . . . . . . . . . . . . . . . . . . . .

77

6.7

82

6.8

6.7.1

6.7.2

84

Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

89

7.1

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

7.2

91

viii

7.2.1

91

7.2.2

Adaptivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

7.2.3

92

93

93

94

99

99

B.2 Piecewise linear model with two and more intervals . . . . . . . . . . . . . . 101

C MATLAB simulation of a realistic photosensitive layer

105

109

111

113

ix

List of Tables

5.1

36

6.1

68

6.2

Comparison results of the exact inverse, Tikhonov and the unbiased characteristics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xi

88

xii

List of Figures

2.1

12

3.1

Block-scheme of pre-distortion. . . . . . . . . . . . . . . . . . . . . . . . . .

14

3.2

Block-scheme of post-distortion. . . . . . . . . . . . . . . . . . . . . . . . . .

16

3.3

17

3.4

17

3.5

x in Fig.

3.2). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.1

18

layer for different foton quanta sensitivity (r). . . . . . . . . . . . . . . . . .

30

4.2

31

4.3

31

5.1

34

5.2

35

5.3

37

5.4

5.5

38

Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm)

film at 16 fps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

5.6

41

6.1

50

6.2

51

6.3

51

xiii

6.4

R(pn (n), N(x)) at Gaussian error function and uniformly distributed noise

(noise interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear

function, dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . .

6.5

55

R(pn (n), N(x)) at Gaussian error function and Gaussian noise (noise deviation

at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.6

55

R(pn (n), N(x)) at exponential function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,

dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.7

56

R(pn (n), N(x)) at exponential function and Gaussian noise (noise deviation at

left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.8

56

R(pn (n), N(x)) at square-root function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function,

dashed line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . .

6.9

57

R(pn (n), N(x)) at square-root function and Gaussian noise (noise deviation at

left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed

line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

6.10 R(pn (n), N(x)) at x0.2 function and uniformly distributed noise (interval at

left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed

line: R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

6.11 R(pn (n), N(x)) at x0.2 function and Gaussian noise (noise deviation at left

0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

65

66

66

6.15 Noisy output signal of the first simulation (distortion is made by the Gaussian

error function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

6.16 Noisy output signal of the second simulation (distortion is made by the x5

function). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

Hansens method (right) as a function of . The nonlinear distortion is the

Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiv

68

6.18 Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of . The nonlinear distortion is the Gaussian

error function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

Hansens method (right) as a function of . The nonlinear distortion is the

part of x5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

6.20 Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of . The nonlinear distortion is the part of x5 . 69

6.21 Reconstruction of x by Morozovs method (left) and Hansens method (right)

for the Gaussian error function. . . . . . . . . . . . . . . . . . . . . . . . . .

70

6.22 Reconstruction of x by the novel method (left) and the optimal result in least

squares sense (right) for the Gaussian error function. . . . . . . . . . . . . .

70

for the x5 nonlinear distortion. . . . . . . . . . . . . . . . . . . . . . . . . . .

71

6.24 Reconstruction of x by the novel method (left) and the optimal result in east

squares sense (right) for the x5 nonlinear distortion. . . . . . . . . . . . . . .

71

73

73

6.27 Distorted, noisy signal part chosen for parameter determination of the nonlinear function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

74

75

75

76

. . . . . . . . . . . .

76

77

78

6.35 Signal part chosen for parameter determination of the nonlinear function. . .

78

79

79

80

81

85

86

xv

86

6.43 Reconstruction of x by the exact inverse (left) and Tikhonov-regularized inverse (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

88

B.2 Original and inverse piecewise linear system. . . . . . . . . . . . . . . . . . . 101

xvi

Chapter 1

Introduction

1.1

Overview

The optical filmsound-recordig technology is more than 100 years old. Since then millions of

sound-films were made and then stored in the national film archives, which have inestimable

artistical value. The task of the archives is not just to preserve these films but also to prepare

them for broadcasting and show them to the wide audience. However, most of these films

cannot be broadcasted because they suffer from several degradations.

There are several distinct types of film degradations. These can be broadly classified

into two groups: localised degradations and global degradations. Localised degradations are

discontinuities in the waveform which affect only certain samples. Global degradations affect

all samples from the waveform. We can distinguish the following sub-classes of degradations

[1]:

clicks and cracklings,

low-frequency noise transients,

broad band noise,

wow and flutter,

non-linear defects.

Clicks and cracklings are short bursts of interference random in time and amplitude.

The cause of these impulsive disturbances are mutations on the sound-carrier material (e.g.

scratches or dirt spots on the surface).

1

Low-frequency noise transients are mainly larger scale defects than clicks. The reasons

are large discontinuities due to glued parts of film-rolls or other strong damages at optical

sound-recording. These changes in the film material cause special excitations in the light

intensity during sound reproduction and hence cause strong transients in the reproduced

sound. These large discontinuities can be heard as low-frequency pulses.

Broad band noise is common to all analogue measurement, storage and recording systems

and in the case of audio signals it is generally perceived as hiss by the listener. It can be

composed of electrical circuit noise, irregularities in the storage medium and ambient noise

from the recording environment.

Wow and flutter are pitch variation defects which may be caused by eccentricities in the

playback system, motor speed fluctuations or by special distortions of the sound carrier (e.g.

shrinkage of film).

Non-linear defect is a very general class that covers a wide range of distortions. In the

audio field, the principal causes are [2]:

saturation in magnetic recording,

tracing distortion (before compensation was introduced) and groove deformation in

records,

the inherent nonlinearity of optical soundtracks.

There are already many solutions and applications in the scientific literature and on the

market that deals with restoration of local degradations and wide band noise. There are

already several results published in the literature to eliminate pitch defects. However, there

was a relatively small emphasize on the elimination of non-linear defects. It is the topic of

current research interests in DSP for audio [1].

In the last decade, methods restoring damaged audio recordings have progressed from ad

hoc methods, motivated primarily by ease of implementation, towards more sophisticated

approaches based on mathematical modeling of the signal and degradation processes.

This thesis addresses the elimination of distortion of optical soundtracks, a previously

not too extensively investigated problem. Restoration of nonlinear distortions is a special

kind of inverse filtering problem. This problem could be ill-posed, which means that during

reconstruction of the nonlinearly distorted signal, small uncertainties in this signal can cause

strong deviations in the restored one. In this case, our aim is to find a restoration method,

where both the signal distortion and the level of deviation (more simply the level of the

amplified noise) can be kept low. The aim of this dissertation is to clarify the reasons

2

of nonlinear distortions in the case of optical soundtracks and propose methods based on

digital signal processing to reduce the distortion and avoid the appearance of artefacts in

the restored sound.

1.2

Structure of thesis

and nonlinearities with memory. Chapter 3 examines the possible methods for eliminating

effects of nonlinear distortions and explains in details the problems and possible solutions

of nonlinear post-compensation techniques. The main problem during post-compensation

is the amplification of the noise that is present in the original material. Without proper

compensation, the noise amplification could be so strong that the resulted sound could

be worse than the distorted one. In this chapter the origin of the noise amplification is

discussed and the possible methods are summarized, which could be applicable to overcome

this problem.

Chapter 4 reviews the nonlinear characteristic of photosensitive materials and shows the

analytical equations, which describe the nonlinear behaviour. Chapter 5 discusses the filmsound recording techniques and the appearance of nonlinear distortions of the photosensitive

materials in the sound.

Chapter 6 shows two novel methods for composing compensation characteristics for postcompensation of distorted signals. One of them is based on Tikhonov regularization operators. The aim of this compensation technique is to minimize the estimated value of the

energy of noise and distortion terms together. The method is fast compared to other compensation methods, because this method does not have iterative steps during the compensation

process. Simulations also show in this chapter that the accuracy of the method is as high as

other compensation methods.

A common problem at regularization of an ill-posed problem is that we have a very little

knowledge about the original signal, hence we dont know, how much regularization is needed

to achieve the optimal result. In this chapter a new method is shown that can automatically

find a good estimate about the amount of regularization without the interaction of a user.

It is quite important at the film industry and at the film archives, where huge amount of

degraded films are waiting for restoration and there is no time to make several experiments

on each film.

The aim of the second compensation method is to produce an unbiased estimate from

3

We also have little knowledge about the nonlinear distortion function, which is another

problem in signal compensation. In chapter 6 a possible method is shown for the identification of the nonlinear function in the knowledge of an analytical, parametrizable formula

about the distortion.

Finally, Chapter 7 presents conclusions and suggests possible directions for future research.

Chapter 2

Classification of nonlinearities and

nonlinear models

2.1

Classification of nonlinearities

A system, at which the relation between the input and the output of the system is described

by the function H(), is a linear system if, for any inputs x1 (t) and x2 (t), and for any constant,

c, the additive property (eq. (2.1)) and the homogeneity property (eq. (2.2)) are satisfied:

H(x1 (t) + x2 (t)) = H(x1 (t)) + H(x2 (t)),

(2.1)

(2.2)

In the case of a nonlinear system the additive and/or homogeneity properties are not satisfied.

Nonlinear systems can be divided into two main categories:

memoryless nonlinear systems,

nonlinear systems with memory.

In a memoryless nonlinear system the current output at time t depends only from the current

input at time t and does not depend from previous or next input values. A nonlinear system

has memory if the output at time t depends on the input at time t, as well as the inputs

over a previous time interval.

2.2

Memoryless nonlinear models are often adequate for representing nonlinearities in systems

that have a very wide bandwidth with respect to the signal bandwidth. The main advantage

5

in resorting to such models is their simplicity, ease of application and low computational

burden [3]. Good examples for applications that can be represented with memoryless nonlinearities are e.g. microwave amplifiers [4], A/D and D/A converters [5], photosensitive

materials [6, 7], tube amplifiers [8, 9], several types of transducers [10] and many other

applications that we cannot enumerate because of the lack of space.

2.2.1

The most elementary model for dealing with nonlinear systems is the Taylor series. The

Taylor series provides a polynomial representation of a memoryless nonlinear system. According to [11], James Gregory was the first to discover the Taylor series in 1668, more than

forty years before Brook Taylor published it in 1717.

If a real function, f (x), has continuous derivatives up to (n+1)th order, then this function

can be expanded in the following fashion:

1 df (x)

1 dn f (x)

1 d2 f (x)

f (x) = f (a) +

+ ...+

+ Rn

+

1! dx x=a 2! dx2 x=a

n! dxn x=a

(2.3)

Rn =

Zx

f (n+1) (u)

(x u)n

f (n+1) ()(x a)n+1

du =

n!

(n + 1)!

a < < x.

(2.4)

When this expansion converges over a certain range of x, that is lim Rn = 0 then this

n

If the value of n in eq. (2.3) equals 1, we will get a simple linear model, which has

appropriately small error in a given small domain. Linearity has been one of the fundamental

principles upon which theory of signal processing has been structured. Most real-world

problems however, are intrinsically nonlinear and can be modeled as linear ones only within

a limited range of values. Piecewise linear constitute a compromise between the inherent

complexity of the nonlinear domain and the theoretical abundance of linear methods.

2.2.2

Polynomial interpolation

In 1903, Weierstrass published a theorem that states that memoryless nonlinear systems that

are non-polynomial in nature, could be approximately represented with arbitrary accuracy

by polynomial models, over a given range of inputs [12]. This is now known as the Weierstrass

approximation theorem. In the 1950s, Davenport and Root showed how the direct method,

6

and the transform method can be used to determine the statistical properties of the output

of memoryless nonlinear devices [11].

In the late 1960s, Blachman showed that a memoryless nonlinearity can be represented

as a generalised Fourier decomposition into a sum of orthogonal polynomials ([13, 14]). The

orthogonality of the polynomials for particular input signal properties allowed the polynomial

coefficients to be calculated or measured using a cross-correlation method. Appropriate sets

of orthogonal polynomials for a number of stationary input signals, were discovered well

before Blachmans application. In 1939 Szego attempted to produce a complete bibliography

of every paper published on the subject of orthogonal polynomials before that date [15].

The most commonly used orthogonal polynomials are Chebyshev and Hermite polynomials. Chebyshev polynomials , Tn (x), n 0, 1, 2, . . ., are real functions, which form a complete

1

.

1x2

It

Z1

0 if m 6= n

1

Tm (x)Tn (x) =

if m = n = 0

1 x2

if m = n = 1, 2, 3, . . .

2

1

1x2

(2.5)

excitations [3].

Hermite polynomials, Hn (x), n 0, 1, 2, . . . form a complete orthogonal set on the interval

Z

x2

Tm (x)Tn (x) =

if m 6= n

2 n! if m = n

0

(2.6)

Since Gauss-like signals have exp(x2 ) amplitude distribution this kind of nonlinearity interpretation is applicable to simulate or eliminate distortions in the case of Gaussian distribution, which is a quite often used signal modeling assumption.

The advantage of orthogonal polynomials instead of Taylor ones is that in the case of

cascaded systems they does not produce cross product terms. E.g., in the case of elimination

of the second and third order harmonic distortion of a system by a cascaded polynomial

compensation system, the result will not contain new, higher order terms. The disadvantage

of them is that this behaviour is true only for a small range of signal types, having a given

amplitude distribution.

7

2.2.3

Analytical models

Several nonlinear physical models such as traveling-wave tubes used in radio-frequency communication channels, or photosensitive materials can be described by analytical models,

which are special (usually non-polynomial) mathematical functions. The advantage of these

functions is that they usually have physical basics, and they can be parametrized, hence the

correct identification of a given nonlinearity is only optimization of a few parameters.

An example is the case of narrow frequency excitations such as radio-frequency communication signals, where the relationship between the input and output can be expressed as

separate amplitude and phase distortions. If an input radio-frequency signal is expressed as

x(t) = r(t) cos(t + (t))

(2.7)

y(t) = A(r(t)) cos(t + (t) + (r(t))),

(2.8)

where A(r) and (r) are the amplitude and phase nonlinear distortions and t denotes time.

There are quite a few mathematical approximation formulae for these distortions ([16, 17,

18, 19]).

In the case of optical sound-recording the possible analytical formulae could be very important for identification and restoration. Analytical formulae with three or more constants

were proposed for photosensitive materials by several authors. They have reasonable agreement with experimental curves, but the theory between these equations is quite inadequate.

Several empirical formulae were proposed in the 1940s but these formulae were not accurate

enough [20]. A more accurate analytical formula about photosensitive emulsions for the

density vs. log exposure characteristic was given by Solman and Farnel [21]. It has good

agreement with real emulsions, although the photographic fog is not modeled.

A nowadays commonly used formula in the optical sound recording is the curve [22],

which can accurately describe a large range of the characteristic. The equation of the

curve is

T (E) = 1 (1 Tsat Tf og ) E Tf og ,

(2.9)

where T denotes the light-transmission ability of film after development and E stands for

light exposure on film before development. Tsat means the lowest light-transmission ability of

film and Tf og means the highest transmission ability that can be achieved. is a parameter

that is different for different film types. The normal range of this parameter is between about

0.2 and 5.

8

2.3

The approaches of nonlinear modelling based on Taylor series and orthogonal series, and the

direct and transform methods of nonlinear system analysis, are suitable only for memoryless

nonlinearities. However, the development of more complex models to deal with nonlinear

systems with memory dates back to the late 19th century.

2.3.1

Volterra series

In 1887 Volterra published a functional series expansion now known as the Volterra series

[23]. This generalised form of the Taylor series expansion can be used to represent a nonlinear system with memory. In 1910 Frechet published a more rigorous representation of the

Volterra series, and contributions towards the generalisation of Weierstrass approximation

theorem for functionals in which the polynomials are replaced by so called polynomic functionals. Specifically, the generalisation of Weierstrass approximation theorem states that

nonlinear systems with memory that are non-polynomial in nature, can be approximately

represented with arbitrary accuracy, by polynomial based nonlinear functional models, over

a given range of inputs.

The Volterra series is a very general means of describing a continuous-time output, y(t)

in terms of an input, x(t). The Volterra series expansion for a causal, time-invariant system

can be expressed as

y(t) = H1 [x(t)] + H2 [x(t)] + . . . + Hn [x(t)]

(2.10)

Hn [x(t)] =

(2.11)

and the Volterra kernels, hn () have unspecified form, but hn (1 , . . . , n ) = 0 for any i 0,

i = 1, 2, . . . , n.

Hn [xt ] =

j1 =0

...

(2.12)

jn =0

This is a generalisation from linear systems theory: for a linear system, y(t) = H1 [x(t)], the

first degree kernel h1 (t) is the impulse response, which completely describes the system. For

higher-degree systems, hn (t1 , . . . , tn ) can be thought of as an n-dimensional impulse response.

9

Discrete Volterra models are widely used in the control literature, classification problems

and artificial neural networks. Present applications in audio include input/output modeling

of audio systems and nonlinear filtering to precompensate for known loudspeaker nonlinearities [25].

2.3.2

Parametric models

Input/output modeling in which we have access to both the input and output of the

system, and seek to describe the function mapping from present and past (for a causal

system) values of the input to the output.

Time series modeling in which we have access only to the output of the system. In

this case we want to describe the output in terms of an input/output model acting on

a random, independent and identically distributed excitation process.

Volterra modeling is a typical example for input-output modeling. An alternative methodology for nonlinear modelling is to use time series nonlinear modeling. There is a plethora

of such models, but there is no universally recognised method to categorise them [25]. For

example, Tong [26], Tjstheim [27], and Chen and Billings [28] take radically different approaches. They can all, however, be treated as generalisations or specialisations of the

nonlinear ARMA (autoregressive moving average) model.

In an autoregressive moving average model, an observed output signal, o can be represented as

ot =

k

X

ai oti +

i=1

l

X

bj etj + et ,

(2.13)

j=1

where ai and bi are weighting factors, ei is an excitation signal (can be thought as an additive

noise, which current value is unknown). This equation can be generalized to give a nonlinear

ARMA (NARMA) model. This takes the form

ot = f (ot1 , . . . , otk , et1 , . . . , etl ) + et ,

(2.14)

where f is now some arbitrary nonlinear function rather then being a simple weighted sum.

This function could be a polynomial model, which is very similar to a finite length and

finite maximum degree Volterra model. If the degree of the polynomial is two, this is the

10

ot = a0 +

A

X

i=1

2.3.3

ai oti +

B

X

bi eti +

j=1

C X

D

X

ck dl xtk etl .

(2.15)

k=1 l=1

Treshold models

In a threshold model [26], different functions f () are used depending on the value of the

output at some fixed lag d. This introduces nonlinearities even when the functions themselves

are linear. It can be written as

g1 () if r0 xt < r1

g () if r x < r

2

1

t

2

f () =

.

..

g () if r

m

m1 xt < rm

(2.16)

(2.17)

2.3.4

Cascade models

Rather than using large, general nonlinear models, an alternative approach is to cascade

smaller models together, connecting the output of one to the input of the next. This can

correspond to the real physical structure of the system itself.

A common cascaded structure is the Linear-Nonlinear-Linear (LNL) or sandwich model

illustrated in Fig 2.1. This model consists of a linear element, h( ), whose output, u(t), is

transformed by a memoryless nonlinearity, N(). The output of the nonlinearity is processed

by a second linear system, g( ). This system is also called Wiener-Hammerstein system.

The LNL cascade has two special cases, the Hammerstein system (NL) and the Wiener

system (LN). Both the Wiener and Hammerstein models can be linear in the parameters if

the component models themselves are linear. Block-oriented models are a generalisation of

cascade models to allow arbitrary connections, including feedback and feedforward, between

subsystems. They are widely used in the control literature.

Cascaded systems can be switched parallel. Palm [29] showed that any finite dimension,

finite order, finite memory Volterra system can be represented exactly by a finite sum of

11

x(t)

h( )

u(t)

v = N(u)

v(t)

g( )

y(t)

LNL models. More recently, Korenberg [30] showed that this was true for Wiener cascade

elements as well. This is a significant advancement, since the identification algorithms for

Wiener models are much simpler than those for LNL cascades [31].

12

Chapter 3

Techniques for nonlinear

compensation

3.1

When a signal passes a system having a nonlinear transfer function, the output signal will

be distorted. If the distortion is not acceptable, we have to somehow reduce it.

Methods for compensation or elimination of nonlinear distortions can be divided into

three main parts:

If we can modify the structure of the system, we can re-design it in order to reduce

the nonlinear distortion. This is a widely used method in the industry. Examples for

reduction of nonlinear distortions of A/D converters can be seen in [32, 33, 5, 34, 35,

36, 37]; examples for current transformers can be seen in [38] and [39], examples for

reducing nonlinear distortions in movie cameras can be seen e.g. in [40]. Unfortunately

this method is too widespread to deal with it in details.

If we cant modify the structure, but we have access to the input, we can pre-distort

the original input signal to compensate the distortion.

If we have neither access to the structure, nor to the input, we can post-process the

output signal to compensate the distortion.

13

P ()

N()

3.2

Pre-distortion

As it was spoken in Chapter 2.3.2, nonlinear modeling and also nonlinear compensation has

two basic situations: input/output modeling, where we have access to the input and output

and time series modeling, where we have access only to the output. In several applications

we have access both to the input, x, and the output, o of the nonlinear system. In these

cases pre-distortion techniques are preferred. Its block scheme is depicted in Fig 3.1. In the

other case, when we have access only to the output of the system post-distortion techniques

can be used.

In the case of pre-distortion the excitation of the nonlinear system is given by another

nonlinear system to eliminate the distortion of the input excitation signal, i, at the output

of the two cascaded system.

The limitation of this method is that the noise level before the original distortion have to

be negligibly low, but this usually can be fulfilled. Hence there is no need to care about the

extra effects of noise and the pre-distortion stage could be simply the inverse of the original

system.

Pre-distortion is a typical solution at the transmitter side of microwave communication

channels, where the transmit amplifier has strong nonlinear distortion. Pre-distorter characteristics were proposed already in 1972 by Kaye [41] who proposed an analog, memoryless

pre-distorter to solve the problem of microwave tubes. A p-th order Volterra inversion for

microwave transmit amplifiers was proposed by Biglieri [42]. Another memoryless compensation techniques were proposed by Karam [43] and Pupolin [44]. Neural network approaches

can be seen in [45] and [19]. Good surveys can be read about this research field in the article

of Lazzarin [46] and in the PhD thesis of Wohlbier [4].

Pre-distortion is used in other fields as well, e. g. predistortion of power amplifiers [47],

laser diodes [48] or cathode ray tubes [49].

Audio related articles are typically reducing the nonlinearities of loud-speakers or com14

plete audio systems. Closed-loop system structures were proposed already in 1977 by Black

[50] and 1983 by Adams [51], who introduced a kind of system re-design. The first pioneer in

the pre-distortion field was A. J. M. Kaizer who made the first loud-speaker models based on

truncated Volterra-series in 1987 [52]. Solutions for loud-speakers based on Volterra-filters

were proposed by Klippel [53, 54, 55, 56, 57, 58] and Schurer [38, 59]. Adaptive nonlinear

compensators were proposed by Klippel [57] and Sternad [60]. Bellini proposed a solution

based on inverting the analytical sound pressure level characteristic of the loud-speaker [61].

Other algorithms were proposed for eliminating acoustic echo by Stenger and Rabenstein

[62, 63, 64, 65] that were based on scalable nonlinearity functions for cancelling nonlinear

distortions in hands-free phone systems. The nonlinear function is described by a polynomial

series, where the coefficients of the series were the parameters of the nonlinear function. The

method can adapt to the changing in the parameters of the distortion and can be extended

for handle nonlinearities with memory.

In all cases the main problem is to identify the characteristic of the nonlinear system.

In some studies, the nonlinear characteristic is assumed to be given, the others proposed

identification techniques.

3.3

Post-distortion

While system re-design and pre-distortion are relatively simple tasks, post-distortion is a

more difficult one. The difficulty arises because most post-distortion processes are ill-posed.

This is also the case of the optical soundtracks.

A problem characterized by the equation f (x) = y is well-posed, if the following conditions introduced by Hadamard in the early 1900s are satisfied [66]:

the solution exists for each element y in the range of Y ;

the solution x is unique;

small perturbations in y result in small perturbations in the solution x without the

need to impose additional constraints.

If any of the above conditions are violated, the problem is said to be ill-posed.

Ill-posed problems exist in countless different fields just like measurement technology

[67], spectroscopy [68], optical measurements [69], image restoration [70, 71], high voltage

measurements [72, 73, 74], RC network identification [75] and in many other fields. Several

15

N()

?

- +j

P ()

solutions were proposed for linear problems, based on filtering techniques, using regularization operators or singular value decomposition, etc. (A good overview can be found about

these methods e.g. in [76] or [77]). However, relatively small amount of works deal with the

ill-posed problems of nonlinear signal reconstruction. In the followings, these problems will

be examined in details.

In the case of nonlinear post-distortion usually the third ill-posed problem arises: small

perturbations in the measurement will result big deviations in the solution. The schematic

block-scheme of post-distortion can be seen in Fig. 3.2. In this case the noise-source is before

the inverse stage and in a lot of cases the noise level is not negligible. If the inverse system

amplifies the signal, the noise will also be amplified. The amplification could be so strong

that the amplified noise signal covers the original one.

A simulation example for noise amplification can be seen in Fig. (3.33.5). In this

simulation the original sinusoid signal was distorted by a Gaussian error function. The

signal-to-noise ratio was 50 dB. After restoration, the noise was amplified at the top part of

the sinusoid, where the nonlinear curve was nearly flat.

Given an ill-posed problem various schemes are available for defining an associated problem which is well-posed [66]. This approach is referred to as regularization of the ill-posed

problem. In particular, an ill-posed problem may be regularized by

1. changing the definition of what is meant by an acceptable solution,

2. changing the space to which the acceptable problem belongs,

3. revising the problem statement,

4. introducing regularization operators and

5. introducing probabilistic concepts to obtain a stochastic extension of the original deterministic problem.

16

2.5

1.5

0.5

0.5

0.1

0.2

0.3

0.4

0.5

time

0.6

0.7

0.8

0.9

2.5

1.5

o = erf(x)

0.5

0.5

0.1

0.2

0.3

0.4

0.5

time

0.6

0.7

0.8

0.9

17

^

x

2.5

1.5

0.5

0.5

0.1

0.2

0.3

0.4

0.5

time

0.6

0.7

0.8

0.9

Figure 3.5: Reconstructed signal by the exact inverse of the nonlinear distortion (

x in Fig.

3.2).

Inversion problems have been extensively studied since 1960. In the early 1960s Tikhonov

began to produce an important series of papers on ill-posed problems. He defined a class of

regularisable ill-posed problems and introduced the concept of a regularising operator which

was used in the solution of these problems [78].

While for linear ill-posed problems a very comprehensive regularization theory is available, the development of regularization methods for non-linear ill-posed problems and the

corresponding theory is quite young and very vital field of research with many open questions [79]. The rigorous analysis of the Tikhonov regularization in the nonlinear context was

initiated first only in 1989 by Engl, Kunich and Neubauer [80].

Since nonlinear equations generally do not have an analytical solution, these algorithms

are mostly iterative ones [81]. In this case there are two points at the algorithms, where

regularization operators can be used:

regularization may be required to make the solution well-posed,

regularization may be required to avoid divergence of the iterative algorithm.

These techniques will be introduced in the next three sections.

Another class of algorithms to handle nonlinear ill-posed problems are based on probabilistic concepts such as Bayesian algorithms and Markov-chain Monte-Carlo methods [25].

18

The aim of these techniques is to create a parametric model of the original, undistorted and

noiseless signal, then to find the possible parameters of this model, based on the noisy and

distorted observation, hence recreate the original signal. These techniques will be introduced

in section 3.3.4.

3.3.1

y = N(x)

(3.1)

Our goal is to best approximate eq. (3.1) in the situation, when the exact data, y, are not

precisely known and only a perturbed data, o with

ky ok

(3.2)

are available. Here, is called the noise level. This problem is usually ill-posed, because

the third rule of Hadamard is not satisfied: small perturbations in o will produce big perturbations in the estimate of x, (that will be noted in the followings by x), just like in the

example of section 3.3.

A commonly used method for solving this problem is Tikhonov regularization. In Tikhonov

regularization, eq. (3.1) is replaced by a minimization problem, where not only the prediction error, kN(

x) ok is minimized, but other terms as well, which are in connection with

the estimated input signal. A practical realization of this minimizaton problem is

kN(

x) ok + k

x xc k min,

(3.3)

where > 0 is the regularization parameter and xc is some center value ideally chosen as

the critical point of interest, but often just set to zero [82]. In this case, when we try to find

that x value, which produces the minimum value of eq. (3.3), deviances between our initial

guess, xc and our estimate, x will be punished, hence big deviations, caused by noise wont

be allowed.

In eq. (3.3), it is not obligatory to use the norm of x xc . Using other norms lead to

kN(

x) ok + k R{

x} R{xc }k min,

where R() is the generalized regularization operator [79, 83].

19

(3.4)

Z

x(t)

kN(

x) ok + x(t) log

dt min,

xc (t)

x ,

(3.5)

where xc (t) is some initial guess about x(t) such as in eq. (3.3). In this case xc is often just

1. For further explanation and examples for nonlinear maximum entropy regularization, see

for example [84, 85, 86, 87].

Other commonly used possibility is bounded variation regularization

Z

d

x

(t)

dt min,

x ,

kN(

x) ok +

dt

(3.6)

which enhances sharp features in x as needed in, e.g., image reconstruction, see [88, 89, 90,

71, 91].

In the case of monotone nonlinear functions, where

N(x2 ) N(x1 ) 0

if x2 x1 0

(3.7)

the least squares minimization can be avoided and one can use the simpler regularized

equation

N(

x) + (

x xc ) = o,

(3.8)

method preserves the original structure of the problem and sometimes can lead to easilyimplemented localized approximation strategies [93].

Since eq. (3.3) (3.6) and (3.8) are nonlinear equations, analytical solution of them

is generally not possible. The commonly used method is to solve the problem by iterative

methods. In the next section the iterative methods will be discussed.

3.3.2

The first candidate for solving eq. (3.1) in an iterative way could be Newtons method [81]

that is the iterative solution of the output least squares problem

ko N(

x)k min,

(3.9)

where k k corresponds to the L2 norm. (Of course, regularization methods also can be used

on all the other equations discussed in the previous section, but for simplicity and for easy

20

understanding, the iterative methods will be shown on eq. (3.9) ). In this case eq. (3.9)

simplifies to

dN()

(o N(

x)) = 0

d =x

dN() 1

(o N(

xk )),

xk+1 = xk +

d

(3.10)

(3.11)

=

xk

starting from an initial guess, x0 . Even if the iteration is well defined and

dN ()

d

is invertible

for every x, the inverse is usually unbounded for ill-posed problems. Hence eq. (3.11) is

inappropriate in this case, since each iteration means to solve a linear ill-posed problem, and

some regularization technique has to be used instead. Applying Tikhonov regularization

yields the Levenberg Marquardt method [94]

xk+1 = xk +

1

dN () 2

d

=

xk

dN()

xk )),

(o N(

+ k d =x

(3.12)

1

dN () 2

d

=

xk

+ k

k (

xk xc )

(3.13)

for additional stabilization gives the iteratively regularized Gauss-Newton method [95]

"

#

1

dN()

xk+1 = xk +

(o N(

xk )) k (

xk xc ) .

(3.14)

dN () 2

d =xk

+

k

d

=

xk

The other widely used iterative method is the steepest descent method [96]

dN()

xk+1 = xk

,

d =xk

(3.15)

k = N(

xk ) o

this leads to the so-called Landweber iteration [97]

dN()

(N(

xk ) o).

xk+1 = xk +

d =xk

(3.16)

(3.17)

Another nonlinear iterative method that is based on the steepest descent algorithm is [98]

xk+1 = xk + (o N(

xk )) .

(3.18)

For a more detailed explanation about techniques based on Newtons method see e.g. [99].

21

3.3.3

One important question in the application of regularization methods is the proper choice

of the regularization parameter, . Let us see the equation of the Tikhonov regularization

problem again:

kN(

x) ok + k

x xc k min .

(3.19)

If we choose near to zero, the regularization will be too little. The solution, x tends to the

original, ill-posed result that is the solution of the output least squares problem, eq. (3.9).

If approaches to infinity, the result will be overregularized. The output norm becomes

negligible compared to k

x xc k. In this case the solution will be well posed, however, it

tends to xc . The result will be our initial guess that estimate could be strongly distorted

(for example simply zero). The optimal solution can be found at an optimum value that

lies somewhere between 0 and .

Several methods were proposed for finding an optimum in the case of linear problems.

The underlying principle in cross validation is that if an arbitrary observation is left out

from o, then its input can be well predicted using the solution calculated from the optimal

regularized remaining observations. GCV is based on the same principle and, in addition,

ensures that the regularization parameter found has some desirable invariance properties,

such as being invariant to an orthogonal transformation (which includes permutations) of the

data. For the linear problem, A x = b, this leads to choosing the regularization parameter

as the minimizer of the following function

G() =

(A x b)2

trace(I A(AT A + 2 )1 AT )

(3.20)

Glans [102] proposed a method based on minimizing the imaginary part of x that is

produced by the numerical errors of the computation method. The technique in this method

seems quite unreliable and this method has no heuristic and no formal proof. Instead of

this method, Daboczi [103, 104] proposed a systematic iterative method for finding in the

case of impulse signals, based on a rough signal model. Chen [105] proposed a solution for

deconvolution of noisy images even if the point-spread-function (the linear, two-dimensional

filter function that distorted the original image) is not exactly known. Roy proposed a

method based on the difference norm calculated from the linearly distorted observation and

its further distorted version with the same linear distortion [106], however, this method also

22

has no formal, no heuristic proof. Solutions based on probabilistic approaches were proposed

in [71] and [107].

At the iterative techniques, Bertocco [108] published a method that worked on the iterative deconvolution of step-response signals estimating the noise spectrum from the flat part

and the signal spectrum from the changing one. Parrucks method [109] is based on similar

assumptions.

In the case of nonlinear iterative problems, first Engl gave an analysis about the convergence rate dependence from the regularization term in the case of iteratively solved maximum

entropy and Tikhonov regularization [80, 85]. Haber examined rigorously these problems and

collected the possible methods in [101] and [99]. These methods are based on simple continuation, or cooling [99]. They start with a relatively large value of , then they gradually

reduce that. If the result deemed to be unacceptable, is increased by a certain factor.

A combination of Tikhonov regularization and gradient method was proposed by Ramlau

[110].

For nonlinear Tikhonov regularization Morozov proposed a so-called discrepancy rule

[111] in which the regularization parameter is chosen as the solution of

kN(

x, ) ok = C,

C1

(3.21)

Another heuristic method is the L-curve technique developed by Hansen [113]. This

method does not have a formal proof, however, it is often used because of its simplicity

[101]. The L-curve is made by plotting the log of the misfit, kN(

x) o)k as the function of

log(k

xk) which are obtained for different regularization parameters. This plot has a typical

L-shape. Hansen claimed that the best model norm for a small misfit is obtained at the

corner of the L-curve.

For Lavrentiev regularization, constrains were given for in [114], but there are no special

methods to determine its exact value.

3.3.4

Bayesian techniques

Bayesian nonlinear restoration techniques are based on nonlinear time series. Many models

are possible for nonlinear time series (see e.g. [28, 26]). In the audio field nonlinear autoregressive (NAR) models are widely used [115]. A commonly used representation of NAR

23

b

i

X

X

(i,j) b(i,j) yti ytj

yt = xt +

i=1 j=1

b X

j

i X

X

(3.22)

where yt is the t-th sample from the distorted signal from that we can make a (noisy)

observation, b(i,j) , b(i,j,k) are the weighting parameters of the NAR process, (i,j) , (i,j,k) are

h0, 1i binary indicators, which decide the usefulness of a weighting parameter, b is the

maximum lag of the model and xt is the undistorted signal modeled as an autoregressive

process

xt = et +

k

X

ai xti .

(3.23)

i=1

A major advantage of this model formulation is that the inverse of the nonlinear stage is

a straightforward nonlinear moving average (NMA) filter, which is guaranteed to be stable.

Hence it is simple to reconstruct the signal xt from yt for a given set of NAR parameters

[25].

The signals and the parameters are modeled as random variables, usually by Gaussian

or multivariate Gaussian distribution. The correct parameter values of the NAR model can

be estimated by finding the value, which has the maximum probability. The parameter

searching can happen by Monte-Carlo methods [2, 116, 117] or simulated annealing [25] or

by any other optimum-searching algorithms.

The advantage of this method is that it can work in the case, when there is no a priori

information about the input signal nor the shape of the nonlinear distortion function. A

disadvantage is that a priori information can hardly be implemented in the process. Other

problem is that the optimum searching algorithm itself can stuck in local minima and requires

high computational power. Also a serious problem is that the velocity of the optimum

searching algorithm at a given task is unknown, therefore the applications realized with this

method cannot be used in real-time environments.

3.4

Several solutions have been made for nonlinear pre-compensation of audio devices such as

pre-compensation of hi-fi sets, nonlinear echo cancellation of mobile sets, compensation of

24

loudspeakers, etc., however, relatively small amount of work has been done in the field of

nonlinear post-compensation. In the followings these works will be discussed.

3.4.1

Histogram equalization

function through which a speech signal has been passed [118]. A smooth function is fitted to

the histogram of sample values from an extract of the signal. This is compared to a reference

histogram shape, based on analysis of a range of speakers, and a 1:1 mapping is derived which

will make the smoothed histogram conform with the reference one. This mapping is then

applied to the distorted signal.

Because it is assumed that the original signal closely conforms to a standard reference

histogram, this method cannot readily be applied to complex music signals, where histograms

differ greatly between recordings and vary significantly over the duration of a recording. The

other problem that was claimed by the author is that the algorithm is very sensitive to noise.

The algorithm was originally proposed for use in speech communication channels, and has

led to a patented device [119]. A related method has been used to restore recordings made

using early analogue-to-digital converters with non-uniform quantisation step heights and

some missed codes [120]. Since these are all small-scale, local defects, they can be reduced

by smoothing the histogram, without the need for a reference.

3.4.2

algorithm has been proposed by Polchlopek [98] to reconstruct the original signal where

only a bandlimited version of the distorted signal is available. The reconstruction uses

the iterative method described in eq. (3.18). The algorithm seems to be applicable also

for certain nonlinearities with memory. The analysis of the algorithm for noise was not

performed.

Tsimbinos composed the inverse of the memoryless nonlinearity from orthogonal polynomials to compensate distortions in digital radio receivers [121]. The advantage of this

method is that in the case of sinusoid excitations, the unwanted harmonics can be filtered

out without the appearance of new harmonic components. However, the method works only

in the case of pure sinusoid excitations that is not the case at general audio problems.

25

3.4.3

make a blind compensation. Audio signals can be well represented by autoregressive models,

therefore a possible method in the case of audio signals is to use autoregressive models for

identification and compensation such as eq. (3.22) and (3.23) in Chapter 3.3.4. This method

is used by Troughton for eliminating tape saturation [2, 116]. The method is applicable to

handle also nonlinearities with memory.

The disadvantage of this method is that the correct model order of the autoregressive

models are not known. The correct parameters are also not known. These data can be

found only by optimum searching algorithms, however, these algorithms may not find the

true parameters, they may stuck in local minima. In this case the resulted signal could be

even more distorted than the original one.

26

Chapter 4

The nonlinear characteristic of movie

film

4.1

Image formation

Optical recording of sound and motion picture is made by photosensitive materials. These

materials are on thin film-rolls. Formerly the carrier was made of cellulose nitrate, later

cellulose acetate. Nowadays it is made of polyester-based plastic. This carrier material is

coated by a photosensitive layer. A normal photosensitive layer consists of very large number

of tiny crystals (grains) of silver-halide embedded in a layer of gelatin. The combination of

grains and gelatin is often referred to as the photographic emulsion [122].

During taking a picture, the optical image is projected onto the photosensitive layer for

a fraction of a second. In ordinary practice this photographic effect is not revealed by any

visible change in the appearance of the emulsion. The exposed emulsion, however, contains

an invisible latent image of the light pattern that can be translated readily into a visible

silver image by action of a developing agent. This latent image is formed by ionization of

silver in the silver-halide crystals that produces very small (few atoms large) silver specks on

the crystal and errors in the crystal structure. During development, if a crystal is adequately

exposed, these mutations cause the acceleration of chemical reactions between the mutated

crystal and the developing agent causing the fast decay of these crystals to metallic silver

grains. This is the so-called print-out. The reaction of the not (or inadequately) affected

crystals is about two or more decades slower [123].

Developing of a silver-halide crystal can be treated as a binary process. If the crystal

contains enough mutation, it will completely transform to silver during the development. An

27

inadequately exposed crystal will be practically untouched. Since the crystals are absolutely

isolated from each other due to the gelatin carrier, the status of a crystal will be independent

of the status of the neighbouring crystals.

With this process, the light amplitude distribution can be reconstructed as the amount

of silver grains on the developed film. Since these silver grains are black, we will get a black

and white negative copy from the original optical image. However, the relationship between

the amount of silver and exposure is not linear.

4.2

In practice, the reduction of transparency of the layer is of most interest than the quantity

of silver. Transparency (T ) is defined as the ratio of flux transmitted (Pt ) to that incident

(Po ) on a uniformly exposed and processed area that is large compared to the area of a grain

[20]:

T =

Pt

.

Po

(4.1)

In their classical paper [124], Hurter and Driffield proposed a new measure, the opacity:

O=

1

.

T

(4.2)

They also proposed to represent the relationship between light exposure and opacity of

the developed film in logarithmic scaled graphs, because it is more descriptive in visual

and photographic reproduction than absolute values or transparency vs. exposure. The

logarithmic of the opacity is termed density (D):

D = log (O) = log (T ) .

(4.3)

The value of D depends on the emulsion, the light magnitude, duration and spectral behaviour of the exposing light. Usually the quantity of light received per unit area is of

greatest interest. This is called exposure and denoted by E. E can be expressed as

ZT

I(t)dt,

(4.4)

The ratio between silver mass and density is the photometric equivalent. Its reciprocal

is the covering power and is a measure of the efficiency with which the silver mass produces

28

optical density. This number depends on the number of silver-halide grains per unit area and

on the average area of a grain surface, but not on the exposure [125]. Hence the transmission

and the amount of silver is proportional.

4.3

As it was told in the introduction of this chapter, relationship between exposure and the

amount of silver (so the transmission) is not linear.

If a single layer of silver-halide grains of equal size and sensitivity is exposed, the probability that a given grain will form latent image depends entirely on the random arrival of

photons and the chance of absorption of a photon by a grain. Assuming that a grain must

absorb r quanta to become developable, the possibility, p, that a grain will absorb r quanta

from an exposure such that the mean number of absorbed quanta per grain is q, is given by

the Poisson-equation:

p(q, r) = exp(q)

qr

.

r!

(4.5)

Grains absorbing more than r quanta will also be developable. The probability that a grain

absorbs r or more quanta will be

P (q, r) = 1 exp(q)

r1 r

X

q

0

r!

(4.6)

The most sensitive crystals in a typical photosensitive layer require at least 10 or more

quanta. Special emulsions used for recording X-ray or nuclear particles require 1 quantum

per grain. A typical emulsion requires about 1000 photons per grain to make the half of the

crystals developable [126]. Calculated characteristics by these basic parameters for some r

values can be seen in Fig. 4.1.

The linear parts of the characteristics of monosized and uniformly sensitive emulsions are

very tight. Usually this is not proper for image recording. Therefore, instead of monosized

photosensitive emulsions, usually lognormally distributed emulsions are used, where several

different sized crystals are present in the emulsion having different photon sensitivity. In

this case the distribution of the grain size can be described by the equation:

!

(ln (x ))2

1

exp

x > ; > 0

p(x) =

2 2

(x ) 2

where is the shape parameter (variance) and is the location parameter (modus).

29

(4.7)

1

r=1

r=16

r=32

0.8

r=64

0.6

r=128

0.4

0.2

0

0

20

40

60

80

100

120

140

160

number of photons

layer for different foton quanta sensitivity (r).

The photon sensitivity of the same sized grains (one size class) is also not uniform. A

single size class has a sensitivity distribution also close to lognormal. In the case of commonly

used photosensitive emulsions, the variance of the sensitivity of a size class is about the same

as the variance of grain size (70% to 170% of the variance of sensitivity) [20].

The exposure vs. (1-transmission) simulated characteristic of a typical emulsion can be

seen in Fig. 4.2 (the MATLAB simulation file can be seen in Appendix C). The logarithmic

exposure vs. density characteristic can be seen in Fig. 4.3. The applicable characteristic

part, where the change of the output is appropriately high, now is much wider. It is about

two decades. However, the whole characteristic is far from linear. There is no linear part,

only a small part near to the beginning can be approximated as linear.

The characteristic begins with a constant part, where the particles are still insensitive

to the light intensity. The transmission here, however is not one, but a bit smaller. This

is caused by crystal imperfections created during the creation of the photoemulsive layer,

which causes a basic blackness in the image. This basic blackness is called photographic fog

or veil.

The constant part is followed by a toe, then an interval, which can be represented by the

following equation in the linear graph [40]:

1 T = (1 Tsat Tf og ) (E E0 ) + Tf og ,

30

(4.8)

1transmission

1

0.9

0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

exposure

density

1.4

1.2

0.8

0.6

0.4

0.2

0

5

10

10

10

10

10

10

log exposure

31

where Tsat is the transmission at saturation, Tf og is the basic transmission and is a constant

that depends from the photosensitive material. This is the so-called gamma-curve. Usually

photo- and film-negative materials have , which is smaller than one, positive photosensitive

materials have higher than one.

After the gamma-curve part of the photosensitive layer the emulsion becomes more and

more saturated. At extremely high light intensities, in a given interval the transmission

becomes to increase and the density decrease (this part is not involved in Fig. 4.3). This

part is called solarisation. The effect is caused by special secondary chemical effects. This

light intensity cannot be reached in the sound-stripe of the film, therefore we dont have to

deal with it.

32

Chapter 5

Imperfections in the optical

sound-recording techniques

5.1

Introduction

At professional sound-films, many methods were used for sound recording. After the 1990s

almost only digital sound-recording technologies are preferred, because they have high soundquality and they can be easily copied. However, before the digital age, only analogue methods

were exist. In the film industry, these techniques usually were based on optical sound projection. Magnetic recording technique was also used in the film industry since the 1950s.

Although magnetic recording technique had lower distortion as optical methods, it was not

so widespread, since copying of this kind of film is much more difficult and magnetic sound

degrades much more quickly at every broadcasting. Therefore before 1990, the optical soundrecording methods were typically used. Before the 1950s, only the optical sound-recording

techniques were known in the film industry.

The advantage of optical sound-recording methods in film-making that they can be easily

copied together with the film without using any additional technologies. Another advantage

that during sound-recording and reproduction nothing has to touch the surface of the film,

therefore the sound on the film will not be degraded by the reproduction. However, optical

sound-recording techniques have disadvantages as well. One disadvantage is the quite high

distortion level, which comes from the nonlinear behaviour of the photosensitive materials,

the other one is the quite high noise level. In the following sections the possible optical

sound-recording techniques will be explained and the description of the distortions at these

techniques will be discussed.

33

5.2

Optical sound-recording has two different methods. One of them was developed by Western

Electric and Fox Movietone and called variable density method. The other method was

developed by RCA and is called variable area method. Variable area method is still being

used for sound-recording. Variable density method was used only until the 70s. However,

from 1925 until 1950, variable density recording was as widespread as variable area based

one. This recording method was used in the studios of most of the major motion picture

producers including Paramount, M-G-M, 20th Century-Fox, Universal, Columbia Pictures,

Movietone and Hearst Newsreel Companies [127].

In the beginnings, variable area method had higher distortions and was much sensitive to

copying as the variable density one, however, with the development of the technology, these

problems were eliminated. After the 70s variable density method was not used anymore,

because this method was too sensitive to the changes in the exposition and developing

environment, compared to the variable area one and in that time sound-distortions of variable

area recordings were already dramatically reduced.

In the following sections, both sound-recording method will be explained in details. After

that the reasons of sound distortions of these methods will be clarified.

34

5.2.1

At variable density recording a thin light ray is projected to the constantly moving filmband and the intensity of the light ray is controlled by the sound signal (Fig. 5.1). Since

creation of movie pictures require not constant movement, therefore creation of sound stripe

and creation of picture are made in two different modules of the movie camera (sometimes

they are made on even two different film rolls). The sound record on the sound-film will be

displaced from the center of the corresponding picture by a distance of 21 frames [128, 129].

However, this effect will not cause any additional problem in copying of the film or cause

any additional distortions and we dont have to deal with the picture module of the camera

in the followings. Therefore Fig. 5.1 shows already only the sound projection part.

When the film is developed after being exposed to the variable intensity light, the sound

track will be made up of lines of varying density extending across the sound track (Fig. 5.2).

The variation of density between successive dark and light bands determines the amplitude

of the recorded sound [130].

Variable density recording has two sub-methods. One method, which was used by Fox

Movietone and Lee de Forest, utilizes a special light source that can produce variable intensity

light. This device is called Aeolight, which is a glow discharge lamp consisting of a cold

cathode and a mixture of inert gases. The intensity of illumination varies with the applied

signal voltage. In the Lignose-Breusing systems cathode-ray tubes were used for this purpose

[131]. However, these light sources could not provide too much light intensity, which caused

small sound level, hence small signal-to-noise ratio.

35

Format

Velocity [mm] at different image number per second

25

24

18

16

570

475

456

304

16 (sub-standard)

190.5

182.8

137

122

8S

105.7

101.5

76.1

67.7

70

35 (standard)

After 1930 the so called Kerr-cell was used widespread as a light valve in the way of a

constant light-ray. Kerr-cell works with two crossed polarizers with a glass of nitrobenzene

between them. Nitrobenzene can change the plane of polarization that can be controlled by

high voltage, hence light intensity can be controlled with this way. In this case the light

source could be a simple incandescent lamp that can produce strong light.

The most sophisticated method for light intensity control was invented by Klangfilm

Eurocord, which is the electrodynamic mirror oscillograph. Here the sound-current controls

the angle of a small mirror that lets more or less light through a slit. This device does not

require polarizers and high driving voltage.

During sound-recording, the controlled light shined through a narrow slit onto the moving

film, which was kept running at a constant speed (Fig. 5.1). The film speed was different at

different film formats. Speeds for the standard film formats can be seen in Table 5.1 (data

is taken from [132]).

The gap width of the slit in the case of 35 mm film is about 20 m. The image of the

slit is reduced and projected to the film by a lens. The gap width on the image is about 10

m, the width of the soundtrack itself in the case of variable area method is 2.94 mm, the

scanned area at sound reproduction is only 2.13 mm [129, 133].

Another method of making variable density film recordings is by the use of a special light

valve which was first used by Western Electric. The light valve of Western Electric varies the

amount of light by the opening and closing of a slit. This slit is the space between two taut

sides of a loop of wire suspended in a magnetic field. As the sound current passes through

the loop, the loop opens and closes passing varying amounts of light through it. The image of

the slit with varying width is then focused with lenses on the moving film so as to form lines

of varying density when the film is developed. This method was also called as longitudinal

36

sound-recording.

5.2.2

The variable area method was developed by RCA. In this system the intensity of the light

is kept at a constant value, but the area of the sensitized film, which is effected by the light

is varied (Fig. 5.3). The system consists essentially of a source of light, a light valve and

a suitable optical system for concentrating the light into a very fine beam (Fig. 5.4). The

light valve is usually an electrodynamic mirror oscillograph, which is similar to the one that

is used at variable density method.

The width of the beam is usually 5 m. The maximum width of the optical soundtrack

is 2.94 mm, the maximum usable area is 2.54 mm, but only 2.13 mm is scanned during

sound-reproduction because at the sides of the sound-track 0.405 mm black stripes are used

to avoid interference with moving picture on film [129, 133].

5.3

The first source of distortions is the light controlling device at the sound-recorder. At

intensity control method the image of a constant width slit is projected onto the running

film-band and the sound-information is carried by the light intensity. Without input signal,

a basic light intensity is projected to the film and the signal driving adds to this value. In

the case of a simple sinusoid excitation with angular frequency, , the intensity at a given

37

Figure 5.4: Schematic diagram of variable area method with electrodynamic mirror oscillograph.

time will be

I(t) = I0 + I1 sin(t) = I0 (1 + r sin(t)),

(5.1)

where I0 is the basic light intensity, I1 is the driving intensity and r is the modulation factor.

If the width, s0 of the slit would be infinitesimal, the exposure on film would be the same

as the input signal. Since s0 is finite, the exposure, E will be

I0 s0

sin()

s0

E=

1+r

sin(t)

=

,

c

d

where c is the velocity of film and d =

2c

(5.2)

Comparing the exposure to the input light intensity we can see that nonlinear distortions

are not created by the recording device, although linear distortions will appear. The amplitude response in the case of 24 frame/sec 35 mm standard film and 16 frame/sec 16 mm

substandard film can be seen in Fig. 5.5. In the case of standard film, at 8 kHz, the distortion

is smaller than 0.45 dB. Since the sound-amplifier devices at film recording had about 8 kHz

bandwidth, this distortion is negligible. In this case the distortion can be treated simply as

a memoryless nonlinear distortion.

The distortion at 8 kHz is higher in the case of substandard film, about 7.5 dB. However,

this version was rarely used in professional film technique and only for silent films.

At longitudinal method, the light intensity is constant, but the width of the slit is controlled. In the case of sinusoid excitation for the slit width, s, we can write

s(t) = s0 (1 + r sin(t)).

38

(5.3)

E/E

0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

f [Hz]

Figure 5.5: Amplitude response of light intensity controlled variable density sound-recording.

Solid line: standard (35 mm) film at 24 fps, dashed: substandard (16 mm) film at 16 fps.

The exposure will be different from eq. (5.2):

sin(2)

I0 s0

cos()

a1 sin(t) +

a2 cos(t) + . . .

E=

1+2

c

(5.4)

The exposure will be nonlinearly distorted. The distortion depends both on the modulation

factor and on the frequency of the sinusoid. At standard film, 1 kHz, r = 1 the distortion is

5%, at 5 kHz, r = 0.6 the distortion is higher than 30% [134]. This is the reason that this

technique was not too widespread.

The second source of sound-distortions is the density characteristic of film. As it was

shown in chapter 4, the density characteristic is highly nonlinear, which can be approximated

by the -curve. The first part of this characteristic (where the light intensity is still low)

is nearly linear. In the beginnings of the history of sound-film, this part was used by the

variable density recording method, because the distortions were small and aeolight could not

produce too high light intensity anyway [40]. However, due to the small signal levels and the

extremely high sensitivity to the recording and developing parameters, this technique was

replaced by another technique.

Since the main aim is to have a distortionless sound on the positive film and the sound

on the negative is not so important, it is enough, if the resulted characteristic of the film

positive and negative will be a straight line. If the characteristic of the film negative is the

inverse of the film positive, the result will be a pure, unbiased output. However, it is already

39

a too strong constraint. For our aim it is enough, if the multiplicative of the derivatives of

the negative and positive filmcharacteristics give 1 in every point in the range of interest:

dDpos

dDneg

= 1.

d lg(Eneg ) d lg(Epos )

(5.5)

In this case the resulted characteristic will be a straight line, which may be shifted from zero.

If it is shifted, a DC offset will appear on the optical receiver of the sound-reproduction

machine, but it is not a big problem, since it will be filtered out by the sound amplifier

devices.

If the straight parts of the density characteristics are used, we can calculate with the

values. In our case the multiplicative of the negative and positive values have to be 1.

This is the Goldberg-lemma [134]. At this method, the sound on the film negative will be

distorted, but this distortion can be eliminated on the positive, if the recording is copied to

a positive film with correct .

The disadvantage of this method is that it is not able to eliminate noise amplification

and in addition to this, every copy procedure adds more noise to the original sound, which

comes from the granularity of the film. A digital correction method would be able to use

the optimal compensation characteristic without any additional noise.

5.4

At variable area method, the width of the slit and the light intensity remains constant and

the width of the black part in the sound-stripe is driven by the input signal. When the input

signal is zero, the half of the sound stripe is exposed. The exposure of the exposed part is

E0 =

Is0

,

c

(5.6)

Is0

.

2c

(5.7)

E0 =

If the input is excited by a sinusoid the exposure of the whole sound stripe can be written

as

I0 s0

E=

2c

sin()

1+r

sin(t)

s0

,

d

(5.8)

which is very similar to eq. (5.2). It means that the light valve causes no nonlinear distortion,

only a small linear one that can be neglected in the case of standard film.

40

light area

light diffusion

dark area

The second distortion source is the film material itself. When an image is projected onto

the film, the developed image will not be the same as the original. The light will disperse in

the emulsion and will be reflected from the back of the film, hence some parts of the film,

which originally was not exposed, will also be exposed. In the case of recording a sinusoid

signal, the image of the sinusoid will be deformed, the darker part of the sinusoid will be

filled up and the lighter part will be much more thin (Fig. 5.6). This effect causes

a strong nonlinear distortion in the recorded sound similar to a rectification process. This

effect is known as Donner-distortion [40]. Since the amount of diffusion depends strongly on

the shape of the sound-signal, this kind of distortion can be treated as a nonlinear distortion

with memory.

The Donner-distortion can be reduced by proper copying of the original optical soundtrack negative to the film positive. At optimal parameters the light diffusion effects at the

film positive will (partly) compensate the light diffusion effects at the negative. The disadvantage of this method is that this kind of compensation is very sensitive to the copy

parameters and is not able to completely eliminate distortion. This is the reason that most

old films has a 10% or more harmonic distortion.

41

5.5

Appearance of noise

The noise sources at optical sound-recordings are common at the different recording-tecniques.

There are three different noise sources, particularly

celluloid-noise of film-carrier,

noise of scratches and other small degradations,

granular noise of photosensitive layer.

Celluloid-noise is caused by the optical imperfections of the celluloid-based film carrier.

The carrier of the film is not absolutely transparent and the transparency is slightly different

at different point of the film roll, which causes a wideband noise during playback.

Similar to the celluloid-noise small scratches and dust on film also cause differences in

transparency, hence cause noise during playback.

Granular noise is created by the finite size silver grains in the photosensitive layer and

this is the determinative part of the noise in film. If there were no developed grain in the

area that is scanned by the sound-reproducing device, the noise level would be zero. This

would be also the case, when the photosensitive layer were so dark that light would not go

through the film. Between the two extremities the noise level differs from zero. The estimate

value of the difference of transparency can be computed as

r

d Tk (1 Tk )

4T =

,

2

F

(5.9)

where d is the average size of silver grains, F is the size of scanning area and Tk is the

transparency of the examined area [40].

Eq. (5.9) shows the advantage of variable area method. Since at variable area method

the transparency is almost 1 or almost zero everywhere the numerator under the square-root

will be small, hence the noise level will also be small.

42

Chapter 6

Compensation of memoryless

nonlinearities

In the previous chapters we have collected and depicted the problems and available solutions of nonlinear restoration. We have analyzed the possible methods of nonlinear modeling

techniques and nonlinear compensation techniques in general. We also have examined the

problems that are appeared at the professional film-sound technology, which are the followings:

The sound of old films has high nonlinear distortion. This is especially the case of

films, which were made with variable density method. This nonlinear distortion comes

mainly from the nonlinear behaviour of the photosensitive material.

The sound has quite high noise level, which comes mainly from the aging of the film

material and the granularity of the photosensitive material.

There are thousands of old films waiting for rescue. The status of these film materials

are getting worse and worse day by day. It means that film restoration technicians

have not enough time to make several experiments on each film and to make time

consuming iterative calculations on the sound.

Since variable density films have the highest distortion and they are the oldest ones,

the main aim is to rescue these films first. The distortion of variable density film can be

described as a memoryless distortion, which can be much more easily described and handled

as distortions with memory. This information can be used to create a fast and optimal

solution.

43

For finding an optimal solution, in the following sections we will examine the possibilities

to create a fast restoration algorithm using no or very small amount of iterations in the case of

memoryless nonlinear distortions. In section 6.1 we will select a proper nonlinear description.

In 6.2 we will examine the possibilities to identify the nonlinearity of the examined film. Next,

in 6.3, we will look after the effects of noise on the film, then in 6.4 we will propose a fast

non-iterative regularization method to eliminate the unwanted effects of noise and to keep

the norm of the difference between the original signal and the estimated one low. Another

method will be shown in 6.7, where the aim is not to reduce the difference and reduce the

noise as much as possible, but to make our estimate signal unbiased.

6.1

Representation of nonlinearity

Let us see a nonlinear function, N(), that is assumed to be continuous and differentiable

on the closed interval [x0 , x1 ]. Another assumptions are that the function is invertible and

memoryless.

Our aim is to find a proper estimate in some sense about the original signal distorted by

N(). In this case we may have to solve nonlinear equations to get the result. Since N() is an

arbitrary nonlinear function, generally, the analytical solution with the exact N() function is

not possible, hence a proper representation form is required to simplify computations. (Note,

that this representation form is not necessarily the realization form of the nonlinearity in a

given software or hardware, it is required only for the computations to find the solution of

the nonlinear compensation method.)

6.1.1

In the case of a continuous, infinite times differentiable, invertible and memoryless nonlinear function, in a given, small interval, [xi , xj ], good approximation of the original, nonpolynomial function can be achieved by the m-th degree Taylor polynomial:

N(x)|x,a[xi,xj ] = T (x) + R(x) =

dN (x)

dx

= N(a) +

+

+

d2 N (x)

dx2

1!

2!

(x a)2 + . . .

m!

(x a)2 + R(x).

dm N (x)

dxm

44

(x a)

(6.1)

The residual error, R(x), exists and can be estimated, if N() is differentiable in the

interval, (xi , xj ), m + 1 times. The supremum of the error can be computed as

m+1

d

N(x)

m+1

.

sup{R(x)} = |xi xj |

max

dxm+1

(6.2)

This means that in the knowledge of the highest value of the m + 1-th derivative in the

range of interest, an input interval can be given, in which the residual error will be equal or

smaller than our requirement. In this case, we can use a set of models, where each model

represents the original nonlinear function in a given interval. If the intervals are consecutive,

the whole nonlinear function can be represented in the range of interest by these models.

(This is the same representation form as that discussed in Chapter 2.3.3.)

If m = 1, eq. (6.1) reduces to

N(x)|x,a[xi ,xj ] N(a) +

dN(x)

(x a),

dx

(6.3)

2

d N(x)

2

.

sup{R(x)}|x[xi,xj ] = |xi xj | max

dx2

(6.4)

2

d N(x)

.

L max

dx2

(6.5)

of the residual error under a certain limit then in the knowledge

d2 N (x)

of the maximum of dx2 , the range of interest in x can be divided into several intervals

So we can obtain a piecewise linear representation that can describe the original nonlinear

function, N() with a required small error. The only requirements are that N() have to be

two times differentiable and the second derivative has to be finite in the range of interest.

In practice these requirements can usually be fulfilled.

6.1.2

Since

dN 1 (y)

(y y0 ).

N 1 (y)y[N (xi ),N (xj )],y0 =N (x0 ) = N 1 (y0) +

dy

dN 1 (N(x))

dx

=

1

dx

dx

45

(6.6)

(6.7)

and

therefore

dN 1 (y)

dN(x)

dN 1 (N(x))

=

dx

dy y=N (x0 ) dx x=x0

x=x0

dN 1 (y)

dN(x)

= 1

dy y=N (x0 ) dx x=x0

1

dN 1 (y)

=

dN (x)

dy

y=N (x0 )

dx

(6.8)

(6.9)

x=x0

which means that the derivative of the inverse nonlinearity can be represented by the derivative of the original nonlinearity, which is an important fact. The corollary of these equations

is the Goldberg-lemma (eq. 5.5), which was used already in section 5.3.

6.2

In order to have a proper piecewise linear representation about the original and inverse

nonlinearity, we have to know the original nonlinear function. In the case of variable density optical soundtracks this function can be determined from the sound signal and in the

knowledge of some a priori information [135]. The function can be determined as

y(t) = G1 (G2 x + O2 ) + O1 ,

(6.10)

where () is the -function of the film, O1 , O2, G1 and G2 are the offset and gain parameters

before and after the nonlinearity. G2 depicts the amplifier before the light valve at the film,

O2 depicts the basic light intensity, when no driving signal is present on the light valve, O1

determines the intensity of the scanning light and the residual offsets of the reproducing

device, while G2 determines the amplification of the reproducing device.

For reconstructing the exact nonlinear function, the values of G1 , O1 and O2 are required.

Note, that G2 is not important, because this parameter adjusts only the volume of the original

sound. The reconstruction is based on the assumption that the recorded audio signal contains

clearly periodic sound-parts. This is the case in most of musical sound parts or in the case

of a short part of human voice, when a vowel is formed by the speaker.

If the recorded signal part, s(t) is periodic, it can be written as a sum of harmonically

related sinusoids:

s(t) =

ai sin(2if0 t + i ),

46

i = 1...

(6.11)

where f0 stands for the fundamental frequency of the periodical signal and ai and i are the

amplitude and phase of the i-th sinusoid.

In eq. (6.11) we did not take into account the DC component in the Fourier-series. The

reason is that the DC component of the original signal cannot be separated from the DC

component added by the nonlinearity. For this reason, in eq. (6.11) we assumed that s(t)

had no DC component. This is a reasonable assumption in the case of audio signals. If the

input signal contains also DC component, this component can be treated as the part of the

input offset O1 . In this case the DC component will not be restored in the estimated signal,

however, we dont have to deal with the DC component, because DC component is inaudible

at audio signals.

If the signal, s(t), is led through a memoryless nonlinear system, a different periodic

signal arises:

u(t) = G1 (G2 s(t) + O2 ) + O1 =

X

=

bj sin(2f0 + j ) + b0 ,

j = 1...

(6.12)

Eq. (6.11) and (6.12) form a common transformation, which assigns a u(t) signal to every

value of the unknown parameters:

u(t) = T (v(f0 , t)),

(6.13)

v(f0 , t) = {G1 , O1 , O2 , a1 , . . . , a , 1 , . . . , }.

(6.14)

sufficient condition is, when the number of a and parameters are limited, the number of

samples from u(t) is higher than the number of variables, s(t) has no DC component and

() is a monotonic nonlinear function [136].

In the case of movie soundtracks these conditions can be fulfilled. The uttered vowels

in the movie contain enough long periodic parts, which are ideal for the identification. The

sound has no DC component, or if it is removed, it does not affect the sound-quality. The

vowels are bandlimited, hence the number of a and parameters are limited and the

function is a strictly monotonic function.

The signal of movie soundtracks is corrupted by wide-band noise. In this case, the

problem is ill-posed, because the observed samples can exceed the limits of the output domain

47

of the nonlinear function. A solution for v(f0 , t) can be found by minimizing the value of

the following form:

Cost =

Zt2

(u(t) T (

v(f0 , t)))2 dt

(6.15)

t1

Eq. (6.15) can be minimized by any optization algorithm e.g. Monte-Carlo method. It

is still a question, how many sinusoids should be used to describe the original signal s(t).

This can be estimated from the graph of the optimal cost versus the number of sinusoids.

If the s(t) signal is undermodeled, the cost will be high and will quickly decrease for higher

number of sine signals. On the contrary, if the periodic signal is overmodeled, the use of

higher number of sinusoidal signals will not change the optimal cost drastically. Hence, by

finding this turning point, the number of sinusoids can be chosen. Experiments show that

the method is not too sensitive to the number of sinusoids and the use of eight sinusoidal

signals has been found appropriate to give good results [135].

Although, optimization of eq. (6.15) is quite computation intensive, however, we dont

have to compute it on the whole sound material, only on some representative sample, which

are not longer than few hundred samples. Therefore comparing the time required to calculate

the solution of eq. (6.15) to the time required to restore the whole material, this amount

of computation is acceptable. An advantage of this solution that the identification and the

restoration are two different modules. In this case, if we have the exact nonlinear curve of the

film-roll, we can skip this procedure, making the restoration even faster. This modularization

is impossible in the solutions based on time series models and probabilistic approaches as it

was claimed in section 3.3.4.

6.3

Effect of noise

Let us consider a signal, x, that is distorted by a nonlinear, two times differentiable, memoryless and invertible function, N() creating a new signal, y:

y = N(x).

(6.16)

If the data, y and the nonlinear function is known, the inversion problem is to find an

estimate, x, that satisfies the data in some sense.

If N() is invertible then in the case of eq. (6.16) the solution is simple:

x = N 1 (y),

48

(6.17)

where N 1 () is the analytical inverse of N(). However, usually the output signal, o, that

we can observe is contaminated by noise, n. If we assume that the noise is additive and

independent on the signal then

o = y + n = N(x) + n.

(6.18)

x = N 1 (o) = N 1 (N(x) + n).

Using eq. (6.3) and (6.9) we can express the difference caused by n at a given x0 as

dN 1 (y)

1

x N (N(x0 )) +

n

dy y=N (x0 )

dN 1 (y)

= x0 +

n

dy

(6.19)

y=N (x0 )

1

= x0 +

dN (x)

dx

n.

(6.20)

x=x0

Eq. (6.20) means that x will be different from the expected value. The size of the difference

between x and x depends on the amplitude of n and on the derivative of the nonlinear

function in the given x0 point. If the derivative of the original nonlinear function is small, the

amplification of the noise in x could be extremely high, which cannot be already neglected.

This means that the exact nonlinear inverse in the case of noisy signals may be not proper

and a better restoration method should be found.

With the restoration we can have two different aims:

1. Our aim could be to restore the signal so that the expected value of the difference of

the estimate and the original signal be minimal in least squares sense:

min E{(

x x)2 } ,

(6.21)

2. our aim could be that the expected value of the estimate signal at a given time point

be the same as the original one:

E{

x(t)} = E{x(t)}.

(6.22)

A solution will be shown for the first problem in section 6.4, a solution will be shown for

the second problem in section 6.7.

49

N(x)

?

- +j

K(o)

6.4

The nonlinearity compensation model can be seen in Fig. 6.1. This is a cascaded model,

where a compensation block, the compensation characteristic of which is denoted by K(o),

is used after the original distortion.

Since in our film-sound restoration problem the output signal, y, is already given and

cannot be pre-compensated, post-compensation technique have to be used. As it was shown

in Chapter 3.3 and Chapter 6.3 this problem could be ill-posed, therefore proper techniques

have to be used to avoid the too high noise amplification. To solve this problem and find an

analytical solution, first we have to use a proper representation form of the nonlinearity, as

it was discussed in Chapter 6.1. If we use eq. 6.3 to have a piecewise linear representation,

then one block of the model will look like as in Fig. 6.2. If this model was computed at a

given x0 input value and y0 = N(x0 ) output value, furthermore the disturbance of the output

signal is n, then 4x denotes x x0 , 4y means o y0 n, where o is the noisy observation.

Note, that in this model the noise, n, does not appear explicitly. In this model the noise

affects only the work-point, hence the amplification of the compensation block. Also note

that the interval length of each model block is L, which length was computed such, that the

residual error of the blocks are smaller than a chosen error limit. If L tends to zero, this

error limit also tends to zero.

Let us find K(o) by Tikhonov regularization. To do this, we have to supplement Fig.

6.2 by a new block that computes 4

y , the estimate of the output signal from 4

x. The

supplemented system can be seen in Fig 6.3.

In this model, the function (cost function) that we have to minimize, in the case of one

block is

k4y 4

y k + k4

xk,

50

(6.23)

4x

dN (x)

dx

x=x0

4y

dK(y)

dy

4

x

Figure 6.2: One block from the piecewise linear compensation model.

4x

dN (x)

dx

x=x0

4y

dK(y)

dy

y=N (x0 )+n

4

x

dN (x)

dx

x=x0

4

y

where

dN(x)

4

y=

4

x.

dx x=x0

(6.24)

x value at a given value, where eq. (6.23) is minimal. At the minimum,

x will be 0:

(k4y 4

y k + k4

xk) = 0

4

x

(6.25)

!

dN(x)

4

x)2 + (4

x)2

= 0

(4y

dx x=x0

2

dN(x)

dN(x)

4y

4

x + 4

x = 0.

dx x=x0

dx x=x0

4

x

(6.26)

4

x

=

4y

dN (x)

dx

x=x0

2

dN (x)

+

dx

x=x0

(6.27)

If the interval length, L, the model blocks tends to zero, then the left side of eq. (6.27)

will tend to

d

x

,

dy

which is

dK(o)

,

do

51

that at a given regularization parameter, using the Euclidean norm, an analytical solution

can be found for the derivative of the compensation characteristic [137, 138] that is

dN (x)

dx

dK(o)

x=x0

=

.

(6.28)

2

do x=x0

dN (x)

+

dx

x=x0

This means that the piecewise linear model of the inverse nonlinear function can be

computed in the knowledge of the derivative of the original nonlinear function by numerical

integration using the solution of eq. (6.28). However, for the numerical integration we

have to know one more parameter: the integration constant. For the determination of this

constant let us write N 1 () on the following way:

N 1 () = F ()|F (0)=0 + C,

(6.29)

where F () is the numerical integral of eq. (6.28) for example calculated using the F (0) = 0

constraint. This constraint is necessary only to have a solution for the inverse nonlinear

characteristic, which can be already shifted by a DC component. If the F (0) = 0 constraint

cannot be fulfilled because of the shape of the nonlinearity, then instead of F (0) = 0, another

arbitrarily chosen constraint can be used.

For the computation of C, several methods can be used:

1. In the knowledge of the probability density function of x and n, C can be computed

by minimizing the weighted norm of the difference of the original signal and the compensated one [137, 138]

Z

px ()

C

(6.30)

2. If we know the expected value of x (the DC component of x), we can adjust C so, that

the expected value of x will be equal to that. Usually The DC component is zero and

we have to make a characteristic, where the expected value of x is also zero.

3. In several problems the exact value of the DC component is not the part of interest.

This is the case of the most audio applications. In this case the only restriction is

that the DC component have to be near to zero, otherwise some elements in the audio

chain could make additional sound distortion or the life expectancy of some devices

(e.g. loudspeakers) could be shortened. Hence, the C constant can be arbitrarily

chosen and the DC component can be filtered out by for example a simple high-pass

filter.

52

Assuming the third method and simply using F ()|F (0)=0 , a MATLAB realization of the

compensator characteristic can be seen in Appendix D.

6.4.1

Eq. (6.28) can be treated as a rough approximation of the probabilistic approach of the

compensation problem as well.

Assuming that the mean value of the input signal and the mean value of the noise is zero

(E{x} = 0, E{n} = 0), moreover the input signal, x and the noise, n are uncorrelated, for

the case of piecewise linear representation, a probability based solution can be seen for the

compensation characteristc, K(o) in Appendix B. Here the squared sum of the difference of

x and x is minimized. As it is derived in the Appendix, the correct coefficients of such a

linear piecewise compensator system can be computed as

Bi0 = Bi1

k

X

j=1

p(y Ui |o Uj )Aj0

Pk

Ui |o Uj )Aj1

2

p(y

U

|o

U

)A

+

i

j

j1

j=1

Bi1 = P

k

j=1 p(y

E{n2 }

E{x2 }

(6.31)

where Aj0 and Aj1 are the coefficients of the piecewise linear representation of N(x) in the

interval Ui :

N(x)|N (x)Uj = Aj0 + Aj1 x

(6.32)

and Bj0 and Bj1 are the coefficients of the linear representation of K(o):

K(o)|oUj = Bj0 + Bj1 o

(6.33)

If the probability density function of the noiseless output signal, y and the disturbance, n is

exactly known, then the probability that y will be in the interval, Ui , while the observation

signal, o will be in Uj (shortly p(y Ui |o Uj )) can be computed as

p(y Ui |o Uj ) =

Ui

py ()pn ()dd

(6.34)

(+)Uj

As we can see, eq. (6.31) is very similar to eq. (6.28) in its structure. If the length of

every Uj tends to zero, Aj1 tends to the derivative of N(x) at x = Aj0 , and Bj1 tends to the

53

R

dN (x)

dn

p (n) dx

n

dK(o)

x=N 1 (y0 +n)

=

.

2

do o=y0

R

2

}

dn + E{n

p (n) dNdx(x)

E{x2 }

n

(6.35)

Now, we have to determine somehow, what is the difference between the derivative,

dN (x)

dx

of

the nonlinear function and the numerator of eq. (6.35) to express the difference between the

optimal least squares solution and the Tikhonov-based one.

The numerator of eq. (6.35) is a convolution integral, which is hard to evaluate directly

and compare to

dN (x)

.

dx

function

Z

dN(x)

dN(x)

dn = R(pn (n), N(x))

pn (n)

dx x=N 1 (y0 +n)

dx x=N 1 (y0 )

dK(o)

1

=

do o=y0

R(pn (n), N(x)) dN (x) 2

dx

dN (x)

dx

E{n2 }

E{x2 }

(6.36)

(6.37)

where R(pn (n), N(x)) is a function that can be calculated as a fraction of the numerator of

eq. (6.35) and

dN (x)

dx

difference between the least squares solution and the Tikhonov solution is small.

Let us approximate

E{n2 }

E{x2 }

1

R(pn (n),N (x))

derivative of a strictly monotone function and the probability density function of a relatively

small level noise and it is divided by a representation point of the derivative, the value of this

R(pn (n), N(x)) function will be in a relatively small interval, however, it will not be constant

and will not be always 1. This means that the previously described Tikhonov regularization

method will be close to the least squares optimum and will reduce the error caused by noise

amplification, although the result will be only suboptimal in least squares sense. As Sarkar

claims in [66] in the case of a similar linear regularization problem, this approximation makes

relatively small error in the most applications.

Let us see some examples, how does R(pn (n), N(x)) change in the case of different nonlinear functions and different noises. Four different nonlinear functions were tested with

uniformly distributed and Gaussian distributed noises. These functions were Gaussian error

function and exponential functions in the range of x = [3, 3], and x0.5 and x0.2 functions

54

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

N(x)

N(x)

0.5

0.5

0.5

0.5

1

3

1

3

Figure 6.4: R(pn (n), N(x)) at Gaussian error function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

N(x)

N(x)

0.5

0.5

0.5

0.5

1

3

1

3

Figure 6.5: R(pn (n), N(x)) at Gaussian error function and Gaussian noise (noise deviation at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

55

25

25

20

20

15

15

N(x)

N(x)

10

10

R(pn(n),N(x))

0

3

R(pn(n),N(x))

0

0

3

Figure 6.6: R(pn (n), N(x)) at exponential function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

25

25

20

20

15

15

N(x)

N(x)

10

10

R(pn(n),N(x))

0

3

R(pn(n),N(x))

0

0

3

Figure 6.7: R(pn (n), N(x)) at exponential function and Gaussian noise (noise deviation

at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

56

3.5

3.5

2.5

2.5

N(x)

N(x)

1.5

1.5

R(p (n),N(x))

R(p (n),N(x))

0.5

0

2

0.5

0

2

10

x

10

x

Figure 6.8: R(pn (n), N(x)) at square-root function and uniformly distributed noise (noise

interval at left 0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

3.5

3.5

2.5

2.5

N(x)

N(x)

1.5

1.5

R(pn(n),N(x))

0.5

0

2

R(pn(n),N(x))

0.5

0

2

10

x

10

x

Figure 6.9: R(pn (n), N(x)) at square-root function and Gaussian noise (noise deviation

at left 0.1, noise deviation at right 0.01). Solid line: nonlinear function, dashed line:

R(pn (n), N(x)).

57

2.5

2.5

1.5

1.5

N(x)

N(x)

R(p (n),N(x))

R(p (n),N(x))

0.5

0

2

0.5

0

2

10

x

10

x

Figure 6.10: R(pn (n), N(x)) at x0.2 function and uniformly distributed noise (interval at left

0.1, noise interval at right 0.01). Solid line: nonlinear function, dashed line: R(pn (n), N(x)).

1.6

1.6

1.4

1.4

N(x)

N(x)

1.2

1.2

R(pn(n),N(x))

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

2

R(pn(n),N(x))

0

2

10

x

10

x

Figure 6.11: R(pn (n), N(x)) at x0.2 function and Gaussian noise (noise deviation at left 0.1,

noise deviation at right 0.01). Solid line: nonlinear function, dashed line: R(pn (n), N(x)).

58

in the range of x = [0, 10]. In the case of the last two functions the output is assumed to be

zero if x < 0.

The interval length of uniformly distributed noise was 0.1 and 0.01. The deviation of

Gaussian noise was 0.1 and 0.01.

The results can be seen in figures 6.46.11. The deviation of R(pn (n), N(x)) from 1 is

very little in the case of Gaussian error function or exponential function. The difference from

1 is already not noticeable in the case of exponential function. A little bit higher differences

can be seen in the case of x0.5 , x0.2 functions in the environment of zero, but it is also on

an acceptable level. And of course, it is zero where the signal is assumed to be zero. Here

R(pn (n), N(x)) is not important,

Let us denote

dN (x)

dx

dK(o)

do

1

R 2 +

R2

2 +

(1R)3 +()

R(2 + 2 )(2 +)

(6.38)

The denominator tends to

.

R

R2

difference of the Tikhonov solution and the optimal least squares solution tends also to 0.

2

)

R2

and in

( + )). Hence the denominator will contain an member, however, the highest power

degree in the numerator is only 3 . So the difference will also tend to zero. The difference

will be the highest in the case, when 2

.

R2

can also be minimized. We have to take care of the numerator of eq. (6.38. Let us see, when

will it be zero:

(1 R)3 + ( ) = 0

= 2 + 2 R

(6.39)

R2

highest difference) then the difference will be near to zero is the case, when

E{n2 }

= 2 + =

R

E{x2 }

R 1.

(6.40)

Comparing the solution of Tikhonov regularization to the least squares solution we can

say:

59

The solution of Tikhonov regularization is not the same as the least squares one,

however, they are close to each other. The difference caused by the R(pn (n), N(x))

term is usually not too big, because the value of R(pn (n), N(x)) is close to 1 in the

range of interest. The difference can be further reduced, if is appropriately chosen.

Although the solution of Tikhonov regularization and the least squares one is not the

same, this is not a big problem in our viewpoint, because our aim was only to reduce

the artifacts during nonlinear compensation at optical audio recordings caused by noise

and for this aim Tikhonov regularization is as proper as the least squares solution.

For the computation of least squares solution the exact knowledge of the probability

density function of the signal and noise (pn (n) and px (x)) is required. If it is not

known, the solution cannot be computed or if it is not properly known, the solution

could be strongly distorted and unusable. Tikhonov regularization is a much more

robust method and good results can be achieved also without the knowledge of pn (n)

or px (x). For example an appropriate regularization parameter, can be computed

already from the energy estimate of the noise and signal, E{n2 } and E{x2 }. (Certainly,

in the knowledge of pn (n) and px (x) a bit better results can be achieved.)

6.4.2

When we have information about the probability density function of the observed signal

and the noise po (o) and pn (n), we can produce a better estimation about the regularization

parameter, than

E{n2 }

.

E{x2 }

the probability density function, pn (n) of the additive noise n is known, the optimal value of

the regularization parameter, , and the optimal compensation characteristic, K(o, ), can

be computed by minimizing the expected value of the difference of the original x0 constant

and its estimate value, x = K(o, ), by :

E{e(x0 , )} =

(6.41)

If x is not constant, but the probability density function of x denoted by px (x) is known,

the optimal compensation characteristic can be computed as the minimization of the proper

60

E{()} =

px ()

(6.42)

Usually, eq. (6.42) cannot be solved and and K(o) cannot be determined directly,

since pn (n) and px (x) is not known in the most problems. However, we can estimate them.

In practice pn (n) can be estimated by two kind of methods:

1. Collect information about pn (n) from those signal parts where only noise is present.

In the case of speech and other audio signals the signal contains several pauses, which

can be found manually [137], or if we would like automatize the whole process it

can be detected by voice activity detectors. These detectors are based on short term

energy measurements [139], zero crossing term measurements [140] or cepstral analysis

of the signal [141].

2. Collect information from the whole signal, using filter banks and select the noise information by statistical methods. These methods are based on spectral minima tracking

[142, 143, 144, 145, 146], quantile based noise spectrum estimation [147] and extended

spectral subtraction [148].

The probability density function, px (x) of the input signal can be estimated iteratively,

using a few steps. A similar method was proposed for linear case by Daboczi [103] and

further developed for nonlinear case by the author [135, 149]. The algorithm of the method

is the following:

1. First we need to have an initial guess about px (x) and pn (n). A good estimate can be

computed for pn (n) by using one of the possibilities described above. For px (x)0 if

we do not have other possibility we can use the probability density function of the

output signal px (x)0 = po (o).

2. Compute by minimizing eq (6.42) using the estimates of px (x) and pn (n).

3. Compute K(o, ) and with this, compute x from o.

4. Using the histogram of x a new estimate can be calculated for px (x).

5. If the number of iterations is already high enough or the difference between the new i

and the previous i1 parameter is small enough, we can stop the iteration, otherwise

we can go back to the second step.

61

We dont have to use the whole audio material to compute px (x)0 , it is enough to use a

representative part of the signal [135]. This means that the non iterative behaviour of the

analytical solution still remains, because the iteration to compute px (x) and have to be

done only on a small portion of the audio material.

Let us examine the convergence properties of the algorithm described above. When

n = 0, hence pn (n) = then

E{()} =

px ()k K(N(x), kd

(6.43)

Now, the best solution is when K(o, ) = N 1 (o), hence when = 0. After one iteration we

will get back px (x) almost regardless to the initial guess of px (x). At the solution, E{()}

will be zero and it will be higher in any other case. (Of course, if px (x) is constant zero, or

a at K(o) = o = 0, cannot be found, but we can find it in any other not degenerated

situation.)

When n is small, K(N(x) + n, ) can be approximated by the first elements of the Taylor

polynomial:

K(N(x) + n, )|x=x0

dK(o)

n

= K(N(x), )|x=x0 +

do o=N (x)

(6.44)

Now, eq. (6.41) (the inner integral of eq. (6.42)) can be written as

Z

dK(o)

nkd

pn ()kx0 K(N(x), )|x=x0

do o=N (x0 )

(6.45)

dK(o)

n = T (x, ) + N (x, n)

x0 K(N(x), )|x=x0

do o=N (x0 )

T (x, ) = x0 K(N(x0 ), )

dK(o)

N (x, n) =

n,

do o=N (x0 )

(6.46)

where N (x, n) corresponds to the error term caused by noise and T (x, ) denotes the error

parameter, near to zero, the difference between the original inverse characteristics and

the regularized one will be small, therefore the caused signal distortion will also be small,

however, the noise amplification becomes high. Using higher and higher values, the noise

62

amplification becomes smaller and smaller, but the derivative of the compensation characteristic becomes smaller and smaller, hence the compensation characteristic becomes more

and more flat and he distortion caused by the characteristic grows.

When is small, the probability distribution function of the estimate, px (

x) will be

widely distributed, because the estimate is contaminated by high amplitude noise. When

approaches , px (

x) will approach because the compensation characteristic will be even

more flat.

Let us examine the behaviour of these two terms assuming a limiter characteristic for

N(), which has small distortion in the case of small amplitude input signals and the distortion increases when the amplitude increases. This kind of nonlinearity is also known as

mild nonlinearity. The unwanted distortions in the audio field generally can be represented

by limiter characteristics. Nonlinear characteristics having different behaviour usually not

appear in audio recordings, except some manipulated signals with special sound-effects or

noise gates.

If during the iteration in the algorithm the i-th value of is very high, px (x)i will differ

from zero only in a very small interval, at small x values and in eq. (6.42) those part of

the error will be emphasized, which is close to zero. In the case of a limiter characteristic,

this part is the almost linear part. Here the noise term, N is negligibly small and only the

distortion term, T will dominate. The error could be decreased if is reduced and T is

decreased.

Unfortunately, the convergence of the algorithm is not guaranteed in the case of a deadzone-like nonlinearity, since here the highly distorted part will be emphasized when is high

and px (x)i is small. However, as we discussed, dead-zone nonlinearities do not appear in

audio recordings, which are our practical problem.

When the i-th value of is extremely small, px (x)i will be widely distributed and almost

flat, due to the noise. In eq. (6.42) the noise term will dominate. The error can be reduced

if is increased.

This means that (in the case of limiter-like nonlinearities) the algorithm will converge to

that value, where the N and T are nearly the same (of course, the exact ratio will depend

x)). This solution corresponds to the Bertoccos method for

linear problems in [108], where a noise term was computed from those signal parts, where

the original signal was flat and a distortion term was computed from signal changes and the

optimal compensation was reached, when the two terms were equal.

The solution is also similar to Hansens L-curve method [113] where the logarithm of

63

xk) was compared to the misfit, kN(

x) o)k. The energy

of the estimate will be small, when is high, and T is also high. The misfit will be small

when is small, but N is high. Hence this method will provide similar solution to our

method.

Simulations show that this algorithm is as robust as the algorithms depicted in chapter

3.3.3. It was shown with some heuristic steps that in the case of limiter characteristics the

algorithm converges to similar values than the other algorithms discussed in 3.3.3, however,

the algorithm still lacks a thorough convergence analysis.

The advantages of our method compared to Hansens method:

Hansens method requires to map log(k

xk) and kN(

x) o)k in a very wide range

to see clearly the L-shape. This novel algorithm requires only a few iteration steps.

Experiments show that usually three iterations are enough to get a proper value

[135].

In Hansens method the edge point can be chosen from the curve manually or using

some heuristic methods. This novel algorithm can choose absolutely automatically.

Sometimes the L-shape of the Hansen-diagram is not clear, which makes harder to

choose the value of .

6.4.3

method

As it was shown in section 3.3.3, two different method is used generally for finding the correct

regularization parameter in nonlinear problems: Morozovs method and Hansens method.

In Morozovs method the regularization parameter is chosen as the solution of

kN(

x, ) ok = C,

C1

(6.47)

where is the estimation of the norm of the noise; in Hansens method the regularization

parameter is chosen as the corner point of the log(kN(

x) o))k vs. log(k

xk) diagram.

For comparison of these methods to the proposed new one, two simulations will be shown

with a multisine input signal (Fig. 6.12) that was passed through two different nonlinear

functions (Fig. 6.13 and 6.14) and contaminated by Gaussian noise. In the first case the nonlinear distortion was the Gaussian error function, erf (), which is a general limiter function.

In the second case it was the part of x5 :

y = 0.1(x + 250.2 )5 2.5

64

(6.48)

X 3

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2

4

x 10

because this function is similar to the distortions that appear at film-sound.

For the simulation of the observation noise, Gaussian noise was added to the distorted

signals to produce 40 dB signal-to-noise ratio, which was calculated as

P 2!

y

SNR = 10 log P i i2 .

j nj

(6.49)

The initial px (x)0 parameter of the proposed algorithm was chosen as po (o). For faster

computation, was chosen and the methods were evaluated only in 20 points. In the iterative

algorithm i was chosen from these points, where the estimated error was the smallest one

and only three iterations were made. These 20 points were equidistant in the logarithmic

scale:

1

,

i = 0 . . . 19

(6.50)

4i

The results can be seen in Fig. 6.176.24. The first four figures show the choosing of the

i =

proper regularization parameter, , by the examined methods. The other four figures show

the reconstruction in time domain, x. The results also can be seen in Table 6.1. The error,

in this table was computed as

i=M

1 X

(x[i] xest [i])2 .

=

M i=1

65

(6.51)

Y

1

0.8

0.6

0.4

0.2

0

0.2

0.4

0.6

0.8

1

4

4

X

Figure 6.13: Gaussian error function used for the first simulation.

Y 500

400

300

200

100

100

4

4

X

66

O 3

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

2

4

x 10

time [samples]

Figure 6.15: Noisy output signal of the first simulation (distortion is made by the Gaussian

error function).

O 35

30

25

20

15

10

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2

4

x 10

Figure 6.16: Noisy output signal of the second simulation (distortion is made by the x5

function).

67

0

lg(||oy

||)

lg(||oyest||)

est

= 0.01

2

2

= 0.0039

= 0.002

C=10

C=2

C=1

6.1e05

5

6

0

10

10

10

10

10

10

6

1

12

10

10

0.8

0.6

0.4

0.2

0.2

0.4

lg(|| xest ||)

Figure 6.17: Error of the compensation of nonlinearity by Morozovs method (left) and

Hansens method (right) as a function of . The nonlinear distortion is the Gaussian error

function.

0.7

0.8

||xxest||

E{||xxest||}

0.7

0.6

0.6

0.5

0.5

0.4

0.4

0.3

0.3

iternum = 2 and 3

0.2

0.2

= 6.1e05

0.1

= 6.1e05

0.1

iternum = 1

0

0

10

10

10

10

10

0

0

10

10

10

10

10

10

10

10

10

Figure 6.18: Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of . The nonlinear distortion is the Gaussian error function.

Table 6.1: Comparison results of the Morozov, Hansen and the new method.

Morozov, Morozov, Morozov, Hansen new method true case

C=10

C=2

C=1

, N() : erf ()

1e-02

3.9e-03

2e-03

6.1e-05

6.1e-05

6.1e-05

, N() : erf ()

0.1004

0.0652

0.0481

0.0201

0.0201

0.0201

, N() : x5

7e-01

2.5e-01

1.5e-01

1e-04

4e-03

2e-02

8.5e-02

3.2e-03

1.9e-02

1.0e-03

6.6e-04

4.95e-04

, N() : x

68

lg(||oyest||)

lg(||oyest||)

= 0.7

2

C=10

2

= 0.25

= 0.15

C=2

3

4

C=1

6

5

7

1e04

6

8

7

0

10

10

10

10

10

10

9

0.52

10

0.5

0.48

0.46

0.44

0.42

lg(||xest||)

Figure 6.19: Error of the compensation of nonlinearity by Morozovs method (left) and

Hansens method (right) as a function of . The nonlinear distortion is the part of x5 .

0.015

E{||xx

0.012

||xxest||

||}

est

0.01

0.01

0.008

0.006

0.005

0.004

= 0.02

= 4e03

iternum = 2

0.002

iternum = 3

0

0

10

10

10

10

10

10

10

0

0

10

10

10

10

10

10

10

10

10

Figure 6.20: Error of the compensation of nonlinearity by the novel method (left) and the

true result (right) as a function of . The nonlinear distortion is the part of x5 .

69

^ 3

X

^ 3

X

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0.2

0.4

0.6

0.8

1.2

time [samples]

1.4

1.6

1.8

2

4

time [samples]

x 10

x 10

Figure 6.21: Reconstruction of x by Morozovs method (left) and Hansens method (right)

for the Gaussian error function.

^ 3

X

^ 3

X

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

x 10

1.8

2

4

x 10

Figure 6.22: Reconstruction of x by the novel method (left) and the optimal result in least

squares sense (right) for the Gaussian error function.

70

^ 1.5

X

^ 1.5

X

1

1

0.5

0.5

0

0.5

0

1

0.5

1.5

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

0.2

0.4

0.6

0.8

1.2

time [samples]

1.4

1.6

1.8

2

4

time [samples]

x 10

x 10

Figure 6.23: Reconstruction of x by Morozovs method (left) and Hansens method (right)

for the x5 nonlinear distortion.

^ 1.5

X

^ 1.5

X

0.5

0.5

0.5

0.5

1.5

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

1.5

x 10

0.2

0.4

0.6

0.8

1.2

1.4

1.6

time [samples]

1.8

2

4

x 10

Figure 6.24: Reconstruction of x by the novel method (left) and the optimal result in east

squares sense (right) for the x5 nonlinear distortion.

71

In the case of Gaussian error function, Hansens method and the proposed new method

gave the same optimal regularization parameters. In the case of x5 , the novel method gave

better results than the Hansens one. Morozovs method has overestimated in all cases,

even in the case, when the C constant in eq. (3.21) was chosen as 1.

6.5

To test the proposed restoration procedure, an original speech signal, x, recorded from the

radio was synthetically distorted by a -function:

y = (x) =

G = 0.5,

G(x + O1 ) + O2 if x > O1

otherwise

O2

(6.52)

O1 = 0.4 O2 = 0 = 5.8

ratio, which was calculated as

SNR = 10 log

P 2!

y

P i i2 .

j nj

(6.53)

The original and the distorted, noise contaminated signals can be seen in fig. 6.25 and 6.26.

The O1 , O2 and G parameters of the function are assumed to be unknown. To determine

these parameters, a small signal part from the distorted, noisy signal was chosen, which can

be seen in Fig. 6.27. The original signal was modeled by eight harmonically related sinusoid.

The fundamental frequency was 84.6 Hz. The results of the search program can be seen in

Fig. 6.28. The gain estimate, G (0.5068) and offset estimate, O2 (0.0007) are near to the

real parameters. The O1 parameter (0.3454) is a bit far from the real one, but this only

causes a DC shift in the resulted audio signal that can be corrected by a high-pass filter.

In the knowledge of (), using eq. (6.42) and the iterative algorithm described in section

6.4.2 we can calculate the estimate of the nonlinearity parameter, . The results of the

iterations can be seen in Fig. 6.29. The real error values can be seen in Fig. 6.30.

The estimate of the regularization parameter (9.76e-04) is quite close to the real one

(3.9e-03). The compensated signal can be seen in Fig. 6.31. An overcompensated ( = 0.25)

and an underregularized example ( = 9e 0.13) can also be seen in Fig. 6.32 and Fig. 6.33.

72

0.8

0.6

0.4

0.2

0.2

0.4

0.5

1.5

2

time [samples]

2.5

5

x 10

O 0.8

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.5

1.5

2

time [samples]

2.5

5

x 10

73

O 0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0.01

50

100

150

200

250

300

350

400

450

500

time [samples]

Figure 6.27: Distorted, noisy signal part chosen for parameter determination of the nonlinear

function.

74

E{||xxest} [dB]

50

50

3rd iteration

2nd iteration

= 9.76e04

100

150

1st iteration

200

0

10

10

10

10

10

10

10

12

10

Figure 6.29: Estimate error of the iterative algorithm at different regularization values.

||xx

|| [dB]

est

80

60

40

20

20

40

60

= 3.9e03

80

100

120

0

10

10

10

10

10

10

10

12

10

75

^

X 0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.5

1.5

2

time [samples]

2.5

5

x 10

^ 0.035

X

0.03

0.025

0.02

0.015

0.01

0.005

0.005

0.5

1.5

2

time [samples]

2.5

5

x 10

76

^

X

10

15

20

25

30

35

40

0.5

1.5

2

time [samples]

2.5

5

x 10

6.6

To test the algorithm on real audio signals, a distorted movie sound was provided by the

Hungarian National Film Archive. Unfortunately, due to the poor capabilities of the archive,

the audio signal was given on a VHS tape. The audio material suffered from strong thumps,

clicks and hiss due to the aged original film material and the VHS tape recorder and it also

suffered from a very strong nonlinear distortion due to a wrong film-copy process. A part

of this audio signal can be seen in Fig. 6.34. The nonlinearity parameter, was told to be

3.8, however, there was no information about the offset and gain parameters. To determine

this, a small signal part was chosen from the file that can be seen in Fig. 6.35. The original

signal was modeled by eight harmonically related sinusoids. The basic frequency was 559.85

Hz. The parameter estimates were

G

0.2104,

O_2 = -0.1349,

O_1 =

0.8246.

From the resulted parameters the regularized inverse characteristic was calculated by the

proposed algorithm. The results of the iterations can be seen in Fig. 6.37.

The optimal reconstructed signal can be seen in Fig. 6.38. An underregularized example

can be seen in Fig. 6.39.

77

O 0.7

0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

0.5

1.5

2.5

6

time [samples]

x 10

Figure 6.34: Real, nonlinearly disorted and noise contaminated audio signal.

O 0.6

0.5

0.4

0.3

0.2

0.1

0.1

0.2

10

20

30

40

50

60

70

80

time [samples]

Figure 6.35: Signal part chosen for parameter determination of the nonlinear function.

78

E{||xx

||} [dB]

est

110

120

130

140

150

= 3.9e03

160

1st iteration

170

180

190

0

10

10

10

10

10

10

10

79

12

10

^

X

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.5

1.5

2

time [samples]

2.5

6

x 10

Although, VHS tape is not the best medium for digital reconstruction of nonlinearly

distorted signals due to its small bandwidth and small signal-to-noise ratio, the proposed

algorithm performed well. The quality of the resulted signal became significantly better, the

disturbing sound distortion became much smaller. The noise level also remained small and

the optimal regularization did not introduce any disturbing artefacts.

80

^

X 0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

0.5

1.5

2

time [samples]

2.5

6

x 10

81

6.7

estimate

The restored data becomes biased due to the restoration method introduced in section 6.4.

It means that the expected value of the restored data at a given time point will not be equal

to the expected value of the original data in that time point. Although, it is not definitely

disturbing effect in the case of audio signals, there could be some applications, which require

an unbiased estimate about the original signal.

In this section, we will give a possible solution to get unbiased estimate in the case of

nonlinearly distorted and noise contaminated signals, if the probability density function of

the noise and the nonlinear distortion function is known.

Let us take a look again to the model nonlinearity compensation in Fig. 6.1 and try to

calculate the expected value of x:

E{

x}|y=N (x0 ) =

px (

x)

xd

x=

po (o)K(o)do =

pn (o y)K(o)do,

(6.54)

Due to the noise, the expected value of x will not be K(E{y}), but K(E{y}) where K() is

the correlation of the noise probability function and the compensation characteristic. So the

estimated value of x could be distorted.

To avoid this, and to make an estimate, x from o, which has an expected value that is

Z

pn (o y)K(o)do = x = N 1 (y),

(6.55)

which means that the correlation of K(o) and pn (n) have to be the inverse of N().

Since correlation transforms to multiplication and a complex conjugation in the frequency

domain it gives the idea to find the solution using the Fourier-transforms:

F {K(o)} =

F {N 1 (x)}

F {pn (n)}

(6.56)

practice, this task can be solved by a sampled data block from N 1 () and pn (n) with finite

length. In this case the realized correlation will be a circular one. Due to the circular

correlation, the result could be strongly different from the expected one: huge oscillations

can appear in the resulted characteristic. This is due to the fact that we made a discrete

82

convolution with a windowing function that causes leakage between the frequency bins hence

the appearance of new frequency components in the spectrum. After dividing this spectrum

by the spectrum of the noise probability function these components can be amplified, and

when we make the inverse Fourier-transformation, these unwanted components can cause

oscillations.

This effect of the circular correlation can be reduced by some techniques such as windowing techniques or zero padding. Another technique could be mirroring the shape of the

nonlinear characteristic and complement the original shape with this one, then using the

resulted characteristic for the determination of the proper F {K(o)}. The last technique can

lead to the Nahman-Gens method [150] or to the cosine transformation. However, the effect

of the circular correlation can only be reduced by windowing, and the circular correlation

can be eliminated with the Nahman-Gens method only, if the derivatives at the matching

point of the mirrored and the original shape are the same. Generally, these methods cannot

handle the problem properly.

Eq. (6.56) is only one solution at sampled data blocks, however, not just one solution

exists in this case. This claim can be proven, if we write eq. (6.55) with finite length,

sampled data blocks:

E{

x} = P K,

x

P

K

= [

xi ; . . . ; x

i+N ],

pn (yiM/2 N(xi ))

pn (yi+M/2 N(xi ))

.

..

.

..

..

=

.

= [K(yiM/2 ); . . . ; K(yi+M/2 )]

(6.57)

is calculated in N point from P and K, where K is known in M point.

If the not negligible part of pn (n) that have to be used in the calculations has an interval

length of I, and yi is sampled with L step length, then K have to be known at least in

M =N+

I

L

points, otherwise the beginning or the ending rows will not contain the whole

range of interest in pn (). Hence M have to be always higher than N. This means that

eq. (6.57) will be underdetermined, the solution for K is a subspace, where infinite number

of solutions exist for K. One solution can be found by Fourier-transformation, but it will

contain oscillations due to the circular correlation. However, we need a solution, where K()

is smooth, otherwise the compensation characteristic would increase the variance of x too

much.

83

6.7.1

K0 = N 1 ,

Ki+1 = Ki + (N 1 corr{Ki , pn }),

<

1

,

max |F {pn }|

(6.58)

For the numerical realization one has to note that if pn (n) is mapped in q points in the

range of [n1 , n2 ] and Ki vector has the length of r in the range of [y1 , y2 ], then the length of

1

1

Ki+1 will be r q in the range of [y1 + n2 n

, y2 n2 n

], since the correlation vector will be

2

2

A possible MATLAB realization of this kind of compensation characteristic for the case

of the Gausian error function and uniformly distributed noise can be seen in Appendix F.

6.7.2

Ki+1 = Ki + (N1 Ki P),

(6.59)

where K, N1 and P are the Fourier-transforms of Ki (), N 1 () and pn (n) functions respectively. means elementwise product of vectors.

Ki+1 can be written without recursion:

K0 = N1

K1 = (1 P) K0 + N1

K2 = (1 P)2 K0 + (1 P) N1 + N1

K3 = (1 P)3 K0 + (1 P)2 N1 + (1 P) N1 + N1

..

.

Ki+1 = (1 Pn )(i+1) K0 + (N1 +

+(1 P) N1 + + (1 P)i N1 ).

(6.60)

As we can see, Ki+1 can be written as a geometric sum and a member on the i + 1-th power:

Ki+1 = (1 P)(i+1) K0 +

84

1 (1 P)i

N1 .

1 (1 P)

(6.61)

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

200

400

600

800

1000

1200

time [samples]

If k1 Pk < 1, the iteration will be convergent, (1 P)(i+1) K0 will vanish and the

sum will converge to

N

,

P

the time domain, circular correlation will not appear and the resulted characteristic will be

smooth.

6.8

Simulation results

To show the behaviour of the method for unbiased estimates, a simulation will be shown

with a simple sinusoid excitation signal, x that can be seen in Fig. 6.40. The amplitude of

the signal is 0.9. The signal is led through a nonlinearity that has the following characteristic

(Fig. 6.41):

y = (2.45 x + 3)3 .

(6.62)

To simulate the effect of noise, uniformly distributed white noise was added to the distorted signal to achieve 50 dB signal to noise ratio (the signal-to noise ratio is defined as the

ratio of the energy of the distorted signal and the noise). The noisy and distorted signal can

be seen in Fig. 6.42.

Three different estimates of the original, undistorted signal were calculated, one with the

exact inverse of the nonlinear distortion (left side of Fig. 6.43, the other one with Tikhonovregularized characteristic (right side of Fig. 6.43). Here, the regularization parameter was

85

Y 180

160

140

120

100

80

60

40

20

0

1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

1

X

O

140

120

100

80

60

40

20

0

0

200

400

600

800

time [samples]

86

1000

^

X

^

X

0.5

0.5

0.5

0.5

1.5

1.5

2.5

2.5

200

400

600

800

1000

1200

200

400

600

800

time [samples]

1000

1200

time [samples]

Figure 6.43: Reconstruction of x by the exact inverse (left) and Tikhonov-regularized inverse

(right).

chosen as

E{n2 }

E{x2 }

E{n2 }

E{o2 }

= 2.03 105 . The third estimate was calculated with the method

The result can be seen in Fig. 6.44.

To compare the three estimates, first the normalized difference was determined to see the

difference of the estimates from the original signal. The normalized difference was computed

as

N

1 X

(x(i) x(i))2 .

diff =

N i=1

(6.63)

To examine the residual distortion of the estimated signals, a sinusoid was fitted to the

estimates in least squares sense and the amplitude of the sinusoid was determined. The

values of the normalized differences and the amplitudes can be seen in Table 6.2.

The estimate created by the exact inverse has quite high difference from the original

signal and looks very noisy (Fig. 6.43). In addition, the estimate of the amplitude of the

sinusoid is strongly distorted, it is 1.01 instead of 0.9.

The estimate made by Tikhonov-regularized inverse has the smallest difference from the

original signal, and has the less noise in the right side of Fig. 6.43, however, the signal

remained slightly distorted: the amplitude estimate is 0.869 instead of 0.9.

The amplitude estimate of the unbiased estimate is 0.898 that is very close to the original

one. However, the normalized least squares difference from the original signal is about double

than that of the Tikhonov-regularized estimate.

As it was expected, the unbiased estimate performs not the best from the point of view

of the squared difference.

87

^

X

0.5

0.5

1.5

2.5

200

400

600

800

1000

1200

time [samples]

Table 6.2: Comparison results of the exact inverse, Tikhonov and the unbiased characteristics.

characteristic type

normalized difference

exact inverse

0.03294

1.0108

Tikhonov-regularized inverse

0.00124

0.8692

unbiased inverse

0. 00246

0.8978

88

Chapter 7

Conclusions and future possibilities

7.1

Conclusions

There are thousands of valuable film rolls in the national film archives that cannot be presented for the audience because of they degraded quality. These films are noisy and they

have high nonlinear distortion. The quality of these films become worse day by day. The

quality can be preserved by a copying process, however, it is not enough. A restoration

process is required to achieve an acceptable sound and image quality for broadcasting.

Several techniques exist already for compensation of nonlinear distortions, however, most

of these techniques are based on pre-compensation, or ignores the effects of noise. Since in

the case of a film roll, only the distorted signal is given that is very noisy, the solution have

to be based on a post-processing technique and have to be robust for noise. Due to the

complexity and the difficulties of post-processing, such as the iterative behaviour of several

techniques and the sensitivity of noise, only a few application was created for nonlinear

post-processing. These applications were discussed in section 3.4.

The disadvantage of these techniques are the unsatisfactory robustness and the required

computational power. There is still a need for a fast restoration method, which may be used

for real-time restoration, too. Another requirement is the reduction of human interaction in

the restoration process. Due to the enormous amount of film data, if the restoration requires

too much manual adjustments, the restoration cannot be made within reasonable time and

price.

The aim of this thesis was to find fast algorithms for restoration of the sound of films

and to find the possibilities to reduce human interaction as much as possible.

Two new methods were shown for nonlinear restoration. The first method is based on

89

Tikhonov regularization and produces an output signal that has small difference from the

original, not distorted signal in least squares sense. Usually regularization techniques are

iterative ones, but it was shown in 6.4 that in the case of memoryless nonlinear distortions,

such as the density characteristic of films, a regularized compensation characteristic can

be calculated from the original nonlinear distortion without any iteration. The resulted

nonlinear characteristic was compared to the least squares solution and they were close to

each other. The resulted characteristic can be stored for example in a look-up table and the

restoration itself also can be solved in one step. Hence this part of the algorithm works fast

and it is proper for real-time applications or extreme fast background data-processing.

The determination of the regularized characteristic itself could require some manual and

some iterative steps. For example selecting sound-parts for the determination of noise could

require some manual work, the determination of the regularization parameter and the parameters of the nonlinearity require iterative steps. However, the iteration steps have to be

done only on a small part of the signal, so they are also quite fast. Not much human interaction is required compared to other sound-restoration works, such as removal of short-time

disturbances or proper modification of the sound-level at certain film-parts.

The second restoration method described in 6.7 produces an unbiased estimate about the

original signal. The calculation of the compensation characteristic is an iterative method.

The restriction of the convergence and the proof of it is also given in 6.7. The resulted

characteristic just as at the previous method can be stored in a look-up table and the

restoration itself is already a non-iterative process. A constraint for the convergence and the

proof of the convergence were also given.

For the calculation of these compensation characteristics, the knowledge of the noise probability distribution and the distortion characteristic is required. The first method requires

the probability distribution of the original signal as well.

Methods for determination of noise probability density function (pdf) were collected in

6.4.2, and also in this section a method was shown that is able to find a good estimate of the

regularization parameter of the first type compensation characteristic without the knowledge

of the pdf of the original signal. A heuristic explanation was given to the convergence of the

algorithm.

Although it is an iterative algorithm, simulations show that the convergence of the algorithm is extremely fast and usually three iteration steps are already enough to get a proper

estimate about the regularization parameter. The iteration steps are also fast, because in

practice we dont have to use the whole signal for the estimation, only a representative part

90

Unfortunately, usually we dont have enough information about the original nonlinear

distortion, since all that we have is a few hundred meters of developed film and we have no

information about the recording and development parameters. However, using the analytical

formula of film density characteristic and using the a priori information that the recorded

film contains periodic signal parts, a blind identification method can be developed that is

explained in 6.2.

The identification method, the computation of the regularization parameter and the

restoration by the regularized characteristic was tested on synthetic and real audio signals.

The proposed method performed well and it was proven to be robust also in the case of a

bandlimited VHS recording.

7.2

7.2.1

Improved blind identification

A main problem at the identification is that the original signal is modeled by harmonically

related sinusoids, therefore the method requires periodic signals from the distorted signal for

the analysis. However, the audio signals contain only quasi-periodic signals, which can be

treated to periodic only in a very small interval, e.g. few times ten samples. This is very

small amount of data, and if it is contaminated by strong noise, the identification could be

quite inaccurate.

Longer signal parts could be used, if the original signal could be modeled otherwise.

In the last years promising results were achieved by modeling sound signals using damped

sinusoids [152] and damped and delayed sinusoids [153].

Using a more sophisticated signal modeling method the number of model parameters

can be significantly increased. In the harmonical model we had to use only a fundamental

frequency parameter and amplitude and phase parameters for the sinusoids. In the case of

damped and delayed sinusoids we have to use decay and delay parameters for all sinusoids,

which means almost twice as much parameters as originally. The high number of parameters

can easily result the stuck of the optimizer algorithm in a local minimum. This is also an

important problem of the blind identification that requires further investigations.

91

7.2.2

Adaptivity

Another important problem is the adaptivity of the method. In this thesis, the nonlinear

distortion is assumed to be time invariant. Since one film roll has the same sensitivity on

the whole film band, the film-sound was recorded having the same amount of basic light

intensity and the concentration of the developer fluid can be treated to be constant in the

case of one film roll, therefore time invariance of the nonlinearity is a good assumption in the

case of film rolls. However, for other applications an automatic and regular determination

of the parameters of the nonlinear characteristic could be very useful.

Although the nonlinear distortion of the film can be treated to be constant at least on

a long part of the film roll, the probability distribution of the noise and the useful signal

can change during the film. In this case different amount of regularization at different part

of the films could lead better sound quality, which means the continuous adjustment of

the regularization parameter of the nonlinear distortion, like the filter adjustment in noise

reduction algorithms [154, 155].

7.2.3

The developed methods in this thesis address only memoryless nonlinearities and they are

applicable only for compensation of distortions at variable density type films. However,

several old and degraded films were made with variable area method. The distortion of

these films can be described as a distortion with memory that makes the restoration of these

films difficult. The correct restoration and the effects of noise at this kind of films still

requires further investigations. A possible method could be the further development of the

iterative restoration method described in [98], but taking the effect of noise also into account.

92

Appendix A

Brief history of film-sound

Sound-recording and film-recording techniques exist already more than 120 years. The first

moving pictures were made separately by several inventors. Eadweard Muybridge made moving pictures about his horse in 1877 [156]. Wordsworth Donisthorpe described and patented a

camera in November 1876, in which photographic plates could be exposed in rapid succession

[157]. It was further developed by Thomas Alva Edison. Edison invented also the phonograph in 1877 that could record and play back sound [158]. In 1878 Donisthorpe proposed

a device in the Nature about the combination of his Kinesigraph equipment with Thomas

Edisons phonograph as a means of recording and reproducing dramatic performances [159].

However, it required more than 10 years for making the first steps to bind sight and sound

together.

Film-sound recording and playback had two different methods [127]. The oldest one

was the needle-sound (sound-on-disc sound), where the sound was recorded separately

by phonograph or gramophone. This method had the best sound quality until 1926, the

invention of electrical recording. After that it was not used anymore. Needle-sound was

followed by optical sound-tracks that become a film-sound standard and it is still used today

(sound-on-film sound). Magnetic sound-recording (another kind of sound-on-film sound)

appeared only after 1950 when Dolbys technique made it possible to produce high quality

film-sound by this method. This method is also preferred today.

A.1

Sound-on-disc sound

The history of film-sound begins in 1889 with the name of William Kennedy Laurie Dickson,

Edisons colleague [127]. He tried to make a new movie-projector in which he synchronized

93

Edisons motion picture camera and phonograph together [160]. The first working versions

of this machine were sold in 1895. These were the so-called peep-shows, because only one

man could enjoy the movie at once, due to the lack of sound power. The theatre-version of

this projector was made only in 1913 by a special mechanical sound-amplifier [127].

Leon Gaumont, in France, began as early as 1901 to work on combining the phonograph

and motion picture. He worked on the project during several widely seperated intervals (a

series of shows of the Film Parlant at the Gaumont Palace in Paris in 1913 and demonstrations in the U.S. were the biggest accomplishments).

An attempt by Carl Laemmle of Paramount in 1907 to exploit a combination of phonograph and motion picture resulted in a German development called Syncroscope. It was

handicapped by the short time which the record would play and, after some apparently successful demonstrations, was dropped for want of a supply of pictures with sound to maintain

programs in the theaters where it was tried.

Efforts to provide sound for movies were attempted by Georges Pomerede, who used

flexible shafts or other mechanical connections to combine phonograph and motion pictures

in 1907, while E. H. Amet in 1912-1918 used electrical methods for the sound. Wm. H.

Bristol began his work on synchronous sound in 1917. There were few further efforts in the

U.S. to provide sound for pictures by means of mechanical recording until 1926.

A.2

Sound-on-film sound

The first trial to make optical sound-recordings was in 1878 by Aexander Blake. However,

he could not solve the playback. It was made first by Ernst Ruhmer, physician, in 1901. For

recording, he used an arc lamp driven by a carbon microphone through a transformer. The

light of the lamp was recorded onto a photosensitive film. The film recorded the changing

light intensity of the lamp as lighter and darker spots. The playback was performed also by

an arc lamp and a selenium sensor [161].

Film and optical sound was joint together by Eugene A. Lauste, who worked at Edisons

Orange N.J. lab between 1887-1892 under the direction of W. K. L. Dickson. While working

for Edison, Lauste read a Scientific American 1881 article [162] about Bells Photophone [163]

and sought to use this method to record sound on 35mm motion picture film. He applied

for a patent in England on Aug. 11, 1906, and granted in 1910, for a new and improved

method for simultaneously recording and reproducing movements and sounds [127]. His

first device used a mechanical grate, then mirrors, and by 1910 developed a light gate of a

94

vibrating silicon wire between two magnets. Lauste made many sound films between 1910

and 1914, but was halted by the war.

In 1917 Theodore W. Case developed the Thalofide photocell that used thallium oxysulfide. By 1922 he developed the Aeo-light as a source of modulated light. E. I. Sponable

worked with Case after 1916 and from 1922 to 1925 he shared equipment with Lee de Forest.

Case and Sponable in 1924 developed a sound recording mechanism for a modified Bell and

Howell camera using the Aeo-light tube. After breaking off from de Forest in 1925, Case

began to develop a projector sound head, offset 20 frames at a speed of 90 ft. per min., using

a narrow slit with a helical filament. General Electric and Western Electric were developing

their own sound systems. Instead of Aeo-light, they were using mechanical solutions for

light modulation, so did not wish to buy into the Case-Sponable system. William Fox at

Western Electric licensed his system at July 23, 1926, and organized the Fox-Case Corp.

with Courtland Smith as president to develop what became known as the Movietone News

service. Sponable left the Case lab to join Fox in designing the recording studios in New

York and Hollywood, and he designed in 1927 a screen that allowed sound to pass through

the screen. The Fox-Case Corp. licensed amplifiers and speakers from Western Electric in

1926 and from ERPI (Electrical Research Products Inc., was formed as a Western Electric

subsidiary, organized in January 1927). The sound-quality of these devices has ended the

life of sound-on-disc method.

Beside this, in 1918, J.T. Tykociner developed a sound-on-film system at the University

of Illinois that used mercury arc light and a Kunz photocell (a cathode of potassium on

silver).

In 1921, Charles A. Hoxie developed a sound film recorder called the Pallophotophone

(meant shaking light sound) at General Electric, a company that had a well-established

photographic and motion picture laboratory under C. E. Bathcholtze for company use and

publicity. He recorded speeches by President Coolidge and his Secretary of War and others

that were broadcast on WGY in Schenectady in 1922. He developed that Pallotrope that

was a photoelectric microphone to be used as the sound pickup. His film soundtracks were

variable-area type.

GE gave demos in 1926 and 1927 of the Hoxie system with loudpseakers and amplifiers

from Bell Labs. The GE system was called the Kinegraphone and used to exhibit a road

show version of the Paramount film Wings in 1927, using multiple-unit cone-and-baffle type

loudspeakers in a bank on each side of the screen. The soundhead was placed on top of the

projector because sound projectors had not yet been installed in theaters. Film speed was 90

95

ft. per min (24 fps) and the optical soundtrack was recorded on the edge of the film, image

size having been reduced from 1 inch down to 7/8 inch to make room for the variable area

soundtrack. In 1927 the film project was transferred from the Engineering Laboratory to

the Radio Dept. for commercial manufacturing. GE would work closely with Westinghouse

and RCA in the manufacturing of sound film equipment.

In 1926, one of the first Vitaphone shorts was made with Bryan Foy in the Manhattan

Opera House in New York. Vitaphone used a 12-inch or a 16-inch disc on a turntable at

33-1/3 rpm for 9-10 minutes, from inside to outside, on one side only, with a lateral-cut

groove. Victor made the Vitaphone records with much less abrasive filler, causing the discs

to wear out after only 24 plays. Vitaphone discs had a needle force of 80-170 grams and

frequency response of 4300 Hz. The Fox sound-on-film was 8000 Hz but had more wow

and flutter and more noise caused by light cell reading film emulsion grain. The RCA-GE

Photophone system used the variable area method that had less noise than the Fox variable

density method. Wentes light valve in the Western Electric variable density sound-on-film

method was capable to transmit up to 8500 Hz.

On Dec. 20, 1926, Western Electric and AT&T created the Electrical Research Products

Inc. (ERPI) to license non-telephone technology, including Vitaphone and microphones and

amplifiers and loudspeakers. On Dec. 31, Fox signed an agreement with ERPI to combine its

Movietone sound-on-film method with Western Electrics amplifification methods for theater

use. This variable density system would compete for the next decade with the RCA variable

area system that was adopted by RKO after 1928.

In 1930, after the invention of electrical recording that made sound pictures possible after

1926, the motion picture soundtrack was standardized as a single-track (monaural) soundon-film (optical) track on the edge of a 35mm film strip. Some variable width tracks had a

solid black left edge, or some had a solid right edge, or some were variable on both edges. In

most theaters, a Western Electric 35mm film projector had an optical pickup head that read

the variable width or variable density image with a selenium photoelectric cell, producing

an electrical signal that went to a monaural tube amplifier that drove a large horn speaker

behind the screen at the front of the theater.

In 1934, RCA introduced a 16mm sound motion picture camera for the amateur market

that recorded an optical soundtrack on the edge of the film.

1940 Nov. 13 is the premier date of Walt Disneys Fantasia in New Yorks Broadway

Theater with a multichannel soundtrack produced by Leopold Stokowski who recorded an

optical track for each section of the orchestra, resulting in 9 separate soundtracks. These

96

were mixed by Stokowski into 4 master optical tracks that were played in synchronization

on special equipment made by RCA for a multiple-loudspeaker theater installation called

Fantasound; behind the screen were three horns and placed around the other walls of the

theater were 65 smaller speakers. The separation and directionality of sounds was impressive.

However, the system was not practical because of the $85,000 cost to equip each theater,

opposition by unions, and a demand by the government that RCA stop manufacturing the

necessary sound components because of defense priorities. After the 2nd full installation of

equipment at the Carthay Circle Theater in Los Angeles, it was not installed in any other

theaters.

97

98

Appendix B

Optimal signal restoration in linear

systems

B.1

y = A1 x + A0 .

(B.1)

o = y + n,

(B.2)

where n is the output noise that assumed to be independent from y. The expected value of

x and n is zero and the variance of x and n is assumed to be finite and different from 0.

We would like to have an estimate, x about the original input signal by using a postdistortion process. The model of the estimation process can be seen in Fig. B.1. x can be

expressed as

x = B1 o + B0 = B1 y + B1 n + B0 = B1 A0 + B1 A1 x + B1 n + B0 .

(B.3)

finding the proper value of B0 and B1 . The estimate value of the least squared error, ,

99

-y

= A1 x + A0

?

- +j

-x

= B1 o + B0

between x and x can be written as

E{} = E {(

x x)}2

= E (B1 A0 + B1 A1 x + B1 n + B0 x)2

+E{B12 A21 x2 } + E{2B12 A1 xn} + E{2B0 B1 A1 x} E{2B1 A1 x2 }

+E{B12 n2 } + E{2B0 B1 n} E{2B1 nx}

(B.4)

We are looking for those B0 and B1 values, where E{} is minimal. The minimum of Eq.

(B.4) is at the point, where

E{}

B0

= 0 and

E{}

B1

(B.5)

B0 = B1 A0 ,

(B.6)

regardless from x or n.

For B1 , we can write:

2B1 A20 + 4B1 A0 A1 E{x} + 4B12 A0 E{n} + 2B0 A0 2A0 E{x}

(B.7)

B1 =

1

A1 +

100

E{n2 }

A1 E{x2 }

(B.8)

20

^

x

11

10

18

16

8

14

7

12

6

10

5

8

4

6

3

4

10

11

10

12

14

16

18

20

o

B.2

(

A10 + A11 x < x <

y=

A20 + A21 x x <

(B.9)

o = y + n,

(B.10)

where n is the output noise that assumed to be independent from y. The estimate value of

x and n is zero.

We would like to have an estimate, x about the original input signal by using a postdistortion process. The estimator can be represented as fig. B.2:

(

B10 + B11 o < o <

x =

B20 + B21 o o <

where = A10 + A11 = A20 + A21 .

Using eq. (B.11) as the post-distorter, we can distinguish four different cases:

1. when y < and y + n < then x = B10 + B11 (A10 + A11 x + n);

2. when y and y + n < then x = B10 + B11 (A20 + A21 x + n);

3. when y < and y + n then x = B20 + B21 (A10 + A11 x + n);

4. when y and y + n then x = B20 + B21 (A20 + A21 x + n).

101

(B.11)

If the probability of these four cases noted by p1 , p2 , p3 and p4 respectively, we can write:

p1 = p(o < |y < ),

p2 = p(o < |y ),

p3 = p(o |y < ),

p4 = p(o |y ),

p1 + p2 = 1 if o < ,

p3 + p4 = 1 if o ,

(B.12)

First, examine the case, when o < . The estimate value of the squared difference is

E{} = E{(p1 (B10 + B11 (A10 + A11 x + n)) + p2 (B10 + B11 (A20 + A21 x + n)) x)2 }

(B.13)

After factorization of the squared sum and simplifying with those parts where E{x} = 0,

E{n} = 0 and E{xn} = 0 eq. (B.13) reduces to:

2

2

2

E{} = B10

+ 2p1 B10 B11 A10 + 2p2 B10 B11 A20 + p21 B11

A210 + 2p1 p2 B11

A10 A20

2

2

2

2

+p22 B11

A220 + B11

E{n2 } + p21 B11

A211 E{x2 } + 2p1 p2 B11

A11 A21 E{x2 }

2

2p1 B11 A11 E{x2 } + p22 B11

A221 E{x2 } 2p2 B11 A21 E{x2 } + E{x2 }

(B.14)

We are looking for those B10 and B11 values, where E{} is minimal. The minimum of

Eq. (B.4) is at the point, where

E{}

B10

= 0 and

E{}

B11

(B.15)

p1 B10 A10 + p2 B10 A20 + B11 E{n2 } + p21 B11 A210 + p1 p2 B11 A10 A20

+p22 B11 A220 + p21 B11 A211 E{x2 } + 2p1 p2 B11 A11 A21 E{x2 }

(B.16)

Substituting eq. (B.15) into eq. (B.16) and solving for B11 we will get

B11 =

1

(p1 A11 + p2 A21 ) +

E{n2 }

(p1 A11 +p2 A21 )E{x2 }

(B.17)

B20 = p3 B21 A10 p4 B21 A20 ,

1

B21 =

E{n2 }

(p3 A11 + p4 A21 ) + (p3 A11 +p

2

4 A21 )E{x }

102

(B.18)

For a piecewise model with k intervals, [U1 , U2 , . . . , Ui , . . . , Uk ], the following formula can

be derived:

Bi0 = Bi1

k

X

j=1

Bi1 = Pk

p(y Ui |o Uj )Aj0

j=1 p(y

1

Ui |o Uj )Aj1 +

E{n2 }

1

E{x2 }

p(yU

|oU

)A

i

j

j1

j=1

(B.19)

Pk

be computed as

p(y Ui |o Uj ) =

Ui

(+)Uj

103

py ()pn ()dd

(B.20)

104

Appendix C

MATLAB simulation of a realistic

photosensitive layer

%Monte-Carlo simulation of transission vs. exposure characteristic of

%a lognormal distribution photosensitive emulsion

clear all

pack

format long

%interval of interest

x=(0.001:0.001:3.7); %this is the diameter of silver halide particles in um^2

%typical distribution parameters

%these parameters were approximated to the graph in

%T. H. james, the theory of the Photographic Process,

%MacMillan, 1966, page 39, Fig 2.1

%Histogram of size frequency curve

S = 2.5;

M = 0.07;

c1 = 100 * 1/log(S)*sqrt(2)*sqrt(pi);

%making a typical lognormal distribution

%number of particles vs. projection area of particles

lognorm_dist = c1 * exp(-(log(x)-log(M)).^2/2/log(S));

105

%lognorm_dist = lognorm_dist(500:end);

%small particles are insensitive for visible light

lognorm_dist2 = round(lognorm_dist);

%number of silver halide particles in the simulation

N = sum(lognorm_dist2);

%grains(i): number of impacted photons in the simulation

%sensitivity(i): number of required photons to make the i-th grain developable

grains = zeros(N,1);

sensitivity = zeros(N,1);

evaluation_vect = zeros(N,1);

opacity = zeros(N,1);

index1 = 1;index2 = 1;

for i = 1:length(lognorm_dist2),

index1 = index2;

index2 = index2 + lognorm_dist2(i);

sens_avg = round(200/x(i));

sensitivity(index1:index2-1) = exp(randn(index2-index1,1)+log(sens_avg));

opacity(index1:index2-1) = x(i)*ones(index2-index1,1);

end;

for i = 1:1000000,

index = round((N-1)*rand(500000,1))+1;

grains(index) = grains(index)+1;

evaluation_vect = grains-sensitivity;

indexlist = find(evaluation_vect>0);

disp(length(indexlist));

disp(N);

char(i) = sum(opacity(indexlist));

if(mod(i,50)==0)

save char.mat char

106

end;

if(mod(i,10)==0)

figure(1);plot(char);pause(0.01);

end;

end;

107

108

Appendix D

MATLAB realization of computation

of regularized nonlinear

characteristics

function K=makeinvc2(lambda,x2);

% K=makeinvc2(lambda,x2);

% This function produces a regularized inverse, K[i] of y=myfcn(x)

% in the range [0,x2] with dx=0.00025 resolution.

% lambda is the Tikhonov regularization parameter.

% We assume that myfcn^{-1}(0) = 0.

dx = 0.00025;

%resolution

= 0.000001;

x_begin=0;

y_begin=0;

x_max=abs(x2);

N=ceil(x_max/dx)+1;

x=x_begin;

K=zeros(1,N+1);

K(1)=y_begin;

109

for i=2:(N+1),

dy=(myfcn(x+d)-myfcn(x-d))/d/2;

dy2=(myfcn(x+dy*dx+d)-myfcn(x+dy*dx-d))/d/2;

dy=0.5*dy+0.5*dy2;

dy=dy/(dy*dy+lambda);

K(i)=K(i-1)+dy*dx;

x=K(i);

end;

110

Appendix E

MATLAB realization of finding the

optimal regularization

% n : signal part from the observation that

%

% x_est : estimate about the original,

%

undistorted signal

% myinvfcn(y,K,Kx) : the regularized inverse function

%

d = 0.0001;

p_nx = (-0.1:d:0.1); p_n

= hist(n,p_nx);;

p_x_estx = (-5:d:5); p_x_est

= hist(y,p_x_estx); p_x_est

p_x_est/sum(p_x_est);

% histogram of observation (first estimation about

% input signal p.d.f.)

for k=1:3, %three iterations are enough

for j=1:20,

111

% are already stored, we just select the best one.

s = int2str(j);

s = [invchar,s,.mat];

feval(load,s);

e(j)=0;

for i=1:length(p_x_est),

e(j)=e(j)+p_x_est(i)*sum(p_n .* ...

(myinvfcn(myfcn(p_x_estx(i))+p_nx,K,Kx)-p_x_estx(i)).^2);

end;

e(j)=myerror(j-1,p_x_est,p_x_estx,p_n,p_nx)

end;

elog(k,:)=e;

[emin,pos]=min(e)

s = int2str(pos-1);

s = [invchar,s,.mat];

feval(load,s);

x_est=myinvfcn(o,K,Kx);

p_x_est

= hist(x_est,p_x_estx);

p_x_est

= p_x_est/sum(p_x_est);

end;

% pos contains the number of the best regularized characteristic.

112

Appendix F

MATLAB realization of calculation of

compensation characteristic for

unbiased signal reconstruction

%computation of compensation characteristic

%for unbiased signal reconstruction

%in the case of the erf() function

%and uniformly distributed noise

%selecting the range of interest

do

= 0.00001;

range_o = 0.99999;

range_n = 0.1;%0.05;

o

= (-range_o:do:range_o);

%of uniformly distributed noise

n

= ones(round(2*range_n/do)-1,1)/(2*range_n/do);

d_h = round(range_n/do)-1;

%first iteration of the compensation characteristic

K_0

= erfinv(o);

113

kappa

= conv(n,K_0);

kappa2 = kappa(2*d_h+1:end-2*d_h);

Nm1

= K_0(d_h+1:end-d_h);

alpha

= 1;

K_1

= K_0(index+d_h+1) + alpha*(Nm1(index)-kappa2(index));

114

Bibliography

[1] S. J. Godsill and P. J. W. Rayner, Digital Audio Restoration - A Statistical ModelBased Approach. Springer-Verlag, 1998, ch. 1.3, pp. 67.

[2] P. T. Troughton and S. J. Godsil, Restoration of Nonlinearly Distorted Audio Using Markov Chain Monte Carlo Methods, Journal of the Audio Engineering Society

(Abstracts), vol. 6, p. 569, June 1998, presented at the 104th Convention of the Audio

Engineering Society, Amsterdam, May 1998. Paper available from the AES.

[3] J. Tsimbinos, Identification and Compensation of Nonlinear Distortion, Ph.D. dissertation, Institute for Telecommunications Research School of Electronic Engineering,

University of South Australia, February 1995.

[4] J. G. Wohlbier, Nonlinear Distortion and Suppression in Traveling Wave Tubes:

Insights and Methods, Ph.D. dissertation, University of Wisconsin-Madison, 2003.

[5] P. Kiss, U. Moon, J. Steensgaard, J. Stonick, and G. Temes, High-speed

4 ADC

with error correction, Electronics Letters, vol. 37, no. 2, pp. 7677, January 2001.

[6] J. Geigel and F. K. Mushgrave, A Model for Simulating the Photographic Development Process on Digital Images, Association for Computing Machinery, Inc., Tech.

Rep., 1997.

[7] A. M. Daz, A. F. Barros, and F. M. Candocia, Image Registration in Range Using a

Constrained Piecewise Linear Model: Analysis and New Results, Proceedings of the

2003 International Conference on Imaging Science, Systems and Technology (CISST

03), Las Vegas, Nevada, vol. 1, pp. 152158, June 23-26 2003.

[8] J. Schimmel, Non-linear Dynamics Processing, Presented at the 114th Convention of

the Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint

5775.

115

[9] M. van der Veen and P. Touzelet, New Vacuum Tube and Output Transformer Models

Applied to the Quad II Valve Amplifier, Presented at the 114th Convention of the

Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint 5748.

[10] S. Moller, M. Gromowski, and U. Zolzer, A Measurement Technique for Highly Nonlinear Transfer Functions, Proceedings of the 5th International Conference on Digital

Audio Effects (DAFx-02), Hamburg, Germany, pp. DAFX203 DAFX206, September 26-28 2002.

[11] C. B. Boyer, A History of Mathematics, second edition. John Wiley, 1991, ch. 18.

[12] K. Weierstrass, Mathematische Werke.

Mayer and M

uller, Berlin, 1903, vol. 3, pp.

[13] N. M. Blachman, The Signal-signal, Noise-noise, and Signal-noise Output of a Nonlinearity, IEEE Transactions on Information Theory, vol. IT-14, no. 1, pp. 2127,

January 1968.

[14] , The Uncorrelated Output Components of a Nonlinearity, IEEE Transactions

on Information Theory, vol. IT-14, no. 2, pp. 250255, January 1968.

[15] G. Szego, Orthogonal polynomials, American Society Colloquium Publications,

1939.

[16] A. A. M. Saleh, Frequency-independent and frequency-dependent nonlinear models

of TWT amplifiers, IEEE Transactions on Communications, vol. COM-29, pp. 1715

1720, November 1981.

[17] A. Ghorbani and M. Sheikhan, The Effect of Solid State Power Amplifiers (SSPAs)

Nonlinearities on MPSK and M-QAM Signal Transmission, Sixth International Conference on Digital Processing of Signals in Communication, pp. 193197, 1991.

[18] C. Rapp, Effects of HPA-Nonlinearity on a 4-DPSK/OFDM-Signal for a Digitial

Sound Broadcasting System, Proceedings of the Second European Conference on Satellite Communications, Liege, Belgium, pp. 179184, October 22-24 1991.

[19] M. Ibnkahla, J. Sombrin, F. Castanie, and N. J. Bershad, Neural Networks for Modeling Nonlinear Memoryless Communication Channels, IEEE Transactions on Communications, vol. 45, no. 7, pp. 768771, July 1997.

116

7286.

[21] R. J. Cox, Photographic sensitivity - proceedings of the symposium on photographic

sensitivity held at Gonville and Caius College and Little Hall, Cambridge, September,

1972. Academic Press, 1973, ch. 25, pp. 375389.

[22] H. Farid, Blind Inverse Gamma Correction, IEEE Transactions on Image Processing,

vol. 10, no. 2, 2001.

[23] M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems.

John Wiley,

1980.

[24] S. Y. Fakhouri, Identification of Volterra kernels of nonlinear systems, Proceedings

of the IEE, Part D, vol. 127, no. 6, pp. 296304, November 1980.

[25] P. T. Troughton, Simulation Methods for Linear and Nonlinear Time Series Models with Application to Distorted Audio Signals, Ph.D. dissertation, University of

Cambridge, 1999.

[26] H. Tong, Non-Linear Time Series: A Dynamical System Approach. Clarendon Press,

Oxford, 1993.

[27] D. T. stheim, Non-linear time series: A selective review, Scandinavian Journal of

Statistics, vol. 21, pp. 87130, 1994.

[28] S. Chen and S. A. Billings, Representations of non-linear systems: The NARMAX

model, International Journal of Control, vol. 49, no. 3, pp. 10131032, 1989.

[29] G. Palm, On representation and approximation of nonlinear sytems, Biological Cybernetics, vol. 31, pp. 119124, 1978.

[30] M. J. Korenberg, Parallel cascade identification and kernel estimation for nonlinear

systems, Annals of Biomedical Engineering, vol. 19, pp. 429455, 1991.

[31] D. T. Westwick, Methods for the Idenification of Multiple-Input Nonlinear Systems,

Ph.D. dissertation, McGill University, Montreal, 1995.

[32] T. L. Sculley and M. A. Brooke, Nonlinearity Correction Techniques for High Speed,

High Resolution A/D Conversion, IEEE Transactions on Circuits and SystemsII:

Analog and Digital Signal Processing, vol. 42, no. 3, pp. 154163, March 1995.

117

P

Nonideal Characteristics in

4 Modulation, IEEE Transactions on Circuits and

Systems-II: Analog and Digital Signal Processing, vol. 45, no. 9, pp. 13151321, September 1998.

[34] P. Rombouts, J. Raman, and L. Weyten, An Approach to Tackle Quantization Noise

P

Folding in Double-Sampling

4 Modulation A/D Converters, IEEE Transactions

on Circuits and SystemsII: Analog and Digital Signal Processing, vol. 50, no. 4, pp.

157163, April 2003.

[35] P. Rombouts, J. D. Maeyer, and L. Weyten, A 250-kHz 94-dB Double-Sampling

Modulation A/D Converter With a Modified Noise Transfer Function, IEEE Journal

of Solid State Circuits, vol. 38, no. 10, pp. 16571662, October 2003.

[36] O. Mitrea, C. Popa, A. M. Manolescu, and M. Glesner, A curvature-corrected CMOS

bandgap reference, Advances in Radio Science, vol. 1, pp. 181184, 2003.

[37] M. Ortmanns, Y. Manoli, and F. Gerfers, A New Technique for Automatic Error

P

Correction in

4 Modulators, IEEE ISCAS, 2004, paper 1022.

[38] H. Schurer, C. Slump, and O. Herrmann, Comparison of Three Methods for Lineariza-

workshop on Circuits, Systems and Signal Processing, Mierlo, The Netherlands, pp.

285290, November 27-28 1996.

[39] L. Cristaldi, A. Ferrero, M. Lazzaroni, and R. Ottoboni, A Linearization Method for

Commercial Hall-Effect Current Transducers, IEEE Transactions on Instrumentation

and Measurement, vol. 50, no. 5, pp. 11491153, October 2001.

[40] F. Kemenes, Hangfenykepezes (Optical sound-recording).

Mernoki Tovabbkepzo

(in Hungarian).

[41] A. R. Kaye, D. A. George, and M. J. Eric, Analysis and Compensation of Bandpass

Nonlinearities for Communications, IEEE Transactions on Communications, vol. 20,

no. 5, pp. 965972, October 1972.

[42] E. Biglieri, S. Barberis, and M. Catena, Analysis and Compensation of Nonlinearities

in Digital Transmission Systems, IEEE Journal on Selected Areas in Communications,

vol. 6, no. 1, pp. 4251, January 1988.

118

[43] G. Karam and H. Sari, Analysis of Predistortion, Equalization, And ISI Cancellation Techniques in Digital Radio Systems with Nonlinear Transmit Amplifiers, IEEE

Transactions on Communications, vol. 37, no. 12, pp. 12451253, December 1989.

[44] S. Pupolin and L. J. Greenstein, Performance Analysis of Digital Radio Links With

Nonlinear Transmit Amplifiers, IEEE Journal of Selected Areas in Communication,

vol. SAC5, pp. 534546, April 1987.

[45] P.-R. Chang and B.-C. Wang, Adaptive Decision Feedback Equalization for Digital

Satellite Channels Using Multilayer Neural Networks, IEEE Journal on Selected Areas

in Communication, vol. 13, no. 2, pp. 316324, February 1995.

[46] G. Lazzarin, S. Pupolin, and A. Sarti, Nonlinearity Compensation in Digital Radio

Systems, IEEE Transactions on Communications, vol. 42, no. 2/3/4, pp. 988998,

February/March/April 1994.

[47] R. Raich, H. Quian, and G. T. Zou, Digital baseband predistortion of nonlinear power

amplifiers using orthogonal polynomials, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP03), vol. 6, pp. VI 689

VI 692, April 2003.

[48] S. Nadjarah, X. N. Fernando, and R. Sedaghat, Adaptive digital predistortion of laser

diode nonlinearity for wireless applications, Canadian Conference on Electrical and

Computer Engineering,IEEE CCECE 2003, vol. 1, pp. 159162, May 2003.

[49] S. K. Wilson and P. Delay, A Method to Improve Cathode Ray Oscilloscope Accuracy, IEEE Transactions on Instrumentation and Measurement, vol. 43, no. 3, pp.

483486, June 1994.

[50] H. Black, Inventing the Negative Feedback Amplifier, IEEE Spectrum, pp. 5560,

1977.

[51] G. J. Adams, Adaptive Control of Loudspeaker Frequency Response at Low Frequencies, Journal of the Audio Engineering Society, May 1983.

[52] A. J. M. Kaizer, Modelling of the Nonlinear Response of an Electrodynamic Loudspeaker by a Volterra Series Expansion, Journal of the Audio Engineering Society,

vol. 35, no. 6, June 1987.

119

[53] W. Klippel, The Mirror FilterA New Basis for Linear Equalization and Nonlinear

Distortion Reduction of Woofer Systems, Journal of the Audio Engineering Society,

vol. 40, no. 9, p. 675, March 1992.

[54] , Filter Structures to Compensate for Nonlinear Distortions of Horn Loudspeakers, Journal of the Audio Engineering Society, October 1995, preprint number 4102.

[55] , Modeling the Nonlinearities in Horn Loudspeakers, Journal of the Audio Engineering Society, vol. 44, no. 6, pp. 470480, June 1996.

[56] , Compensation for Nonlinear Distortion of Horn Loudspeakers by Digital Signal

Processing, Journal of the Audio Engineering Society, vol. 44, no. 11, pp. 964972,

November 1996.

[57] , Adaptive Adjustment of Nonlinear Filters Used for Loudspeaker Linearization,

Journal of the Audio Engineering Society, vol. 46, no. 11, p. 939, May 1998, preprint

number 4646.

[58] , Diagnosis and Remedy of Nonlinearities in Electrodynamical Transducers,

Journal of the Audio Engineering Society, September 2000, preprint number 5261.

[59] H. Schurer,

A. G. Nijmeijer,

M. A. Boer,

C. H. Slump,

and O. E.

Nonlinearities, Proceedings of the International Conference on Acoustic, Speech and

Signal Processing, ICASSP97, vol. 3, pp. 23812385, 1997. [Online]. Available:

citeseer.nj.nec.com/schurer97identification.html

[60] M. Sternad, M. Johansson, and J. Rutstrom, Inversion of Loudspeaker Dynamics

by Polynomial LQ Feedforward Control, Proceedings of the IFAC Symposium on

Robust Control Design, Prague, Czech Republic, vol. 13, 2000. [Online]. Available:

citeseer.nj.nec.com/sternad00inversion.html

[61] A. Bellini, G. Cibelli, E. Ugolotti, A. Farina, and C. Morandi, Non-linear Digital Audio Processor for Dedicated Loudspeaker Systems, IEEE Transactions on Consumer

Electronics, vol. 44, no. 3, pp. 10241031, August 1998.

[62] A. Stenger, L. Trautmann, and R. Rabenstein, Nonlinear Acoustic Echo Cancellation With 2nd Order Adaptive Volterra Filters, IEEE International Conference on

Acoustics, Speech and Signal Processing, March 1999, phoenix, USA.

120

[63] A. Stenger and R. Rabenstein, Adaptive Volterra Filters for Nonlinear Acoustic Echo

Cancellation, Nonlinear Signal and Image Processing (NSIP), June 20-23 1999.

[64] A. Stenger, W. Kellermann, and R. Rabenstein, Adaptation of Acoustic Echo Cancellers Incorporating a Memoryless Nonlinearity, IEEE Workshop on Acoustic Echo

and Noise Control (IWAENC99), Pocono Manor PA, USA, 1999.

[65] A. Stenger and W. Kellermann, Adaptation of a Memoryless Preprocessor for Nonlinear Acoustic Echo Cancelling, Signal Processing, Elsevier, vol. 80, pp. 17471760,

September 2000.

[66] T. K. Sarkar, D. D. Weiner, and V. K. Jain, Some Mathematical Considerations in

Dealing with the Inverse Problem, IEEE Transactions on Antennas and Propagation,

vol. AP-29, no. 2, pp. 373379, March 1981.

[67] S. Vladimir, A dekonvol

ucio es merestechnikai alkalmazasi lehetosegei, III. Orsz

agos

Elektronikus M

uszer- es Merestechnikai Konferencia, March 13-16 1972, in Hungarian.

[68] P. E. Siska, Iterative unfolding of intensity data with application to molecular beam

scattering, The Journal of Chemical physics, vol. 59, no. 11, pp. 60526060, December

1973.

[69] D. Henderson, A. G. Roddie, J. G. Edwards, and H. M. Jones, A deconvolution technique using least-squares model-fitting and its application to optical pulse measurement, US Department of Commerce, National Technical information Service, National

Physical Laboratory technical report DES-87, 1988.

[70] J. Biemond, R. L. Lagendijk, and R. M. Mersereau, Iterative Methods for Image

Deblurring, Proceedings of the IEEE, vol. 78, no. 5, pp. 856883, May 1990.

[71] R. Molina, J. Nunez, and J. Mateos, Image Restoration in Astronomy - A Bayesian

Perspective, IEEE Signal Processing Magazine, vol. 18, no. 2, pp. 1129, March 2001.

[72] S. E. Kiersztyn, Numerical Correction of HV Impulse Deformed by the Measuring

System, IEEE Transactions on Power Apparatus and Systems, vol. PAS-99, no. 5,

pp. 19841991, Sept/Oct 1980.

[73] I. Kollar, P. Osvath, and W. S. Zaengl, Numerical Correction and Deconvolution of

Noisy HV Impulses by Means of Kalman Filtering, IEEE International Symposium

on ELectrical Insulation, Boston, Mass, pp. 5963, June 5-8 1988.

121

Measurements by Means of Adaptive Filtering and Deconvolution, IEEE Transactions

on Power Delivery, vol. 6, no. 2, pp. 501506, April 1991.

[75] V. Szekely, Identification of RC Networks by Deconvolution: Chances and Limits,

IEEE Transactions on Circuits and Systems-I. Theory and applications, vol. 45, no. 3,

pp. 244258, March 1998.

[76] E. Hensel, Inverse Theory and Applications for Engineers. Prentice Hall, 1991.

[77] H. W. Engl, M. Hanke, and A. Neubauer, Regularization of Inverse Problems. Kluwer,

Dordrecht, 1996.

[78] A. N. Tikhonov and V. Y. Arsenin, Solutions of Ill-posed problems. Wiley, 1977.

[79] M. Ulbrich, A Generalized Tikhonov Regularization for Nonlinear Inverse Ill-Posed

Problems, Technische Universitat, Munchen, Tech. Rep., July 1998, tUM-M9810.

[Online]. Available: http://www-lit.mathematik.tu-muenchen.de/reports/

[80] H. W. Engl, K. Kunisch, and A. Neubauer, Convergence rates for Tikhonov

regularisation of nonlinear ill-posed problems, Inverse Problems, vol. 5, pp. 523540,

August 1989. [Online]. Available: stacks.iop.org/0266-5611/5/523

[81] H. W. Engl and P. K

ugler, Nonlinear Inverse Problems: Theoretical Aspects and

Some Industrial Applications, to be published in Elsevier, 2003.

[82] M. Gulliksson, Regularizing nonlinear least squares with applications to parameter

estimation, ECMI98, June 22-27 1998.

[83] K. Kunisch and W. Ring, Regularization of nonlinear illposed problems with closed

operators, Numer. Funct. Anal. Optim., vol. 14, pp. 389404, 1993. [Online].

Available: citeseer.nj.nec.com/kunisch92regularization.html

[84] U. Amato and W. Hughes, Maximum entropy regularization of Fredholm integral

equations of the first kind, Inverse Problems, vol. 7, pp. 793808, December 1991.

[Online]. Available: stacks.iop.org/0266-5611/7/793

[85] H. W. Engl, Convergence rates for maximum entropy regularization, SIAM Journal

on Numerical Analysis, vol. 30, no. 5, pp. 15091536, October 1993.

122

regularization, Inverse problems, vol. 12, pp. 3553, February 1996. [Online].

Available: stacks.iop.org/0266-5611/12/35

[87] A. Mohammad-Djafari, J.-F. Giovannelli, G. Demoment, and J. Idier, Regularization,

maximum entropy and probabilistic methods in mass spectrometry data processing

problems, Int. Journal of Mass Spectrometry, vol. 215, no. 1-3, pp. 175193, Apr.

2002.

[88] R. Acar and C. Vogel, Analysis of bounded variation penalty method for ill-posed

problems, Inverse Problems, vol. 10, no. 6, pp. 12171229, 1994. [Online]. Available:

citeseer.nj.nec.com/167879.html

[89] L. I. Rudin and S. Osher, Total variation based image restoration with free local

constraints, IEEE International Conference on Image Processing, vol. 1, pp. 3135,

November 13-16 1994.

[90] T. Daboczi and T. B. Bako, Inverse Filtering of Optical Images, IEEE Transactions

on Instrumentation and Measurement, vol. 50, no. 4, pp. 991994, August 2001.

[91] O. Scherzer, Explicit versus implicit relative error regularization on the space of functions of bounded variation, Contemporary Mathematics (AMS), vol. 313, pp. 171198,

2002.

[92] U. Tautenhahn, On the method of Lavrentiev regularization for nonlinear ill-posed

problems, Inverse Problems, vol. 18, pp. 191207, February 2002. [Online]. Available:

stacks.iop.org/0266-5611/18/191

[93] P. K. Lamm, Future-sequential regularization methods for ill-posed Volterra equations, Journal of Mathematical Analysis and Applications, vol. 195, pp. 469494, 1995.

[Online]. Available: http://www.mth.msu.edu/ lamm/Preprints/JMAA/index.html

[94] M. Lampton, Damping-undamping strategies for the Levenberg-Marquardt nonlinear

least-squares method, Computers in Physics, vol. 11, no. 1, pp. 110115, Jan/Feb

1997.

[95] Q.-N. Jin, The analysis of a discrete scheme of the iteratively regularized GaussNewton method, Inverse Problems, vol. 16, pp. 14571476, October 2000.

123

[96] P. Deift and X. Zhou, A steepest descent method for oscillatory Riemann-Hilbert

problems, Bulletin of the American Mathematical Society, vol. 26, no. 1, pp. 119124,

Jan 1992.

[97] A. Neubauer, On Landweber iteration for nonlinear ill-posed problems in Hilbert

scales, Numerische Mathematik, vol. 85, pp. 309328, 2000.

[98] D. Preis and H. Polchlopek, Restoration of Nonlinearly Distorted Magnetic Recordings, Journal of the Audio Engineering Society, vol. 32, no. 1/2, pp. 2630, January/February 1984.

[99] E. Haber, U. Ascher, and D. Oldenburg, On optimization techniques for solving

nonlinear inverse problems, Inverse problems, vol. 16, no. 5, pp. 12631280, October

2000. [Online]. Available: citeseer.nj.nec.com/haber00optimization.html

[100] D. M. Goodman, Deconvolution/Identification Techniques for 1-D Transient Signals,

Lawrence Livermore National Laboratory, Laser Engineering Division, Tech. Rep.,

October 1990.

[101] E. Haber, Numerical Strategies for the Solution of Inverse Problems, Ph.D. dissertation, University of British Columbia, 1997.

[102] W. L. Glans, The Measurement and Deconvolution of Time Jitter in Equivalent-Time

Waveform Samplers, IEEE Transactions on Instrumentation and Measurement, vol.

IM-32, no. 1, pp. 126133, March 1983.

[103] T. Daboczi, Deconvolution of transient signals, Ph.D. dissertation, Department of

Measurement and Instrumentation Engineering, Technical University of Budapest, August 1994.

[104] T. Daboczi and I. Kollar, Multiparameter optimization of inverse filtering algorithms,

IEEE Transactions on Instrumentation and Measurement, vol. 45, no. 2, pp. 417421,

Apr 1996.

[105] W. Chen, M. Chen, and J. Zhou, Adaptively Regularized Constrained Total LeastSquares Image Restoration, IEEE Transactions on Image Processing, vol. 9, no. 4,

pp. 588594, April 2000.

124

[106] S. Roy and M. Souders, Non-iterative waveform deconvolution using analytic reconstruction filters with time-domain weighting, Instrumentation and Measurement

Technology Conference (IMTC), vol. 3, pp. 14291434, May 2000.

[107] Elcio

H. Shiguemori, H. F. de Campos Velho, J. D. S. da Silva, and F. M. Ramos, A

Parametric Study of a New Regularization Operator: the Non-extensive Entropy, 4 th

International Conference on Inverse Problems in Engineering Rio de Janeiro, Brazil,

2002.

[108] M. Bertocco, C. Narduzzi, C. Offelli, and D. Petri, An improved method for iterative identification of bandlimited linear systems, Instrumentation and Measurement

Technology Conference (IMTC), vol. 1, pp. 368372, May 1991.

[109] B. Parruck and S. M. Riad, An Optimization Criterion for Iterative Deconvolution,

IEEE Transactions on Instrumentation and Measurement, vol. IM-32, no. 1, pp. 137

140, March 1983.

[110] R. Ramlau, TIGRA an iterative algorithm for regularizing nonlinear ill-posed

problems, Inverse Problems, vol. 19, pp. 433465, March 2003.

[111] V. A. Morozov, Method for Solving Incorrectly Posed Problems. , Springer, New York,

1984.

[112] U. Tautenhahn and Q. nian Jin, Tikhonov regularization and a posteriori rules for

solving nonlinear ill posed problems, Inverse Problems, vol. 19, pp. 121, February

2003.

[113] P. C. Hansen, Numerical tools for analysis and solution of Fredholm integral equations

of the first kind, Inverse Problems, vol. 8, pp. 849872, December 1992.

[114] J. Janno, Lavrentev regularization of ill-posed problems containing nonlinear near-tomonotone operators with application to autoconvolution equation, Inverse Problems,

vol. 16, pp. 333348, April 2000. [Online]. Available: stacks.iop.org/0266-5611/16/333

[115] S. J. Godsill and P. J. W. Rayner, Digital Audio Restoration - A Statistical ModelBased Approach. Springer-Verlag, 1998.

[116] P. T. Troughton and S. J. Godsill, MCMC methods for restoration of nonlinearly

distorted autoregressive signals, Signal Processing, vol. 81, no. 1, pp. 8397, 2001.

125

[117] K. Mosegaard and M. Sambridge, Monte Carlo analysis of inverse problems, Inverse

Problems, vol. 18, pp. R29R54, June 2002.

[118] S. A. White, Restoration of Nonlinearly Distorted Audio by Histogram Equalization,

Journal of the Audio Engineering Society, vol. 30, no. 11, pp. 828832, November 1982.

[119] , Non-linear Signal Processor. US Patent 4315319, 1982.

[120] M. Tsukamoto, K. Matsunaga, O. Morioka, T. Saito, T. Igarashi, H. Yazawa, and

Y. Takahashi, Correction of Nonlinearity Errors Contained in the Digital Audio Signals, Presented at the 104th Convention of the Audio Engineering Society, 1998,

preprint 4698.

[121] J. Tsimbinos, New Design Techniques for Radio Frequency Input Stages of Communications Receivers, Proc. IREECON91 Conference, Sydney, Australia, pp. 542545,

September 16-20 1991.

[122] B. H. Carrol, Introduction to Photographic Theory. Wiley, 1980, ch. 1.

[123] , Introduction to Photographic Theory. Wiley, 1980, ch. 7.

[124] F. Hurter and V. Driffield, Photochemical Investigations and a New Method of Determination of the Sensitiveness of Photographic Plates, The Journal of the Society

of Chemical Industry, 31 May 1890.

[125] B. H. Carrol, Introduction to Photographic Theory. Wiley, 1980, ch. 12.

[126] R. J. Cox, Photographic sensitivity - proceedings of the symposium on photographic

sensitivity held at Gonville and Caius College and Little Hall, Cambridge, September,

1972. Academic Press, 1973, ch. 1, pp. 125.

[127] R. Fielding, A Technological History of Motion Pictures and Television.

University

[128] J. Webers, Handbuch der Film- und Videotechnik. M

unchen: Franzis, 1993.

[129] USA standard PH22.401967, Dimensions of Photographic Sound Record on 35 mm

Motion-Picture Prints, United States of America Standards Institute, April 1967.

126

[130] Anonymous,

York

City:

RCA

Photophone

Inc.,

1930,

book

can

be

New

downloaded

[131] F. Lohr, A filmszalag u

tja (The way of film).

[132] J. Webers, Handbuch der Film- und Videotechnik.

M

unchen: Franzis, 1993, ch. 6.3,

pp. 145146.

[133] DIN 15503, Film 35 mm Lichttonwiedergabe Spurlagen und Spaltbild, August 1968.

[134] H. Lichte and A. Narath, Physik und Technik des Tonfilms, 3rd ed. S. Hirzel, Leipzig,

1945, ch. II, pp. 7289.

[135] T. B. Bako, B. Bank, and T. Daboczi, Restoration of Nonlinearly Distorted Audio

with the Application to Old Motion Pictures, Proceedings of the AES 20th International Conference on Archiving, Restoration and New Methods of Recording, pp.

191198, October 5-7 2001, no. 88-65002.

[136] P. T. Troughton, Bayesian Restoration of Quantised Audio Signals Using a Sinusoidal

Model With Autoregressive Residuals, Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, pp. 159162, October

17-20 1999.

[137] T. B. Bako and T. Daboczi, Reconstruction of Nonlinearly Distorted Signals with

Regularized Inverse Characteristics, Instrumentation and Measurement Technology

Conference, 2001. IMTC 2001. Proceedings of the 18th IEEE, vol. 3, pp. 1565 1569,

May 21-23 2001, no. 01CH37188.

[138] , Reconstruction of Nonlinearly Distorted Signals With Regularized Inverse Characteristics, IEEE Transactions on Instrumentation and Measurement, vol. 51, no. 5,

pp. 10191022, 2002.

[139] M. Marzinzik and B. Kollmeyer, Speech Pause Detection for Noise Spectrum Estimation by Tracking Power Envelope Dynamics, IEEE Transactions on Speech and

Audio Processing, VOL. 10, NO. 2, FEBRUARY 2002, vol. 10, no. 2, pp. 109118,

February 2002.

127

IEEE Transactions on Communications, vol. COM-34, no. 6, pp. 630637, June 1986.

[141] J. Haigh and J. Mason, A voice activity detector based on cepstral analysis,

Proceedings of the European Conference on Speech Technology and Communication, EUROSPEECH 93, vol. 2, pp. 11031106, 1993. [Online]. Available:

citeseer.nj.nec.com/haigh93voice.html

[142] H.-G. Hirsch,

cite-

seer.nj.nec.com/hirsch93estimation.html

[143] G. Doblinger, Computationally Efficient Speech Enhancement by Spectral Minima

Tracking in Subbands, Proceedings of the European Conference on Speech Technology

and Communication, EUROSPEECH 95, (Madrid, Spain), pp. 15131516, September

1995. [Online]. Available: citeseer.nj.nec.com/doblinger95computationally.html

[144] R. Martin, Noise Power Spectral Density Estimation Based on Optimal Smoothing

and Minimum Statistics, IEEE Transactions on Speech and Audio Processing, vol. 9,

no. 5, pp. 504512, July 2001.

[145] M. Israel Cohen and B. Berdugo, Noise Estimation by Minima Controlled Recursive

Averaging for Robust Speech Enhancement, IEEE Signal Processing Letters, vol. 9,

no. 1, pp. 1215, January 2002.

[146] I. Cohen, Noise Spectrum Estimation in Adverse Environments: Improved Minima

Controlled Recursive Averaging, IEEE Transactions on Speech and Audio Processing,

2003.

[147] V. Stahl, A. Fischer, and R. Bippus, Quantile Based Noise Estimation for Spectral

Subtraction and Wiener Filtering, Proceedings of the International Conference on

Acoustics, Speech and Signal Processing, ICASSP00, vol. 3, pp. 18751878, 2000.

[Online]. Available: citeseer.nj.nec.com/stahl00quantile.html

[148] P. Sovka, P. Pollak, and J. Kybic, Extended Spectral Subtraction, European Signal

Processing Conference (EUSIPCO96), (Trieste, Italy), September 1996. [Online].

Available: citeseer.nj.nec.com/sovka96extended.html

128

[149] T. B. Bako, T. Daboczi, and B. A. Bell, Automatic Compensation of Nonlinear Distortions, Instrumentation and Measurement Technology Conference, 2002. IMTC/2002.

Proceedings of the 19th IEEE, vol. 2, pp. 1321 1357, May 21-23 2002, no. 00CH37276.

[150] N. S. Nahman and M. E. Guillaume, Deconvolution of Time Domain Waveforms in

the Presence of Noise, National Bureau of Standards, NBS, Boulder, CO. USA, Tech.

Note 1047, 1981.

[151] T. B. Bako and T. Daboczi, Unbiased Reconstruction of Nonlinear Distortions, Instrumentation and Measurement Technology Conference, 2002. IMTC/2002. Proceedings of the 19th IEEE, vol. 2, pp. 1099 1102, May 21-23 2002, no. 00CH37276.

[152] Y. Grenier and B. David, Extraction of weak background transients from audio signals, Presented at the 114th Convention of the Audio Engineering Society, Amsterdam, Netherlands, March 22-25 2003, preprint 5774.

[153] R. Boyer and K. Abed-Meraim, Efficient Parametric Modeling for Audio Transients,

Proceedings of the 5th International Conference on Digital Audio Effects (DAFx-02),

Hamburg, Germany, pp. DAFX97 DAFX100, September 26-28 2002.

[154] S. Canazza, G. de Poli, G. A. Mian, and A. Scarpa, Objective comparison of audio

restoration methods based on Short Time Spectral Attenuation, Proceedings of Science and Technology for the safeguard of Cultural Heritage in the mediterranean basin,

Alcala de Henares, Spain, pp. 173174, July 9-14 2001.

[155] , Comparison of different audio restoration methods based on frequency and time

domains with applications on electronic music repertoire, Proceedings of International

Computer Music Conference. Goteborg, Sweden, pp. 104109, September 16-21 2002.

[156] E. Muybridge, Animals in motion. Dover Publications, 1957.

[157] S. Herbert and M. Heard, Industry, Liberty, and a Vision... Wordsworth Donisthorpes

Kinesigraph. Projection Book, London, 1998.

[158] Anonymous, The Talking Phonograph, Scientific American, 22 December 1877,

article can be downloaded from http://history.acusd.edu/gen/recording/tinfoil77.html.

[Online]. Available: http://history.acusd.edu/gen/recording/tinfoil77.html

129

be downloaded from http://histv2.free.fr/19/donisthorpe.htm. [Online]. Available:

http://histv2.free.fr/19/donisthorpe.htm

[160] W. K. L. Dickson, A Brief History of the Kinetograph, the Kinetoscope and the

Kinetophonograph, SMPE Journal (Society for Motion Picture Engineers), vol. 21,

December 1933.

[161] E. Ruhmer, The Photographophone, Scientific American, 20 July 1901, article

can be downloaded from http://www.fsfl.home.se/backspegel/ruhmer.html. [Online].

Available: http://www.fsfl.home.se/backspegel/ruhmer.html

[162] Anonymous, Bells Photophone, Scientific American, vol. 44, no. 1, pp.

12, January 1881, article can be downloaded from http://histv2.free.fr/bell/bellnotice.htm. [Online]. Available: http://histv2.free.fr/bell/bellnotice.htm

[163] A. G. Bell, On the Production and Reproduction of Sound by Light, American

Journal of Sciences, vol. XX, no. 118, pp. 305324, October 1880, article

can be downloaded from http://histv2.free.fr/bell/bell1.htm. [Online]. Available:

http://histv2.free.fr/bell/bell1.htm

130

- WhatsApp Security WhitepaperUploaded byGreg Broiles
- v36_956Uploaded bypassme369
- fir_filter_design_considerationsUploaded byJohn Nash
- waveshapeSynth_4upUploaded bypassme369
- Fractal Audio Systems MIMIC (Tm) TechnologyUploaded by55bar
- 45057089 Q Amp a Session 01 NoisiaUploaded bypassme369
- Xdiss LangUploaded bypassme369
- Xdiss LangUploaded bypassme369
- Magnetic Tape Manual 1Uploaded bypassme369
- 665_Volterra_2008.pdfUploaded bypassme369
- VOLTERRA 2010 Dafx NovakUploaded bypassme369
- Time WalkerUploaded bypassme369
- Chapter 14 Hilbert FilterUploaded bypassme369
- 499 Poster FinalUploaded bypassme369
- Q&A Session 01 - NoisiaUploaded bypanner_23
- rf1Uploaded bypassme369
- 0fcfd50fb3d9d213da000000Uploaded bypassme369
- Molecules of EmotionsUploaded bymeliancurtean4659
- ICMC-AdditiveInterpUploaded bypassme369
- FAST FOURIER TRANSFORM ALGORITHMS WITH APPLICATIONSUploaded byma860930
- AKG C414 B Xls XlII ServiceUploaded bypassme369
- Samsung Dvd v6700 Xsa-bmUploaded bypassme369
- rf4Uploaded bypassme369
- rf4Uploaded bypassme369
- rf4Uploaded bypassme369
- rf4Uploaded bypassme369
- des MethodUploaded byCaroline Simpkins
- 499 Poster FinalUploaded bypassme369
- 499 Poster FinalUploaded bypassme369