Main PRJT

CHAPTER 1
1.1 Introduction
Noise is a random fluctuation in an electrical signal, a characteristic of all electronic
circuits. Noise generated by electronic devices varies greatly, as it can be produced by
several different effects. In communication systems, the noise is an error or undesired
random disturbance of a useful information signal. Denoising is the extraction of a signal
from a mixture of signal and noise . This is the first step in many applications.
In this project, DWT is used for De-noising a one dimensional signal. The linear methods
of de-noising (like Filtering) have the drawback of either removing sharp features
(sudden changes) or not completely removing noise. The DWT is a non-linear method
that separates the signal from noise by comparing their amplitude rather than their
spectra.
1.2 Aim of the project

The aim of the project is to de-noise a real time signal and to design a suitable
architecture for high speed implementation.
1.3 Methodology
The test signal is initially analyzed in MATLAB using a suitable mother wavelet. Then it
is decomposed into required number of levels and denoised using a suitable threshold
rule.
De-noising using DWT is realized using the concept of Parallel Distributed Arithmetic in
VHDL for high computational speed.
The results of MATLAB and VHDL are compared.
1
1.4 Significance of the Work
The significance of this work is that Wavelet Transforms are used to denoise the signal
instead of general techniques (like filtering). Recently the Wavelet Transform has gained
a lot of popularity in the field of signal processing. This is due to its capability of
providing both time and frequency information simultaneously, hence giving a time-
frequency representation of the signal. The traditional Fourier Transform can only
provide spectral information about a signal. Moreover, the Fourier method only works for
stationary signals. In many real world applications, the signals are non-stationary. One
solution for processing non-stationary signals is the Wavelet Transform. Currently there
is tremendous focus on the application of Wavelet Transform for real time signal
processing like De-noising and Compression.
1.5 Organization of the Report
Chapter 2 consists of Literature review related to the topic of work i.e. Wavelet
transforms and implementation of Distributed Arithmetic.
Chapter 3 consists of detailed procedure we adopted to denoise the signal
Chapter 4 consists of the, results and waveforms.
2
CHAPTER 2
2.1 WAVELET TRANSFORMS
2.1.1 Introduction
Mathematical transformations are applied to signals to obtain further information

From that signal that is not readily available in the raw signal. Usually a time-domain
signal is assumed as a raw signal and a signal that has been “transformed by any of the
available mathematical transformations as a processed signal.
There are number of transformations that can be applied, among which the Fourier
transforms are probably by far the most popular. In this section it will be analyzed how
wavelet transform overcomes some of the drawbacks of Fourier transforms.
2.1.2 Time Domain Analysis
Most of the signals in practice are time domain signals in their raw format. That is,
whatever that signal is measuring, is a function of time. In other words, when the signal is
plotted one of the axes is time (independent variable), and the other (dependent variable)
is usually the amplitude. When time-domain signals are plotted, time-amplitude
representation of the signal will be obtained. This representation is not always the best
representation of the signal for most signal processing related applications. In many
cases, the most distinguished information is hidden in the frequency content of the signal.
The frequency spectrum of a signal is basically the frequency components (spectral
components) of that signal. The frequency spectrum of a signal shows what frequencies
exist in the signal.
For example the electric power used in our daily life in the US is 60 Hz. This
means that if the electric current is plotted, it will be a sine wave passing through the
same point 60 times in 1 second. In the below figures, the first one is a sine wave at 3
Hz ,the second is the same at 10 Hz and the third one at 50 Hz.
3
Fig 2.1 sine waves with different frequencies
To find the frequency content of a signal, we use the FOURIER TRANSFORM (FT).
2.1.3 Frequency Domain Analysis
2.1.3.1 The Fourier Transform
If the FT of a signal in time domain is taken, the frequency-amplitude

representation of that signal is obtained. In other words, the signal is plotted with one axis
4
being the frequency and the other being the amplitude. This plot tells how much of each
frequency exists in a signal.
. The Fig 2.2 shows the FT of the 50 Hz signal
Fig 2.2
Often times, the information that cannot be readily seen in the time-domain can be
seen in the frequency domain.
The best example is that of an ECG signal (Electro Cardio Graphy, graphical
recording of heart's electrical activity). The typical shape of a healthy ECG signal is well
known to cardiologists. Any significant deviation from that shape is usually considered to
be a symptom of a pathological condition.
This pathological condition, however, may not always be quite obvious in the
original time-domain signal. Cardiologists usually use the time-domain ECG signals
which are recorded on strip-charts to analyze ECG signals. Recently, the new
computerized ECG recorders/analyzers also utilize the frequency information to decide
whether a pathological condition exists. A pathological condition can sometimes be
diagnosed more easily when the frequency content of the signal is analyzed.
This, of course, is only one simple example why frequency content might be
useful. Today Fourier transforms are used in many different areas including all branches
of engineering.
5
2.1.4. Drawbacks of Fourier Transform
Although FT is probably the most popular transform being used (especially in

Electrical engineering), it is not the only one. There are many other transforms that are
used quite often by engineers and mathematicians. Hilbert transform, short-time Fourier
transform , Wigner distributions, the Radon Transform, and the wavelet transform,
constitute only a small portion of a huge list of transforms that are available at engineer's
and mathematician's disposal. Every transformation technique has its own area of
application, with advantages and disadvantages, and the wavelet transform (WT) is no
exception.
FT is a reversible transform, that is, it allows to go back and forward between the
raw and processed (transformed) signals. However, only either of them is available at any
given time. That is, no frequency information is available in the time-domain signal, and
no time information is available in the Fourier transformed signal. But sometimes it is
necessary to have both the time and the frequency information at the same time
depending on the particular application, and the nature of the signal in hand. To better
understand the drawbacks of Fourier transforms it is required to understand the concept
of stationary and non-stationary signals.
2.1.5 Stationary and Non-Stationary Signals
FT gives the frequency information of the signal, which means that it tells us how
much of each frequency exists in the signal, but it does not tell us when in time these
frequency components exist. This information is not required when the signal is
stationary. Signals whose frequency content do not change in time are called stationary
signals. In other words, the frequency content of stationary signals do not change in time.
In this case, one does not need to know at what times frequency components exist , since
all frequency components exist at all times.
For example the following signal
x(t)=cos(2*pi*10*t)+cos(2*pi*25*t)+cos(2*pi*50*t)+cos(2*pi*100*t)
6
is a stationary signal, because it has frequencies of 10, 25, 50, and 100 Hz at any given
time instant. This signal is plotted below:
Figure 2.3 Stationary Signal
And the following is its FT:
Figure 2.4 FT of Stationary signal
It contains the four spectral components corresponding to the frequencies 10, 25, 50 and
100 Hz.
Contrary to the signal in Figure 2.3, the following signal is not stationary. Figure
2.5 plots a signal whose frequency constantly changes in time. This is a non-stationary
signal.
7
Figure 2.5 Non Stationary signal
And the following is its FT:
Figure 2.6 FT of Non Stationary signal
Little ripples at this time are due to sudden changes from one frequency
component to another, which have no significance.
8
Now, comparing the Figures 2.4 and 2.6,the similarity between these two spectrum is
apparent. Both of them show four spectral components at exactly the same frequencies,
i.e., at 10, 25, 50, and 100 Hz. Other than the ripples, and the difference in amplitude
(which can always be normalized), the two spectrums are almost identical, although the
corresponding time-domain signals are not even close to each other. Both of the signals
involve the same frequency components, but the first one has these frequencies at all
times, the second one has these frequencies at different intervals. So, FT gives the
spectral content of the signal, but it gives no information regarding where in time those
spectral components appear. Therefore, FT is not a suitable technique for non-stationary
signal.
2.1.6. Need for Time Frequency Representation
FT can be used for non-stationary signals, if we are only interested in what spectral
components exist in the signal, but not interested where these occur. However, if this
information is needed, i.e., if we want to know, what spectral component occur at what
time (interval) , then Fourier transform is not the right transform to use.
For practical purposes it is difficult to make the separation, since there are a lot of
practical stationary signals, as well as non-stationary ones. Almost all biological signals,
for example, are non-stationary. Some of the most famous ones are ECG (electrical
activity of the heart, electrocardiograph), EEG (electrical activity of the brain,
electroencephalograph), and EMG (electrical activity of the muscles, electromyogram).
When the time localization of the spectral components is needed, a transform

giving the TIME-FREQUENCY REPRESENTATION of the signal is needed. The next
transform developed to serve this purpose is the Short Term Fourier Transform.
9
2.1.7. Short Term Fourier Transform
There is only a minor difference between STFT and FT. In STFT, the signal is
divided into small enough segments, where these segments (portions) of the signal can be
assumed to be stationary. For this purpose, a window function "w" is chosen. The width
of this window must be equal to the segment of the signal where its stationary is valid.
This window function is first located to the very beginning of the signal. That is, the
window function is located at t=0. Suppose that the width of the window is "T" s. At this
time instant (t=0), the window function will overlap with the first T/2 seconds. The
window function and the signal are then multiplied. By doing this, only the first T/2
seconds of the signal is being chosen, with the appropriate weighting of the window (if
the window is a rectangle, with amplitude "1", then the product will be equal to the
signal). Then this product is assumed to be just another signal, whose FT is to be taken.
In other words, FT of this product is taken, just as taking the FT of any signal.
The result of this transformation is the FT of the first T/2 seconds of the signal. If
this portion of the signal is stationary, as it is assumed, then there will be no problem and
the obtained result will be a true frequency representation of the first T/2 seconds of the
signal.
The next step, would be shifting this window (for some t1 seconds) to a new location,
multiplying with the signal, and taking the FT of the product. This procedure is followed,
until the end of the signal is reached by shifting the window with "t1" seconds intervals.
The following definition of the STFT summarizes all the above explanations in one
line
Equ .2.1
10
In the above equation x(t) is the signal itself, w(t) is the window function, and * is
the complex conjugate. From the equation, we can observe that the STFT of the signal is
nothing but the FT of the signal multiplied by a window function.
For every t' and f a new STFT coefficient is computed.
Fig 2.7 STFT coefficients computed for every t’ and f
The Gaussian-like functions in color are the windowing functions. The red one
shows the window located at t=t1', the blue shows t=t2', and the green one shows the
window located at t=t3'. These will correspond to three different FTs at three different
times. Therefore, we will obtain a true time-frequency representation (TFR) of the signal.
STFT is a function of both time and frequency (unlike FT, which is a function of
frequency only), the transform would be two dimensional (three, if you count the
amplitude too).
Consider a non-stationary signal, such as the following one:
11
Figure 2.8 Non stationary signal
In this signal, there are four frequency components at different times. The interval 0
to 250 ms is a simple sinusoid of 300 Hz, and the other 250 ms intervals are sinusoids of
200 Hz, 100 Hz, and 50 Hz, respectively. Apparently, this is a non-stationary signal.
Below is its STFT:
12
Fig 2.9 STFT of Non stationary signal
This is two dimensional plot (3 dimensional, if the amplitude is also). The "x" and
"y" axes are time and frequency, respectively. The graph is symmetric with respect to
midline of the frequency axis, FT of a real signal is always symmetric, since STFT is
nothing but a windowed version of the FT, STFT is also symmetric in frequency. The
symmetric part is said to be associated with negative frequencies.
There are four peaks corresponding to four different frequency components. Also
note that, unlike FT, these four peaks are located at different time intervals along the time
axis.
From the above fig. it is not only possible to know what frequency components are
present in the signal, but we also know where they are located in time.
The implicit problem of the STFT is not obvious in the above example. The
problem with STFT is the Heisenberg Uncertainty Principle . This principle originally
applied to the momentum and location of moving particles, can be applied to time-
13
frequency information of a signal. Simply, this principle states that one cannot know the
exact time-frequency representation of a signal, i.e., one cannot know what spectral
components exist at what instances of times. What one can know are the time intervals in
which certain band of frequencies exist, which is a resolution problem.
2.1.8. The Resolution Problem
The problem with the STFT has something to do with the width of the window
function that is used. To be technically correct, this width of the window function is
known as the support of the window. If the window function is narrow, then it is known
as compactly supported.
In the FT there is no resolution problem in the frequency domain, i.e., it is known

exactly what frequencies exist; similarly there is no time resolution problem in the time
domain, since we know the value of the signal at every instant of time. Conversely, the
time resolution in the FT, and the frequency resolution in the time domain are zero, since
we have no information about them. What gives the perfect frequency resolution in the
FT is the fact that the window used in the FT is its kernel, the exp{jwt} function, which
lasts at all times from minus infinity to plus infinity. Now, in STFT, our window is of
finite length, thus it covers only a portion of the signal, which causes the frequency
resolution to get poorer i.e. we no longer know the exact frequency components that exist
in the signal, but we only know a band of frequencies that exist:
In FT, the kernel function, allows us to obtain perfect frequency resolution,

because the kernel itself is a window of infinite length. In STFT window is of finite
length, and we no longer have perfect frequency resolution.
If a window of infinite length is used, we get the FT, which gives perfect
frequency resolution, but no time information. Furthermore, in order to obtain the
stationarity, we have to have a short enough window, in which the signal is stationary.
The narrower we make the window, the better the time resolution, and better the
assumption of stationarity, but poorer the frequency resolution:
14
Narrow window ===>good time resolution, poor frequency resolution.
Wide window ===>good frequency resolution, poor time resolution.
In order to see these effects, consider four windows of different length to compute
the STFT. The window function we use is simply a Gaussian function in the form:
w (t)=exp(-a*(t^2)/2);
where a determines the length of the window, and t is the time. The following figure
shows four window functions of varying regions of support, determined by the value of a.
Just note the length of each window. The above example given was computed with the
second value, a=0.001. The STFT of the same signal given above is computed with the
other windows.
Figure 2.10 Different windows
15
Figure 2.11 STFT found using narrow window
The fig. shows the STFT found using most narrow window. The four peaks are well
separated from each other in time. In frequency domain, every peak covers a range of
frequencies, instead of a single frequency value. Now by making the window wider, (i.e.
the 3rd window, the second one was already shown in the first example), the STFT is as
shown
16
Figure 2.12 STFT found using wider window
The peaks are not well separated from each other in time, unlike the previous case,
however, in frequency domain the resolution is much better. By further increasing the
width of the window, the STFT is plotted as shown.
17
Figure 2.13 Showing Resolution problem
These examples illustrate the implicit problem of resolution of the STFT. Anyone
who would like to use STFT is faced with this problem of resolution. Narrow windows
give good time resolution, but poor frequency resolution. Wide windows give good
frequency resolution, but poor time resolution; furthermore, wide windows may violate
the condition of stationarity. The problem is a result of choosing a window function, once
and for all, and uses that window in the entire analysis. If the frequency components are
well separated from each other in the original signal, than we may sacrifice some
frequency resolution and go for good time resolution, since the spectral components are
already well separated from each other. However, if this is not the case, then a good
window function could be difficult to find.
Although the time and frequency resolution problems are results of a physical
phenomenon (the Heisenberg uncertainty principle) and exist regardless of the transform
used, it is possible to analyze any signal by using an alternative approach called the
multiresolution analysis (MRA). MRA, as implied by its name, analyzes the signal at
18
different frequencies with different resolutions. Every spectral component is not resolved
equally as was the case in the STFT.
MRA is designed to give good time resolution and poor frequency resolution at
high frequencies and good frequency resolution and poor time resolution at low
frequencies. This approach makes sense especially when the signal at hand has high
frequency components for short durations and low frequency components for long
durations. Fortunately, the signals that are encountered in practical applications are often
of this type. For example, the following shows a signal of this type. It has a relatively low
frequency component throughout the entire signal and relatively high frequency
components for a short duration somewhere around the middle.
2.1.9. THE CONTINUOUS WAVELET TRANSFORM
The continuous wavelet transform was developed as an alternative approach to

the short time Fourier transforms to overcome the resolution problem. The wavelet
analysis is done in a similar way to the STFT analysis, in the sense that the signal is
multiplied with a function, similar to the window function in the STFT, and the transform
is computed separately for different segments of the time-domain signal. However, there
are two main differences between the STFT and the CWT:
1. The Fourier transforms of the windowed signals are not taken, and therefore single
peak will be seen corresponding to a sinusoid, i.e., negative frequencies are not
computed.
2. The width of the window is changed as the transform is computed for every single
spectral component, which is probably the most significant characteristic of the wavelet
transform.
The continuous wavelet transform is defined as follows
19
Equ.2.2
As seen in the above equation, the transformed signal is a function of two variables, tau
and s, the translation and scale parameters, respectively. psi(t) is the transforming
function, and it is called the mother wavelet . The term mother wavelet gets its name due
to two important properties of the wavelet analysis as explained below:
The term wavelet means a small wave. The smallness refers to the condition that this
(window) function is of finite length (compactly supported). The wave refers to the
condition that this function is oscillatory. The term mother implies that the functions with
different region of support that are used in the transformation process are derived from
one main function, or the mother wavelet. In other words, the mother wavelet is a
prototype for generating the other window functions.
The term translation is used in the same sense as it was used in the STFT; it is related to
the location of the window, as the window is shifted through the signal. This term,
obviously, corresponds to time information in the transform domain. However, there is
no frequency parameter, as we had before for the STFT. Instead, we have scale parameter
which is defined as 1/frequency. The term frequency is reserved for the STFT.
2.1.9.1The Scale
The parameter scale in the wavelet analysis is similar to the scale used in maps. As in the
case of maps, high scales correspond to a non-detailed global view (of the signal), and
low scales correspond to a detailed view. Similarly, in terms of frequency, low
frequencies (high scales) correspond to a global information of a signal (that usually
spans the entire signal), whereas high frequencies (low scales) correspond to a detailed
information of a hidden pattern in the signal (that usually lasts a relatively short time).
20
Cosine signals corresponding to various scales are given as examples in the following
figure.
Figure 2.14 Signal corresponding to various scales
Fortunately in practical applications, low scales (high frequencies) do not last for the
entire duration of the signal, unlike those shown in the figure, but they usually appear
from time to time as short bursts, or spikes. High scales (low frequencies) usually last for
the entire duration of the signal.
Scaling, as a mathematical operation, either dilates or compresses a signal. Larger scales

correspond to dilated (or stretched out) signals and small scales correspond to
21
compressed signals. All of the signals given in the figure are derived from the same
cosine signal, i.e., they are dilated or compressed versions of the same function. In the
above figure, s=0.05 is the smallest scale, and s=1 is the largest scale.
In terms of mathematical functions, if f(t) is a given function f(st) corresponds to a

contracted (compressed) version of f(t) if s > 1 and to an expanded (dilated) version of
f(t) if s < 1 .
However, in the definition of the wavelet transform, the scaling term is used in the
denominator, and therefore, the opposite of the above statements holds, i.e., scales s > 1
dilates the signals whereas scales s < 1 , compresses the signal. This interpretation of
scale will be used throughout this text.
2.1.9.2. COMPUTATION OF THE CWT
Continuous Wavelet Transform can be computed in five steps. The continuous wavelet
transform is the sum over all time of the signal multiplied by scaled, shifted versions of
the wavelet. This process produces wavelet coefficients that are a function of scale and
position.
1. Take a wavelet and compare it to a section at the start of the original signal.
2. Calculate a number, C, that represents how closely correlated the wavelet is with this
section of the signal. The higher C is, the more the similarity. More precisely, if the
signal energy and the wavelet energy are equal to one, C may be interpreted as a
22
correlation coefficient. The results will depend on the shape of the wavelet chosen.
3. Shift the wavelet to the right and repeat steps 1 and 2 until you've the whole signal is
covered.
4. Scale (stretch) the wavelet and repeat steps 1 through 3.
5. Repeat steps 1 through 4 for all scales.
c=0.2247
When the process is done, the coefficients produced at different scales by different
sections of the signal are obtained. The coefficients constitute the results of a regression
of the original signal performed on the wavelets. We have to make a plot on which the x-
axis represents position along the signal (time), the y-axis represents scale, and the color
at each x-y point represents the magnitude of the wavelet coefficient C.
23
Figure 2.15 Represents wavelet coefficients at each x-y point
2.1.10. Need of Discrete Wavelet Transform
The discretized continuous wavelet transform enables the computation of the continuous
wavelet transform by computers; it is not a true discrete transform. As a matter of fact,
the wavelet series is simply a sampled version of the CWT, and the information it
provides is highly redundant as far as the reconstruction of the signal is concerned. This
redundancy, on the other hand, requires a significant amount of computation time and
resources. The discrete wavelet transform (DWT), on the other hand, provides sufficient
information both for analysis and synthesis of the original signal, with a significant
reduction in the computation time.
The DWT is considerably easier to implement when compared to the CWT. The DWT
analyzes the signal at different frequency bands with different resolutions by
decomposing the signal into a coarse approximation and detail information. DWT
employs two sets of functions, called scaling functions and wavelet functions, which are
associated with low pass and high pass filters, respectively. The decomposition of the
signal into different frequency bands is simply obtained by successive high pass and low
pass filtering of the time domain signal.
24
2.1.11. One-Stage Filtering: Approximations and Details
For many signals, the low-frequency content is the most important part. It is what gives
the signal its identity. The high-frequency content, on the other hand, imparts flavor or
nuance. Consider the human voice. By removing the high-frequency components, the
voice sounds different, but it can still be told what's being said. However, if the low-
frequency components are removed, one hears gibberish.
In wavelet analysis, it is often spoken of approximations and details. The approximations

are the high-scale, low-frequency components of the signal. The details are the low-scale,
high-frequency components.
The filtering process, at its most basic level, looks like this.
Figure 2.16 Filtering process at its basic level
The original signal, S, passes through two complementary filters and emerges as two
signals. If this operation is actually performed on a real digital signal, we wind up with
twice as much data as we started with. Suppose, for instance, that the original signal S
consists of 1000 samples of data. Then the resulting signals will each have 1000 samples,
for a total of 2000.. There exists a more subtle way to perform the decomposition using
wavelets. By looking carefully at the computation, only one point out of two in each of
the two 2000-length samples may be kept to get the complete information. This is the
notion of down sampling. We produce two sequences called cA and cD.
25
Figure 2.17 Shows the process of obtaining DWT coefficients
The process on the right, which includes down sampling, produces DWT coefficients.
Below it is shown how to perform a one-stage discrete wavelet transform of a sinusoid
signal with high-frequency noise added to it.
The schematic figure is as shown
.Figure 2.18 schematic representation for one stage decomposition
2.1.12. Multiple-Level Decomposition
The decomposition process can be iterated, with successive approximations being

decomposed in turn, so that one signal is broken down into many lower resolution
components. This is called the wavelet decomposition tree.
26
Figure 2.19 Wavelet decomposition tree
Looking at a signal's wavelet decomposition tree can yield valuable information.
Figure 2.20 Signal’s Wavelet decomposition tree.
Since the analysis process is iterative, in theory it can be continued indefinitely. In

reality, the decomposition can proceed only until the individual details consist of a single
sample or pixel. In practice, we will select a suitable number of levels based on the nature
of the signal, or on a suitable criterion such as entropy.
2.1.13. Wavelet Reconstruction
27
The mathematical manipulation that effects synthesis is called the inverse discrete
wavelet transform (IDWT).To synthesize a signal using Wavelet Toolbox software,it is
reconstructed from the wavelet coefficients.
Fig 2.21.Reconstruction from the wavelet coefficients.
Where wavelet analysis involves filtering and down sampling, the wavelet reconstruction
process consists of up sampling and filtering. Up sampling is the process of lengthening a
signal component by inserting zeros between samples.
Fig 2.22 Single component and Upsampled signal component
The toolbox includes commands, like idwt and waverec, that perform single-level or
multilevel reconstruction, respectively, on the components of one-dimensional signals.
These commands have their two-dimensional analogs, idwt2 and waverec2.
2.1.14. Reconstruction Filters
28
The filtering part of the reconstruction process is important, because it is the choice of
filters that is crucial in achieving perfect reconstruction of the original signal.
The down sampling of the signal components performed during the decomposition phase
introduces a distortion called aliasing. It turns out that by carefully choosing filters for the
decomposition and reconstruction phases that are closely related (but not identical), we
the effects of aliasing can be cancelled out.
The low- and high-pass decomposition filters (L and H), together with their associated
reconstruction filters (L' and H'), form a system of what is called quadrature mirror
filters:
Fig 2.23 (a) Decomposition and (b) Reconstruction
2.1.14.1Reconstructing Approximations and Details
It is possible to reconstruct our original signal from the coefficients of the approximations
and details.
Fig 2.24 Reconstructing approximations and details
29
It is also possible to reconstruct the approximations and details themselves from their
coefficient vectors. As an example, consider how the first-level approximation A1 can be
reconstructed from the coefficient vector cA1.
Coefficient vector cA1 is passed through the same process used to reconstruct the
original signal. However, instead of combining it with the level-one detail cD1, we feed
in a vector of zeros in place of the detail coefficients vector:
Fig 2.25 Reconstructing the signal from approximations
The process yields a reconstructed approximation A1, which has the same length as the
original signal S and which is a real approximation of it.
Similarly, the first-level detail D1 can be reconstructed, using the analogous process:
Fig 2.25 Reconstructing the signal from details
The reconstructed details and approximations are true constituents of the original signal.
In fact, when we combine them it can be found that
30
The coefficient vectors cA1 and cD1 -- because they were produced by down
sampling and are only half the length of the original signal -- cannot directly be combined
to reproduce the signal. It is necessary to reconstruct the approximations and details
before combining them.
Extending this technique to the components of a multilevel analysis, we find that similar
relationships hold for all the reconstructed signal constituents. That is, there are several
ways to reassemble the original signal:
Fig 2.26 Reconstructed signal components
2.2 Distributed Arithmetic (DA)

2.2.1 Distributed Arithmetic at a Glance
The arithmetic sum of products that defines the response of linear, time-
invariant networks can be expressed as:
Equ 2.3
Where
31
is response of network at time n.
is kth input at time n.
is weighing factor of kth input variable that is constant for all n,
and so it remains time-invariant.
In filtering applications the constants, Ak , are the filter coefficients and the variables,
xk , are the prior samples of a single data source (for example, an analog to digital
converter). In frequency transforming - whether the discrete Fourier or the fast Fourier
transform - the constants are the sine/cosine basis functions and the variables are a block
of samples from a single data source. Examples of multiple data sources may be found in
image processing.
The multiply-intensive nature of equ2.3 can be appreciated by observing that a
single output response requires the accumulation of K product terms. In DA the task of
summing product terms is replaced by table look-up procedures that are easily
implemented in the Xilinx configurable logic block (CLB) look-up table architecture.
We start by defining the number format of the variable to be 2’s complement, fractional -
a standard practice for fixed-point microprocessors in order to bound number growth
under multiplication. The constant factors, Ak, need not be so restricted, nor are they
required to match the data word length, as is the case for the microprocessor. The
constants may have a mixed integer and fractional format; they need not be defined at
this time. The variable, xk, may be written in the fractional format as shown in equ. 2.4
Equ 2.4
where xkb is a binary variable and can assume only values of 0 and 1. A sign bit of value
-1 is indicated by xk0. The time index, n, has been dropped since it is not needed to
continue the derivation. The final result is obtained by first substituting equ.2.4 into
equ.2.3.
32
Equ 2.5
and then explicitly expressing all the product terms under the summation symbols:
Equ 2.6
Each term within the brackets denotes a binary AND operation involving a bit of the
input variable and all the bits of the constant. The plus signs denote arithmetic sum
operations. The exponential factors denote the scaled contributions of the bracketed pairs
to the total sum. Construct a look-up table that can be addressed by the same scaled bit
of all the input variables and can access the sum of the terms within each pair of brackets.
Such a table is shown in fig.2.26 and will henceforth be referred to as a Distributed
Arithmetic look-up table or DALUT. The same DALUT can be time-shared in a serially
organized computation or can be replicated B times for a parallel computation scheme.
33
Fig2.27 The Distributed Arithmetic Look-up Table (DALUT)
The arithmetic operations have now been reduced to addition, subtraction, and binary
scaling. With scaling by negative powers of 2, the actual implementation entails the
shifting of binary coded data words toward the least significant bit and the use of sign
extension bits to maintain the sign at its normal bit position. The hardware
implementation of a binary full adder (as is done in the CLBs) entails two operands, the
addend and the augends to produce sum and carry output bits. The multiple bit-parallel
34
additions of the DALUT outputs expressed in equ.2.6 can only be performed with a
single parallel adder if this adder is time-shared. Alternatively, if simultaneous addition
of all DALUT outputs is required, an array of parallel adders is required. These opposite
goals represent the classic speed-cost tradeoff.
2.2.2 The Speed Tradeoff

Any new device that can be software configured to perform DSP functions must contend
with the well entrenched standard DSP chips, i.e. the programmable fixed point
microprocessors that feature concurrently operating hardware multipliers and address
generators, and on-chip memories. The first challenge is speed. If the FPGA doesn’t offer
higher speed why bother. For a single filter channel the bother is worth it - particularly as
the filter order increases. And the FPGA advantage grows for multiple filter channels.
Alas, a simple advantage may not be persuasive in all cases - an overwhelming speed
advantage may be needed for FPGA acceptance. To reach 50 megasamples/sec data
sample rates we require high cost in gate resources. The first two examples will show the
end points of the serial/parallel tradeoff continuum.
2.2.3 The Ultimate in Speed
Conceivably, with a fully parallel design the sample speed could match the system clock
rate. This is the case where all the add operations of the bracketed values (the DALUT
outputs) of equ.2.6 are performed in parallel. Gain implementation guidance can be done
by rephrasing equ.2.6, and to facilitate this process, abbreviate the contents within each
bracket pair by the data bit position. Thus
For B=16, equ 2.6 becomes:
35
Equ 2.7
The decomposition of Equ 2.7 into an array of two input adders is given below:
Equ 2.8
Equations 2.7 and 2.8 are computationally equivalent, but equ.2.8 can be mapped in a
straight forward way into a binary tree-like array of summing nodes with scaling affected
by signal routing as shown in fig. 2.27. Each of the 15 nodes represents a parallel adder,
and while the computation may can yield responses that include both the double precision
(B+C bits) of the implicit multiplication and the attendant processing gain, these adders
36
can be truncated to produce single precision (B bits) responses.
Fig. 2.28 Example of Fully Parallel DA Model (K=16, B=16)
All B bits of all K data sources must be present to address the B DALUTS. A BxK array
of flip-flops is required. Each of the B identical DALUTS contains 2K words with C bits
37
per word where C is the “cumulative” coefficient accuracy. The data flow from the flip-
flop array can be all combinatorial; the critical delay path for B=16 is not inordinately
long - signal routing through 5 CLB stages and a carry chain embracing 2C adder stages.
A system clock in the 10 MHz range may work. Certainly with internodes pipelining a
system clock of 50 MHz appears feasible. The latency would, in many cases be
acceptable; however, it would be problematic in feedback networks (e.g., IIR filters).
2.2.4 The Ultimate in Gate Efficiency

The ultimate in gate efficiency would be a single DALUT, a single parallel adder, and, of
course, fewer flip-flops for the input data source. Again with our B=16 examples, a
rephrasing of equ.2.6 yields the desired result:
Equ 2.9
Starting from the least significant end, i.e. addressing the DALUT with the least
significant bit of all K input variables the DALUT contents, [sum15], are stored, scaled
by and then added to the DALUT contents, [sum14] when the address changes to the
next-to-the-least-significant bits. The process repeats until the most significant bit
addresses the DALUT, [sum0]. If this is a sign bit a subtraction occurs. Now a vision of
the hardware emerges. A serial shift register, B bits long, for each of the K variables
addresses the DALUT least significant bit first. At each shift the output is applied to a
parallel adder whose output is stored in an accumulator register. The accumulator output -
scaled by 2-1 is the second input to the adder. Henceforth, the adder, register and scalar
shall be referred to as a scaling accumulator. The functional blocks are shown in fig.
2.29. All can be readily mapped into the Xilinx 4000 CLBs. There is a performance price
to be paid for this gate efficiency - the computation takes at least B clocks.
38
Fig2.29 Serially Organized DA processor
2.2.5 Between the Extremes
While there are a number of speed-gate count tradeoffs that range from one bit per clock
(the ultimate in gate efficiency) to B bits per clock (the ultimate in speed) the question of
their effectiveness under architectural constraints remains. We can start this study with
the case of 2 bit-at-a-time processing; the computation lasts B/2 clocks and the DALUT
now grows to include two contiguous bits, i.e. [sum b + {sum(b+1)}2-1]. Again consider
the case of B = 16 and rephrasing equ. 2.7:
39
Equ 2.10
The terms within the rectangular brackets are stored in a common DALUT which can
also serve [sum0] and [sum15]. Note that the computation takes B/2 +1 or 9 clock
periods. The functional blocks of the data path are shown in fig. 2.30(a). The odd valued
scale factors outside the rectangular brackets do introduce some complexity to the circuit,
but it can be managed;
40
Fig. 2.30(a) Two-bit-at-a-time Distributed Arithmetic Data Path (B=16, K=16)
The scaling simplifies with magnitude-only input data. Furthermore, the two bit
processing would last for 8 clock periods. Thus:
Equ 2.11
There is another way of rephrasing or partitioning equ. 2.7 that maintains the B clock
computation time:
41
Equ 2.12
Here two identical DALUTs, two scaling accumulators, and a post-accumulator adder
(fig.2.30(b)) are required. While the adder in the scaling accumulator may be single
precision, the second adder stage may be double precision to meet performance
requirements.
Fig 2.30(b) Two-bit-at-a-time Distributed Arithmetic Data Path (B=16, K=16)

There are other two-bit-at-a-time possibilities. Each possibility implies a different circuit
arrangement. Consider a third rephrasing of equ. 2.7.
42
Equ 2.13
Here the inner brackets denote a DALUT output while the larger, outer brackets denote
the scaling of the scaling accumulator.. Two parallel odd-even bit data paths are indicated
(fig.2.30c) with two identical DALUTs. The DALUT addressed by the even bits has its
output scaled by 2-1 and then is applied to the parallel adder. The adder sum is then
applied to the scaling accumulator which yields the desired response, y(n). Here a single
precision pre-accumulator adder replaces the double precision post accumulator double
precision adder.
43
Fig.2.30(c) Two-bit-at-a-time Distributed Arithmetic Data Path (B=16, K=16)
Each of these approaches implies a different gate implementation. Certainly one of the
most important concerns is DALUT size which is constrained by the look-up table
capacity of the CLB. The first approach, defined by equ.5b, describes a DALUT of 22K
words that feeds a single scaling accumulator, while the second, defined by equ.5c,
describes 2 DALUTs -each 2K words - that feed separate scaling accumulators. An
additional parallel adder is required to sum (with the 2-B/2 scaling indicated) the two
output halves. The difference in memory sizes between 22K and 2x2K is very significant
particularly when we confront reality, namely the CLB memory sizes of 32x1 or
2x(16x1) bits.
2.2.6 Parallel Realization

In its most obvious and direct form, distributed arithmetic computations are bit-serial in
ature, i.e., each bit of the input samples must be indexed in turn before a new output
sample becomes available. When the input samples are represented with B bits of
44
precision, B clock cycles are required to complete an inner-product calculation. A parallel
realization of distributed arithmetic corresponds to allowing multiple bits to be processed
in one clock cycle by duplicating the LUT and adder tree. In a 2-bit at a time parallel
implementation, the odd bits are fed to one LUT and adder tree, while the even bits are
simultaneously fed to an identical tree. The bits partials are left shifted to properly weight
the result and added to the even partials before accumulating the aggregate. In the
extreme case, all input bits can be computed in parallel and then combined in a shifting
adder tree.
Fig 2.31 Mallat’s quadratic mirror filter tree used to compute the coefficients of the (a)
forward and (b) inverse wavelet transforms.
45
CHAPTER 3
3.1 Introduction
This chapter describes the detailed procedure adopted to denoise the signal. It also
explains in details the MATLAB functions involved in the process.
3.2 Testing in MATLAB

Steps involved in denoising the signal using MATLAB are
• Load a signal
• Perform a single-level wavelet decomposition of a signal
• Construct approximations and details from the coefficients
• Display the approximation and detail
• Perform a multilevel wavelet decomposition of a signal
• Extract approximation and detail coefficients
• Apply thresholding to detail coefficients
• Reconstruct the level 3 approximation
• Display the results of a multilevel decomposition
• Reconstruct the original signal from the level 3 decomposition
3.3 Functions involved in denoising a signal

3.3.1 Analysis-Decomposition Functions
3.3.1.1 dwt
Purpose
Single-level discrete 1-D wavelet transform
Syntax
• [cA,cD] = dwt(X,'wname')
• [cA,cD] = dwt(X,'wname','mode',MODE)
• [cA,cD] = dwt(X,Lo_D,Hi_D)
• [cA,cD] = dwt(X,Lo_D,Hi_D,'mode',MODE)
Description
46
The dwt command performs a single-level one-dimensional wavelet decomposition with
respect to either a particular wavelet or particular wavelet decomposition filters (Lo_D
and Hi_D) that we specify.
[cA,cD] = dwt(X,'wname') computes the approximation coefficients vector cA and detail

coefficients vector cD, obtained by a wavelet decomposition of the vector X. The string
'wname' contains the wavelet name.
[cA,cD] = dwt(X,Lo_D,Hi_D) computes the wavelet decomposition as above, given

these filters as input:
• Lo_D is the decomposition low-pass filter.

• Hi_D is the decomposition high-pass filter.
Lo_D and Hi_D must be the same length.
Let lx = the length of X and lf = the length of the filters Lo_D and Hi_D; then
length(cA) = length(cD) = la where la = ceil(lx/2), if the DWT extension mode is set to
periodization. For the other extension modes, la = floor(lx+lf-1)/2).
[cA,cD] = dwt(...,'mode',MODE) computes the wavelet decomposition with the

extension mode MODE that you specify. MODE is a string containing the desired
extension mode.
Example:
• [cA,cD] = dwt(x,'db1','mode','sym');
47
3.3.1.2 wavedec
Purpose
Multilevel 1-D wavelet decomposition
Syntax
• [C,L] = wavedec(X,N,'wname')
• [C,L] = wavedec(X,N,Lo_D,Hi_D)
Description
wavedec performs a multilevel one-dimensional wavelet analysis using either a specific

wavelet ('wname') or a specific wavelet decomposition filters. [C,L] =
wavedec(X,N,'wname') returns the wavelet decomposition of the signal X at level N,
using 'wname'. N must be a strictly positive integer . The output decomposition structure
contains the wavelet decomposition vector C and the bookkeeping vector L. The structure
is organized as in this level-3 decomposition example.
48
Fig 3.1 Decomposition structure
[C,L] = wavedec(X,N,Lo_D,Hi_D) returns the decomposition structure as above, given

the low- and high-pass decomposition filters.
3.3.2 Synthesis-Reconstruction Functions

3.3.2.1 idwt
Purpose
Single-level inverse discrete 1-D wavelet transform
Syntax
• X = idwt(cA,cD,'wname')
• X = idwt(cA,cD,Lo_R,Hi_R)
• X = idwt(cA,cD,'wname',L)
49
• X = idwt(cA,cD,Lo_R,Hi_R,L)
• X = idwt(...,'mode',MODE)
Description
The idwt command performs a single-level one-dimensional wavelet reconstruction with

respect to either a particular wavelet or particular wavelet reconstruction filters (Lo_R
and Hi_R) that we specify.
X = idwt(cA,cD,'wname') returns the single-level reconstructed approximation

coefficients vector X based on approximation and detail coefficients vectors cA and cD,
and using the wavelet 'wname'.
X = idwt(cA,cD,Lo_R,Hi_R) reconstructs as above using filters that you specify.
• Lo_R is the reconstruction low-pass filter.

• Hi_R is the reconstruction high-pass filter.
Lo_R and Hi_R must be the same length.
la be the length of cA (which also equals the length of cD) and lf the length of the filters
Lo_R and Hi_R; then length(X) = LX where LX = 2*la if the DWT extension mode is set
to periodization. For the other extension modes LX = 2*la-lf+2.
X = idwt(cA,cD,'wname',L) or X = idwt(cA,cD,Lo_R,Hi_R,L) returns the length-L

central portion of the result obtained using idwt(cA,cD,'wname'). L must be less than LX.
X = idwt(...,'mode',MODE) computes the wavelet reconstruction using the specified

extension mode MODE.
X = idwt(cA,[],...) returns the single-level reconstructed approximation coefficients

vector X based on approximation coefficients vector cA.
X = idwt([],cD,...) returns the single-level reconstructed detail coefficients vector X based

on detail coefficients vector cD.
50
idwt is the inverse function of dwt in the sense that the abstract statement
idwt(dwt(X,'wname'),'wname') would give back X
3.3.2.2 waverec
Purpose
Multilevel 1-D wavelet reconstruction
Syntax
• X = waverec(C,L,'wname')
• X = waverec(C,L,Lo_R,Hi_R)
Description
waverec performs a multilevel one-dimensional wavelet reconstruction using either a

specific wavelet or specific reconstruction filters (Lo_R and Hi_R). waverec is the
inverse function of wavedec in the sense that the abstract statement
waverec(wavedec(X,N,'wname'),'wname') returns X.
X = waverec(C,L,'wname') reconstructs the signal X based on the multilevel wavelet

decomposition structure [C,L] and wavelet 'wname'. X = waverec(C,L,Lo_R,Hi_R)
reconstructs the signal X as above, using the reconstruction filters you specify. Lo_R is
the reconstruction low-pass filter and Hi_R is the reconstruction high-pass filter.
X = waverec(C,L,'wname') is equivalent to X = appcoef(C,L,'wname',0).
51
3.3.2.3 wrcoef
Purpose
Reconstruct single branch from 1-D wavelet coefficients
Syntax
• X = wrcoef('type',C,L,'wname',N)
• X = wrcoef('type',C,L,Lo_R,Hi_R,N)
• X = wrcoef('type',C,L,'wname')
• X = wrcoef('type',C,L,Lo_R,Hi_R)
Description
wrcoef reconstructs the coefficients of a one-dimensional signal, given a wavelet

decomposition structure (C and L) and either a specified wavelet or specified
reconstruction filters (Lo_R and Hi_R).
X = wrcoef('type',C,L,'wname',N) computes the vector of reconstructed coefficients,

based on the wavelet decomposition structure [C,L] ,at level N. 'wname' is a string
containing the wavelet name.
Argument 'type' determines whether approximation ('type' = 'a') or detail ('type' = 'd')
coefficients are reconstructed. When 'type' = 'a', N is allowed to be 0; otherwise, a strictly
positive number N is required. Level N must be an integer such that N length(L)-2.
X = wrcoef('type',C,L,Lo_R,Hi_R,N) computes coefficients as above, given the

reconstruction filters you specify.
X = wrcoef('type',C,L,'wname') and X = wrcoef('type',C,L,Lo_R,Hi_R) reconstruct

coefficients of maximum level N = length(L)-2.
52
3.3.2.4 appcoef
Purpose
1-D approximation coefficients
Syntax
• A = appcoef(C,L,'wname',N)
• A = appcoef(C,L,'wname')
• A = appcoef(C,L,Lo_R,Hi_R)
• A = appcoef(C,L,Lo_R,Hi_R,N)
Description
appcoef is a one-dimensional wavelet analysis function.
appcoef computes the approximation coefficients of a one-dimensional signal.
A = appcoef(C,L,'wname',N) computes the approximation coefficients at level N using

the wavelet decomposition structure [C,L] .'wname' is a string containing the wavelet
name. Level N must be an integer such that 0 N length(L)-2.
A = appcoef(C,L,'wname') extracts the approximation coefficients at the last level:
length(L) - 2.
For A = appcoef(C,L,Lo_R,Hi_R) or A = appcoef(C,L,Lo_R,Hi_R,N), Lo_R is the

reconstruction low-pass filter and Hi_R is the reconstruction high-pass filter .
53
3.3.3 De-noising and Compression
3.3.3.1.ddencmp
Purpose
Default values for de-noising or compression
Syntax
• [THR,SORH,KEEPAPP,CRIT] = ddencmp(IN1,IN2,X)
• [THR,SORH,KEEPAPP] = ddencmp(IN1,'wv',X)
• [THR,SORH,KEEPAPP,CRIT] = ddencmp(IN1,'wp',X)
Description
ddencmp is a de-noising and compression-oriented function.
ddencmp gives default values for all the general procedures related to de-noising and
compression of one- or two-dimensional signals, using wavelets or wavelet packets.
[THR,SORH,KEEPAPP,CRIT] = ddencmp(IN1,IN2,X) returns default values for de-

noising or compression, using wavelets or wavelet packets, of an input vector or matrix
X, which can be a one- or two-dimensional signal. THR is the threshold, SORH is for
soft or hard thresholding, KEEPAPP allows you to keep approximation coefficients, and
CRIT is the entropy name
IN1 is 'den' for de-noising or 'cmp' for compression.
IN2 is 'wv' for wavelet or 'wp' for wavelet packet.
For wavelets (three output arguments):
54
[THR,SORH,KEEPAPP] = ddencmp(IN1,'wv',X) returns default values for de-noising (if
IN1 = 'den') or compression (if IN1 = 'cmp') of X. These values can be used for
wdencmp.
For wavelet packets (four output arguments):
[THR,SORH,KEEPAPP,CRIT] = ddencmp(IN1,'wp',X) returns default values for de-

noising (if IN1 = 'den') or compression (if IN1 = 'cmp') of X. These values can be used
for wpdencmp.
3.3.3.2 wdencmp
Purpose
De-noising or compression
Syntax
[XC,CXC,LXC,PERF0,PERFL2]=wdencmp('gbl',X,'wname',N,THR,SORH,KEEPAPP)
[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',X,'wname',N,THR,SORH)
[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',C,L,'wname',N,THR,SORH)
Description
wdencmp is a one- or two-dimensional de-noising and compression-oriented function.

wdencmp performs a de-noising or compression process of a signal or an image, using
wavelets.
[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('gbl',X,'wname',N,THR,SORH,
KEEPAPP) returns a de-noised or compressed version XC of input signal X (one- or two-
dimensional) obtained by wavelet coefficients thresholding using global positive
threshold THR.
55
Additional output arguments [CXC,LXC] are the wavelet decomposition structure of XC.
PERF0 and PERFL2 are L2-norm recovery and compression score in percentage.
PERFL2 = 100 * (vector-norm of CXC / vector-norm of C)2 if [C,L] denotes the wavelet
decomposition structure of X.
If X is a one-dimensional signal and 'wname' an orthogonal wavelet, PERFL2 is reduced

to
Wavelet decomposition is performed at level N and 'wname' is a string containing

wavelet name .SORH ('s' or 'h') is for soft or hard thresholding .If KEEPAPP = 1,
approximation coefficients cannot be thresholded, otherwise it is possible.
wdencmp('gbl',C,L,'wname',N,THR,SORH,KEEPAPP) has the same output arguments,

using the same options as above, but obtained directly from the input wavelet
decomposition structure [C,L] of the signal to be de-noised or compressed, at level N and
using 'wname' wavelet.
For the one-dimensional case and 'lvd' option,

[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',X,'wname',N,THR,SORH)or
[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',C,L,'wname',N,THR,SORH) have the
same output arguments, using the same options as above, but allowing level-dependent
thresholds contained in vector THR (THR must be of length N). In addition, the
approximation is kept. Note that, with respect to wden (automatic de-noising), wdencmp
allows more flexibility and you can implement your own de-noising strategy.
For the two-dimensional case and 'lvd' option,

[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',X,'wname',N,THR,SORH) or
[XC,CXC,LXC,PERF0,PERFL2] = wdencmp('lvd',C,L,'wname',N,THR,SORH).
56
THR must be a matrix 3 by N containing the level-dependent thresholds in the three
orientations, horizontal, diagonal, and vertical.
The compression features of a given wavelet basis are primarily linked to the relative
scarceness of the wavelet domain representation for the signal. The notion behind
compression is based on the concept that the regular signal component can be accurately
approximated using a small number of approximation coefficients (at a suitably selected
level) and some of the detail coefficients.
Like de-noising, the compression procedure contains three steps:
1. Decomposition.
2. Detail coefficient thresholding. For each level from 1 to N, a threshold is selected
and hard thresholding is applied to the detail coefficients.
3. Reconstruction.
3.4 Parallel DA Implementation
The discrete wavelet transform equations can be efficiently computed using the pyramid
filter bank tree shown in Figure 3.2. This section describes a parallel distributed
arithmetic implementation of the filter banks by deriving a parallel distributed arithmetic
structure of a single FIR filter. Next step is to describe the implementation of the
decimator and interpolator; the basic building blocks of the forward and discrete wavelet
transforms, respectively.
57
Fig. 3.2 Mallat's quadratic mirror filter tree used to compute the coefficients of the (a).
forward and (b). inverse wavelet transforms.
3.5 Parallel DA FIR Filter Structure

All filters in the pyramid tree structure shown in Figure 3.2 are constructed using FIR
filters because of their inherent stability. Most discrete wavelet transform
implementations reported in literature employ the direct FIR structure, in which each
filter tap consists of a delay element, an adder, and a multiplier . However, a major
drawback of this implementation is that filter throughput is inversely proportional to the
number of filter taps. That is, as filter length is increased, the filter throughput is
proportionately decreased. In contrast, throughput of an FIR filter constructed using
distributed arithmetic is maintained regardless of the length of the filter. This feature is
particularly attractive for flexible implementations of different wavelet types since each
type has a different set of filer coefficients. Distributed arithmetic implementation of the
Daubechies 8-tap wavelet FIR filter consists of an LUT, a cascade of shift registers and a
scaling accumulator, as shown in Figure 3.3. The LUT stores all possible sums of the
Daubechies 8-tap wavelet coefficients. As the input sample is serialized, the bit-wide
58
output is presented to the bit serial shift register cascade, 1-bit at a time. The cascade
stores the input sample history in a bit-serial format and is used in forming the required
inner-product computation. The bit outputs of the shift register cascade are used as
address inputs to the LUT. Partial results from the LUT are summed by the scaling
accumulator to form a final result at the filter output port
Fig. 3.3 A DA implementation of the Daubechies FIR filter
Since the LUT size in a distributed arithmetic implementation increases exponentially

with the number of coefficients, the LUT access time can be a bottleneck for the speed of
the whole system when the LUT size becomes large. Hence the 8-bit LUT decomposed
is shown in Figure 3.3 into two 4-bit LUTs, and added their outputs using a two-input
accumulator. The 4-bit LUT partitioning is optimum in terms of logic resources
utilization, which uses 4-input LUTs. The modified partitioned-LUT architecture is
shown in Figure 3.4. The total size of storage is now reduced since the accumulator
occupies less logic resources than the larger 8-bit LUT. Furthermore, partitioning the
larger LUT into two smaller LUTs accessed in parallel reduces access time.
59
Fig 3.4 A partitioned-LUT DA implementation of the Daubechies FIR filter
A parallel implementation of the inherently serial distributed arithmetic (SDA) FIR filter,
shown in Figure 3.4, corresponds to partitioning the input sample into M sub-samples
and processing these sub-samples in parallel. Such a parallel implementation requires M
times as many memory look-up tables and so comes at a cost of increased logic
requirements. Below describes the implementation of PDA FIR filter at two different
degrees of parallelism; a 2-bit PDA FIR filter and a fully parallel 8-bit PDA FIR filter. A
2-bit parallel distributed arithmetic (PDA) FIR filter implementation is shown in Figure
3.5. It corresponds to feeding the odd bits of the input sample to an SDA LUT adder tree,
while feeding the even bits, simultaneously, to an identical tree. Compared to the serial
DA filter, shown is Figure 3.4, the shift registers are each replaced with two similar shift
registers at half the bit size. The odd bit partials are left shifted to properly weight the
result and added to the even partials before accumulating the aggregate by a 1-bit scaling
adder. Finally, since two bits are taken at a time, the scaling accumulator is changed from
1-to-2-bit shift (1/4) for scaling.
60
Fig 3.5 A 2-bit PDA Daubechies FIR filter
As for the fully parallel 8-bit PDA FIR filter implementation, the 8-bit input sample is
partitioned into eight 1-bit sub-samples so as to achieve maximum speed. Figure 3.6
shows the ultimate fully parallel PDA FIR filter, where all 8 input bits are computed in
parallel and then summed by a binary-tree like adder network. The lower input to each
adder is scaled down by a factor of 2. No scaling accumulator is needed in this case, since
the output from the adder tree is the entire sum of products.
61
Fig 3.6 (a). A single-bit and (b). an 8 -bit PDA Daubechies FIR filter
3.6 Decimator Implementation

Wavelets are the basic building block of the parallel DA forward discrete wavelet
transform filter bank is the decimator, which consists of a parallel DA, anti-aliasing FIR
filter, followed by a down-sampling operator .
Down sampling an input sequence x[n] by 2 generates an output sequence y[n] according
to the relation y[n] = x[2n]. All input samples with indices equal to an integer multiple of
2 are retained at the output, and all other samples are discarded. Therefore, the sequence
y[n] has a sampling rate equal to half of the sampling rate of x[n]. Implementation of the
decimator is shown in Figure 3.7. The input data port of the PDA FIR filter is connected
to the external input samples source, and its clock input is tied with the clock input of a 1-
bit counter. Furthermore, the output data port of the PDA FIR filter is connected to the
input port of a parallel-load register. The register receives or blocks data appearing on its
input port depending on the status of the 1-bit counter. Assuming an unsigned 8-bit input
sample is used, the decimator operates in such a way that when the counter is in the 1
state, the PDA FIR data is stored in the parallel load register, and when the counter turns
62
to the 0 state, the PDA FIR data is discarded. A random input sample X enters the
decimator at a rate of 1sample/1 clocks , and an output filtered sample Y leaves the
decimator at a rate of 1sample/ 2clocks. The input frequency is clearly halved by the
decimator.
Fig 3.7 Decimator Implementation
3.7 Interpolator implementation
Wavelets are the basic building block of the inverse discrete wavelet transform filter bank
is the interpolator which consists of a parallel DA, anti-imaging FIR filter, proceeded by
an up-sampling operator . In upsampling by a factor of 2, an equidistant zero-valued
sample is inserted between every two consecutive samples on the input sequence x[n] to
develop an output sequence y[n], such that y[n] = x[n/2] for even indices of n, and 0
otherwise. The sampling rate of the output sequence y[n] is twice as large as the sampling
rate of the original sequence x[n]. Implementation of the interpolator is shown in Figure
3.8. The input data port of the PDA FIR filter is connected to the output port of a parallel-
load register. Furthermore, the input port of the register is connected to the external input
sample source, and its CLK input is tied with the CLK input of a 1-bit counter. The
operation of the register depends on the signal received on its active-high CLR (clear)
63
input from the 1-bit counter. Assuming the input signal source sends out successive
samples separated by 2 clock periods, the interpolating filter operates in such a way that
when the counter is in the 0 state, the register passes the input sample X to the PDA FIR
filter, and when the counter turns to the 1 state, the register is cleared, thus transferring a
zero to the PDA FIR filter. That is, a zero is inserted between every two successive input
samples. The filter receives an input sample X at the rate of 1 sample/2 clocks , and sends
out its filtered sample Y at the rate of 1 sample/1 clock. The input frequency is clearly
doubled by the interpolator.
Fig3.8InterpolatorImplementation
64
CHAPTER 4
This chapter consists of the results and waveforms.
4.1 Waveforms obtained in MATLAB.
Fig 4.1 Approximation and Details
65
Fig 4.2 Approximation (A3) and Details(D1,D2,D3)
66
Fig 4.3 Original and level 3 approximation
67
Fig 4.4 Detail level 1, 2, 3
Fig 4.5 Original and De-noised signals
68
4.2 VHDL Coding
4.2.1. Code for Single level DWT
library IEEE;
use IEEE.std_logic_1164.all;
use IEEE.std_logic_arith.all;
use IEEE.std_logic_signed.all;
entity dwt_single_level is
port (
-- din: in std_logic_vector(15 downto 0);
clockin:in std_logic;
samplecount:in integer;
dout: out std_logic_vector(15 downto 0)
);
end dwt_single_level;
architecture dwt_single_level of dwt_single_level is
component fileread is
port(clockin:in std_logic;
dout:out std_logic_vector(15 downto 0));
end component;
component decimator_hpf_dec is
port (
clock: in STD_LOGIC;
datain: in std_logic_vector(15 downto 0);
samplecnt:in integer;
levelno:in integer;
-- dataoutcnt:out integer;
-- clockout:out std_logic;
dataout: out std_logic_vector(15 downto 0)
);
end component;
component decimator_lpf_dec is
port (
datain: in std_logic_vector(15 downto 0);
samplecnt:in integer;
levelno:in integer;
69
dataoutcnt:out integer;
clockout:out std_logic;
firout: out std_logic_vector(15 downto 0);
dataout: out std_logic_vector(15 downto 0)
);
end component;
component interpolator_lpf_rec is
port (
datain:in std_logic_vector(15 downto 0);
samplecnt,levelno:in integer;
-- countout:out integer;
doutput: out std_logic_vector(15 downto 0)

);
end component;
component interpolator_hpf_rec is
port (
datain:in std_logic_vector(15 downto 0);
samplecnt,levelno:in integer;
countout:out integer;
doutput: out std_logic_vector(15 downto 0)
);
end component;
signal din,fir1: std_logic_vector(15 downto 0);

signal ca_cnt,cd_cnt: integer;
signal ca_clock: std_logic;
signal ca_dec,cd_dec,ca_rec,cd_rec,cd1_dec:std_logic_vector(15 downto 0);
begin
c0:fileread port map(clockin,din);
c1:decimator_lpf_dec port
map(clockin,din,samplecount,1,ca_cnt,ca_clock,fir1,ca_dec);
c2:decimator_hpf_dec port map(clockin,din,samplecount,1,cd_dec);
c3:interpolator_lpf_rec port map(clockin,ca_dec,ca_cnt,2,ca_rec);

c4:interpolator_hpf_rec port map(clockin,cd_dec,ca_cnt,2,cd_cnt,cd_rec);
dout<=ca_rec + cd_rec;
end dwt_single_level;
70
4.3.Results
Fig 4.6 Samples taken from MATLAB
4.4. Comparision of the results
71
72
73
Fig 4.9 MATLAB values for denoised signal
74
4.5 Conclusion
Real world signals are often corrupted by noise which may severely limit their
usefulness. For this reason, signal denoising is a topic that continually draws great
interest. Wavelets are an alternative tool for signal decomposition using orthogonal
functions. Unlike basic Fourier analysis, wavelets do not lose completely time
information, a feature that makes the technique suitable for applications where the
temporal location of the signal’s frequency content is important. One of the fields where
wavelets have been successfully applied is data analysis. In particular, it has been
demonstrated that wavelets produce excellent results in signal denoising. This work
presents a procedure to denoise a signal using discrete wavelet transform. A real-time
electrical signal contaminated with noise is used as test bed for the method. The
simulation result of the suggested design is presented. The future work includes using
multiwavelets to denoise a signal.
75
References
[1] Texas Corporation, www.ti.com
[2] M. Smith, Application-specific integrated circuits.USA: Addison Wesley Longman,
1997.
[3] R. Seals and G. Whapshott, Programmable Logic: PLDs and FPGAs. UK:
Macmillan, 1997.
[4] P. Kollig, B. Al-Hashimi and K. Abbot, “ FPGA implementation of high performance
FIR filters,” In Proc. International Symposium on Circuits and Systems, 1997.
[5] M. Shand, “ Flexible image acquisition using reconfigurable hardware,” In Proc. of
the IEEE Workshop on Filed Programmable Custom Computing Machines, Napa,
Ca, Apr. 1995.
[6] J. Villasenor, B. Schoner, and C. Jones, “Video communication using rapidly
reconfigurable hardware,” IEEE Transactions on Circuits and Systems for Video
Technology, vol. 5, no. 12, pp. 565-567, Dec. 1995.
[7] L. Mintzer, “The role of distributed arithmetic in FPGAs,” Xilinx Corporation.
[8] K. Parhi, VLSI digital signal processing systems. US: John Wiley & Sons, 1999
[9] G. Strang and T. Nguyen, Wavelets and filter banks. MA: Wellesley-Cambridge
Press, 1996.
[10] M. Antonini, M. Barlaud, P. Mathieu, and I. Daubechies, “Image coding using
wavelet transform,” IEEE Trans. Image Processing, vol. 1, no.2, pp. 205-220, April
1992.
[11] T. Ebrahimi and F. Pereira, The MPEG-4 Book. Prentice Hall, July 2002
[12] D. Taubman and M. Marcellin. JPEG2000: Image compression fundamentals,
standards, and practice. Kluwer Academic Publishers, November, 2001,
[13] Xilinx Corporation. “Xilinx breaks one million-gate barrier with delivery of new
Virtex series,” October 1998
[14] G. Knowles, “VLSI architecture for the discrete wavelet transform,” Electron
Letters, vol. 26, no.15, pp. 1184-1185, July 1990.
76
[15] A. Grzeszczak, M. Kandal, S. Panchanathan, and T. Yeap, “ VLSI implementation
of discretewavelet transform,” IEEE Trans. VLSI Systems, vol. 4, no. 4, pp. 421-433,
Dec. 1996
[16] K. Parhi and T. Nishitani, VLSI architectures for discrete wavelet transforms, IEEE
Trans. VLSI Systems, pp. 191-202, June 1993.
[17] C.Chakabarti, M. Vishwanath, and R. Owens, "Architectures for wavelet transforms:
a survey," Journal of VLSI Signal Processing, vol. 14, no. 2, pp.171-192, Nov. 1996.
[18] S. Mallat, “ A theory for multresolution signal decomposition: The wavelet
representation, IEEE Trans. Pattern Anal. And Machine Intell., vol. 11, no. 7, pp. 674-
693, July 1989.
[19] I. Daubechies, “Orthonomal bases of compactly supported wavelets,” Comm.
Pure Appl. Math, vol. 41, pp. 906-966, 1988.
[20] S. White, “Applications of distributed arithmetic to digital signal processing: a
tutorial”, In IEEE ASSP Magazine, pp. 4-19, July 1989.
[21] A. Oppenheim and R. Schafer, Discrete signal processing. New Jersy: Prentice
Hall, 1999.
[22] Xess Corporation. www.xess.com.
[23] WaveLib www-sim.int-evry.fr/~bourges/WaveLib.html
[24] EPIC [http://www.cis.upenn.edu/~eero/epic.html]
[25] Imager Wavelet Library
[http://www.cs.ubc.ca/nest/imager/contributions/bobl/wvlt/download/]
[26] Mathematica wavelet programs [http://timna.Mines.EDU/wavelets/]
[27] p-wavelets [ftp://pandemonium.physics.missouri.edu/pub/wavelets/]
[28] WaveLab [http://playfair.Stanford.EDU/~wavelab/]
[29] Uvi_Wave Software [http://www.tsc.uvigo.es/~wavelets/uvi_wave.html]
[30] WAVBOX [ftp://simplicity.stanford.edu/pub/taswell/]
[31]WaveThresh[http://www.stats.bris.ac.uk/pub/software/wavethresh/WaveThresh.
html]
[32]WPLIB [ftp://pascal.math.yale.edu/pub/wavelets/software/wplib/]
[33]W-Transform Matlab Toolbox [ftp://info.mcs.anl.gov/pub/W-transform/]
[34] XWPL [ftp://pascal.math.yale.edu/pub/wavelets/software/xwp]
77
78

Main PRJT

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Main PRJT

Hochgeladen von

Copyright:

Verfügbare Formate

CHAPTER 1

1.2 Aim of the project

1.5 Organization of the Report

2.1 WAVELET TRANSFORMS

Mathematical transformations are applied to signals to obtain further information

2.1.2 Time Domain Analysis

2.1.3 Frequency Domain Analysis

2.1.3.1 The Fourier Transform

If the FT of a signal in time domain is taken, the frequency-amplitude

. The Fig 2.2 shows the FT of the 50 Hz signal

Although FT is probably the most popular transform being used (especially in

2.1.5 Stationary and Non-Stationary Signals

For example the following signal

Figure 2.3 Stationary Signal

And the following is its FT:

Figure 2.4 FT of Stationary signal

And the following is its FT:

Figure 2.6 FT of Non Stationary signal

2.1.6. Need for Time Frequency Representation

When the time localization of the spectral components is needed, a transform

For every t' and f a new STFT coefficient is computed.

Fig 2.7 STFT coefficients computed for every t’ and f

Consider a non-stationary signal, such as the following one:

2.1.8. The Resolution Problem

In the FT there is no resolution problem in the frequency domain, i.e., it is known

In FT, the kernel function, allows us to obtain perfect frequency resolution,

Wide window ===>good frequency resolution, poor time resolution.

Figure 2.10 Different windows

2.1.9. THE CONTINUOUS WAVELET TRANSFORM

The continuous wavelet transform was developed as an alternative approach to

The continuous wavelet transform is defined as follows

Figure 2.14 Signal corresponding to various scales

Scaling, as a mathematical operation, either dilates or compresses a signal. Larger scales

In terms of mathematical functions, if f(t) is a given function f(st) corresponds to a

2.1.9.2. COMPUTATION OF THE CWT

4. Scale (stretch) the wavelet and repeat steps 1 through 3.

5. Repeat steps 1 through 4 for all scales.

2.1.10. Need of Discrete Wavelet Transform

In wavelet analysis, it is often spoken of approximations and details. The approximations

Figure 2.16 Filtering process at its basic level

The schematic figure is as shown

.Figure 2.18 schematic representation for one stage decomposition

2.1.12. Multiple-Level Decomposition

The decomposition process can be iterated, with successive approximations being

Looking at a signal's wavelet decomposition tree can yield valuable information.

Figure 2.20 Signal’s Wavelet decomposition tree.

Since the analysis process is iterative, in theory it can be continued indefinitely. In

2.1.13. Wavelet Reconstruction

Fig 2.21.Reconstruction from the wavelet coefficients.

Fig 2.22 Single component and Upsampled signal component

2.1.14. Reconstruction Filters

Fig 2.23 (a) Decomposition and (b) Reconstruction

2.1.14.1Reconstructing Approximations and Details

Fig 2.24 Reconstructing approximations and details

Fig 2.25 Reconstructing the signal from approximations

Fig 2.25 Reconstructing the signal from details

Fig 2.26 Reconstructed signal components

2.2 Distributed Arithmetic (DA)

is kth input at time n.

is weighing factor of kth input variable that is constant for all n,

and so it remains time-invariant.

2.2.2 The Speed Tradeoff