Final Thesis

SYMBOLS AND ABBREVIATIONS
A/D
ANALOG TO DIGITAL
AAC
ADVANCED AUDIO CODING
AWGN
Additive White Gaussian Noise
BEP
Bit Error Probability
BER
Bit Error Rate
BPS
Bits Per Second
CD
Compact Disc
CSI
Channel State Information
D/A
Digital to Analog
DC
Direct Current
DCT
Discrete Cosine Transform
DFT
Discrete Fourier Transform
DS
Direct Sequence
DSP
Digital Signal Processing
DVD
Digital Versatile Disc
DWT
Discrete Wavelet Transform
FFT
Fast Fourier Transform
FH
Frequency Hopping
FIR
Finite Impulse Response
GTC
Gain of Transform Coding
HAS
Human Auditory System
HVS
Human Visual System
ID
Identity
IID
Independent Identically Distributed
ISS
Improved Spread Spectrum
ISO
International Organization for Standardization
IWT
Integer Wavelet Transform
JND
Just Noticeable Distortion
LSB
Least Significant Bit
MER
Minimum-Error Replacement
MPEG
Moving Picture Experts Group
mp3
MPEG 1 Compression, Layer 3
MSE
Mean-Squared Error
NMR
Noise to Mask Ratio (in decibels)
PDA
Personal Digital Assistant
PDF
Probability Density Function
PN
Pseudo Noise
PRN
Pseudo Random Noise
PSNR
Peak Signal to Noise Ratio
PSC
Power-Density Spectrum Condition
QIM
Quantization Index Modulation
SDMI
Secure Digital Music Initiative
SMR
Signal to Mask Ratio (in decibels)
SNR
Signal to Noise Ratio (in decibels)
SPL
Sound Pressure Level
SS
Spread Spectrum
SYNC
Synchronization
TCP
Transmission Control Protocol
UDP
User Datagram Protocol
VHS
Video Home System
WMSE
Weighted Mean-Squared Error
WEP
Word Error Probability
WER
Word Error Rate
CHAPTER 1
INTRODUCTION
The invention of steganography and cryptography techniques give secure communication

environment in this challenging world. Before that secure data transmission was a very tedious
job to do. Some of the techniques employed in early days are writing with an invisible type ink,
drawing some painting with some little modifications, combination of two images to create a
new image, shaving the head of the messenger in the form of a message, sketching the message
on the scalp and so on [15].
Normally an application is developed by a person or a small group of people and used by
some other user or group of users. Hackers are the people who tend to modify the original
application by modifying it or use the same application to make profits without proper
permission of its original owner. It is obvious that hackers are more in number as compared to
those who creating it. Hence, protecting an application should have been the primarily job.
Protection techniques have to be very efficient, robust and unique to restrict hackers. The
development of technology has moved towards the scope of steganography and at the same time
decreased its efficiency because the medium is become relatively unprotected. This tends to the
development of the new but relative technology called, Watermarking. Some of the
applications of digital watermarking include ownership protection, authentication, security,
monitoring, industrial and medical applications etc. [8] [21].
1.1 BACKGROUND
Globalization and internet are the main reasons for the growth of research and sharing of
information. However, they have become the greatest tool for malicious user to attack and pirate
the digital media. The watermarking technique during starting phase was used on images, and is
termed as Digital Image Watermarking. Digital Image watermarking has become popular,
4
however, the malicious user has started to extract the watermark creating challenges for the
developers. Thus, developers have found another digital embedding source as audio and termed
such watermarking as Audio Watermarking. It is very difficult to make digital informationsecure
especially in the case of audio and audio watermarking has become a challenge to developers
because of the impact it has created in preventing copyrights of the music [12]. Note that it is
necessary to maintain the copyright of the digital media, which is one of the compulsory
requirement. Digital watermarking is a technique by which copyright information is embedded
into the host signal in a way that the embedded information is not recognized, and robust against
intentional and unintentional attacks [14].
1.2 STEGANOGRAPHY AND WATERMARKING
1.2.1 STEGANOGRAPHY
Steganography is evolved from the ancient technique known as the

Cryptography. Cryptography protects the contents of the message [15]. On the other hand,
steganography is a technique to send information by writing on the cover object invisibly.
Steganography is taken from the Greek word that means covered writing (stego = covered
and
graphy = writing) [3]. Here the authorized party is only aware of the existence of the hidden
message. An ideal steganographic technique conceals large amount of information ensuring that
the modified object is not visually or audibly distinguishable from the original object.
The steganography technique needs a cover object and message that is to be
transmitted from medium. It also requires a stego (owner) keyto recover the embedded message.
Users having the stego(owner) key can be only access for the secret message. Another important
requirement for an efficient steganographic technique is the cover object is modified in a way
such that the quality is not degraded after embedding the message.
1.2.2 WATERMARKING
Watermarking is a technique through which the secure information is carried without

degrading the quality of the original signal. The technique consists of two major important
blocks:
(i) Embedding block
(ii) Extraction block
The system has an embedded owner key as in case of a steganography. The key is used
to increase security, which does not allow any unauthorized access to the users to manipulate or
extract data from carrier. The embedded object is known as watermark, the watermark
embedding medium is termed as the carrier or cover object and the modified object is termed as
embedded signal or watermarked data[15].
The embedding block, which is shown in Figure 1.1 has a watermark, cover object, and
watermarking key as the inputs (creates the embedded signal or watermarked data) [5]. Whereas,
the inputs for the extraction block is embedded object, key and sometimes watermark as given in
the figure 1.2 [5].
The watermarking technique that does not use the watermark during extraction process is
termed as the blind watermarking. Blind watermarking is better technique over the other
watermarking which has watermark for extraction as watermarked signal and owner key are
sufficient to find the embedded secret information [10].
Figure 1.1 Digital Watermarking Embedding
Figure 1.2 Digital Watermarking Extraction

1.2.3 DIFFERENCES BETWEEN STEGANOGRAPHY AND WATERMARKING
Although steganography and watermarking both techniques are used for coverting,
communication,
steganography
typically
relates
only
to
covert
point
to
point
communication between two parties [1]. Steganographic methods are not always robust against
malicious attacks or modification of data that might occur during transmission, storage or format
7
conversion [5]. Watermarking is one type of steganographic techniques whose primary objective
is to provide the security of the object rather than the invisibility of the hidden object. The major
difference
between
the
two
techniques
is
the
superior
robustness
capability
of
watermarking schemes [15]. To summarize this, an ideal steganographic system can embed a
large amount of information with no visible degradation to the cover object, but an ideal
watermarking system would embed an amount of information that cannot be altered or removed
without making the cover object entirely unusable. A watermarking system involves tradeoff
between capacity and security [16].
1.2.4 IMAGE AND AUDIO WATERMARKING
Watermarking technique has evolved considerably from its origin [21]. Due to
evolution of technology the medium of transmission has been changed. Watermarking is
preferred in digital media like image & audio. The watermarking technique in which the cover
objects as discussed in Section 1.2.2, is image then the process is termed as Image Watermarking.
Audio watermarking is quite challenging than image watermarking due to the dynamic
supremacy of human auditory system (HAS) over human visual system (HVS) [12].
The evaluation of image quality is very important in todays video broadcasting,
transmission control, and e-commerce, because quality is a key determinant of customer
satisfaction and a key indicator of transmission conditions. Meanwhile, quality is very useful in
the evaluation of the effectiveness or performance of image processing algorithms or systems.
Based on the dependence on a reference image, the image quality metrics can be divided into
three categories: the Full-Referencequality metrics, Reduced-Referencequality metrics& NoReference quality metrics. The Full-Reference quality metrics evaluate image quality by
comparing the differences between the distorted image and the original image [5],[7]. The widely
used quality metrics in this category are the PSNR, w-PSNR, Watson JND, SSIM, etc. The FullReference quality metrics provide more accurate quality evaluation results comparing to the
Reduced or No-Reference quality metrics. However, the Full-Reference quality metrics become
less practical when the original image is not available. The Reduced-Reference quality metrics
evaluate the quality of a distorted image using partial information of the original image. In
literature, such partial information can be some features extracted from the original image [8]
[10]. The Reduced-Reference quality metrics do not require the presence of the original image
for quality evaluation. However, the partial information of the original image need to be
transmitted to the receiver side either through an ancillary channel or by embedding into the
transmitted image. The sacrifice of bandwidth for transmitting the additional information needs
to be considered. The No-Reference quality metrics estimate image quality without accessing the
original image [11][16]. In practical applications, different quality metrics can measure the
image degradation from different angle. For example, PSNR measures image quality
mathematically in terms of MSE; JND and SSIM intend to measure image quality with more
emphasis on the perceptual experience. Multiple metrics are often used in practical applications
to give a full coverage of the image quality evaluation. In terms of applicability in signal
transmission, Reduced- or No-Reference metrics are more preferable over the Full-Reference
metrics. To this end, a Reduced- or No-Reference quality evaluation scheme which can evaluate
image quality in terms of different existing metrics would be really useful.
The watermarking technique is one of the most promising methods to develop
Reduced or No-Reference quality metrics. In this case, usually a semi-fragile watermark is
embedded in the original (or cover) image. Both the embedded watermark and the cover image
will undergo the same distortion. The image quality can be estimated by evaluating the
watermark degradation. The watermarking based quality estimation schemes can be categorized
into the image-feature-dependent schemes and the image-feature-independent schemes. In the
image-feature-dependent schemes, normally, some features extracted from the original image are
embedded into the original image [9], [17], [18]. At the receiver side, the embedded features are
reconstructed and are used as the reference watermark. The features extracted from the distorted
image are used as the distorted watermark. The watermark degradation is assessed by comparing
the distorted watermark with the reference watermark. Instead of the original image features,
using the reconstructed image featuresas the reference watermark mayintroduce additional
inaccuracy to the quality estimation. Moreover, for these schemes, it is hard to examine which
kind of image features are suitable for providing accurate quality estimation. The image-featureindependent schemes simplify the situation [19][23]. In these schemes, the watermark is
independent of the image features and needs to be known at the receiver side. The watermark
degradation is evaluated using the distorted watermark and the original watermark. One
challenging task to develop the image-feature-independent watermarking based quality metric is
to enable the embedded watermark to accurately reflect the quality changes of cover image under
distortions. This requires the watermark being adaptively embedded in cover images according
to the characteristics of cover images, so that the embedded watermark degrades in a similar way
as the cover images under distortions. This is a critical part of the whole scheme and directly
affects the accuracy of quality estimation.
In [19], an image-feature-independent watermarking based quality estimation
scheme was proposed which attempts to achieve the quality estimation accuracy of the FullReference objective metrics. In the scheme, the watermark degradation is measured using the
True Detection Rates (TDR). Then the image quality is estimated by mapping the calculated
10
TDR to a quality value using an empirical mapping function which is experimentally generated
and is used as a priori at the receiver side. An iterative process is used to find the optimal
watermark embedding strength by experimentally testing the image degradation characteristics
so that the quality estimation error can be minimized. This iterative process provides high
accuracy to the quality estimation. However, it also introduces relatively high computational
complexity which makes it less suitable for certain applications. Meanwhile, in the scheme, the
human perception characteristics have not been taken into consideration during the watermark
embedding process and the quality of the watermarked image is about 40 dB in PSNR on
average. Thus, the goal of our research in this paper is to keep the accuracy of quality estimation
achieved in [19] while improving the computational efficiency and reducing the image quality
degradation caused by the watermark embedding process. Here, the term accuracy evaluates
the correlation of the estimated quality and the quality calculated using the existing objective
Full-Reference quality metrics, such as PSNR. The closer the estimated quality to the calculated
quality, the more accurate the quality estimation, and vice versa.
The watermark is a signature, embedded within the data of the original signal,
which in addition to being inaudible to the human ear, should also be statistically undetectable
and resistant to any malicious attempts to remove it. In addition, the watermark should be able
to resolve multiple ownership claims (known as the deadlock problem), by using the original
signal (i.e., the unmarked signal) in the signature detection process.
In order to meet the above demands, perceptual masking is used [1,2], both in the
frequency domain (using a psycho-acoustic model) and in the time domain. The added signature
is signal dependent, and thus is inaudible as well as robust enough to survive attempts to destroy
it.
11
The audio signal is divided into segments. For each segment a local key is
calculated and summed up with a general key (independent of the segment) to initiate a pseudorandom noise sequence for the segment. The noise is colored by a filter, whose coefficients are
calculated according to the psycho-acoustic model. After applying a temporal mask (in order to
reduce the pre-echo effect), the colored noise becomes a watermark.
1.3 APPLICATIONS OF WATERMARKING

1.3.1 OWNERSHIP PROTECTION AND PROOF OF OWNERSHIP:In ownership
protection application, the watermark embedded contains a unique proof of ownership. The
embedded information is robust and secure against attacks and can be demonstrated in a case of
dispute of ownership. There can be the situations where some other person modifies the
embedded watermark and claims that it is his own. In such cases the actual owner can use the
watermark to show the actual proof of ownership [5] [18] [19].
1.3.2 AUTHENTICATION AND TAMPERING DETECTION:In this application
additional secondary information is embedded in the host signal and can be used to check if the
host signal is tampered. This situation is important because it is necessary to know about the
tampering caused to the media signal. The tampering is sometime a cause of forging of the
watermark which has to be avoided [5] [18] [19].
1.3.3Finger printing: Additional data embedded by a watermark in the fingerprinting
applications are used to trace the originator or recipients of a particular copy of a multimedia file.
The usage of an audio file can be recorded by a fingerprinting system. When a file is accessed by
a user, a watermark, or called fingerprint in this case, is embedded into the file thus creating a
12
mark on the audio. The usage history can be traced by extracting all the watermarks that were
embedded into the file [7].
1.3.4 Broadcast monitoring: Watermarking is used in code identification information for an
active broadcast monitoring. No separate broadcast channel is required as the data is embedded
in the host signal itself which is one of the main advantages of the technique [19].
1.3.5 Copy control and access control: A watermark detector is usually integrated in a
recording or playback system, like in the DVD copy control algorithm [8] or during the
development of Secure Digital Music Initiative (SDMI) [7]. The copy control and access control
policy detects the watermark and it enforces the operation of particular hardware or software in
the recording set [18].
1.3.6 Information carrier: The blind watermarking technique can be used in this sort of
applications. These applications can transfer a lot of information and the robustness of the
algorithm is traded with the size of content [15].
1.3.7 Medical applications: Watermarking can be used to write the unique name of the patient
on the X-ray reports or MRI scan reports. This application is important because it is highly
advisable to have the patients name entered on reports, and reduces the misplacements of reports
which are very important during treatment [19].
1.3.8Airline traffic monitoring: Watermarking is used in air traffic monitoring. The pilot
communicates with a ground monitoring system through voice at a particular frequency.
However, it can be easily trapped and attacked, and is one of the causes of miss communication.
To avoid such problems, the flight number is embedded into the voice communication between
the ground operator and the flight pilot. As the flight numbers are unique the tracking of flights
will become more secure and easy [31].
13
CHAPTER 2
LITERATURE SURVEY
[1] Sha Wang et-al, Adaptive Watermarking and Tree Structure Based Image Quality
Estimation, IEEE Transactions on Multimedia, Volume 16, Number 2, February 2014.
In this paper authors proposed a quality estimationmethod based on a novel semifragile and adaptive watermarkingscheme. The proposed scheme uses the embedded watermark
to estimate the degradation of cover image under different distortions.The watermarking process
is implemented in DWT domain of thecover image. The correlated DWT coefficients across the
DWT subbands are categorized into Set Partitioning in Hierarchical Trees(SPIHT). Those SPHIT
trees are further decomposed into a set ofbitplanes. The watermark is embedded into the selected
bitplanesof the selected DWT coefficients of the selected tree without causingsignificant fidelity
loss to the cover image. The accuracy of thequality estimation is made to approach that of FullReference metrics by referring to an "Ideal Mapping Curve" computed a priori.The experimental
results show that the proposed scheme can estimate image quality in terms of PSNR, wPSNR,
JND and SSIM withhigh accuracy under JPEG compression, JPEG2000 compression,Gaussian
low-pass filtering and Gaussian noise distortion.
[2]Ms. Komal V. Goenka et-al, Overview of Audio Watermarking Techniques,

IJETAE,Volume 2, Issue 2, February 2012.
In this paper authors describe the need of audio watermarking along with its
important properties. The paper also brings to view works done byvarious on digital audio
watermarking.This paper surveyed thosepapers and presented some of the important
14
techniquesused for digital audio watermarking.Spread spectrum scheme requires psychoacousticadaptation for inaudible noise embedding. This adaptationis rather time-consuming. Of
course, most of the audiowatermarking schemes need psychoacoustic modeling forinaudibility.
Another disadvantage of spread-spectrumschemes is its difficulty of synchronization.
[3]Shweta Sharma et-al, Survey on Different Level of Audio Watermarking Techniques,

International Journal of Computer Applications (IJCA), Volume 49 No.10, July 2012.
In this paper various techniques for digital audio watermarking has been
given.Audio Watermarking is useful technique for audio systems.This technique can work on
different domains like frequencyand time. By using the different scheme of watermarking
atdifferent levels of audio, it can be secure from many types ofattacks. This paper shows some
techniques which can be usedto secure the audio system from attacks and survey on
varioustransformation techniques for embedding or extractingwatermark.
[4]Md. Iqbal Hasan Sarker et-al, FFT-Based Audio Watermarking Method with a Gray
Image for Copyright Protection, International Journal of Advanced Science and Technology,
Volume 47, October, 2012.
In this paper few algorithms have been proposed for audiowatermarking. In this
paper, a new method of embedding gray image data into the audiosignal and additive audio
watermarking algorithm based on Fast Fourier Transformation(FFT) domain is proposed.
Experimental resultsdemonstrate that the watermark is inaudible and this algorithm is robust to
commonoperations of digital audio signal processing, such as noise addition, re-sampling, requantization and so on. To evaluate the performance of the proposed audio watermarkingmethod,
15
subjective and objective quality tests including Similarity (SIM) and Signal to Noiseratio (SNR)
are conducted.
[5] B.K. Singh et-al, Digital Audio Watermarking: An Overview, International Journal of
Electronics and Computer Science Engineering(IJECSE), Volume4, Number 4, 2013.
In this paper digital watermarking overview has been given.Digital audio
watermarking is a method to embed or hide theWatermark (Information signal) into a digital
signal i.e. Image, audio, text or video data. The watermark is difficult to remove from theaudio
signal. If the signal is copied, the information or watermark is also carried in the copy. A signal
may carry several differentwatermarks at the same time. It is used to protecting multimedia data
from unauthorized copying, piracy, ownership, inventions,authentication etc. in this paper we
present the watermarking methods and applications.
[6]Dhananjay Yadav et-al, Reversible Data Hiding Techniques, International Journal of

Electronics and Computer Science Engineering (IJECSE), Volume 1, Number 2, 2013.
In this paper authors presented a review of reversible watermarking techniques
and show different methods that areused to get reversible data hiding technique with higher
embedding capacity and invisible objects.Reversible data hiding is a technique that is used to
hide data inside an image. The data is hiddenin such a way that the exact or original data is not
visible. The hidden data can be retrieved as and whenrequired. There are several methods that are
used in reversible data hiding techniques like Watermarking,Lossless embedding and encryption.
16
[7]Ali Al-Haj et-al, DWTBased Audio Watermarking, The International Arab Journal of
Information Technology, Vol. 8, No. 3, July 2011.
In this paper authors describe an imperceptible and robust audiowatermarking
algorithm based on the discrete wavelet transform. Performance of the algorithm has been
evaluatedextensively, and simulation results are presented to demonstrate the imperceptibility
and robustness of the proposedalgorithm. Algorithms based on the discrete wavelets transform.
Thespectrum of the host audio signal was decomposed tolocate the most appropriate regions to
embed thewatermark bits, imperceptibly and robustly.
17
CHAPTER 3
OVERVIEW OF AUDIO WATERMARKING TECHNIQUES
This chapter provides the features of the human auditory system, which are
important while dealing with the audio watermarking technique. Further, this chapter considers
the requirement of an efficient watermarking strategy and different audio watermarking
techniques involving both time and frequency domain.
3.1 FEATURES OF HUMAN AUDITORY SYSTEM (HAS)
Note that audio watermarking is more challenging than an image watermarking technique due to
wider dynamic range of the HAS in comparison with human visual system (HVS) [12]. Human
ear can perceive the power range greater than 10 9: 1 and range frequencies of 10 3:1 [18]. In
addition, human ear can hear the low ambient Gaussian noise in the order of 70dB [18].
However, there are some useful features such as the louder sounds mask the corresponding slow
sounds. This feature can be used to embed additional information like a watermark. Further,
HAS is insensitive to a constant relative phase shift in a stationary audio signal, and, some
spectral distortions are interpreted as natural, perceptually non-annoying ones [12]. Two
properties of the HAS dominantly used in watermarking algorithms are frequency (simultaneous)
masking and temporal masking[13]:
3.1.1 FREQUENCY MASKING
Frequency (simultaneous) masking is a frequency domain phenomenon where

low levels signal (the maskee) can be made inaudible (masked) by a simultaneously appearing
stronger signal (the masker), if the masker and maskee are close enough to each other in
frequency [13]. A masking threshold can be found and is the level below which the audio signal
18
is not audible. Thus, frequency domain is a good region to check for the possible areas that have
imperceptibility.
3.1.2 TEMPORAL MASKING
In addition to frequency masking, two phenomena of the HAS in the time domain
also play an important role in human auditory perception. Those are premasking and postmasking in time [13]. However, considering the scope of analysis in frequency masking over
temporal masking, prior is chosen for this thesis. Temporal masking is used in application where
the robustness is not of primary concentration.
3.2 REQUIREMENTS OF THE EFFICIENT WATERMARK TECHNIQUE
The IFPI (International Federation of the Phonographic Industry) [29], digital
audio watermarking algorithms should meet certain requirements. The most significant
requirements are as follows:
3.2.1 Perceptibility: One of the important features of the watermarking technique is that the
watermarked signal should not lose the quality of the original signal. The signal to noise ratio
(SNR) of the watermarked signal to the original signal should be maintained greater than 20dB
[19]. In addition, the technique should make the modified signal not perceivable by human ear.
3.2.2 Reliability: Reliability covers the features like the robustness of the signal against the
malicious attacks and signal processing techniques. The watermark should be made in a way that
they provide high robustness against attacks. In addition, the watermark detection rate should be
high under any types of attacks in the situations of proving ownership. Some of the other attacks
summarized by Secure Digital Music Initiative (SDMI), an online forum for digital music
copyright protection, are digital-to-analog and analog-to-digital conversions, noise addition,
band-pass filtering, time-scale modification, echo addition, and sample rate conversion [10].
19
3.2.3 Capacity: The efficient watermarking technique should be able to carry more information
but should not degrade the quality of the audio signal. It is also important to know if the
watermark is completely distributed over the host signal because, it is possible that near the
extraction process a part of the signal is only available. Hence, capacity is also a primary concern
in the real time situations [19].
3.2.4 Speed:Speed of embedding is one of the criteria for efficient watermarking technique.The
speed of embedding of watermark is important in real time applications where the embedding is
done on continuous signals such as, speech of an official or conversation between airplane pilot
and ground control staff. Some of the possible applications where speed is a constraint are audio
streaming and airline traffic monitoring. Both embedding and extraction process need to be made
as fast as possible with greater efficiency [19].
3.2.5 Asymmetry: If for the entire set of cover objects the watermark remains same; then,
extracting for one file will cause damage watermark of all the files. Thus, asymmetry is also a
noticeable concern. It is recommended to have unique watermarks to different files to help make
the technique more useful [19].
3.3 PROBLEMS AND ATTACKS ON AUDIO SIGNALS

As discussed in Section 3.2 the important requirements of an efficient
watermarking technique are the robustness and inaudibility. There is a tradeoff between these
two requirements; however, by testing the algorithm with the signal processing attacks that gap
can be made minimal. Every application has its specific requirements, and provides an option to
choose high robustness compensating with the quality of the signal and vice-versa. Without any
transformations and attacks every watermarking technique performs efficiently. Some of the
20
most common types of processes an audio signal undergoes when transmitted through a medium
are as follows [11]:
3.3.1 Dynamics: The amplitude modification and attenuation provide the dynamics of the
attacks. Limiting, expansion and compressions are some sort of more complicated applications
which are the non-linear modifications. Some of these types of attacks are re-quantization [20].
3.3.2 Filtering: Filtering is common practice, which is used to amplify or attenuate some part of
the signal. The basic low pass and high pass filters can be used to achieve these types of attacks.
3.3.3 Ambience: In some situations the audio signal gets delayed or there are situations where in
people record signal from a source and claim that the track is theirs. Those situations can be
simulated in a room, which is of great importance to check the performance of an audio signal.
3.3.4 Conversion and lossy compression: Audio generation is done at a particular sampling
frequency and bit rate; however, the created audio track will undergo so many different types of
compression and conversion techniques. Some of the most common compression techniques are
audio compression techniques based on psychoacoustic effect (MPEG and Advanced Audio
Codec (AAC)). In addition to that, it is common process that the original audio signal will
change its sampling frequencies like from 128Kbps to 64Kpbs or 48 Kbps. There are some
programs that can achieve these conversions and perform compression operation. However, for
testing purposes we have used MATLAB to implement these applications. Attacks like resampling and mp3 compression provide some typical examples.
3.3.5 Noise: It is common practice to notice the presence of noise in a signal when transmitted.
Hence, watermarking algorithm should make the technique robust against the noise attacks. It is
recommended to check the algorithm for this type of noise by adding the host signal by an
additive white Gaussian noise (AWGN) to check its robustness.
21
3.3.6 Time stretch and pitch shift: These attacks change either the length of the signal without
changing its pitch and vice versa. These are some de-synchronization attacks which are quite
common in the data transmission. Jittering is one type of such attack.
3.4 PERFORMANCE EVALUATION OF WATERMARKING METHODS
Several Functions are used to qualify the watermarking algorithm, examining
tests on the resulted watermarked image.
3.4.1 Imperceptibility: The imperceptibility of the watermark is tested through comparing the
watermarked image with the original one. Several tests are usually used in this regard.
3.4.2 MSE: Mean Squared Error (MSE) is one of the earliest tests that were performed to test if
two pictures are similar. A function could be simply written according to equation given as:
3.1
3.4.3 PSNR: Peak Signal to Noise Ratio (PSNR) is a better test since it takes the signal strength
into consideration (not only the error). Given equation describes how this value is obtained:
3.2
3.4.4 SSIM: The main problem about the previous two criteria is that they are not similar to
what similarity means to human visual system (HVS). Structural Similarity (SSIM) is a function
defined as equation given below:
3.3
Where: , , & xy are mean, variance, and covariance of the images, and c 1, c2 are the
stabilizing constants.
22
3.4.5 Robustness: The robustness of a watermark method can be evaluated by performing

attacks on the watermarked image and evaluating the similarity of the extracted message to the
original one.
3.4.6 Compression Attack: The most used image compression is definitely JPEG. In MATLAB,
for compressing an image to different quality factors, the image should be created from a matrix.
3.4.7 Cropping: Cropping attack is simply cutting off parts of the image. If the algorithm is nonblind, it is better to bring back those parts from the original image for a better recovery of the
message.
3.4.8 Noise: Gaussian, Poisson, Salt & Pepper, and Speckle etc. Also in extraction the image
recovered has loss of some components which appears as noise.
3.5 VARIOUSAUDIO WATERMARKING TECHNIQUES

An audio watermarking technique can be classified into two groups based on the
domain of operation. One type is time domain technique and the other is transformation based
method. The time domain techniques include methods where the embedding is performed
without any transformation. Watermarking is employed on the original samples of the audio
signal. One of the examples of time domain watermarking technique is the least significant bit
(LSB) method. In LSB method the watermark is embedded into the least significant bits of the
host signal. As against these techniques, the transformation based watermarking methods
perform watermarking in the transformation domain. Few transformation techniques that can be
used are discrete cosine transform and discrete wavelet transform. In transformation based
approaches the embedding is done on the samples of the host signal after they are transformed.
Using of transformation based techniques provides additional information about the signal [26].
23
In general, the time domain techniques provide least robustness as a simple low
pass filtering can remove the watermark [20]. Hence time domain techniques are not advisable
for the applications such as copyright protection and airline traffic monitoring; however, it can be
used in applications like proving ownership and medical applications.
Watermarking techniques can be distinguished as visible or non-blind
watermarking and blind watermarking as described in Section 1.2.2. In the following, we
present typical watermarking strategies such as LSB coding, spread spectrum technique,
patchwork technique, and quantization index modulation (QIM). We provide a detailed
description of transformation methods later in this Chapter.
3.5.1 LSB CODING
This technique is one of the common techniques employed in signal processing

applications. It is based on the substitution of the LSB of the carrier signal with the bit pattern
from the watermark noise [21]. The robustness depends on the number of bits that are being
replaced in the host signal. This type of technique is commonly used in image watermarking
because, each pixel is represented as an integer hence it will be easy to replace the bits. The
audio signal has real values as samples, if converted to an integer will degrade the quality of the
signal to a great extent. The operation of the 2-bit LSB coding is shown in Figure 3.1.
24
Figure 3.1 LSB Embedding
3.5.2 SPREAD SPECTRUM TECHNIQUE
These techniques are derived from the concepts used in spread spectrum
communication [21]. The basic approach is that a narrow band signal is transmitted over the
large bandwidth signal which makes them undetectable as the energy of the signal is overlapped.
In the similar way the watermark is spread over multiple frequency bins so that the energy in any
one bin is very small and certainly undetectable [22].
In spread spectrum technique, the original signal is first transformed to another
domain using domain transformation techniques [21]. The embedding technique can use any
type of approach for example quantization. Zhou et al. proposed an algorithm embedding
watermark in 0th DCT coefficient and 4th DCT coefficients which are obtained by applying DCT
on the original signal [23]. Both embedding and extraction procedure can be interpreted using
Figure 2.2. The original signal is transformed into frequency domain using DCT. Then
watermark is embedded to the sample values in that domain. Reverse procedure is followed to
obtain the watermarked signal. This process of generating embedded signal is shown as
embedding procedure in Figure 2.2.
25
Embedded signal will undergo some attacks, thus, noise is added to the signal. To extract the
watermark the attacked signal is fed through extraction procedure. The procedure for extractions
follows the same steps as that in embedding procedure as shown in Figure 3.2. The extraction
process involves taking the attacked signal and applying DCT, framing the obtained components.
And the obtained frames are used to obtain the watermark. Care is taken to replicate the
procedure used for embedding process.
Figure 3.2 Block Diagram of Spread Spectrum Technique

3.5.3 PATCHWORK TECHNIQUE
The data to be watermarked is separated into two distinct subsets. One feature of
the data is chosen and modified in opposite directions in both subsets [21]. For an example let
the original signal is divided into two parts A and B, then the part A is increased by a fraction
and the part B is decreased by some amount . The samples separation is the secret key which is
termed as watermarking key. Detection of watermark is done by following the statistical
properties of the audio signal. Let NA and NB denote the size(s) of the individual A and B parts
26
and be the amount of the change made to the host signal. Suppose that a[i] and b[i] represent
the sample values at ith position in blocks A and B. The difference of the sample values can be
written as [21]:
3.4
The expectation of the difference is used to extract the watermark which is expressed as
follows:
3.5
3.5.4 QUANTIZATION INDEX MODULATION
The quantization index modulation (QIM) is a technique which uses quantization

of samples to embed watermark. The basic principle of QIM is to find the maximum value of the
samples and to divide the range 0 to the maximum value into intervals of step size . The
intervals are assigned a value of 0 or 1 depending on any pseudo random sequence. Each sample
has quantized value, thus, a polarity is assigned based on the location of the interval. The
watermark is embedded by changing the value of the median for created interval and by the
similarity of the polarity and watermark bit. Suppose to embed a bit with the same polarity, the
median is moved to the same interval as shown in the right black point in the Figure 3.3 [14]. If
the watermark bit and polarity are different then the sample is moved to the median of the
nearest neighbor interval as shown in the left dark point in Figure 3.3 [14]. The quantized sample
can be expressed as shown in equation below.
Q(x) =x
3.6
27
Where: x is the original sample value of the audio signal and Q(x) is the quantized value, hence
the quantization error is .
Figure 3.3 Modification of samples using QIM
3.6TRANSFORMATION TECHNIQUES
Here we discuss the background about discrete cosine transform (DCT) and
discrete wavelet transform (DWT). Also different DWT types such as orthogonal, bi-orthogonal
and frame based filters.
3.6.1 DISCRETE COSINE TRANSFORM
The discrete cosine transform is a technique for converting a signal into

elementary frequency components [25]. The DCT can be employed on both one-dimensional and
twodimensional signals like audio and image, respectively. The discrete cosine transform is the
spectral transformation, which has the properties of Discrete Fourier Transformation [25]. DCT
uses only cosine functions of various wave numbers as basic functions and operates on
realvalued signals and spectral coefficients. DCT of a 1-dimensional (1-d) sequence and the
reconstruction of original signal from its DCT coefficients termed as inverse discrete cosine
transform (IDCT)can be computed using equations [25]. In the following, fdct(x)is original
sequence while Cdct(u)denotes the DCT coefficients of the sequence.
28
3.7
3.8
From the equation for it can be inferred that for u = 0, the component is the
average of the signal also termed as dc coefficient in literature [28]. And all the other 19
transformation coefficients are called as ac coefficients. Some of the important applications of
DCT are image compression and signal compression. The most useful applications of twodimensional (2-d) DCT are the image compression and encryption [25]. The 1 -d DCT equations,
discussed above, can be used to find the 2-d DCT by considering every row as an individual 1 -d
signal. Thus, DCT coefficients of an MN twodimensional signals C DCT2(u,v) and their
reconstruction f DCT2(x,y) can be calculated by the equations below.
3.9
3.10
3.11
29
Some of the properties of DCT are de-correlation, energy compaction, separability,

symmetry and orthogonality [12]. DCT provides inter-pixel redundancy for most of natural
images and coding efficiency is maintained while encoding the uncorrelated transformation
coefficients [28]. DCT packs the energy of the signal into the low frequency regions which
provides an option of reducing the size of the signal without degrading the quality of the signal.
3.6.2 DISCRETE WAVELET TRANSFORM (DWT)
Majority of the signals in practice are represented in time domain. Time-amplitude

representation is obtained by plotting the time domain signal. However, the analysis of the signal
in time domain cannot give complete information of the signal since it cannot provide the
different frequencies available in the signal [26].
Frequency domain provides the details of the frequency components in the signal which are
importance in some applications like electrocardiography (ECG), graphical recording of heart's
electrical activity or electroencephalography (EEG), an analysis of electrical activity of human
brain[26].The frequencyspectrumof a signal is basically the frequency components (spectral
components) of that signal [26]. The main drawback of frequency domain is it does not provide
when in time these frequencies exist.
There are considerable drawbacks in either time domain or frequency domains, which are
rectified in wavelet transform. Wavelet Transform provides the time-frequency representation of
the signal. Some of the other types of time-frequency representation are short time Fourier
transformation, Wigner distributions, etc. There are different types of wavelet transforms such as
Continuous Wavelet Transform(CWT)and discrete wavelet transform (DWT). CWT provides
great redundancy of reconstruction of the signal whereas DWT provides the sufficient
30
information for both analysis and synthesis signal and is easier to implement as compared to
CWT [26].
A complete structure of wavelet contains domain processing analysis block and a
synthesis block. Analysis or decomposition block decomposes the signal into wavelet
coefficients. The reconstruction process is the inverse of decomposition process. Here, the block
takes the decomposed signal and synthesizes (near) original signal. A view of the wavelet
process is shown in Figure 3.4. From the figure the original signal is decomposed in the analysis
block and the signal is reconstructed using the synthesis block. Filters used in the analysis and
synthesis block
Figure 3.4 Basic Block View of Wavelet Functionality

The operation of 1-level discrete wavelet transform decomposition is to separate high
pass and low pass components. Thus, process involves passing the time-domain signal x[n]
through a high pass filter g0[n] and down sampling the signal obtained yieldsDetailed
coefficients (D). And, passing x[n] through low pass filters h0 [n]and down sampling generated
Approximatecoefficients(A). The working principle is shown in Figure 3.5.
31
Figure 3.5Single Level DWT Analysis and Synthesis Blocks

For the multi-level operation the 1-level DWT procedure is repeated by taking either the
low frequency components or the high frequency components or both as in wavelet packets as
the input to the one level analysis block [34]. It can be observed that every time some portion of
the signal corresponding to some frequencies being removed from the signal. The most common
decomposition components chosen are low frequency coefficients. The 3-level DWT
decomposition is shown in Figure 3.6. A1 and D1 are the first level decomposition coefficients
of signal x[n]. At the second level A1 is further decomposed into A2 and D2; and A2 is further
decomposed into A3 and D3 as explained earlier.
For the reconstruction of the decomposed signal, A3 and D3 are used to find low
pass coefficients at level-2 as explained in the single level reconstruction process. The obtained
level2 low- pass signal with D2 is used to obtain low pass coefficients at level-1. The level-1
low frequency components with D1 are used to find the reconstructed original signal.
32
Figure 3.6A3-Level DWT decomposition of signal x[n]

From Figure 3.6, the reconstruction processes can be interpreted and is the inverse of the
decomposition process. The approximate coefficients are up-sampled and passed through a low
pass filter h1[n], similarly, detailed coefficients are up-sampled and passed through high pass
filter g1[n]. The obtained samples from these filters are convoluted to obtain the reconstructed
signal of x[n].
3.7CONCLUSION
In this chapter, we presented the features of human auditory system and the
requirements of the efficient watermarking techniques. Problems and possible attacks on the
audio signal are also provided. Different audio watermarking techniques in the literature such as
LSB coding, spread spectrum technique, patchwork technique, and quantization index
modulation are presented. Also detailed information about the transformation techniques such as
discrete cosine transformation and discrete wavelet transformation (DWT) are provided. It also
presents different types of DWT transformations.
33
CHAPTER 4
PROPOSED METHODOLOGY
FREQUENCY MASKING METHOD

In this proposed work input image (original signal) is embedded into audio signal. Audio
watermarking is challenging than an image watermarking technique due to wider dynamic range
of the HAS in comparison with human visual system (HVS) [1]. Human ear can perceive the
power range greater than 109:1 and range frequencies of 103:1 [4]. In addition, human ear can
hear the low ambient Gaussian noise in the order of 70dB [4]. However, there are many
other useful features such as the louder sounds mask the corresponding slow sounds. This
feature can be used to embed additional information like watermark. Further, HAS is
insensitive to a constant relative phase shift in a stationary audio signal and
some
spectral distortions are interpreted as natural, perceptually non-annoying ones [2]. Two
properties of the HAS dominantly used in watermarking algorithms are frequency (simultaneous)
masking and temporal masking [3].
Frequency Masking: Frequency (simultaneous) masking is a frequency domain observable
fact where low levels signal (the maskee) can be made inaudible (masked) by a simultaneously
appearing stronger signal (the masker), if the masker and maskee are close enough to each
other in frequency [5]. A masking threshold can be found and is the level below which the
audio signal is not audible. Thus, frequency domain is a good region to check for the possible
areas that have imperceptibility.
Temporal Masking: In frequency masking, two phenomena of the HAS in the time domain also
play an important role in human auditory perception. Those are pre-masking and post masking in
time [5]. Temporal masking is used in those applications where the robustness is not of primary
consideration.
34
In this proposed blind frequency masking algorithm entire unwatermarked host signal is
not needed at the detector. Instead, a password is required (Co) usually a data reducing function,
is used by the watermark detector to nullify "noise" effects represented by the addition the host
signal in the embedder. In a blind watermark detector, the un-watermarked host signal is
unknown, and cannot be removed before a watermark extraction. Under these conditions, the
analogy with Figure 6 can be made, where the added watermark is corrupted by the combination
of impacts of the cover work and the noise signal. The received watermarked signal cwn, is now
viewed as a corrupted version of the added pattern wa and the entire watermarked detector is
viewed as the channel decoder.
Fig 4.1: Proposed Frequency Masking Watermarking system with blind detection
35
4.1 PROPOSED EMBEDDING PROCESS
Output
Temporal Masking
Original
Signal
FFT
Calculate
local key
Frequency Masking
( psychoacoustic model
)
Noise Filtering
Pseudo Random
Noise Generator
Owner's
Key
Figure 4.2 Watermark Embedding Block Diagram
WATERMMARK EMBEDING ALGORITHM

(1) First take image i.e. information signal.
(2) Convert it in to frequency domain by taking its transform.
(3) Result in transformed image.
(4) Similarly find transformed audio signal i.e. carrier signal.
(5) Form embedded signal by adding transformed information and carrier signal i.e.
watermarked sub band.
(6) The take inverse transform i.e. convert it into time domain.
(7) Obtain watermarked image.
36
37
Figure 4.3 Watermark Detection Algorithm

4.2 PROPOSED DETECTION PROCESS
Minimize +
Distortions
Tested
Signal
Original
Signal
+
-
Correlation &
Thresholding
Result
Calculate
Signature
Owner's
Key
Figure 4.4 Watermark Detection Block Diagram
WATERMMARK EXTRACTION ALGORITHM

(1) First take watermarked image.
(2) Convert it in to frequency domain by taking its transform.
(3) Result in transformed watermarked image.
(4) Take transformed carrier signal.
(5) Extract image by subtracting watermarked image and audio signal.
(6) The take inverse transform i.e. convert it into time domain.
(7) Obtain original image i.e. information signal.
38
39
Figure 4.5 Watermark Detection Algorithm
CHAPTER 7
CONCLUSION
In this work, a new watermarking based on Frequency Masking scheme is

presented. This method uses the time domain signal and process it in frequency domain, while
time domain features of the carrier remains same, so no one can identify the hidden data into it.
The algorithm is based on Psychoacoustic Auditory Model and Spread Spectrum theory. It
generates a watermark signal using spread spectrum theory and embeds it into the signal by
measuring the masking threshold using Modified Psychoacoustic Auditory model. Since the
watermark is shaped to lie below the masking threshold, the difference between the original and
the watermarked copy is imperceptible. Recovery of the watermark is performed without the
knowledge of the original signal. The watermarks are embedded into non overlapping DCT
coefficients of the audio signal which are randomly selected and very hard to detect even with
the blind detection. The audio watermarking is relatively new and has wide scope for research.
For future, a new algorithm will proposed that taking features of Human Auditory System and
the signal processing theories. Proposed algorithm is based on DCT domain while considering
the more active components of the signal. Comparison shows the Noise PSNR is improved by
6% or 0.0606 dB.
Chapter 8
FUTURE SCOPE
In future work, the proposed scheme will be further developed to estimate the
quality of an image distorted by multiple distortions. Meanwhile, experiments about image
quality estimation in terms of subjective quality scores will be conducted. Since the proposed
scheme has good computational efficiency, it is feasible to further develop the proposed scheme
40
for video quality evaluation. In this work image is taken as data and audio has been taken as host
or carrier, in other work one can use color image instead of grey scale image. Also an audio can
be used as data or information. Further one can implement digital video watermarking if bigger
size data has to be hidden.
41

Final Thesis

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Final Thesis

Hochgeladen von

Copyright:

Verfügbare Formate

SYMBOLS AND ABBREVIATIONS

ADVANCED AUDIO CODING

Additive White Gaussian Noise

Bit Error Probability

Bit Error Rate

Bits Per Second

Channel State Information

Discrete Cosine Transform

Discrete Fourier Transform

Digital Signal Processing

Digital Versatile Disc

Discrete Wavelet Transform

Fast Fourier Transform

Finite Impulse Response

Gain of Transform Coding

Human Auditory System

Human Visual System

Independent Identically Distributed

Improved Spread Spectrum

International Organization for Standardization

Integer Wavelet Transform

Just Noticeable Distortion

Least Significant Bit

Moving Picture Experts Group

MPEG 1 Compression, Layer 3

Noise to Mask Ratio (in decibels)

Personal Digital Assistant

Probability Density Function

Pseudo Random Noise

Peak Signal to Noise Ratio

Power-Density Spectrum Condition

Quantization Index Modulation

Secure Digital Music Initiative

Signal to Mask Ratio (in decibels)

Signal to Noise Ratio (in decibels)

Sound Pressure Level

Transmission Control Protocol

User Datagram Protocol

Video Home System

Weighted Mean-Squared Error

Word Error Probability

Word Error Rate

The invention of steganography and cryptography techniques give secure communication

Steganography is evolved from the ancient technique known as the

Watermarking is a technique through which the secure information is carried without

Figure 1.1 Digital Watermarking Embedding

Figure 1.2 Digital Watermarking Extraction

1.3 APPLICATIONS OF WATERMARKING

[2]Ms. Komal V. Goenka et-al, Overview of Audio Watermarking Techniques,

[3]Shweta Sharma et-al, Survey on Different Level of Audio Watermarking Techniques,

[6]Dhananjay Yadav et-al, Reversible Data Hiding Techniques, International Journal of

Frequency (simultaneous) masking is a frequency domain phenomenon where

3.3 PROBLEMS AND ATTACKS ON AUDIO SIGNALS

3.4.5 Robustness: The robustness of a watermark method can be evaluated by performing

3.5 VARIOUSAUDIO WATERMARKING TECHNIQUES

3.5.1 LSB CODING

This technique is one of the common techniques employed in signal processing

Figure 3.1 LSB Embedding

3.5.2 SPREAD SPECTRUM TECHNIQUE

Figure 3.2 Block Diagram of Spread Spectrum Technique

3.5.4 QUANTIZATION INDEX MODULATION

The quantization index modulation (QIM) is a technique which uses quantization

Figure 3.3 Modification of samples using QIM

The discrete cosine transform is a technique for converting a signal into