You are on page 1of 30

Studies on Data Hiding Techniques for Enforcement of Data Security

A
Mini Project Report
SUBMITTED

By EKTA VIRMANI MCS10008 Under the supervision of

Asst. Professor Ms. DEEPTI GOYAL Deptt. of Computer Science & Engg.

Department of Computer Science & Engineering Advanced Institute of Technology & Management,
70 KM Stone, Delhi-Mathura Road, Aurangabad, Dist. Palwal, Haryana (INDIA)

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

MAHARSHI DAYANAND UNIVERSITY, ROHTAK


CANDIDATE'S DECLARATION

I hereby certify that the work which is being presented in the Mini project report entitled Studies on Data Hiding Techniques for Enforcement of Data Security by Ekta Virmani in partial fulfillment of requirements for the award of degree of M.Tech. (CSE) submitted in the Department of (Computer Science and Engineering) at NAME OF THE ADVANCED INSTITUTE OF TECHNOLOGY & MANAGEMENT MAHARASHI DYANAND UNIVERSITY, ROHTAK is an authentic record of my own work carried out during a period from July 2011 to Dec 2011under the supervision of Asst. Proff. Ms Deepti Goyal. The matter presented in this Mini project report has not been submitted by me in any other University / Institute for the award of M.Tech Degree.

Signature of the Student This is to certify that the above statement made by the candidate is correct the best of my/our knowledge to

Signature of the SUPERVISOR (S) The M.Tech Viva Voce Examination of (NAME OF CANDIDATE) has been held on____________ and accepted

Signature of Supervisor(s) Signature of External Examiner

Signature of H.O.D.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

ACKNOWLEDGEMENT
I would like to place on record my deep sense of gratitude to Asst. Prof. Ms. DEEPTI GOYAL department of Computer Science and Engineering Maharshi Dayanand University, Rohtak (MDU), India for her generous guidance, help and useful suggestions. I express my sincere gratitude to Prof. R.N. RAJOTIA Dept. of Computer Science and Engineering, A.I.T.M Palwal India, for his stimulating guidance, continuous encouragement and supervision throughout the course of present work. I also wish to extend my thanks to other colleagues for attending my seminars and for their insightful comments and constructive suggestions to improve the quality of this research work. I am extremely thankful to Prof. G.P.Dubey, Principal, A.I.T.M Palwal, for providing me infrastructural facilities to work in, without which this work would not have been possible.

EKTA VIRMANI

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

ABSTRACT

Digital communication has become an essential part of infrastructure now a days, a lot of applications are Internet-based and in some cases it is desired that communication be made secret. Consequently, the security of information has become a fundamental issue. Two techniques are available to achieve this goal: Encryption and steganography is one of them. Using cryptography, the data is transformed into some other gibberish form and then the encrypted data is transmitted. In steganography, the data is embedded in an image file and the image file is transmitted. This paper proposed a system that combines the effect of these two methods to enhance the security of the data. This proposed system encrypts the data with a crypto algorithm and then embeds the encrypted text in an image file. The embedding process is done with help of stego-key, and the detection or reading of embedded information is possible only having this key. The stego key (user-specified or default) is used not only to facilitate random selection of bytes for hiding message file bits but also is used to encrypt the message file. The encryption method is based on XORing the message bytes with random numbers generated by a pseudorandom number generator whose seed is derived from the stego key. Here we also calculate the message digest of image and embed into image file to check integrity of message contents..

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

TABLE OF CONTENTS
1. Introduction to Data Security 1.1 Definition and Purpose 1.2 Introduction to Data Hiding 1.2.1 Strong User Authentication 1.2.2 Backup Solutions 1.3 Different Techniques for Data Hiding 1.3.1 Stenography 1.3.2 Digital Watermarking 1.4 Important properties of Data Hiding 2. Introduction to steganography 2.1 Introduction 2.2 Purpose 2.3 Scope 2.4 Definitions and Acronyms 2.5 Overview 3. Overall Description 3.1 Encryption and Steganography 3.2 Uses of Steganography 3.3 Steganography and Security 3.4 Steganography Tools 3.5 Possible attacks in Steganography 4. Steganographic Techniques 4.1 Steganography using Linear Feedback Shift Registers 4.2 Least significant Bit Insertion Steganography 4.3 Masking and Filtering 4.4 Discrete Cosine Transform Steganography 5. SRS of Project 6. Future scope of the project 7. References 8. Appendix A: Research Papers

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

1 Introduction to Data Security


1.1 Definition and Purpose : In simple terms, data security is the practice of keeping data protected from corruption and unauthorized access. The focus behind data security is to ensure privacy while protecting personal or corporate data. Data is the raw form of information stored as columns and rows in our databases, network servers and personal computers. This may be a wide range of information from personal files and intellectual property to market analytics and details intended to top secret. Data could be anything of interest that can be read or otherwise interpreted in human form. However, some of this information isn't intended to leave the system. The unauthorized access of this data could lead to numerous problems for the larger corporation or even the personal home user. Having your bank account details stolen is just as damaging as the system administrator who was just robbed for the client information in their database. There has been a huge emphasis on data security as of late, largely because of the internet. There are a number of options for locking down your data from software solutions to hardware mechanisms. Computer users are certainly more conscious these days, but is your data really secure? If you're not following the essential guidelines, your sensitive information just may be at risk.

1.2 Introduction to Data Hiding

Data Hiding has become a critical security feature for thriving networks and active home users alike. This security mechanism uses mathematical schemes and algorithms to scramble data into unreadable text. It can only by decoded or decrypted by the party that possesses the associated key. (FDE) Full-disk encryption offers some of the best protection available. This technology enables you to encrypt every piece of data on a disk or hard disk drive. Full disk encryption is even more powerful when hardware solutions are used in conjunction with software components. This combination is often referred to as end-based or end-point full disk encryption.

1.2.1 Strong User Authentication

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

Authentication is another part of data security that we encounter with everyday computer usage. Just think about when you log into your email or blog account. That single sign-on process is a form authentication that allows you to log into applications, files, folders and even an entire computer system. Once logged in, you have various given privileges until logging out. Some systems will cancel a session if your machine has been idle for a certain amount of time, requiring that you prove authentication once again to re-enter. The single sign-on scheme is also implemented into strong user authentication systems. However, it requires individuals to login using multiple factors of authentication. This may include a password, a one-time password, a smart card or even a fingerprint.

1.2.2 Backup Solutions

Data security wouldn't be complete without a solution to backup your critical information. Though it may appear secure while confined away in a machine, there is always a chance that your data can be compromised. You could suddenly be hit with a malware infection where a virus destroys all of your files. Someone could enter your computer and thieve data by sliding through a security hole in the operating system. Perhaps it was an inside job that caused your business to lose those sensitive reports. If all else fails, a reliable backup solution will allow you to restore your data instead of starting completely from scratch.

1.3 Different Techniques for Data Hiding

1.3.1 Steganography : The main purpose is to hide or cover the occurence of communication with other data, in such a way that the third parties (unauthorized persons) cannot detect or even notice the presence of the communication. Steganographic communications are usually point-to-point. Compared with cryptography techniques attempting to conceal the content of message, steganography conceals the existence of the secret message.

1.3.2 Digital watermarking : The objective is to embed a signature within a digital cover signal to signify origin or ownership. Watermarking, as opposed to steganography, has the additional requirement of robustness against possible attacks. Watermarks do not always need to be hidden (some systems use visible digital watermarks), and watermarking techniques are usually one-to-many. 1.4 The most important properties of data hiding are :

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

1. Robustness : presence of the embedded information can be reliably detected after an image has been modified, but not destroyed beyond recognition, and means resistance to .blind., non-targeted modification, or common image operations. 2. Undetectability : typically required for secure covert communication. The embedded information is undetectable if the data with the embedded message are consistent with a model of the source from which data are drawn, e.g. mathematical analysis may reveal statistical discrepancies that expose the fact that hidden communication is happening. 3. Invisibility (perceptual transparency): an average human subject must be unable to distinguish among data that do contain hidden information and those that do not, and this property is associated with Signal-to-Noise Ratio (SNR). 4. Security : the embedded information cannot be removed beyond reliable detection by targeted attacks based on a full knowledge of the embedding algorithm and the knowledge of at least one carrier with hidden message. The system is already insecure if an attacker is able to prove the existence of secret message. 5. Capacity : maximum amount of hidden data that can be hidden and successfully extracted.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

2 INTRODUCTION TO STEGANOGRAPHY

2.1. INTRODUCTION

The term steganography comes from the Greek words for covered writing that is steganos which stands for covered or secret and graphy which stands for writing or drawing. If, as a child, you ever wrote an invisible message in lemon juice and had your friend hold it next to a light bulb in order to watch the message magically appear, you've used steganography. When using steganography on a computer, you actually hide a message within another file. That resulting file is called a "stego file."

The trick to computer steganography is to choose a file capable of hiding a message. A picture, audio, or video file is ideal for several reasons: These types of files are already compressed by an algorithm. For example, .jpeg, .mp3, .mp4, and .wav formats are all examples of compression algorithms. These files tend to be large, making it easier to find spots capable of hiding some text. These files make excellent distractors. That is, few people expect a text message to be hidden within a picture or an audio clip. If the steganographic utility does its job well, a user shouldn't notice a difference in the quality of the image or sound, even though some of the bits have been changed in order to make room for the hidden message.

Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography, where the "enemy" is allowed to detect, intercept and modify messages without being able to violate certain security premises guaranteed by a cryptosystem, the goal of steganography is to hide messages inside other "harmless" messages in a way that does not allow any "enemy" to even detect that there is a second secret message present. Steganography is in the (especially military) literature also referred to as transmission security or short TRANSEC.

A good steganography system should fulfill the same requirements posed by the "Kerckhoff principle" in cryptography. This means that the security of the system has to be based on the assumption that the "enemy" has full knowledge of the design and implementation details of the steganographic system. The only missing information for the "enemy" is a short easily exchangeable random number sequence, the secret key, and without the secret key, the "enemy" should not have the slightest chance of even becoming suspicious that on an observed communication channel hidden communication might take place.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

Steganography is closely related to the problem of "hidden channels" in secure operating system design, a term which refers to all communication paths that can not easily be restricted by access control mechanisms (e.g. two processes that communicate by modulating and measuring the CPU load). Steganography is also closely related to spread spectrum radio transmission, a technique that allows to receive radio signals that are over 100 times weaker than the atmospheric background noise. Most communication channels like telephone lines and radio broadcasts transmit signals which are always accompanied by some kind of noise. This noise can be replaced by a secret signal that has been transformed into a form that is indistinguishable from noise without knowledge of a secret key and this way, the secret signal can be transmitted undetectable.

However really good steganography is much more difficult and usage of most of the currently available steganographic tools might be quite easily detected using sufficiently careful analysis of the transmitted data. The noise on analog systems has a large number of properties very characteristic to the channel and the equipment used in the communication system. A good steganographic system has to observe the channel, has to build a model of the type of noise which is present and has then to adapt the parameters of its own encoding algorithms so that the noise replacement fits the model parameters of the noise on the channel as well as possible. Whether the steganographic system is really secure depends on whether the "enemy" has a more sophisticated model of the noise on the channel than the one used in the steganographic system.

Common communication systems have a huge number of characteristics and only a small fraction of what looks like noise can actually be replaced by the statistically very clean noise of a cryptographic cipher text. Noise in communication systems is often created by modulation, quantization and signal cross-over and is heavily influenced by these mechanisms and in addition by all kinds of filters, echo cancelation units, data format converters, etc. Many steganographic systems have to work in noisy environments and consequently require synchronization and forward error correction mechanisms that also have to be undetectable as long as the secret key is unknown.

Cryptography the science of writing in secret codes addresses all of the elements necessary for secure communication over an insecure channel, namely privacy, confidentiality, key exchange, authentication, and nonrepudiation. But cryptography does not always provide safe communication. Steganography is the science of hiding information. Whereas the goal of cryptography is to make data unreadable by a third party, the goal of steganography is to hide the data from a third party. In this article, I will discuss what steganography is, what purposes it serves, and will provide an example using available software.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

There are a large number of steganographic methods that most of us are familiar with (especially if you watch a lot of spy movies!), ranging from invisible ink and microdots to secreting a hidden message in the second letter of each word of a large body of text and spread spectrum radio communication. With computers and networks, there are many other ways of hiding information, such as: Covert channels (e.g., Loki and some distributed denial-ofservice tools use the Internet Control Message Protocol, or ICMP, as the communications channel between the "bad guy" and a compromised system) Hidden text within Web pages Hiding files in "plain sight" (e.g., what better place to "hide" a file than with an important sounding name in the c:\winnt\system32 directory?) Null ciphers (e.g., using the first letter of each word to form a hidden message in an otherwise innocuous text)

Steganography today, however, is significantly more sophisticated than the examples above suggest, allowing a user to hide large amounts of information within image and audio files. These forms of steganography often are used in conjunction with cryptography so that the information is doubly protected; first it is encrypted and then hidden so that an adversary has to first find the information (an often difficult task in and of itself) and then decrypt it.

There are a number of uses for steganography besides the mere novelty. One of the most widely used applications is for so-called digital watermarking. A watermark, historically, is the replication of an image, logo, or text on paper stock so that the source of the document can be at least partially authenticated. A digital watermark can accomplish the same function; a graphic artist, for example, might post sample images on her Web site complete with an embedded signature so that she can later prove her ownership in case others attempt to portray her work as their own.

STEGANOGRAPHIC METHODS The following formula provides a very generic description of the pieces of the steganographic process: cover_medium + hidden_data + stego_key = stego_medium In this context, the cover_medium is the file in which we will hide the hidden_data, which may also be encrypted using the stego_key. The resultant file is the stego_medium (which will, of course. be the same type of file as the cover_medium). The cover_medium (and, thus, the stego_medium) are typically image or audio files. In this article, I will focus on image files and will, therefore, refer to the cover_image and stego_image.

An image file is merely a binary file containing a binary representation of the color or light intensity of each picture element (pixel) comprising the image.Images typically use either 8-bit or 24-bit color. When using 8-bit color, there is a definition of up to 256 colors forming a palette for this image, each color denoted by an 8-bit value. A 24-bit color scheme, as the term suggests, uses 24 bits per pixel and provides a much better set of colors.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

In this case, each pix is represented by three bytes, each byte representing the intensity of the three primary colors red, green, and blue (RGB), respectively. The Hypertext Markup Language (HTML) format for indicating colors in a Web page often uses a 24-bit format employing six hexadecimal digits, each pair representing the amount of red, blue, and green, respectively. The color orange, for example, would be displayed with red set to 100% (decimal 255, hex FF), green set to 50% (decimal 127, hex 7F), and no blue (0), so we would use "#FF7F00" in the HTML code.

The size of an image file, then, is directly related to the number of pixels and the granularity of the color definition. A typical 640x480 pix image using a palette of 256 colors would require a file about 307 KB in size (640 480 bytes), whereas a 1024x768 pix high-resolution 24-bit color image would result in a 2.36 MB file (1024 768 3 bytes).

To avoid sending files of this enormous size, a number of compression schemes have been developed over time, notably Bitmap (BMP), Graphic Interchange Format (GIF), and Joint Photographic Experts Group (JPEG) file types. Not all are equally suited to steganography, however,GIF and 8-bit BMP files employ what is known as lossless compression, a scheme that allows the software to exactly reconstruct the original image. JPEG, on the other hand, uses lossy compression, which means that the expanded image is very nearly the same as the original but not an exact duplicate. While both methods allow computers to save storage space, lossless compression is much better suited to applications where the integrity of the original information must be maintained, such as steganography. While JPEG can be used for stego applications, it is more common to embed data in GIF or BMP files.

The simplest approach to hiding data within an image file is called least significant bit (LSB) insertion. In this method, we can take the binary representation of the hidden_data and overwrite the LSB of each byte within the cover_image. If we are using 24-bit color, the amount of change will be minimal and indiscernible to the human eye. As an example, suppose that we have three adjacent pixels (nine bytes) with the following RGB encoding: 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011 Now suppose we want to "hide" the following 9 bits of data (the hidden data is usually compressed prior to being hidden): 101101101. If we overlay these 9 bits over the LSB of the 9 bytes above, we get the following (where bits in bold have been changed):

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

10010101 00001100 11001001 10010111 00001110 11001011 10011111 00010000 11001011 Note that we have successfully hidden 9 bits but at a cost of only changing 4, or roughly 50%, of the LSBs.

This description is meant only as a high-level overview. Similar methods can be applied to 8-bit color but the changes, as the reader might imagine, are more dramatic. Gray-scale images, too, are very useful for steganographic purposes. One potential problem with any of these methods is that they can be found by an adversary who is looking. In addition, there are other methods besides LSB insertion with which to insert hidden information.

Without going into any detail, it is worth mentioning steganalysis, the art of detecting and breaking steganography. One form of this analysis is to examine the color palette of a graphical image. In most images, there will be a unique binary encoding of each individual color. If the image contains hidden data, however, many colors in the palette will have duplicate binary encodings since, for all practical purposes, we can't count the LSB. If the analysis of the color palette of a given file yields many duplicates, we might safely conclude that the file has hidden information.

But what files would you analyze? Suppose I decide to post a hidden message by hiding it in an image file that I post at an auction site on the Internet. The item I am auctioning is real so a lot of people may access the site and download the file; only a few people know that the image has special information that only they can read. And we haven't even discussed hidden data inside audio files! Indeed, the quantity of potential cover files makes steganalysis a Herculean task

2.2 PURPOSE

The main purpose of steganography is to hide the occurrence of communication. While most methods in use today are invisible to the observer's senses, mathematical analysis may reveal statistical discrepancies in the stego medium. These discrepancies expose the fact that hidden communication is happening. The purpose of this document is to describe the steps taken in the development of steganography tool using LFSR technique. This document describes the development phases as well as the working of the tool. This project deals with the concept of steganography i.e. A Method of Living a secret message inside another data. The camouflaging document is restricted to only bitmap images. The basic purpose of this particular project is to create secret message with the

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

advantage of not only encrypting the message but also camouflage it to such an extent such that any other person other than ones involved would be unaware of any message at all.

2.3 SCOPE This comprehensive tool thus developed finds extensive use in transferring crucial data over various channels safely. The two level security offered by the tool enhances increases its reliability thus enhancing its usability.

2.4 DEFINITIONS AND ACRONYMS

LFSR-linear feedback shift register

Jpeg- joint photographic expert group

Mpeg-motion pictures expert group

Gif-graphics interchange format

Bmp-a Microsoft windows andOS/2 bitmap file

2.5 OVERVIEW Steganography is the act of concealing the existence of a message. This differs from conventional cryptography because it is concerned with hiding the presence of a message rather than its contents. As a familiar example, consider an acrostic, in which the first letter of each word of a text can be interpreted to reveal a hidden message. In a similar fashion, digital steganographic systems often use the least significant bit of each byte in some binary file to encode a message. To illustrate this concept, let us consider a simple example. Suppose we would like to save the binary data

0100

inside some binary file, and that this binary file contains four bytes:

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

10010101 11001110 10100111 00110110

To encode the data, the least significant bit of each byte is changed to reflect the corresponding bit of the message. In this example, a new binary file will be created with the data

10010100 11001111 10100110 00110110

To any party that is looking for it, the original message can be reconstructed by reading the least significant bit of each byte. The hope is that no third party will think to look in some inconspicuous binary data for a secret message. Frequently, the carrier medium is an image. It is hoped that by only changing the least significant bits of the data, the image will not appear to be altered and no one will attempt to decode the secret message.

Steganography serves to hide secret messages in other messages, such that the secret 0s very existence is concealed. Generally the sender writes an innocuous message and then conceals a secret message on the same piece of paper. Historical tricks include invisible inks, tiny pin punctures on selected characters, minute differences between handwritten characters, pencil marks on typewritten characters, grilles which cover most of the message except for a few characters, and so on.

More recently, people are hiding secret messages in graphic images. Replace the least significant bit of each byte of the image with the bits of the message. The graphical image wont change appreciablymost graphics standards specify more gradations of color than the human eye can noticeand the message can be stripped out at the receiving end. You can store a 64-kilobyte message in a 1024 1024 grey-scale picture this way. Several public-domain programs do this sort of thing.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

3 OVERALL DESCRIPTION

3.1

ENCRYPTION & STEGANOGRAPHY

Steganography is often compared with another technique for securing data called cryptography. In cryptography we encrypt the secret message before sending. But existence of encrypted message can be easily noted. Whereas, in steganography we hide the secret message in a cover file in such a manner that detection of its existence is not possible. The attacker has to analyze all the files being transferred over the network in order to detect the presence of hidden data, which is an impossible task. Steganography, if implemented using efficient techniques can be securest way of transferring data. The purpose of steganography is not only to keep others away from knowing the hidden information; it is also to keep others away from thinking that the information even exists. Classical stago concerns itself with ways of embedding a secret message (which might be copyright mark, a convert communication, or a serial no) in a cover message (such as video films, an audio recording, or computer code). If a stego method causes someone to suspect the carrier medium, then the method has failed. Encryption and steganography achieve separate goals. On the other hand, encryption encodes data such that an unintended recipient cannot determine its intended meaning. Encryption helps in achieving the confidentiality of the document. Stego does not alter data to make it unusable to an unintended recipient. Instead, the steganographer attempts to prevent an unintended recipient from suspecting that the data is there. Those who seek the ultimate in private communication can combine encryption & steganography. Encrypted data is more difficult to differentiate from naturally occurring phenomena than plaintext is in the carrier medium. The embedding is typically parameterized by key: without knowledge of this key (or a related one) it is difficult for a third party to detect or remove the embedded material. Once the cover object has material embedded in it, it is called a stego object. Thus, for

example, we might embed a mark in a covertext giving a stegotext, or embed a text in a cover image giving a stego-image and so on. So encryption and steganography combined can generate the securest way of communication.

3.2

USES OF STEGANOGRAPHY

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

Internet users frequently need to store, send or retrieve private information. The most common way to do this is to transform the plain data into a non- understandable form called Encrypted Data. Only those who know how to decrypt it can understand the encrypted data. This method of protecting data is known as encryption. As already mentioned earlier that major drawback to encryption is that the existence of data is not hidden. Data that has been encrypted, although unreadable, still exist as data. If given enough time and information, someone might eventually decrypt the data. A solution to this is steganography. If a person simply wants to communicate without being subjected to his/her employers monitoring systems, then digital steganography is a good solution-the most private communication is the one that never existed.

With the proliferation of multimedia and concerns of privacy on the Internet, such research has become even more pressing. Information is collected by numerous organizations and the nature of digital media allows for the exact duplication of material with no notification that the material has been copied. The more information placed in the publics reach on the Internet, the more owners of such information need to protect themselves from unwanted surveillance, theft and false representation and reproduction.

Systems to analyze techniques for uncovering hidden information and recovering seemingly destroyed information are thus of great importance to many groups, including law enforcement authorities in computer forensics and digital traffic analysis.

Businesses may have similar concerns regarding trade secrets or new product information. Businesses have increasingly taken advantage of another form of steganography, called watermarking. Watermarking is used primarily for identification and entails embedding a unique piece of information within a medium without noticeably altering the medium. For e.g. if a person creates a digital image, he can embed in the image a watermark that identifies him as the images creator. He would achieve this by manipulating the image data using steganography, such that the result contains data representing his name without noticeably altering itself. Others who obtain his digital image cant visibly determine that any extra information is hidden within it. If someone attempts to use his image without permission, he can prove it is his by extracting the watermark. Watermarking is commonly used to proct copyrighted digital media, such as Web page art and audio files.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

STEGANOGRAPHY

PROTECTION AGAINST DETECTION (data hiding)

PROTECTION AGAINST REMOVAL( document marking)

WATER MARKING

FINGERPRINTING

FIG.1 Forms of Stegnography

Digital data hiding techniques for images are explored, analyzed, attacked and countered. Understanding the limitations of these methods provides for the construction of robust methods that can better survive manipulations and attacks. Stego and watermarking Like many security tools, steganography can be used for a variety of reasons, some good, some not so good. Legitimate purposes can include things like watermarking images for reasons such as copyright protection. Digital watermarks (also known as fingerprinting, significant especially in copyrighting material) are similar to steganography in that they are overlaid in files, which appear to be part of the original file and are thus not easily detectable by the average person. Steganography can also be used as a way to make a substitute for a one-way hash value (where you take a variable length input and create a static length output string to verify that no changes have been made to the original variable length input). Further, steganography can be used to tag notes to online images (like post-it notes attached to paper files). Finally, steganography can be used to maintain the confidentiality of valuable information, to protect the data from possible sabotage, theft, or unauthorized viewing. Unfortunately, steganography can also be used for illegitimate reasons. For instance, if someone was trying to steal data, they could conceal it in another file or files and send it out in an innocent looking email or file transfer.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

Furthermore, a person with a hobby of saving pornography, or worse, to their hard drive, may choose to hide the evidence through the use of steganography. 3.3 STEGANOGRAPHY AND SECURITY

Steganography is an effective means of hiding data, thereby protecting the data from unauthorized or unwanted viewing. But stego is simply one of many ways to protect the confidentiality of data. It is probably best used in conjunction with another data-hiding method. When used in combination, these methods can all be a part of a layered security approach. Some good complementary methods include:

Encryption - Encryption is the process of passing data or plaintext through a series of mathematical operations that generate an alternate form of the original data known as ciphertext. The encrypted data can only be read by parties who have been given the necessary key to decrypt the ciphertext back into its original plaintext form. Encryption doesn't hide data, but it does make it hard to read!

Hidden directories (Windows) - Windows offers this feature, which allows users to hide files. Using this feature is as easy as changing the properties of a directory to "hidden", and hoping that no one displays all types of files in their explorer.

Hiding directories (Unix) - in existing directories that have a lot of files, such as in the /dev directory on a Unix implementation, or making a directory that starts with three dots (...) versus the normal single or double dot.

Covert channels - Some tools can be used to transmit valuable data in seemingly normal network
traffic. One such tool is Loki. Loki is a tool that hides data in ICMP traffic (like ping).

3.4

STEGANOGRAPHY TOOLS

There are a vast number of tools that are available for steganography. An important distinction that should be made among the tools available today is the difference between tools that do steganography, and tools that do steganalysis, which is the method of detecting steganography and destroying the original message. Steganalysis focuses on this aspect, as opposed to simply discovering and decrypting the message, because this can be difficult

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

to do unless the encryption keys are known.

Private data Encrypt

Encrypted data Split Data chunks Apply steganography Images

Send Inconspicuous files

Extract Data chunks Combine Encrypted data

Decrypt Private data


FIG. 2 PROCESS FLOW OF TRANSMITTED DATA.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

3.5

POSSIBLE ATTACKS IN STEGENOGRAPHY

Stegenography concerns itself with hiding secret data within a cover message, but there are some possible attacks in the technique. A successful attack does not mean detecting a mark ,but rendering it useless Possible attacks in various techniques are:-

1. The L.S.B. insertion technique, which is the most trivial one and has much greater chances to be detected. 2. The key dependent approach in which sender & receiver share a secret key & use a cryptrographic method to hide a text into a cover message is secure till the key is secret. Key management is still a prominent problem. 3. Technique in which variance in luminosity of the surrounding pixels comes into picture can be rendered useless by a trivial filtering process. 4. In JPEG compression, message can be embedded in the frequency domain by altering he components of the images Discrete Cosine Transform. If several marks are introduced in succession in this technique it all leadsto trivial detection. 5. In entropy technique the entropy of the stegotext will be equal to the entropy of the cover text plus the entropy of the embedded material. Thus in order to make embedding process secure we have to consider two things: a) Entropy of the embedded material has to kept much less than uncertainty opponents measurement of entropy of the cover text. b) Find some way of processing cover text to reduce its entropy by an amount that can be made up by adding the embedded material. Compression algorithm can be used to remove some unnecessary information from cover text before embedding the embedded material. in the

6. A possible attack on a cover message involves their replacements with noise by cunning pirates. 7. We have to confine the embedded text to an extent because embedding more in the cover message leads to the estimation of statistics of embedded message.

STEGANOGRAPHIC TECHNIQUES

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

To a computer, an image is an array of numbers that represent light intensities at various points or pixels. Digital images are typically stored in either 24-bit or 8-bit per pixel files. 8-bit color images can be used to hide information. In 8-bit color images (such as GIF files), each pixel is represented as a single bytes. Each pixel merely points to a color index table with 256 possible colors. Grey-scale images are preferred because the shades change very gradually between color entries. Image compression offers a solution to large image files. Two kinds of image compression are possible: Lossy compression: it offers high compression, but may lose image data. This may compromise the hidden information when part of the embedded information is lost. It is based on the concept of compromising the accuracy of the reconstructed image in exchange for increased compressions. If the resulting distortion can be tolerated , the increase in compression can be significant. Lossless compression: It maintains the original image data exactly, thats why it is more favoured by steganographic techniques.

4.1 Steganography using Linear Feedback Shift Registers:

Linear Feed Shift Registers (LFSR) is a mechanism of generating a random binary sequence. The Linear Feedback Shift Register consists of a series of D flip-flops that are initialized by an initialization vector, also called as seed value. A clock synchronizes the D flip-flops. At every clock tick, a bit is output and the values in the flip-flops are shifted to the right by one. The first flip-flop gets its value from a polynomial function involving certain cells. The polynomial function is responsible for incorporating Non-Linearity in the output binary sequence. Linear Feed Shift Registers play an extremely vital role in generating keys for various encryption techniques. Linear Feed Shift Registers generate extremely good pseudo random sequence. The period of sequence is 2 raise to power n minus 1 (2n 1), where n is the number of cells in the LFSR.

Linear Feed Shift Registers in Steganography:

In usual LSB techniques, we store the data in the Least Significant Bit of consecutive data elements. The weakness of this technique lies in the fact that data is stored in the same bit position of every byte. So the attacker can easily get all the data bits & hence get to the secret data. LFSR come for help at this time.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

D1 Q1 FF1

D2 Q2 FF2

D3 Q3 FF3

D4 Q4 FF4

D5 Q5 FF5

XOR

Consider the output of the last 3 flip-flops of the above figure. At every stage we get a random decimal number ranging from 0 to 7. The length of this sequence of random numbers is 2n 1, after which the sequence starts repeating. Higher the values of n, higher are the randomness. Assume the sequence generated is 2,3,0,6,5,1,2,4,1,0,4,6 In this technique we store the data in the bit that is the decimal values shown by the last 3 flip-flops. So we store the data in the 2nd bit of first byte, then in the 3rd bit of second byte, then in the LSB of the third byte and so on. Hence the attacker would not be able to know the secret data until the attacker knows the seed value, which was used to initialize the flip-flops because with different initialization vector we get different random sequence. The randomness involved in this Steganographic technique secures the data to a great extent.

4.2 LEAST SIGNIFICANT BIT INSERTION STEGANOGRAPHY

LSB insertion is the easiest and one of the most commonly used techniques to implement steganography. It makes use of the fact that any change in the LSB of the data field of an image is not detected by the human eye without any distinction. The LSB insertion technique can similarly applied to audio files also. Following steps are implemented for the LSB insertion :-

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

1. 2. 3.

Read the container file and the text byte to byte. Replace the LSB of the source file with the text bit. Store the resultant byte in the steganographed file.

Following steps are implemented for the retrieval of data from a file:1. 2. 3. Read the source file byte to byte. Collect the LSB of every byte. Combine the LSBs to retrieve back the hidden data.

ADVANTAGES OF LSB INSERTION TECHNIQUE:

1.

ENCRYPTION: The data to be embedded is first encrypted and then embedded in to the image. There are many algorithms which are used for the encryption hence at the time of retrieval user can get the meaningful data if he knows the algorithm applied and the key used for encryption.

2.

SELECTIVE INSERTION: In this only the selected bytes of the container file are modified. In this case the amount of the data stored in an image would decrease but it is fairly acceptable from the security point of view.

3.

HIGH LEVEL LSB INSERTION: We hide the data in more than one bit of a byte taking in to account that there is no distortion in the final file. This has two advantages:-

More data is stored in the image file. Any intruder might not be able to retrieve the data as he will only look for single bir from a file.

4.3 MASKING AND FILTERING

Masking and filtering techniques hide information by marking an image in a manner similar to paper watermarks. Because watermarking techniques are more integrated into the image, they may be applied without fear of image destruction from lossy compression. By covering , or masking a faint but perceptible signal with another to make the first non-perceptible , we exploit the fact that the human visual system cannot detect slight changes in certain temporal domains of the image.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

Masking techniques are more suitable for use in lossy JPEG images than LSB insertion because of their relative immunity to image operations such as compression and cropping.

4.4 DCT (DISCRETE COSINE TRANSFORM) STEGANOGRAPHY

It is a very unique form of steganography where the data is hidden in the image according to the local characteristics of the image. It means that we hide more data in high frequency components of the image and lesser data in the low frequency components of the image. Hence there is least influence on the quality of the image. The technique employs embedding data in to the DCT coefficients of the adjacent blocks of image where there is insignificant deviation b/w their components. The data is embedded using LSB insertion technique, where the LSB of the DCT coefficients is changed in accordance with the data to be stored. The data embedding also depends on the type of the image. For a colour image, the data is embedded in the separate planes RGB. For a grayscale image the process is straight forward i.e. the data is stored directly.

Steps taken to perform DCT steganography:-

Find the DCT of the source image. Find the DCT coefficients below a pre-described value. If there is not a significant difference b/w the same components of DCT coefficients of adjacent blocks of the image ,data is embedded in the DCT coefficients, but if there is a significant difference b/w the components, data is not embedded.

Implement the data embedding using any technique to substitute the data bits. This process is a part of embedding algorithm.

Find the inverse DCT transform to the modified DCT coefficients. Now we have the steganographed image (in similar format as the source image) that contains the embedded data.

Steps taken to perform DCT de-steganography:-

Find the DCT of the source image (which contains the hidden data).. Find the DCT coefficients below a pre-described value. Find the bits of data from these DCT coefficients. Retrieve the actual data by combining these bits.

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

EVALUATION OF DCT TECHNIQUUE:-

This technique is very flexible, as the amount of data embedded in the image is not fixed like some other techniques. The amount of data here does not depend on the number of pixels but on the value of quality coefficient.

The decryption of data in this technique is virtually impossible, as one must know the threshold value and quality coefficients that were used at the time of embedding. The technique does not embed any data into certain (top and extreme left) blocks of DCT coefficients.

USE CASE DIAGRAM


ENCODE

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

USER SELECT A BITMAP IMAGE

CLICK ENCODE

ENTER PASSWORD

ENTER SECRET TEXT

DECODE

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

USER

CLICK DECODE

SELECT THE STEGO IMAGE

ENTER PASSWORD

Future scope of the project


Our tool offers a two level security since it uses both data hiding and encryption. The tool is compatible with the .bmp and .gif format of images

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

The quality of the cover image does not degrade with data hidden in it as long as the ratio of the size of the data and the size of the cover image is about 1:8. Future work on this project can improve this ratio. Another possible scope of improvement is to make this tool compatible with the .dat ,.wav file formats and motion pictures formats which will enhance the functionality of the tool.

REFERENCES

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)

1. Moerland, T., Steganography and Steganalysis, Leiden Institute of Advanced Computing Science, www.liacs.nl/home/ tmoerl/privtech.pdf 2. Silman, J., Steganography and Steganalysis: An Overview, SANS Institute, 2001

3. Jamil, T., Steganography: The art of hiding information is plain sight, IEEE Potentials, 18:01, 1999 4. Wang, H & Wang, S, Cyber warfare: Steganography vs. Steganalysis, Communications of the ACM, 7:10, October 2004. 5. Ahsan, K. & Kundur, D., Practical Data hiding in TCP/IP, Proceedings of the Workshop on Multimedia Security at ACM Multimedia, 2002 6. Krenn, R., Steganography and Steganalysis, http://www.krenn.nl/univ/cry/steg/article.pdf

Print to PDF without this message by purchasing novaPDF (http://www.novapdf.com/)