Sie sind auf Seite 1von 15

SECURITY THROUGH OBSCURITY:

STEGANOGRAPHY
V.Santhosh Kumar, P.V.U.Mahesh,
Department of Computer Science & Department of Computer Science &
Systems Engineering, Engineering,
Andhra University. Gayatri Vidya Parishad.
Email: welcome.santhosh@gmail.com Email:
welcome.mahesh@gmail.com

◙ ABSTRACT
As the information age is growing rapidly and data becomes highly valuable and
sensitive, methods need to be discovered to protect and secure sensitive data. One such
method that transfers data over network securely is Steganography. Many different carrier file
formats can be used, but digital images are the most popular because of their frequency on the
Internet. Different applications have different requirements of the steganography technique
used. For example, some applications may require absolute invisibility of the secret
information, while others require a larger secret message to be hidden. This paper gives a clear
overview of usage of Steganographic methods to hide the data in images, and deals with some
of the steganographic techniques for hiding information in Images and TCP/IP protocol that
are used frequently in the open-systems environment such as internet.

1. INTRODUCTION
"Steganography is the art and science of communicating in a way which hides
the existence of the communication”[1]. This basically comes down to using unnecessary bits in
an innocent file to store your sensitive data. The techniques used make it impossible to detect
that there is anything inside the innocent file, but the intended recipient can obtain the hidden
data. This way, one can not only hide the message itself, but also the fact that he is sending this
message[2].
1.1 GOAL OF STEGANOGRAPHY : In contrast to Cryptography, where the enemy is
allowed to detect, intercept and modify messages without being able to violate certain security
premises guaranteed by a cryptosystem, the goal of Steganography is to hide messages inside
other harmless messages in a way that does not allow any enemy to even detect that there is a
second message present"[3].

2. A DETAILED LOOK AT STEGANOGRAPHY

This section discusses Steganography at length and deals with the different
types of Steganography generally used in practice today along with some of the other
principles that are used in Steganography and some of the Steganographic techniques in use
today. This is where one can look at the nuts and bolts of Steganography and all the different
ways one can use this technology.

Let’s look at what a theoretically perfect secret communication. To illustrate


this concept, consider three fictitious characters named Amy, Bret and Crystal. Amy wants to
send a secret message (M) to Bret using a random (R) harmless message to create a cover (C)
which can be sent to Bret without raising suspicion. Amy then changes the cover message (C)
to a stego-object (S) by embedding the secret message (M) into the cover message (C) by
using a stego-key (K). Amy should then be able to send the stego-object (S) to Bret without
being detected by Crystal. Bret will then be able to read the secret message (M) because he
knows the stego-key (K) used to embed it into the cover message (C)[3].
Fig 2.1 A simple Steganographic System[8]

steganography_medium = secret_message + cover_message + key[11]

As Fabien A.P. Petitcolas[10] points out, "in a 'perfect' system, a normal cover should not be
distinguishable from a stego-object, neither by a human nor by a computer looking for
statistical patterns." In practice, however, this is not always the case. In order to embed secret
data into a cover message, the cover must contain a sufficient amount of redundant data or
noise. This is because the embedding process Steganography uses actually replaces this
redundant data with the secret message. This limits the types of data that one can use with
Steganography.

In practice there are three types of steganography protocols used. They are
Pure Steganography, Secret Key Steganography and Public Key Steganography.

2.1 Pure Steganography : It is defined as a steganographic system that does not require the
exchange of a cipher such as a stego-key[2]. This method of Steganography is the least secure
means by which to communicate secretly because the sender and receiver can rely only upon
the presumption that no other parties are aware of this secret message. Using open systems
such as the Internet, this is not the case at all[14].

2.2 Secret Key Steganography : It is defined as a steganographic system that requires the
exchange of a secret key (stego-key) prior to communication. Secret Key Steganography takes
a cover message and embeds the secret message inside of it by using a secret key (stego-key).
Only the parties who know the secret key can reverse the process and read the secret message.
Unlike Pure Steganography where a perceived invisible communication channel is present,
Secret Key Steganography exchanges a stego-key, which makes it more susceptible to
interception. The benefit to Secret Key Steganography is even if it is intercepted, only parties
who know the secret key can extract the secret message[14].

2.3 Public Key Steganography : Public Key Steganography is defined as a steganographic


system that uses a public key and a private key to secure the communication between the
parties wanting to communicate secretly. The sender will use the public key during the
encoding process and only the private key, which has a direct mathematical relationship with
the public key, can decipher the secret message[10]. Public Key Steganography provides a more
robust way of implementing a steganographic system. It also has multiple levels of security in
that unwanted parties must first suspect the use of steganography and then they would have to
find a way to crack the algorithm used by the public key system before they could intercept
the secret message[14].

3. ENCODING SECRET MESSAGE IN TEXT

Encoding secret messages in text can be a very challenging task. This is because
text files have a very small amount of redundant data to replace with a secret message.Another
drawback is the ease of which text based Steganography can be altered by an unwanted parties
by just changing the text itself or reformatting the text to some other form (from .TXT to
.PDF, etc.). There are numerous methods by which to accomplish text based Steganography[9].

3.1 Line-shift encoding: It involves actually shifting each line of text vertically up or down
by as little as 3 centimeters. Depending on whether the line was up or down from the
stationary line would equate to a value that would or could be encoded into a secret
message[7].

3.2 Word-shift encoding: It works in much the same way that line-shift encoding works, only
one can use the horizontal spaces between words to equate a value for the hidden message.
This method of encoding is less visible than line-shift encoding but requires that the text
format support variable spacing[7].

3.3 Feature specific encoding: It involves encoding secret messages into formatted text by
changing certain text attributes such as vertical/horizontal length of letters such as b, d, etc[2].
This is by far the hardest text encoding method to intercept as each type of formatted text has
a large amount of features that can be used for encoding the secret message.

All three of these text based encoding methods require either the original file or the
knowledge of the original files formatting to be able to decode the secret messages.

4. ENCODING SECRET MESSAGES IN IMAGE

Coding secret messages in digital images is widely used in the digital world of
today. This is because it can take advantage of the limited power of the human visual system
(HVS). Almost any plain text, cipher text, image and any other media that can be encoded into
a bit stream can be hidden in a digital image. With the continued growth of strong graphics
power in computers and the research being put into image based Steganography, this field will
continue to grow at a very rapid pace.
To the computer, an image is an array of numbers that represent light intensities at
various points(pixels)[12]. These pixels make up the images raster data. When dealing with
digital images for use with Steganography, 8-bit and 24-bit per pixel image files are typical.
Both have advantages and disadvantages[2],
● 8-bit images are a great format to use because of their relatively small size. The drawback
is that only 256 possible colors can be used which can be a potential problem during
encoding. Usually a gray scale color palette is used when dealing with 8-bit images such as
(.GIF) because its gradual change in color will be harder to detect after the image has been
encoded with the secret message[2].
● 24-bit images offer much more flexibility when used for Steganography. The large
number of colors (over 16 million) that can be used go well beyond the human visual
system(HVS), which makes it very hard to detect once a secret message has been encoded.
The other benefit is that a much larger amount of hidden data can be encoded into a 24-bit
digital image as opposed to an 8-bit digital image[2](Here only 1 bit data can be placed in each
pixel where as in 24-bit 3 bits can be placed in each pixel [8]). The one major drawback to 24-
bit digital images is their large size (usually in MB) makes them more suspect than the much
smaller 8-bit digital images(usually in KB) when sent over an open system such as the
Internet.

Digital image compression is a good solution to large digital images such as the 24-bit images.
There are two types of compression techniques. They are,
Lossless compression is preferred when there is a requirement that the original
information remain intact (as with steganographic images). The original message can be
reconstructed exactly. This type of compression is typical in GIF and BMP images[12].

Lossy compression, while also saving space, may not maintain the integrity of the
original image. This method is typical in JPG images and yields very good compression[12].

The popular digital image encoding techniques used today are least significant bit (LSB)
encoding , masking & filtering , Transformation , spread spectrum steganography,
statistical steganography, distortion, and cover generation steganography. The following
are some of these techniques.

4.1 Least significant bit (LSB) encoding : It is by far the most popular of the coding
techniques used for digital images. By using the LSB of each byte (8 bits) in an image for a
secret message, one can store 3 bits of data in each pixel for 24-bit imagesand 1 bit in each
pixel for 8-bit images.
Logic: A 24-bit bitmap will have 8 bits representing each of the three color values (red, green,
and blue) at each pixel[2]. If we consider just the blue there will be 28 different values of blue.
The difference between say 11111111 and 11111110 in the value for blue intensity is likely to
be undetectable by the human eye. If we do it with the green and the red as well we can get
one letter of ASCII text for every three pixel[12].
Therefore, the least significant bit can be used (more or less undetectably) for something else
other than color information. As you can see, much more information can be stored in a 24-bit
image file.
Disadvantages of using LSB alteration are mainly in the fact that it requires a fairly large
cover image to create a usable amount of hiding space. Even now a days, uncompressed
images of 800 x 600 pixels are not often used on the Internet, so using these might raise
suspicion[5]. Another disadvantage will arise when compressing an image concealing a secret
using a lossy compression algorithm. The hidden message will not survive this operation and
is lost after the transformation[1].

4.2 Masking and filtering : These techniques are usually restricted to 24 bits or grayscale
images, take a different approach to hiding a message. These methods are effectively similar
to paper watermarks, creating markings in an image[2]. This can be achieved for example by
modifying the luminance of parts of the image. While masking does change the visible
properties of an image, it can be done in such a way that the human eye will not notice the
anomalies. Since masking uses visible aspects of the image, it is more robust than LSB
modification with respect to compression, cropping and different kinds of image
processing[14]. The information is not hidden at the ”noise” level but is inside the visible part
of the image, which makes it more suitable than LSB modifications in case a lossy
compression algorithm like JPEG is being used[13].

4.3 Transformations : A more complex way of hiding a secret inside an image comes with
the use and modifications of discrete cosine transformations. Discrete cosine transformations
(DST)), are used by the JPEG compression algorithm to transform successive 8 x 8 pixel
blocks of the image, into 64 DCT coefficients each[13]. It follows Jsteg algorithm(D.Upham)
used JPEG image format. According to Jsteg algorithm,
Replace sequentially the least-significant bit of discrete cosine transform coefficients
with the message data[7].
Logic: The secret data, is inserted into the cover image in the DCT domain. The
signature(secret message) DCT coefficients are encoded using a lattice coding scheme before
embedding. Each block of cover DCT coefficients is first checked for its texture content and
the signatured codes are appropriately inserted depending on a local texture measure.
Experimental results indicate that high quality embedding is possible, with no visible
distortions. Signature images can be recovered even when the embedded data is subject to
significant lossy JPEG compression.
Each DCT coefficient F(u, v) of an 8 x 8 block of image pixels f(x, y) is given by[5]:

where C(x) = 1/√2 when x equals 0 and C(x) = 1 otherwise. After calculating the coefficients,
the following quantizing operation is performed[5]:

where Q(u, v) is a 64-element quantization table. A simple pseudo-code algorithm to hide a


message inside a JPEG image could look like this[1]:

Input: message, cover image


Output: steganographic image containing message
while data left to embed do
get next DCT coefficient from cover image
if DCT ≠ 0 and DCT ≠ 1 then
get next LSB from message
replace DCT LSB with message bit
end if
insert DCT into steganographic image
end while

Although a modification of a single DCT will affect all 64 image pixels, the LSB of the
quantized DCT coefficient can be used to hide information. Lossless compressed images will
be suspectible to visual alterations when the LSB are modified. This is not the case with the
above described method, as it takes place in the frequency domain inside the image, instead of
the spatial domain and therefore there will be no visible changes to the cover image[5].

In addition to DCT, images can be processed with fast Fourier transform (FFT). FFT is "an
algorithm for computing the Fourier transform of a set of discrete data values". The FFT
expresses a finite set of data points in terms of its component frequencies. It also solves the
identical inverse problem of reconstructing a signal from the frequency data[8].

Thus simple logic for encoding and decoding using transforms is

Hiding the data The steps are to take the DCT or wavelet transform of the cover image and
find the coefficients below a specific threshold. Replace these bits with bits to be hidden (for
example, use LSB insertion) and then take the inverse transform and store it as a regular
image.

Recovering the data To extract the hidden data take the transform of the modified image
and find the coefficients below a specific threshold. Extract bits of data from these
coefficients and combine the bits into an actual message.

4.4 Patchwork
Patchwork is a statistical technique that uses redundant pattern encoding to
embed a message in an image[14]. The algorithm adds redundancy to the hidden information
and then scatters it throughout the image.
Logic: A pseudorandom generator is used to select two areas of the image (or patches), patch
A and patch B. All the pixels in patch A is lightened while the pixels in patch B is darkened.
In other words the intensities of the pixels in the one patch are increased by a constant value,
while the pixels of the other patch are decreased with the same constant value. The contrast
changes in this patch subset encodes one bit and the changes are typically small and
imperceptible, while not changing the average luminosity[7].
A disadvantage of the patchwork approach is that only one bit is embedded. One can embed
more bits by first dividing the image into sub-images and applying the embedding to each of
them[10].
The advantage of using this technique is that the secret message is distributed over the entire
image, so should one patch be destroyed, the others may still survive. This however, depends
on the message size, since the message can only be repeated throughout the image if it is
small enough. If the message is too big, it can only be embedded once[13].
The patchwork approach is used independent of the host image and proves to be quite robust
as the hidden message can survive conversion between lossy and lossless compression[10]

There are also other methods that are not discussed in this paper which are of
less utility over the above topics.

5. ENCODING INFORMATION IN A TCP/IP HEADER

The TCP/IP header contains a number of areas where information can be stored
and sent to a remote host in a covert manner. Take the following diagrams which are textual
representations of the IP and TCP headers respectively:

IP Header (Numbers represent bits of data from 0 to 32 and the relative position of the fields
in the datagram)

Fig 5.1 Basic IP Header Structure[4]

TCP Header (Numbers represent bits of data from 0 to 32 and the relative position of the
fields in the diagram.

Fig 5.2 Basic TCP header structure[4]


Logic: Within each header there are multitude of areas that are not used for normal
transmission or are "optional" fields to be set as needed by the sender of the datagrams. An
analysis of the areas of a typical IP header that are either unused or optional reveals many
possibilities where data can be stored and transmitted[6].
For general purposes, this paper focuses on encapsulation of data in the more mandatory
fields. Because these fields are not as likely to be altered in transit than say the IP or TCP
options fields which are sometimes changed or stripped off by packet filtering mechanisms or
through fragment re-assembly. They are
- The IP packet identification field.
- The TCP initial sequence number field.
- The TCP acknowledged sequence number field.

Hence data can be placed into these fields. Though the ascii code of character can be placed
simply,
it will not look innocent thus some special techniques are used to encode and decode for much
safety[4].

5.1 Manipulating IP packet identification field: The identification field of the IP protocol
helps with re-assembly of packet data by remote routers and host systems. It's purpose is to
give a unique value to packets so if fragmentation occurs along a route, they can be accurately
re- assembled[4]. In the following example, The lines below show a tcpdump representation of
the packets on a network between two hosts "nemesis.psionic.com" and "blast.psionic.com.".
This is one of the packets received during transmission which has character ‘H’ in its IP
packet identification field[6].

18:50:13.551117 nemesis.psionic.com.7180 > blast.psionic.com.www: S


537657344:537657344(0) win 512 (ttl 64, id 18432)

Here the id field located in parenthesis shows data in IP packet identification field. This
method is used by having the client host construct a packet with the appropriate destination
host and source host information and encoded IP ID field. This packet is sent to the remote
host which is listening on a passive socket which decodes the data.

Decoding:...(ttl 64, id 18432/256) [ASCII: 72(H)]

Note that the ID field is represented by an unsigned integer during the packet generation
process of the included program. This program does not perform any type of byte ordering
functions normally used in this process, therefore packet data is converted to the ASCII
equivalent by dividing by 256.

5.2 Manipulating Initial Sequence Number field (ISN): The Initial Sequence Number field
(ISN) of the TCP/IP protocol suite enables a client to establish a reliable protocol negotiation
with a remote server[4]. It is similar to above but here it has 32 bits field, hence it serves as a
perfect medium for transmitting clandestine data. Consider following line[6]

18:50:29.071117 nemesis.psionic.com.45321 > blast.psionic.com.www: S


1207959552:1207959552(0) win 512 (ttl 64, id 49408)

Here 1207959552 is ISN. Dividing it by 65536*256 gives 72(i.e, ‘H’)

Decoding:... S 1207959552/16777216 [ASCII: 72(H)]

Because of the sheer amount of information any one can represent in a 32 bit address space
(4,294,967,296 numbers), the sequence number makes an ideal location for storing data. Aside
from the obvious example given above, a number of other techniques are used to store
information in either a byte fashion, or as bits of information represented through careful
manipulation of the sequence number. The simple algorithm of the covert_tcp program takes
the ASCII value of our data and converts it to a usable sequence number (which is actually
done by the packet generation functions and is converted back to ASCII in a symmetrical
manner).
Also there are other methods for hiding data in tcp/ip header that may vary depending on the
type of application and requirement[4].
Also data can be hidden in audio and video files which are not discussed in this paper. And
they are also widely used in open environment systems.

6. APPLICATIONS
The three most popular and researched uses for steganography in an open
systems environment are covert channels, embedded data and digital watermarking.
● Covert channels in TCP/IP involve masking identification information in the TCP/IP
headers to hide the true identity of one or more systems. This can be very useful for any
secure communications needs over open systems such as the Internet when absolute secrecy is
needed for an entire communication process and not just one document as mentioned next.
● Embedding Data using containers (cover messages) is by far the most popular use
of Steganography today. This method of Steganography is very useful when a party must send
a top secret, private or highly sensitive document over an open systems environment such as
the Internet. By embedding the hidden data into the cover message and sending it, you can
gain a sense of security by the fact that no one knows you have sent more than a harmless
message other than the intended recipients.
● Digital watermarking is usually used for copy write reasons by companies or
entities that wish to protect their property by either embedding their trademark into their
property or by concealing serial numbers/license information in software, etc. Digital
watermarking is very important in the detection and prosecution of software pirates/digital
thieves.

7. CONCLUSION
Although only some of the main image steganographic techniques were
discussed in this paper, one can see that there exists a large selection of approaches to hiding
information in images. All the major image file formats have different methods of hiding
messages, with different strong and weak points respectively. Where one technique lacks in
payload capacity, the other lacks in robustness. For example, the patchwork approach has a
very high level of robustness against most type of attacks, but can hide only a very small
amount of information. Least significant bit (LSB) in both BMP and GIF makes up for this,
but both approaches result in suspicious files that increase the probability of detection when in
the presence of a warden. Thus for an agent to decide on which steganographic algorithm to
use, he would have to decide on the type of application he want to use the algorithm for and if
he is willing to compromise on some features to ensure the security of others

8. REFERENCES
1. Hide and Seek: An Introduction to Steganography - Niels Provos and Peter Honeyman
URL: http://niels.xtdnet.nl/papers/practical.pdf
2. Johnson, Neil F., “Steganography”, 2000,
URL: http://www.jjtc.com/stegdoc/index2.html
3. Steganography - Wikipedia, the free encyclopedia_files
URL: http://en.wikipedia.org/wiki/Steganography
4. Embedding Covert Channels into TCP/IP - Steven J. Murdoch and Stephen Lewis
URL: http://www.cl.cam.ac.uk/users/{fsjm217, srl32g}/
5. Krenn, R., “Steganography and Steganalysis”,
URL: http://www.krenn.nl/univ/cry/steg/article.pdf
6. Rowland, C.H.: Covert channels in the TCP/IP protocol suite. First Monday 2
(1997) URL: http://www.firstmonday.org/issues/issue2_5/rowland/.
7. Steganography Links & Whitepapers (Computer Forensics)
URL: http:// www.forensics.nl/steganography/
8. URL:http://io.acad.athabascau.ca/~grizzlie/Comp607/
9. The WEPIN Store, “Steganography (Hidden Writing)”, 1995,
URL: http://www.wepin.com/pgp/stego.html
10. Petitcolas, Fabien A.P., “The Information Hiding Homepage: Digital Watermarking
and
Steganography”,
URL: http://www.cl.cam.ac.uk/~fapp2/steganography/
11.Forensic Science Communications - July 2004
URL: http://www.fbi.gov/hq/lab/fsc/backissu/july2004/research/
12. Steganography-Tutorial
URL: http://www.jjtc.com/stegdoc/stegdoc.html
13. Steganography Papers, Johnson, N.F. & Jajodia, S., “Exploring Steganography:
Seeing the Unseen”, Computer Journal,February 1998
URL: http://www.cs.arizona.edu/~collberg/Teaching/620/1999/Handouts/hual1.ps
14. Bender, W., Gruhl, D., Lu, A., Morimoto, N., IBM Systems Journal, “Techniques for
Data
Hiding”, URL: http://www.research.ibm.com/journal/sj/mit/sectiona/bender.pdf
15. Sellars, D., “An Introduction to Steganography”,
URL:
http://www.cs.uct.ac.za/courses/CS400W/NIS/papers99/dsellars/stego.html

Das könnte Ihnen auch gefallen