Final

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES
Chapter 1 Introduction
1.1 Introduction:
Steganography is the art and science of writing hidden messages in such a way that no one, apart from the sender and intended recipient, suspects the existence of the message, a form of security through obscurity. The word steganography is of Greek origin and means "concealed writing". The first recorded use of the term was in 1499 by Johannes Trithemius in his Steganographia, a treatise on cryptography and steganography disguised as a book on magic. Generally, messages will appear to be something else: images, articles, shopping lists, or some other covertext and, classically, the hidden message may be in invisible ink between the visible lines of a private letter. The advantage of steganography, over cryptography alone, is that messages do not attract attention to themselves. Plainly visible encrypted messagesno matter how unbreakablewill arouse suspicion, and may in themselves be incriminating in countries where encryption is illegal.[1] Therefore, whereas cryptography protects the contents of a message, steganography can be said to protect both messages and communicating parties. Steganography includes the concealment of information within computer files. In digital steganography, electronic communications may include steganographic coding inside of a transport layer, such as a document file, image file, program or protocol. Media files are ideal for steganographic transmission because of their large size. As a simple example, a sender might start with an innocuous image file and adjust the color of every 100th pixel to correspond to a letter in the alphabet, a change so subtle that someone not specifically looking for it is unlikely to notice it.
1.2 Ancient steganography

College Name Page 1
The first recorded uses of steganography can be traced back to 440 BC when Herodotus mentions two examples of steganography in The Histories of Herodotus.[2] Demaratus sent a warning about a forthcoming attack to Greece by writing it directly on the wooden backing of a wax tablet before applying its beeswax surface. Wax tablets were in common use then as reusable writing surfaces, sometimes used for shorthand. Another ancient example is that of Histiaeus, who shaved the head of his most trusted slave and tattooed a message on it. After his hair had grown the message was hidden. The purpose was to instigate a revolt against the Persians.
1.3 Steganographic techniques

There are three steganographic techniques are available. They are: Physical Steganography. Digital steganography. Printed steganography.
1.3.1 Physical steganography

Steganography has been widely used including recent historical times and the present day. Possible permutations are endless and known examples include:
Steganart example. Within this picture, the letters position of a hidden message are represented by increasing numbers (1 to 20), and a letter value is given by its intersection
College Name Page 2
position in the grid. For instance, the first letter of the hidden message is at the intersection of 1 and 4. So, after a few tries, the first letter of the message seems to be the 14th letter of the alphabet; the last one (number 20) is the 5th letter of the alphabet.
Hidden messages within wax tablets: in ancient Greece, people wrote messages on the wood, then covered it with wax upon which an innocent covering message was written. Hidden messages on messenger's body: also in ancient Greece. Herodotus tells the story of a message tattooed on a slave's shaved head, hidden by the growth of his hair, and exposed by shaving his head again. The message allegedly carried a warning to Greece about Persian invasion plans. This method has obvious drawbacks such as delayed transmission while waiting for the slave's hair to grow, and its one-off use since additional messages requires additional slaves. In WWII, the French Resistance sent some messages written on the backs of couriers using invisible ink.
Hidden messages on paper written in secret inks, under other messages or on the blank parts of other messages. Messages written in morse code on knitting yarn and then knitted into a piece of clothing worn by a courier. Messages written on the back of postage stamps. During and after World War II, espionage agents used photographically produced microdots to send information back and forth. Microdots were typically minute, about or less than the size of the period produced by a typewriter. WWII microdots needed to be embedded in the paper and covered with an adhesive (such as collodion). This was reflective and thus detectable by viewing against glancing light. Alternative techniques included inserting microdots into slits cut into the edge of post cards.
During World War II, a spy for the Japanese in New York City, Velvalee Dickinson, sent information to accommodation addresses in neutral South America. She was a dealer in dolls, and her letters discussed how many of this or that doll to ship. The stegotext was the doll orders, the concealed 'plaintext' was itself encoded and gave information about ship movements, etc. Her case became somewhat famous and she became known as the Doll Woman.
College Name
Page 3
Cold War counter-propaganda. During 1968, crew members of the USS Pueblo (AGER2) intelligence ship held as prisoners by North Korea, communicated in sign language during staged photo opportunities, informing the United States they were not defectors but rather were being held captured by the North Koreans. In other photos presented to the US, crew members gave "the finger" to the unsuspecting North Koreans, in an attempt to discredit photos that showed them smiling and comfortable.[3]
1.3.2 Digital steganography
Modern steganography entered the world in 1985 with the advent of the personal computer applied to classical steganography problems. [4] Development following that was slow, but has since taken off, going by the number of 'stego' programs available: Over 725 digital steganography applications have been identified by the Steganography Analysis and Research Center. [5] Digital steganography techniques include:
Image of a tree. By removing all but the last 2 bits of each color component, an almost completely black image results. Making the resulting image 85 times brighter results in the image below.
College Name Page 4
Image of a cat extracted from above image.

Concealing messages within the lowest bits of noisy images or sound files. Concealing data within encrypted data. The data to be concealed is first encrypted before being used to overwrite part of a much larger block of encrypted data. Chaffing and winnowing. Mimic functions convert one file to have the statistical profile of another. This can thwart statistical methods that help brute-force attacks identify the right solution in a ciphertextonly attack.
Concealed messages in tampered executable files, exploiting redundancy in the i386 instruction set. Pictures embedded in video material (optionally played at slower or faster speed). Injecting imperceptible delays to packets sent over the network from the keyboard. Delays in keypresses in some applications (telnet or remote desktop software) can mean a delay in packets, and the delays in the packets can be used to encode data.
Content-Aware Steganography hides information in the semantics a human user assigns to a datagram. These systems offer security against a non-human adversary/warden. Blog-Steganography. Messages are fractionalized and the (encrypted) pieces are added as comments of orphaned web-logs (or pin boards on social network platforms). In this case the selection of blogs is the symmetric key that sender and recipient are using; the carrier of the hidden message is the whole blogosphere.
College Name
Page 5
1.3.3 Printed steganography

Digital steganography output may be in the form of printed documents. A message, the plaintext, may be first encrypted by traditional means, producing a cipher text. Then, an innocuous cover text is modified in some way to as to contain the cipher text, resulting in the stegotext. For example, the letter size, spacing, typeface, or other characteristics of a covertext can be manipulated to carry the hidden message. Only a recipient who knows the technique used can recover the message and then decrypt it. Francis Bacon developed Bacon's cipher as such a technique.
1.4 Organisation of thesis:

Chapter 2 will define steganography, provide a brief history, and explain various methods of steganography. Chapter 3 will review several software applications that provide steganographic services and mention the approaches taken. Chapter 4 will conclude with a brief discussion of the implications of steganographic technology. Chapter 5 will list the resources used in researching this topic and additional readings for those interested in more in-depth understanding of steganography.
Chapter 2 Literature Survey

This section gives the brief introduction of Cryptography, steganography, and also provides a brief history, and explains various methods of steganography.
College Name
Page 6
2.1 Cryptography:
Does increased security provide comfort to paranoid people? Or does security provide some very basic protections that we are naive to believe that we don't need? During this time when the Internet provides essential communication between tens of millions of people and is being increasingly used as a tool for commerce, security becomes a tremendously important issue to deal with. There are many aspects to security and many applications, ranging from secure commerce and payments to private communications and protecting passwords. One essential aspect for secure communications is that of cryptography, which is the focus of this chapter. But it is important to note that while cryptography is necessary for secure communications, it is not by itself sufficient. The reader is advised, then, that the topics covered in this chapter only describe the first of many steps necessary for better security in any number of situations. This paper has two major purposes. The first is to define some of the terms and concepts behind basic cryptographic methods, and to offer a way to compare the myriad cryptographic schemes in use today. The second is to provide some real examples of cryptography in use today. I would like to say at the outset that this paper is very focused on terms, concepts, and schemes in current use and is not a treatise of the whole field. No mention is made here about pre-computerized crypto schemes, the difference between a substitution and transposition cipher, cryptanalysis, or other history. Interested readers should check out some of the books in the bibliography below for this detailed and interesting! background information. 2.1.1 THE PURPOSE OF CRYPTOGRAPHY Cryptography is the science of writing in secret code and is an ancient art; the first documented use of cryptography in writing dates back to circa 1900 B.C. when an Egyptian scribe used non-standard hieroglyphs in an inscription. Some experts argue that cryptography appeared spontaneously sometime after writing was invented, with applications ranging from diplomatic missives to war-time battle plans. It is no surprise, then, that new forms of
College Name Page 7
cryptography came soon after the widespread development of computer communications. In data and telecommunications, cryptography is necessary when communicating over any untrusted medium, which includes just about any network, particularly the Internet. Within the context of any application-to-application communication, there are some specific security requirements, including:
Authentication: The process of proving one's identity. (The primary forms
of host-to-host authentication on the Internet today are name-based or addressbased, both of which are notoriously weak.)
Privacy/confidentiality: Ensuring that no one can read the message except Integrity: Assuring the receiver that the received message has not been Non-repudiation: A mechanism to prove that the sender really sent this
the intended receiver.
altered in any way from the original.
message. Cryptography, then, not only protects data from theft or alteration, but can also be used for user authentication. There are, in general, three types of cryptographic schemes typically used to accomplish these goals: secret key (or symmetric) cryptography, public-key (or asymmetric) cryptography, and hash functions, each of which is described below. In all cases, the initial unencrypted data is referred to as plaintext. It is encrypted into ciphertext, which will in turn (usually) be decrypted into usable plaintext. In many of the descriptions below, two communicating parties will be referred to as Alice and Bob; this is the common nomenclature in the crypto field and literature to make it easier to identify the communicating parties. If there is a third or fourth party to the communication, they will be referred to as Carol and Dave. Mallory is a malicious party, Eve is an eavesdropper, and Trent is a trusted third party. 2.1.2 TYPES OF CRYPTOGRAPHIC ALGORITHMS
College Name
Page 8
There are several ways of classifying cryptographic algorithms. For purposes of this paper, they will be categorized based on the number of keys that are employed for encryption and decryption, and further defined by their application and use. The three types of algorithms that will be discussed are (Figure 1):
Secret Key Cryptography (SKC): Uses a single key for both encryption Public Key Cryptography (PKC): Uses one key for encryption and another Hash Functions: Uses a mathematical transformation to irreversibly
and decryption
for decryption
"encrypt" information
2.1.2.1 Secret Key Cryptography

With secret key cryptography, a single key is used for both encryption and decryption. As shown in Figure 1A, the sender uses the key (or some set of rules) to encrypt the plaintext and sends the ciphertext to the receiver. The receiver applies the same key (or ruleset) to decrypt the message and recover the plaintext. Because a single key is used for both functions, secret key cryptography is also called symmetric encryption. With this form of cryptography, it is obvious that the key must be known to both the sender and the receiver; that, in fact, is the secret. The biggest difficulty with this approach, of course, is the distribution of the key. Secret key cryptography schemes are generally categorized as being either stream ciphers or block ciphers. Stream ciphers operate on a single bit (byte or computer word) at a time and implement some form of feedback mechanism so that the key is constantly changing. A block cipher is so-called because the scheme encrypts one block of data at a time using the same key on each block. In general, the same plaintext block will always encrypt to the same ciphertext when using the same key in a block cipher whereas the same plaintext will encrypt to different ciphertext in a stream cipher.
College Name
Page 9
Stream ciphers come in several flavors but two are worth mentioning here. Selfsynchronizing stream ciphers calculate each bit in the keystream as a function of the previous n bits in the keystream. It is termed "self-synchronizing" because the decryption process can stay synchronized with the encryption process merely by knowing how far into the n-bit keystream it is. One problem is error propagation; a garbled bit in transmission will result in n garbled bits at the receiving side. Synchronous stream ciphers generate the keystream in a fashion independent of the message stream but by using the same keystream generation function at sender and receiver. While stream ciphers do not propagate transmission errors, they are, by their nature, periodic so that the keystream will eventually repeat. Block ciphers can operate in one of several modes; the following four are the most important:
Electronic Codebook (ECB) mode is the simplest, most obvious
application: the secret key is used to encrypt the plaintext block to form a ciphertext block. Two identical plaintext blocks, then, will always generate the same ciphertext block. Although this is the most common mode of block ciphers, it is susceptible to a variety of brute-force attacks.
Cipher Block Chaining (CBC) mode adds a feedback mechanism to the
encryption scheme. In CBC, the plaintext is exclusively-ORed (XORed) with the previous ciphertext block prior to encryption. In this mode, two identical blocks of plaintext never encrypt to the same ciphertext.
Cipher Feedback (CFB) mode is a block cipher implementation as a self-
synchronizing stream cipher. CFB mode allows data to be encrypted in units smaller than the block size, which might be useful in some applications such as encrypting interactive terminal input. If we were using 1-byte CFB mode, for example, each incoming character is placed into a shift register the same size as the block, encrypted, and the block transmitted. At the receiving side, the ciphertext is decrypted and the extra bits in the block (i.e., everything above and beyond the one byte) are discarded.
Output Feedback (OFB) mode is a block cipher implementation
conceptually similar to a synchronous stream cipher. OFB prevents the same

College Name Page 10
plaintext block from generating the same ciphertext block by using an internal feedback mechanism that is independent of both the plaintext and ciphertext bitstreams. A nice overview of these different modes can be found at progressive-coding.com. Secret key cryptography algorithms that are in use today include: Data Encryption Standard (DES): The most common SKC scheme used today, DES was designed by IBM in the 1970s and adopted by the National Bureau of Standards (NBS) [now the National Institute for Standards and Technology (NIST)] in 1977 for commercial and unclassified government applications. DES is a block-cipher employing a 56-bit key that operates on 64-bit blocks. DES has a complex set of rules and transformations that were designed specifically to yield fast hardware implementations and slow software implementations, although this latter point is becoming less significant today since the speed of computer processors is several orders of magnitude faster today than twenty years ago. IBM also proposed a 112-bit key for DES, which was rejected at the time by the government; the use of 112-bit keys was considered in the 1990s, however, conversion was never seriously considered.
DES is defined in American National Standard X3.92 and three Federal Information Processing Standards (FIPS): FIPS 46-3: DES FIPS 74: Guidelines for Implementing and Using the NBS Data Encryption Standard o FIPS 81: DES Modes of Operation
o o
Information about vulnerabilities of DES can be obtained from the Electronic Frontier Foundation. Two important variants that strengthen DES are: Triple-DES (3DES): A variant of DES that employs up to three 56bit keys and makes three encryption/decryption passes over the block; 3DES is also described in FIPS 46-3 and is the recommended replacement to DES. o DESX: A variant devised by Ron Rivest. By combining 64 additional key bits to the plaintext prior to encryption, effectively increases the keylength to 120 bits.
o
More detail about DES, 3DES, and DESX can be found below in Section 5.4.
Advanced Encryption Standard (AES): In 1997, NIST initiated a very
public, 4-1/2 year process to develop a new secure cryptosystem for U.S. government applications. The result, the Advanced Encryption Standard, became the official successor to DES in December 2001. AES uses an SKC scheme called Rijndael, a block cipher designed by Belgian cryptographers Joan Daemen and Vincent Rijmen. The algorithm can use a variable block length and key length; the latest specification allowed any combination of keys lengths of 128, 192, or 256 bits and blocks of length 128, 192, or 256 bits. NIST initially selected Rijndael in October 2000 and formal adoption as the AES standard came in December 2001. FIPS PUB 197 describes a 128-bit block cipher employing a 128-, 192-, or 256-bit key. The AES process and Rijndael algorithm are described in more detail below in Section 5.9.
CAST-128/256: CAST-128, described in Request for Comments (RFC)
2144, is a DES-like substitution-permutation crypto algorithm, employing a 128bit key operating on a 64-bit block. CAST-256 (RFC 2612) is an extension of CAST-128, using a 128-bit block size and a variable length (128, 160, 192, 224, or 256 bit) key. CAST is named for its developers, Carlisle Adams and Stafford Tavares and is available internationally. CAST-256 was one of the Round 1 algorithms in the AES process.
International
Data
Encryption
Algorithm
(IDEA):
Secret-key
cryptosystem written by Xuejia Lai and James Massey, in 1992 and patented by Ascom; a 64-bit SKC block cipher using a 128-bit key. Also available internationally.
Rivest Ciphers (aka Ron's Code): Named for Ron Rivest, a series of SKC RC1: Designed on paper but never implemented. RC2: A 64-bit block cipher using variable-sized keys designed to
algorithms.
o o
replace DES. It's code has not been made public although many companies have licensed RC2 for use in their products. Described in RFC 2268.
o o
RC3: Found to be breakable during development. RC4: A stream cipher using variable-sized keys; it is widely used
in commercial cryptography products, although it can only be exported using keys that are 40 bits or less in length.
o
RC5: A block-cipher supporting a variety of block sizes, key sizes, RC6: An improvement over RC5, RC6 was one of the AES Round
and number of encryption passes over the data. Described in RFC 2040.
o
2 algorithms.
Blowfish: A symmetric 64-bit block cipher invented by Bruce Schneier;
optimized for 32-bit processors with large data caches, it is significantly faster than DES on a Pentium/PowerPC-class machine. Key lengths can vary from 32 to 448 bits in length. Blowfish, available freely and intended as a substitute for DES or IDEA, is in use in over 80 products.
Twofish: A 128-bit block cipher using 128-, 192-, or 256-bit keys.
Designed to be highly secure and highly flexible, well-suited for large microprocessors, 8-bit smart card microprocessors, and dedicated hardware. Designed by a team led by Bruce Schneier and was one of the Round 2 algorithms in the AES process.
Camellia: A secret-key, block-cipher crypto algorithm developed jointly
by Nippon Telegraph and Telephone (NTT) Corp. and Mitsubishi Electric Corporation (MEC) in 2000. Camellia has some characteristics in common with AES: a 128-bit block size, support for 128-, 192-, and 256-bit key lengths, and suitability for both software and hardware implementations on common 32-bit processors as well as 8-bit processors (e.g., smart cards, cryptographic hardware, and embedded systems). Also described in RFC 3713. Camellia's application in IPsec is described in RFC 4312 and application in OpenPGP in RFC 5581.
MISTY1: Developed at Mitsubishi Electric Corp., a block cipher using a
128-bit key and 64-bit blocks, and a variable number of rounds. Designed for hardware and software implementations, and is resistant to differential and linear cryptanalysis. Described in RFC 2994.
Secure and Fast Encryption Routine (SAFER): Secret-key crypto scheme
designed for implementation in software. Versions have been defined for 40-, 64-, and 128-bit keys.
KASUMI: A block cipher using a 128-bit key that is part of the Third-
Generation Partnership Project (3gpp), formerly known as the Universal Mobile Telecommunications System (UMTS). KASUMI is the intended confidentiality and integrity algorithm for both message content and signaling data for emerging mobile communications systems.
SEED: A block cipher using 128-bit blocks and 128-bit keys. Developed
by the Korea Information Security Agency (KISA) and adopted as a national standard encryption algorithm in South Korea. Also described in RFC 4269.
Skipjack: SKC scheme proposed for Capstone. Although the details of the
algorithm were never made public, Skipjack was a block cipher using an 80-bit key and 32 iteration cycles per 64-bit block.
2.1.2.2. Public-Key Cryptography

Public-key cryptography has been said to be the most significant new development in cryptography in the last 300-400 years. Modern PKC was first described publicly by Stanford University professor Martin Hellman and graduate student Whitfield Diffie in 1976. Their paper described a two-key crypto system in which two parties could engage in a secure communication over a non-secure communications channel without having to share a secret key. PKC depends upon the existence of so-called one-way functions, or mathematical functions that are easy to computer whereas their inverse function is relatively difficult to compute. Let me give you two simple examples:
1.
Multiplication vs. factorization: Suppose I tell you that I have two
numbers, 9 and 16, and that I want to calculate the product; it should take almost
no time to calculate the product, 144. Suppose instead that I tell you that I have a number, 144, and I need you tell me which pair of integers I multiplied together to obtain that number. You will eventually come up with the solution but whereas calculating the product took milliseconds, factoring will take longer because you first need to find the 8 pair of integer factors and then determine which one is the correct pair.
2.
Exponentiation vs. logarithms: Suppose I tell you that I want to take the
number 3 to the 6th power; again, it is easy to calculate 3 6=729. But if I tell you that I have the number 729 and want you to tell me the two integers that I used, x and y so that logx 729 = y, it will take you longer to find all possible solutions and select the pair that I used. While the examples above are trivial, they do represent two of the functional pairs that are used with PKC; namely, the ease of multiplication and exponentiation versus the relative difficulty of factoring and calculating logarithms, respectively. The mathematical "trick" in PKC is to find a trap door in the one-way function so that the inverse calculation becomes easy given knowledge of some item of information. Generic PKC employs two keys that are mathematically related although knowledge of one key does not allow someone to easily determine the other key. One key is used to encrypt the plaintext and the other key is used to decrypt the ciphertext. The important point here is that it does not matter which key is applied first, but that both keys are required for the process to work (Figure 1B). Because a pair of keys are required, this approach is also called asymmetric cryptography. In PKC, one of the keys is designated the public key and may be advertised as widely as the owner wants. The other key is designated the private key and is never revealed to another party. It is straight forward to send messages under this scheme. Suppose Alice wants to send Bob a message. Alice encrypts some information using Bob's public key; Bob decrypts the ciphertext using his private key. This method could be also used to prove who sent a message; Alice, for example, could encrypt some plaintext with her private key; when Bob decrypts using
Alice's public key, he knows that Alice sent the message and Alice cannot deny having sent the message (non-repudiation). Public-key cryptography algorithms that are in use today for key exchange or digital signatures include:
RSA: The first, and still most common, PKC implementation, named for
the three MIT mathematicians who developed it Ronald Rivest, Adi Shamir, and Leonard Adleman. RSA today is used in hundreds of software products and can be used for key exchange, digital signatures, or encryption of small blocks of data. RSA uses a variable size encryption block and a variable size key. The keypair is derived from a very large number, n, that is the product of two prime numbers chosen according to special rules; these primes may be 100 or more digits in length each, yielding an n with roughly twice as many digits as the prime factors. The public key information includes n and a derivative of one of the factors of n; an attacker cannot determine the prime factors of n (and, therefore, the private key) from this information alone and that is what makes the RSA algorithm so secure. (Some descriptions of PKC erroneously state that RSA's safety is due to the difficulty in factoring large prime numbers. In fact, large prime numbers, like small prime numbers, only have two factors!) The ability for computers to factor large numbers, and therefore attack schemes such as RSA, is rapidly improving and systems today can find the prime factors of numbers with more than 200 digits. Nevertheless, if a large number is created from two prime factors that are roughly the same size, there is no known factorization algorithm that will solve the problem in a reasonable amount of time; a 2005 test to factor a 200-digit
number took 1.5 years and over 50 years of compute time (see the
Wikipedia article on integer factorization.) Regardless, one presumed protection of RSA is that users can easily increase the key size to always stay ahead of the computer processing curve. As an aside, the patent for RSA expired in September
College Name
Page 16
2000 which does not appear to have affected RSA's popularity one way or the other. A detailed example of RSA is presented below in Section 5.3.
Diffie-Hellman: After the RSA algorithm was published, Diffie and
Hellman came up with their own algorithm. D-H is used for secret-key key exchange only, and not for authentication or digital signatures. More detail about Diffie-Hellman can be found below in Section 5.2.
Digital Signature Algorithm (DSA): The algorithm specified in NIST's
Digital Signature Standard (DSS), provides digital signature capability for the authentication of messages.
ElGamal: Designed by Taher Elgamal, a PKC system similar to DiffieElliptic Curve Cryptography (ECC): A PKC algorithm based upon elliptic
Hellman and used for key exchange.
curves. ECC can offer levels of security with small keys comparable to RSA and other PKC methods. It was designed for devices with limited compute power and/or memory, such as smartcards and PDAs. More detail about ECC can be found below in Section 5.8. Other references include "The Importance of ECC" Web page and the "Online Elliptic Curve Cryptography Tutorial", both from Certicom.
Public-Key Cryptography Standards (PKCS): A set of interoperable
standards and guidelines for public-key cryptography, designed by RSA Data Security Inc.
o o o o o
PKCS #1: RSA Cryptography Standard (Also RFC 3447) PKCS #2: Incorporated into PKCS #1. PKCS #3: Diffie-Hellman Key-Agreement Standard PKCS #4: Incorporated into PKCS #1. PKCS #5: Password-Based Cryptography Standard (PKCS #5 PKCS #6: Extended-Certificate Syntax Standard (being phased out
V2.0 is also RFC 2898)

o
in favor of X.509v3)
College Name
Page 17
PKCS #7: Cryptographic Message Syntax Standard (Also RFC PKCS #8: Private-Key Information Syntax Standard (Also RFC PKCS #9: Selected Attribute Types (Also RFC 2985) PKCS #10: Certification Request Syntax Standard (Also RFC PKCS #11: Cryptographic Token Interface Standard PKCS #12: Personal Information Exchange Syntax Standard PKCS #13: Elliptic Curve Cryptography Standard PKCS #14: Pseudorandom Number Generation Standard is no PKCS #15: Cryptographic Token Information Format Standard
2315)
o
5208)
o o
2986)
o o o o
longer available
o
Cramer-Shoup: A public-key cryptosystem proposed by R. Cramer and V. Key Exchange Algorithm (KEA): A variation on Diffie-Hellman; proposed LUC: A public-key cryptosystem designed by P.J. Smith and based on
Shoup of IBM in 1998.
as the key exchange method for Capstone.
Lucas sequences. Can be used for encryption and signatures, using integer factoring. A digression: Who invented PKC? I tried to be careful in the first paragraph of this section to state that Diffie and Hellman "first described publicly" a PKC scheme. Although I have categorized PKC as a two-key system, that has been merely for convenience; the real criteria for a PKC scheme is that it allows two parties to exchange a secret even though the communication with the shared secret might be overheard. There seems to be no question that Diffie and Hellman were first to publish; their method is described in the classic paper, "New Directions in Cryptography," published in the November 1976 issue of IEEE Transactions on Information Theory. As shown below, Diffie-Hellman uses the idea that finding logarithms is relatively harder than exponentiation. And, indeed, it is the precursor to modern PKC which does employ two keys. Rivest, Shamir, and Adleman described an implementation that extended this
idea in their paper "A Method for Obtaining Digital Signatures and Public-Key Cryptosystems," published in the February 1978 issue of the Communications of the ACM (CACM). Their method, of course, is based upon the relative ease of finding the product of two large prime numbers compared to finding the prime factors of a large number. Some sources, though, credit Ralph Merkle with first describing a system that allows two parties to share a secret although it was not a two-key system, per se. A Merkle Puzzle works where Alice creates a large number of encrypted keys, sends them all to Bob so that Bob chooses one at random and then lets Alice know which he has selected. An eavesdropper will see all of the keys but can't learn which key Bob has selected (because he has encrypted the response with the chosen key). In this case, Eve's effort to break in is the square of the effort of Bob to choose a key. While this difference may be small it is often sufficient. Merkle apparently took a computer science course at UC Berkeley in 1974 and described his method, but had difficulty making people understand it; frustrated, he dropped the course. Meanwhile, he submitted the paper "Secure Communication Over Insecure Channels" which was published in the CACM in April 1978; Rivest et al.'s paper even makes reference to it. Merkle's method certainly wasn't published first, but did he have the idea first? An interesting question, maybe, but who really knows? For some time, it was a quiet secret that a team at the UK's Government Communications Headquarters (GCHQ) had first developed PKC in the early 1970s. Because of the nature of the work, GCHQ kept the original memos classified. In 1997, however, the GCHQ changed their posture when they realized that there was nothing to gain by continued silence. Documents show that a GCHQ mathematician named James Ellis started research into the key distribution problem in 1969 and that by 1975, Ellis, Clifford Cocks, and Malcolm Williamson had worked out all of the fundamental details of PKC, yet couldn't talk about their work. (They were, of course, barred from challenging the RSA patent!) After more than 20 years, Ellis, Cocks, and Williamson have begun to get their due credit.
2.1.2.4. Hash Functions
College Name
Page 19
Hash functions, also called message digests and one-way encryption, are algorithms that, in some sense, use no key (Figure 1C). Instead, a fixed-length hash value is computed based upon the plaintext that makes it impossible for either the contents or length of the plaintext to be recovered. Hash algorithms are typically used to provide a digital fingerprint of a file's contents, often used to ensure that the file has not been altered by an intruder or virus. Hash functions are also commonly employed by many operating systems to encrypt passwords. Hash functions, then, provide a measure of the integrity of a file. Hash algorithms that are in common use today include:
Message Digest (MD) algorithms: A series of byte-oriented algorithms MD2 (RFC 1319): Designed for systems with limited memory, MD4 (RFC 1320): Developed by Rivest, similar to MD2 but MD5 (RFC 1321): Also developed by Rivest after potential
that produce a 128-bit hash value from an arbitrary-length message.

o
such as smart cards.

o
designed specifically for fast processing in software.

o
weaknesses were reported in MD4; this scheme is similar to MD4 but is slower because more manipulation is made to the original data. MD5 has been implemented in a large number of products although several weaknesses in the algorithm were demonstrated by German cryptographer Hans Dobbertin in 1996.
Secure Hash Algorithm (SHA): Algorithm for NIST's Secure Hash
Standard (SHS). SHA-1 produces a 160-bit hash value and was originally published as FIPS 180-1 and RFC 3174. FIPS 180-2 describes five algorithms in the SHS: SHA-1 plus SHA-224, SHA-256, SHA-384, and SHA-512 which can produce hash values that are 224, 256, 384, or 512 bits in length, respectively. SHA-224, -256, -384, and -512 are also described in RFC 4634.
RIPEMD: A series of message digests that initially came from the RIPE
(RACE Integrity Primitives Evaluation) project. RIPEMD-160 was designed by Hans Dobbertin, Antoon Bosselaers, and Bart Preneel, and optimized for 32-bit
processors to replace the then-current 128-bit hash functions. Other versions include RIPEMD-256, RIPEMD-320, and RIPEMD-128.
HAVAL (HAsh of VAriable Length): Designed by Y. Zheng, J. Pieprzyk
and J. Seberry, a hash algorithm with many levels of security. HAVAL can create hash values that are 128, 160, 192, 224, or 256 bits in length.
Whirlpool: A relatively new hash function, designed by V. Rijmen and
P.S.L.M. Barreto. Whirlpool operates on messages less than 2256 bits in length, and produces a message digest of 512 bits. The design of this has function is very different than that of MD5 and SHA-1, making it immune to the same attacks as on those hashes (see below).
Tiger: Designed by Ross Anderson and Eli Biham, Tiger is designed to be
secure, run efficiently on 64-bit processors, and easily replace MD4, MD5, SHA and SHA-1 in other applications. Tiger/192 produces a 192-bit output and is compatible with 64-bit architectures; Tiger/128 and Tiger/160 produce the first 128 and 160 bits, respectively, to provide compatibility with the other hash functions mentioned above. Hash functions are sometimes misunderstood and some sources claim that no two files can have the same hash value. This is, in fact, not correct. Consider a hash function that provides a 128-bit hash value. There are, obviously, 2128 possible hash values. But there are a lot more than 2128 possible files. Therefore, there have to be multiple files in fact, there have to be an infinite number of files! that can have the same 128-bit hash value. The difficulty is finding two files with the same hash! What is, indeed, very hard to do is to try to create a file that has a given hash value so as to force a hash value collision which is the reason that hash functions are used extensively for information security and computer forensics applications. Alas, researchers in 2004 found that practical collision attacks could be launched on MD5, SHA-1, and other hash algorithms. Readers interested in this problem should read the following:
College Name
Page 21
Burr, W. (2006, Match/April). Cryptographic hash standards: Where do Gutman, P., Naccache, D., & Palmer, C.C. (2005, May/June). When Klima, V. (March 2005) "Finding MD5 Collisions - a Toy For a Thompson, E. (2005, February). MD5 collisions and the impact on Wang, X., Feng, D., Lai, X., & Yu, H. (August 2004). "Collisions for Wang, X., Yin, Y.L., & Yu, H. (February 2005). "Collision Search
we go from here? IEEE Security & Privacy, 4(2), 88-91.
hashes collide. IEEE Security & Privacy, 3(3), 68-71.
Notebook."
computer forensics. Digital Investigation, 2(1), 36-40.
Hash Functions MD4, MD5, HAVAL-128 and RIPEMD."
Attacks on SHA1." Readers are also referred to the Eindhoven University of Technology HashClash Project Web site. An excellent overview of the situation with hash collisions (circa 2005) can be found in RFC 4270 (by P. Hoffman and B. Schneier, November 2005). And for additional information on hash functions, see David Hopwood's MessageDigest Algorithms page. At this time, there is no obvious successor to MD5 and SHA-1 that could be put into use quickly; there are so many products using these hash functions that it could take many years to flush out all use of 128- and 160-bit hashes. That said, NIST announced in 2007 their Cryptographic Hash Algorithm Competition to find the next-generation secure hashing method. Dubbed SHA-3, this new scheme will augment FIPS 180-2. A list of submissions can be found at The SHA-3 Zoo. The SHA-3 standard may not be available until 2011 or 2012. Certain extensions of hash functions are used for a variety of information security and digital forensics applications, such as:
Hash libraries are sets of hash values corresponding to known files. A
hash library of known good files, for example, might be a set of files known to be a part of an operating system, while a hash library of known bad files might be of a set of known child pornographic images.
Rolling hashes refer to a set of hash values that are computed based upon a
fixed-length "sliding window" through the input. As an example, a hash value might be computed on bytes 1-10 of a file, then on bytes 2-11, 3-12, 4-13, etc.
Fuzzy hashes are an area of intense research and represent hash values that
represent two inputs that are similar. Fuzzy hashes are used to detect documents, images, or other files that are close to each other with respect to content. See "Fuzzy Hashing" (PDF | PPT) by Jesse Kornblum for a good treatment of this topic.
2.2 Steganography
The word steganography literally means covered writing as derived from Greek. It includes a vast array of methods of secret communications that conceal the very existence of the message. Among these methods are invisible inks, microdots, character arrangement (other than the cryptographic methods of permutation and substitution), digital signatures, covert channels and spread-spectrum communications. Steganography is the art of concealing the existence of information within seemingly innocuous carriers. Steganography can be viewed as akin to cryptography. Both have been used throughout recorded history as means to protect information. At times these two technologies
seem to converge while the objectives of the two differ. Cryptographic techniques "scramble" messages so if intercepted, the messages cannot be understood. Steganography, in an essence, "camouflages" a message to hide its existence and make it seem "invisible" thus concealing the fact that a message is being sent altogether. An encrypted message may draw suspicion while an invisible message will not. Over the past couple of years, steganography has been the source of a lot of discussion, particularly as it was suspected that terrorists connected with the September 11 attacks might have used it for covert communications. While no such connection has been proven, the concern points out the effectiveness of steganography as a means of obscuring data. Indeed, along with encryption, steganography is one of the fundamental ways by which data can be kept confidential. This article will offer a brief introductory discussion of steganography: what it is, how it can be used, and the true implications it can have on information security. David Kahn places steganography and cryptography in a table to differentiate against the types and counter methods used. Here security is defined as methods of "protecting" information where intelligence is defined as methods of "retrieving" information.
Signal Security
Signal Intelligence
Communication Security
Communication Intelligence
Steganography (invisible inks, open codes, messages in hollow heels) and Transmission Security (spurt radio and spread spectrum systems) Cryptography(codes and ciphers) Traffic security(call-sign changes, dummy messages, radio silence)
Interception and directionfinding Cryptanalysis Traffic analysis (directionfinding, message-flow Page 24
College Name
studies, radio finger printing)
Electronic Security

Electronic Intelligence
Emission Security (shifting of radar frequencies, spread spectrum)
Electronic Reconnaissance (eaves-dropping on radar emissions)
Counter-Countermeasures "looking Countermeasures (jamming through" (jammed radar) radar and false radar echoes) Table 1: Kahn's Security Table
Steganography has its place in security. It is not intended to replace cryptography but supplement it. Hiding a message with steganography methods reduces the chance of a message being detected. However, if that message is also encrypted, if discovered, it must also be cracked (yet another layer of protection). While we are discussing it in terms of computer security, steganography is really nothing new, as it has been around since the times of ancient Rome. For example, in ancient Rome and Greece, text was traditionally written on wax that was poured on top of stone tablets. If the sender of the information wanted to obscure the message - for purposes of military intelligence, for instance - they would use steganography: the wax would be scraped off and the message would be inscribed or written directly on the tablet, wax would then be poured on top of the message, thereby obscuring not just its meaning but its very existence[1]. According to Dictionary.com, steganography (also known as "steg" or "stego") is "the art of writing in cipher, or in characters, which are not intelligible except to persons who have the key; cryptography" [2]. In computer terms, steganography has evolved into the practice of hiding a message within a larger one in such a way that others cannot discern the presence or contents of the hidden message[3]. In contemporary terms, steganography has evolved into a digital strategy of hiding a file in some form of multimedia, such as an image, an audio file (like a .wav or mp3) or even a video file.
College Name
Page 25
2.2.1 History and Steganography Throughout history, a multitude of methods and variations have been used to hide information. David Kahn's The Codebreakers provides an excellent accounting of this history [Kahn67]. Bruce Norman recounts numerous tales of cryptography and steganography during times of war in Secret Warfare: The Battle of Codes and Ciphers. One of the first documents describing steganography is from the Histories of Herodotus. In ancient Greece, text was written on wax covered tablets. In one story Demeratus wanted to notify Sparta that Xerxes intended to invade Greece. To avoid capture, he scraped the wax off of the tablets and wrote a message on the underlying wood. He then covered the tablets with wax again. The tablets appeared to be blank and unused so they passed inspection by sentries without question. Another ingenious method was to shave the head of a messenger and tattoo a message or image on the messengers head. After allowing his hair to grow, the message would be undetected until the head was shaved again. Another common form of invisible writing is through the use of Invisible inks. Such inks were used with much success as recently as WWII. An innocent letter may contain a very different message written between the lines [Zim48]. Early in WWII steganographic technology consisted almost exclusively of invisible inks [Kahn67]. Common sources for invisible inks are milk, vinegar, fruit juices and urine. All of these darken when heated. With the improvement of technology and the ease as to the decoding of these invisible inks, more sophisticated inks were developed which react to various chemicals. Some messages had to be "developed" much as photographs are developed with a number of chemicals in processing labs. Null ciphers (unencrypted messages) were also used. The real message is "camouflaged" in an innocent sounding message. Due to the "sound" of many open coded messages, the suspect communications were detected by mail filters. However "innocent" messages were allowed to flow through. An example of a message containing such a null cipher from [JDJ01] is:
Fishing freshwater bends and saltwater coasts rewards anyone feeling stressed. Resourceful anglers usually find masterful leapers fun and admit swordfish rank overwhelming anyday. By taking the third letter in each word, the following message emerges [Zevon]: Send Lawyers, Guns, and Money. The following message was actually sent by a German Spy in WWII [Kahn67]: Apparently neutral's protest is thoroughly discounted and ignored. Isman hard hit. Blockade issue affects pretext for embargo on by products, ejecting suets and vegetable oils.
Taking the second letter in each word the following message emerges: Pershing sails from NY June 1. As message detection improved, new technologies were developed which could pass more information and be even less conspicuous. The Germans developed microdot technology which FBI Director J. Edgar Hoover referred to as "the enemy's masterpiece of espionage." Microdots are photographs the size of a printed period having the clarity of standard-sized typewritten pages. The first microdots were discovered masquerading as a period on a typed envelope carried by a German agent in 1941. The message was not hidden, nor encrypted. It was just so small as to not draw attention to itself (for a while). Besides being so small, microdots permitted the transmission of large amounts of data including drawings and photographs [Kahn67].
College Name
Page 27
With many methods being discovered and intercepted, the Office of Censorship took extreme actions such as banning flower deliveries which contained delivery dates, crossword puzzles and even report cards as they can all contain secret messages. Censors even went as far as rewording letters and replacing stamps on envelopes. With every discovery of a message hidden using an existing application, a new steganographic application is being devised. There are even new twists to old methods. Drawings have often been used to conceal or reveal information. It is simple to encode a message by varying lines, colors or other elements in pictures. Computers take such a method to new dimensions as we will see later. Even the layout of a document can provide information about that document. Brassil et al authored a series of publications dealing with document identification and marking by modulating the position of lines and words [Brassil-Infocom94, Brassil- Infocom94, BrassilCISS95]. Similar techniques can also be used to provide some other "covert" information just as 0 and 1 are informational bits for a computer. As in one of their examples, word-shifting can be used to help identify an original document [Brassil-CISS95]. Though not applied as discussed in the series by Brassil et al, a similar method can be applied to display an entirely different message. Take the following sentence (S0): We explore new steganographic and cryptographic algorithms and techniques throughout the world to produce wide variety and security in the electronic web called the Internet. and apply some word shifting algorithm (this is sentence S1). We explore new steganographic and cryptographic algorithms and techniques throughout the world to produce wide variety and security in the electronic web called the Internet. By overlapping S0 and S1, the following sentence is the result:
We explore new steganographic and cryptographic algorithms and techniques throughout the world to produce wide variety and security in the electronic web called the Internet. This is achieved by expanding the space before explore, the, wide, and web by one point and condensing the space after explore, world, wide and web by one point in sentence S1. Independently, the sentences containing the shifted words appear harmless, but combining this with the original sentence produces a different message: explore the world wide web. 2.3 Steganography Uses: Like many security tools, steganography can be used for a variety of reasons, some good, some not so good. Legitimate purposes can include things like watermarking images for reasons such as copyright protection. Digital watermarks (also known as fingerprinting, significant especially in copyrighting material) are similar to steganography in that they are overlaid in files, which appear to be part of the original file and are thus not easily detectable by the average person. Steganography can also be used as a way to make a substitute for a one-way hash value (where you take a variable length input and create a static length output string to verify that no changes have been made to the original variable length input). Further, steganography can be used to tag notes to online images (like post-it notes attached to paper files). Finally, steganography can be used to maintain the confidentiality of valuable information, to protect the data from possible sabotage, theft, or unauthorized viewing. Unfortunately, steganography can also be used for illegitimate reasons. For instance, if someone was trying to steal data, they could conceal it in another file or files and send it out in an innocent looking email or file transfer. Furthermore, a person with a hobby of saving pornography, or worse, to their hard drive, may choose to hide the evidence through the use of steganography. And, as was pointed out in the concern for terroristic purposes, it can be used as a means of covert communication. Of course, this can be both a legitimate and an illegitimate application.
College Name
Page 29
2.4 Steganography Tools There are a vast number of tools that are available for steganography. An important distinction that should be made among the tools available today is the difference between tools that do steganography, and tools that do steganalysis, which is the method of detecting steganography and destroying the original message. Steganalysis focuses on this aspect, as opposed to simply discovering and decrypting the message, because this can be difficult to do unless the encryption keys are known. A comprehensive discussion of steganography tools is beyond the scope of this article. However, there are many good places to find steganography tools on the Net. One good place to start your search for stego tools is on Neil Johnson's Steganography and Digital Watermarking Web site. The site includes an extensive list of steganography tools. Another comprehensive tools site is located at the StegoArchive.com. For steganalysis tools, a good site to start with is Neil Johnson's Steganalysis site. Niels Provos's site, is also a great reference site, but is currently being relocated, so keep checking back on its progress. The plethora of tools available also tends to span the spectrum of operating systems. Windows, DOS, Linux, Mac, Unix: you name it, and you can probably find it. 2.4.1 Working of Steganography Tools: To show how easy steganography is, I started out by downloading one of the more popular freeware tools out now: F5, then moved to a tool called SecurEngine, which hides text files within larger text files, and lastly a tool that hides files in MP3s called MP3Stego. I also tested one commercial steganography product, Steganos Suite. F5 was developed by Andreas Westfield, and runs as a DOS client. A couple of GUIs were later developed: one named "Frontend", developed by Christian Wohne and the other, named "Stegano", by Thomas Biel. I tried F5, beta version 12. I found it very easy to encode a message into a JPEG file, even if the buttons in the GUI are written in German! Users can simply
do this by following the buttons, inputting the JPEG file path, then the location of the data that is being hidden (in my case, I used a simple text file created in Notepad), at which point the program prompts the user for a pass phrase. As you can see by the before and after pictures below, it is very hard to tell them apart, embedded message or not.
Granted, the file that I embedded here was very small (it included one line of text: "This is a test. This is only a test."), so not that many pixels had to be replaced to hide my message. But what if I tried to hide a larger file? F5 only hides text files. I tried to hide a larger word document and although it did hide the file, when I tried to decrypt it, it came out as garbage. However, larger text files seemed to hide in the picture just as well as my small, one-line message. SecurEngine doesn't seem to be as foolproof as the tools that hide text within pictures. When I hid my small text file in a bigger text file, I found an odd character at the bottom of the encoded file (""). This character was not in the original file. SecurEngine gives users the option of just hiding the image, hiding the image as well as encrypting it, or both. The test message was encrypted and decrypted without issue. SecurEngine also has a feature that helps to "wipe" files (to delete them more securely). MP3Stego, a tool that hides data in MP3 files worked very well. How the process works is like this: you encode a file, a text file for example, with a .WAV file, in order for it to be compressed into MP3 format. One problem that I ran into was that in order to hide data of any size, I had to find a file that was proportional in size. So, for instance, my small text message from the previous exercise was too big to hide in a .WAV file (the one that I originally tried was 121KB, and the text file was around 36 bytes). In order to ultimately hide a file that was 5 bytes (only bearing the message "test."), I found a .WAV file that was 627 KB. The ultimate MP3 file size was 57KB. Steganos Suite is a commercial software package of numerous stego tools all rolled into one. In addition to a nifty Internet trace destructor function and a computer file shredder, it has a function called the File Manager. This allows users to encrypt and hide files on their hard drive. The user selects a file or folder to hide, and then selects a "carrier" file, which is defined as a
graphic or sound file. It will also create one for you if you prefer, if you have a scanner or microphone available. If you don't have a file handy or if you want to create one, the File Manager will search your hard drive for an appropriate carrier. This tool looks for a wider variety of file types than the majority of the freeware tools that I perused (such as .DLL and .DIB files), so if you intend to do quite a bit of file hiding, you might want to invest in a commercial package. 2.5 Steganography and Security As mentioned previously, steganography is an effective means of hiding data, thereby protecting the data from unauthorized or unwanted viewing. But stego is simply one of many ways to protect the confidentiality of data. It is probably best used in conjunction with another data-hiding method. When used in combination, these methods can all be a part of a layered security approach. Some good complementary methods include:
Encryption - Encryption is the process of passing data or plaintext through a series of mathematical operations that generate an alternate form of the original data known as ciphertext. The encrypted data can only be read by parties who have been given the necessary key to decrypt the ciphertext back into its original plaintext form. Encryption doesn't hide data, but it does make it hard to read!
Hidden directories (Windows) - Windows offers this feature, which allows users to hide files. Using this feature is as easy as changing the properties of a directory to "hidden", and hoping that no one displays all types of files in their explorer.
Hiding directories (Unix) - in existing directories that have a lot of files, such as in the /dev directory on a Unix implementation, or making a directory that starts with three dots (...) versus the normal single or double dot.
Covert channels - Some tools can be used to transmit valuable data in seemingly normal network traffic. One such tool is Loki. Loki is a tool that hides data in ICMP traffic (like ping).
2.6 Protecting Against Malicious Steganography
College Name
Page 32
Unfortunately, all of the methods mentioned above can also be used to hide illicit, unauthorized or unwanted activity. What can you do to prevent or detect issues with stego? There is no easy answer. If someone has decided to hide their data, they will probably be able to do so fairly easily. The only way to detect steganography is to be actively looking for in specific files, or to get very lucky. Sometimes an actively enforced security policy can provide the answer: this would require the implementation of company-wide acceptable use policies that restrict the installation of unauthorized programs on company computers. Using the tools that you already have to detect movement and behavior of traffic on your network may also be helpful. Network intrusion detection systems can help administrators to gain an understanding of normal traffic in and around your network and can thus assist in detecting any type of anomaly, especially with any changes in the behavior of increased movement of large images around your network. If the administrator is aware of this sort of anomalous activity, it may warrant further investigation. Host-based intrusion detection systems deployed on computers may also help to identify anomalous storage of image and/or video files. A research paper by Stefan Hetzel cites two methods of attacking steganography, which really are also methods of detecting it. They are the visual attack (actually seeing the differences in the files that are encoded) and the statistical attack: "The idea of the statistical attack is to compare the frequency distribution of the colors of a potential stego file with the theoretically expected frequency distribution for a stego file." It might not be the quickest method of protection, but if you suspect this type of activity, it might be the most effective. For JPEG files specifically, a tool called Stegdetect, which looks for signs of steganography in JPEG files, can be employed. Stegbreak, a companion tool to Stegdetect, works to decrypt possible messages encoded in a suspected steganographic file, should that be the path you wish to take once the stego has been detected.
College Name
Page 33
Chapter 3 PC Software that Provide Steganographic Services

3.1 Background
Steganographic software is new and very effective. Such software enables information to be hidden in graphic, sound and apparently "blank" media. Charles Kurak and John McHugh discuss the implications of downgrading an image (security downgrading) when it may contain some other information [Kurak92]. Though not explicitly stated the author(s) of StegoDos mention embedding viruses in images [StegoDos]. In the computer, an image is an array of numbers that represent light intensities at various points (pixels1) in the image. A common image size is 640 by 480 and 256 colors (or 8 bits per pixel). Such an image could contain about 300 kilobits of data. There are usually two type of files used when embedding data into an image. The innocent looking image which will hold the hidden information is a "container." A "message" is the information to be hidden. A message may be plain-text, ciphertext, other images or any thing that can be embedded in the least significant bits (LSB) of an image.
College Name
Page 34
For example: Suppose we have a 24-bit image 1024 x 768 (this is a common resolution for satellite images, electronic astral photographs and other high resolution graphics). This may produce a file over 2 megabytes in size (1024x768x24/8 = 2,359,296 bytes). All color variations are derived from three primary colors, Red, Green and Blue. Each primary color is represented by 1 byte (8 bits). 24-bit images use 3 bytes per pixel. If information is stored in the least significant bit (LSB) of each byte, 3 bits can be a stored in each pixel. The "container" image will look identical to the human eye, even if viewing the picture side by side with the original. Unfortunately, 24-bit images are uncommon (with exception of the formats mentioned earlier) and quite large. They would draw attention to themselves when being transmitted across a network. Compression would be beneficial if not necessary to transmit such a file. But file compression may interfere with the storage of information. 1 A pixel is an instance of color, a point in a picture. Kurak and McHugh identify two kinds of compression, lossless and lossy [Kurak92]. Both methods save storage space but may present different results when the information is uncompressed.
Lossless compression is preferred when there is a requirement that the original information remain intact (as with steganographic images). The original message can be reconstructed exactly. This type of compression is typical in GIF2 and BMP3 images.
Lossy compression, while also saving space, may not maintain the integrity of the original image. This method is typical in JPG4 images and yields very good compression.
To illustrate the advantage of lossy compression, Renoir's Le Moulin de la Galette was retrieved as a 175,808 byte JPG image 1073 x 790 pixels with 16 million possible colors. The colors were maintained when converting it to a 24-bit BMP file but the file size became 2,649,019 bytes! Converting again to a GIF file, the colors were reduced to 256 colors (8-bit) and the new file is 775,252 bytes. The 256 color image is a very good approximation of Renoir's painting.
2 3 4
Graphic Interchange Format developed by Compuserve to be a device-independent method of storing images. Windows and OS/2 bitmap picture file. Joint Photography experts Group (JPG/JPEG) is a deviceindependent method for storing images which supports 24-bit images.
Most steganographic software available does not support, nor recommends, using JPG files (an exception is noted later in the paper). The next best alternative to 24-bit images, is to use 256 color (or gray-scale) images. These are the most common images found on the Internet in the form of GIF files. Each pixel is represented as a byte (8-bits). Many authors of the steganography software and articles stress the use of gray-scale images (those with 256 shades of gray or better) [Arachelian, Aura95, Kurak92, Maroney]. The importance is not whether the image is gray-scale or not, the importance is the degree to which the colors change between bit values. Gray-scale images are very good because the shades gradually change from byte to byte. The following is a palette containing 256 shades of gray.
College Name
Page 36
3.2 Evaluation Method

A similar image with 16 shades of gray (four-bit color) may look very close to one with 256 shades of gray but the palette has less variations with which to work. The subtleties permit data to be stored without the human eye catching the changes. Many argue that gray-scale images render the "best" results for steganography. However, using gray- scale or color is not as important as the subtleties in color variation. Consider the following two 256 color palettes.
Figure 3 illustrates subtle changes in color variations. It is difficult to differentiate between many of the colors in this palette. Is this palette in Figure 2 "good" for steganography? Well, it depends. Subtle color changes can be seen in Figure 2, but other color variances seem to be rather drastic. However, one must consider the image in addition to the palette. Obviously, an
image with large areas of solid colors is a poor choice as variances created from the embedded message will be noticeable in the solid areas (a palette as in Figure 3 would offset this). Figure 2 is the palette from a 256 color version of Renoir's Le Moulin de la Galette. Based on embedding this image with text and graphic messages, it is a very good container for holding data.
Various steganographic software packages were explored. The evaluation process was to determine limitations and flexibility of the software readily available to the public. Message and container files were selected before testing. This proved to be a problem with some packages due to limitations of the software. The images selected had to be altered to fit into the constraints of the software and other containers were used. In all, a total of 25 files were used as containers (much more than I have room to discuss). The files used for evaluation included two "message" files and two "container" files. The "message" files are those to be hidden in the innocent looking "container" files. Message 1 contains the following plain-text and will be referred to as M1: Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography, where the "enemy" is allowed to detect, intercept and modify messages without being able to violate certain security premises guaranteed bya cryptosystem, the goal of steganography is to hide messages inside other "harmless" messages in a way that does not allow any "enemy" to even detect that there is a second secret message present.
College Name
Page 38
The satellite photograph is of a major Soviet strategic bomber base near Dolon,
Kazakhstan taken August 20, 1966. An Executive Order, signed by President Clinton on 23 February 1995, has authorized the declassification of satellite photographs collected by the U.S. intelligence community during the 1960's. This and other photographs are available on the Internet via U.S Geological Survey - National Mapping Information - EROS Data Center.
College Name
Page 39
The Container Files Figure 5: Renoir's Le Moulin de la Galette - Container C1
Figure 6: Droeshout engraving of William Shakespeare - Container C27
6 7
Le Moulin de la Galette by Pierre-Auguste Renoir is available via the WebMuseum, Paris A JPG version of Droeshout engraving of William Shakespeare is available.
Page 40
and accessible.
College Name
The image of Shakespeare is too small to contain M2, but M1 could be embedded without any degradation of the image. For the most part, all the software tested could handle the 518 byte plaintext message, however, only two could handle the image labeled M2. Of the two, only one software package could reliably handle 24-bit images and other formats consistently: STools by Andy Brown. Next, an attempt was made to embed messages M1 and M2 using each software package. If the software could not handle processing these containers (C1 and C2), other containers were tried. All the software could embed M1 into some container. These files were reviewed before and after applying steganographic methods.
3.3 Software Evaluation

The following software packages were reviewed with respect to steganographic manipulation of images: Hide and Seek v4.1, StegoDos v0.90a, White Noise Storm, and S-Tools for Windows v3.00. Nearly all the authors encourage encrypting messages before embedding them in images as an added layer of protection and reviewing the images after embedding data. Even with the most reliable software tested, there may be some unexpected results.
3.3.1 Hide and Seek v 4.1

Hide and Seek versions 4.1 and 5.0 by Colin Maroney have similar limitations with minimum image sizes (320 x 480). In version 4.1 if the image is smaller than the minimum, then the stego-image is padded with black space. If the cover image is larger, the stego-image is cropped to fit. In version 5.0 the same is true with minimum image sizes. If any image exceeds 1024 x 768, an error message is returned. The Hide and Seek 1.0 for Windows 95 version seems to have these issues resolved and is a much improved steganography tool. Version 4.1 is evaluated here to illustrate limitations of some steganography tools. Hide and Seek 4.1 is free software which contains a series of DOS programs that embed data in GIF files and comes with the source code.
College Name
Page 41
Hide and Seek uses the Least Significant Bit of each pixel to encode characters, 8 pixels per character and spreads the data throughout the GIF in a somewhat random fashion. The larger the message the more likely the resulting image will be degraded. Since the data is dispersed "randomly" and the message file header is encrypted, there is no telling what is in an embedded file. Unfortunately the hidden file can be no longer than 19,000 bytes because the maximum display used is 320 x 480 pixels. Each character takes 8 pixels two hide ( (320x480)/8 = 19200). C2 (Shakespeare) was used to embed M1. The original image of Shakespeare is 222 x 282 pixels and 256 shades of gray. The resulting image was forced to 320 x 480 pixels. Instead of "stretching" the image to fit, large black areas were added to the image making it 320 x 480. The image on the left is the original C2 and the image on the right is embedded with M1.
3.3.2 StegoDos
StegoDos is also known as Black Wolf's Picture Encoder version 0.90a. This is Public Domain software written by Black Wolf (anonymous). This is a series of DOS programs that
College Name
Page 42
require far too much effort for the results. It will only work with 320x200 images with 256 colors. To encode a message, one must: 1. Run GETSCR. This starts a TSR which will perform a screen capture when PRINTSCREEN is pressed. 2. View the image with a third-party image viewing software (not included with StegoDos) and press PRINTSCREEN to save the image in MESSAGE.SCR. 3. Save your message to be embedded in the image as MESSAGE.DAT. 4. Run ENCODE. This will merge MESSAGE.DAT with MESSAGE.SCR. 5. Use a third party screen capturing program (not included with StegoDos) to capture the new image from the screen. 6. Run PUTSCR and capture the image displayed on the screen.
Decoding the message is not as involved but still requires a third party program to view the image. To decode a message, one must: 1. Run GETSCR. This starts a TSR which will perform a screen capture when PRINTSCREEN is pressed. 2. View the image containing a message with a third-party image viewing software (not included with StegoDos) and press PRINTSCREEN to save the image in MESSAGE.SCR. 3. Run DECODE. This will extract the stored message from MESSAGE.SCR. Due to the size restrictions, M2 and C1 could not be used. C2 (Shakespeare) and a number of other containers were tested (both color and gray-scale) with M1. Every one of them were obviously distorted. There was little distortion within the C2 image, but it was cropped and fitted into a 320 x 200 pixel image. The image on the left is the original C2 file. The image on the right contains the M1 message:
College Name
Page 43
This application uses the Least Significant Bit method with less success than the others. It also appends an EOF (end of file) character to the end of the message. Even with the EOF character, the message retrieved from the altered imaged most likely contained garbage at the end. The following is the original message (M1) and a portion of the message extracted from the image created with StegoDos: Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography, where the "enemy" is allowed to detect, intercept and modify messages without being able to violate certain security premises guaranteed by a cryptosystem, the goal of steganography is to hide messages inside other "harmless" messages in a way that does not allow any "enemy" to even detect that there is a second secret message present. The original file is 518 bytes. The extracted file is around 8 kilobytes: Steganography is the art and science of communicating in a way which hides the existence of the communication. In contrast to cryptography, where the "enemy" is allowed to detect, intercept and modify messages without being able to violate certain security premises guaranteed by a cryptosystem, the goal of steganography is to hide messages inside other "harmless"
College Name
Page 44
messages in a way that does not allow any "enemy" to even detect that there is a second secret message present.
3.3.3 White Noise Storm

White Noise Storm by Ray (Arsen) Arachelian is a very versatile steganography application for DOS. Embedding M1 in the containers C1 and C2 was rather trivial and no degradation could be detected. White Noise Storm was the first software tested that could embed M2 into C1 - notice the "noise" interfering with the image integrity. The image on the left is the original C2. The image on the right contains message M1:
College Name
Page 45
Arachelian encourages encrypting the message before embedding it into an image. White Noise Storm (WNS) also includes an encryption routine to "randomize" the bits with in an image.
His use of encryption with steganography is well integrated, but is beyond the scope of this paper. WNS was designed based on the idea of spread spectrum technology and frequency hopping. "Instead of having X channels of communication which are changed with a fixed formula and passkey. Eight channels are spread within a number of 8-bits*W byte channels. W represents a random sized window of W bytes. Each of these eight channels represents one single bit, so each window holds one byte of information and a lot of unused bits. These channels rotate among themselves, for instance bit 1 might be swapped with bit 7, or all the bits may rotate positions at once. These bits change location within the window on the byte level. The rules for this swapping are dictated not only by the passphrase by also by the previous window's random data (similar to DES block encryption)" [Arachelian, RE: Steganography]. WNS also used the Least Significant Bit (LSB) application of steganography and applies this method to PCX8 files. The software extracts the LSBs from the container image and stores them in a file. The message is encrypted and applied to these bits to create a "new" set of LSBs. These are then "injected" into the container image to create a new image. The documentation that accompanies White Noise Storm is well organized and explains some of the theory behind the implementation of encryption and steganography. The main disadvantage of applying the WNS encryption method to steganography is the loss of many bits that can be used to hold information. Relatively large files must be used to hold the same amount of information other methods provide.
College Name
Page 47
3.3.4 S-Tools
Steganography Tools (S-Tools) for Windows 3.00 by Andy Brown is the most versatile steganography tools of any applications tested. It includes several programs that process GIF and BMP images (ST-BMP.EXE), audio WAV files (ST-WAV.EXE) and will even hide information in the "unused" areas on floppy diskettes (ST-FDD.EXE). In addition to supporting 24-bit images, S-Tools also includes a barrage of encryption routines (Idea, MPJ2, DES, 3DES and NSEA) with many options. S-Tools applies the LSB methods discussed before to both images and audio files. Due to the lack of resources, only images were tested. Brown developed a very nice interface with prompts and well developed on-line documentation. The only apparent limitations were the resources available. There were times large 24-bit images would bring the Windows to a halt. A very useful feature is a status line that displays the largest message size that can be store in an open container file. This saved the time of attempting to store a message that is too large for a container. After hiding the message, the "new" image will be displayed and let you toggle between the new and original images. At times the new image looked to be grossly distorted, but after saving the new image looked nearly identical to the original. This may be due to memory limitations. On occasion a saved image was actually corrupted and could not be read. A saved image should always be reviewed before sending it out. S-Tools provided the most impressive results. Unlike the obvious distortions in "A Cautionary Note on Image Downgrading" [Kurak92], S-Tools maintained remarkable image integrity. The following figure illustrates the text message M1 embedded in container C2.
College Name
Page 48
The following is the original C1 (top) and C1 embedded with M2 (airfield):
College Name
Page 49
The following is derived from S-Tools BMP - How it is done by Andy Brown:
College Name
Page 50
"S-Tools works by 'spreading' the bit-pattern of the message file to be hidden across the least-significant bits of the color levels in the image. S-Tools tries to reduce the number of image colors in a manner that preserves as much of the image detail as possible. It is difficult to tell the difference between a 256 color image and one reduced to 32." "S-Tools adds some extra information on to the front of the message file before hiding. 32 bits of time-dependent random garbage is added first. This step means that two identical hidden files that are encrypted in CBC or PCBC mode will never encipher to the same ciphertext. The 32 bit length of the hidden file is then included. This is required for S- Tools to be able to extract the hidden file. Encryption will conceal this value." "To further conceal the presence of a file, S-Tools picks its bits from the image based on the output of a random number generator. This is designed to defeat an attacker who might apply a statistical randomness test to the lower bits of the image to determine whether encrypted data is hidden there (well-encrypted data shows up as pure white noise). The random number generator used by S-Tools is based on the output of the MD5 message digest algorithm, and is not easily (if at all) defeatable" [S-Tools Documentation by Andy Brown].
3.4. Software not tested but worth noting

The following software packages were reviewed but not tested: Jpeg-Jsteg v4 and Stealth v1.1.
College Name
Page 51
3.4.1 Jpeg-Jsteg v4
Cryptography and steganography rely on retrieving a message in its original form without losing any information. Such is the idea behind lossless compression. Since JPG images use lossy encoding to compress its data, it is generally thought that steganography would be infeasible with such images. "This version of the Independent JPEG Group's JPEG Software has been modified for 1-bit steganography in JFIF output files" [Independent JPEG Group]. The Jpeg-Jsteg software comes with source code and instructions for compiling the code on various platforms. According to the Independent JPEG Group (IJPG), the JFIF format is composed of lossy and non-lossy stages. Information can be inserted between these stages without corrupting the image. As discussed earlier with Renoir's Le Moulin de la Galette compression is a great advantage JPG images have over other formats. JPEG images are becoming more abundant on the Internet because large images with unlimited colors can be stored in relatively small files (a 1073 x 790 pixel image with 16 million colors can be stored in a 170 Kilobyte file. The same image is over 2 Megabytes if converted to a BMP).
3.4.2 Stealth v1.1

Stealth by Henry Hastur in and of itself is not a steganographic program or method. It is usually found with steganographic software on the Internet and is used to complement the steganographic methods. Stealth is a filter that strips off the PGP header that is on a PGP encrypted file. This leaves only the encrypted data. Why is this important? Applying steganography to an encrypted message is more secure than a "plain text" message. However, many encryption applications add header information to the encrypted message. This header information identifies the method used to encrypt the data. For example, if a cracker has identified hidden data in an image and has successfully extracted the encrypted message, a header for the encryption method would point the cracker in the right direction for additional cryptanalysis. But, if the header is removed, the cracker cannot determine the method for
encryption. Some steganography software (White Noise Storm and S-Tools) provide this step in security, but others do not.
3.5 Code:
function varargout = steg(varargin) % STEG M-file for steg.fig % % % % % % % % % % % % % % % % % % % See also: GUIDE, GUIDATA, GUIHANDLES % Copyright 2002-2003 The MathWorks, Inc. College Name Page 53 *See GUI Options on GUIDE's Tools menu. Choose "GUI allows only one instance to run (singleton)". STEG('Property','Value',...) creates steg new STEG or raises the existing singleton*. Starting from the left, property value pairs are applied to the GUI before steg_OpeningFunction gets called. An unrecognized property name or invalid value makes property application stop. All inputs are passed to steg_OpeningFcn via varargin. STEG('CALLBACK',hObject,eventData,handles,...) calls the local function named CALLBACK in STEG.M with the given input arguments. H = STEG returns the handle to steg new STEG or the handle to the existing singleton*. STEG, by itself, creates steg new STEG or raises the existing singleton*.
% Edit the above text to modify the response to help steg
% Last Modified by GUIDE v2.5 27-Jun-2007 02:17:56 % Begin initialization code - DO NOT EDIT gui_Singleton = 1; gui_State = struct('gui_Name', mfilename, ...
'gui_Singleton', gui_Singleton, ... 'gui_OpeningFcn', @steg_OpeningFcn, ... 'gui_OutputFcn', @steg_OutputFcn, ... 'gui_LayoutFcn', [] , ... 'gui_Callback', []); if nargin && ischar(varargin{1}) gui_State.gui_Callback = str2func(varargin{1}); end if nargout [varargout{1:nargout}] = gui_mainfcn(gui_State, varargin{:}); else gui_mainfcn(gui_State, varargin{:}); end % End initialization code - DO NOT EDIT % --- Executes just before steg is made visible. function steg_OpeningFcn(hObject, eventdata, handles, varargin) % This function has no output args, see OutputFcn. % hObject handle to figure
% eventdata reserved - to be defined in steg future version of MATLAB % handles structure with handles and user data (see GUIDATA) Page 54
College Name
% varargin command line arguments to steg (see VARARGIN) % Choose default command line output for steg handles.output = hObject; % Update handles structure guidata(hObject, handles); % UIWAIT makes steg wait for user response (see UIRESUME) % uiwait(handles.figure1); % --- Outputs from this function are returned to the command line. function varargout = steg_OutputFcn(hObject, eventdata, handles) % varargout cell array for returning output args (see VARARGOUT); % hObject handle to figure
% eventdata reserved - to be defined in steg future version of MATLAB % handles structure with handles and user data (see GUIDATA)
% Get default command line output from handles structure varargout{1} = handles.output; % --- Executes on button press in mat. function mat_Callback(hObject, eventdata, handles) % hObject handle to mat (see GCBO)
% eventdata reserved - to be defined in steg future version of MATLAB % handles mat % --- Executes on button press in itu. function itu_Callback(hObject, eventdata, handles) % hObject handle to itu (see GCBO) structure with handles and user data (see GUIDATA)
% eventdata reserved - to be defined in steg future version of MATLAB % handles structure with handles and user data (see GUIDATA) Page 55
College Name
itu % --- Executes on button press in dec. function dec_Callback(hObject, eventdata, handles) % hObject handle to dec (see GCBO)
% eventdata reserved - to be defined in steg future version of MATLAB % handles global var % Loading the Image [filename, pathname, filterindex]=uigetfile( ... {'*.jpg','JPG File (*.bmp)'; ... '*.*','Any Image file (*.*)'}, ... 'Pick an image file'); var=strcat(pathname,filename); dec(var) structure with handles and user data (see GUIDATA)
Chapter 4 Conclusion & Future Scope

Steganography has its place in security. It is not intended to replace cryptography but supplement it. Hiding a message with steganography methods reduces the chance of a message being detected. However, if that message is also encrypted, if discovered, it must also be cracked (yet another layer of protection). There are an infinite number of steganography applications. This paper explores a tiny fraction of the art of steganography. It goes well beyond simply embedding text in an image. Steganography does not only pertain to digital images but also to other media (files such as
voice, other text and binaries; other media such as communication channels, the list can go on and on). Consider the following example: A person has a cassette tape of Pink Floyd's "The Wall." The plans of a Top Secret project (e.g., device, aircraft, covert operation) are embedded, using some steganographic method, on that tape. Since the alterations of the "expected contents" cannot be detected, (especially by human ears and probably not easily so by digital means) these plans can cross borders and trade hands undetected. How do you detect which recording has the message? This is a trivial (and incomplete) example, but it goes far beyond simple image encoding in an image with homogeneous regions. Part of secrecy is selecting the proper mechanisms. Consider encoding using an Mandelbrot image [Hastur]. In and of itself, steganography is not a good solution to secrecy, but neither is simple substitution and short block permutation for encryption. But if these methods are combined, you have much stronger encryption routines (methods). For example (again over simplified): If a message is encrypted using substitution (substituting one alphabet with another), permute the message (shuffle the text) and apply a substitution again, then the encrypted ciphertext is more secure than using only substitution or only permutation. NOW, if the ciphertext is embedded in an [image, video, voice, etc.] it is even more secure. If an encrypted message is intercepted, the interceptor knows the text is an encrypted message. With steganography, the interceptor may not know the object contains a message.
College Name
Page 57
References:
1. [Aura95] Tuomas Aura, "Invisible Communication," EET 1995, 2. [Brassil-Infocom95] J. Brassil, S. Low, N. Maxemchuk, L. OGoram, "Document Marking and Identification using Both Line and Word Shifting," Infocom95, 3. [Brassil-Infocom94] J. Brassil, S. Low, N. Maxemchuk, L. OGoram, "Electronic Marking and Identification Techniques to Discourage Document Copying," 4. [Brassil-CISS95] J. Brassil, S. Low, N. Maxemchuk, L. OGoram, "Hiding Information in Document Images," CISS95, 5. [JDJ01] Neil F. Johnson, Zoran Duric, Sushil Jajodia, Information Hiding: Steganography and Watermarking - Attacks and Countermeasures Kluwer Academic Press, Norwrll, MA, New York, The Hague, London, 2000. 6. [Kahn67] David Kahn, The Codebreakers, The Macmillan Company. New York, NY 1967.
College Name
Page 58
7. [Kurak92] C. Kurak, J. McHugh, "A Cautionary Note On Image Downgrading," IEEE Eighth Annual Computer Security Applications Conference, 1992. pp. 153-159. 8. [Norman73] Bruce Norman, Secret Warfare, Acropolis Books Ltd. Washington, DC 1973. 9. [Zevon] Warren Zevon, Lawyers, Guns, and Money. Music track released in the albums Excitable Boy, 1978; Stand in the Fire, 1981; A Quiet Normal Life, 1986; Learning to Flinch, 1993. 10.[Zim48] Herbert S. Zim, Codes and Secret Writing, William Marrow and Company. New York, NY, 1948.
Appendix - A
Steganography: Hiding Data Within Data
Cryptography the science of writing in secret codes addresses all of the elements necessary for secure communication over an insecure channel, namely privacy, confidentiality, key exchange, authentication, and non-repudiation. But cryptography does not always provide safe communication. Consider an environment where the very use of encrypted messages causes suspicion. If a nefarious government or Internet service provider (ISP) is looking for encrypted messages, they can easily find them. Consider the following text file; what else is it likely to be if not encrypted?
qANQR1DBwU4D/TlT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy ZBIEFtjyqCd96qF38sp9IQiJIKlNaZfx2GLRWikPZwchUXxB+AA5+lqsG/ELBvRa c9XefaYpbbAZ6z6LkOQ+eE0XASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv z/9Ak4/OLnLiJRk05/2UNE5Z0a+3lcvITMmfGajvRhkXqocavPOKiin3hv7+Vx88 uLLem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9muDGBY0MkjiZIRI7azWnoU9 3KCnmpR60VO4rDRAS5uGl9fioSvze+q8XqxubaNsgdKkoD+tB/4u4c4tznLfw1L2 YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd49BTOFFTWWavAfE gLYcPrcn4s3EriUgvL3OzPR4P1chNu6sa3ZJkTBbriDoA3VpnqG3hxqfNyOlqAka mJJuQ53Ob9ThaFH8YcE/VqUFdw+bQtrAJ6NpjIxi/x0FfOInhC/bBw7pDLXBFNaX HdlLQRPQdrmnWskKznOSarxq4GjpRTQo4hpCRJJ5aU7tZO9HPTZXFG6iRIT0wa47 AR5nvkEKoIAjW5HaDKiJriuWLdtN4OXecWvxFsjR32ebz76U8aLpAK87GZEyTzBx dV+lH0hwyT/y1cZQ/E5USePP4oKWF4uqquPee1OPeFMBo4CvuGyhZXD/18Ft/53Y WIebvdiCqsOoabK3jEfdGExce63zDI0= =MpRf The message above is a sentence in English that is encrypted using Pretty Good Privacy (PGP), probably the most commonly used e-mail encryption software today. Besides being nonsensical to a casual reader, the other indication that this is encrypted is that the characters comprising the message appear more-or-less at random and do not adhere to the relative frequency counts that one would expect in a non-encrypted message. Encrypted data sticks out like a sore thumb. Steganography is the science of hiding information. Whereas the goal of cryptography is to make data unreadable by a third party, the goal of steganography is to hide the data from a third party. In this article, I will discuss what steganography is, what purposes it serves, and will provide an example using available software. STEGANOGRAPHY There are a large number of steganographic methods that most of us are familiar with (especially if you watch a lot of spy movies!), ranging from invisible ink and microdots to secreting
a hidden message in the second letter of each word of a large body of text and spread spectrum radio communication. With computers and networks, there are many other ways of hiding information, such as:
Covert channels (e.g., Loki and some distributed denial-of-service tools use the Internet Control Message Protocol, or ICMP, as the communications channel between the "bad guy" and a compromised system)
Hidden text within Web pages Hiding files in "plain sight" (e.g., what better place to "hide" a file than with an important sounding name in the c:\winnt\system32 directory?) Null ciphers (e.g., using the first letter of each word to form a hidden message in an otherwise innocuous text) Steganography today, however, is significantly more sophisticated than the examples above suggest, allowing a user to hide large amounts of information within image and audio files. These forms of steganography often are used in conjunction with cryptography so that the information is doubly protected; first it is encrypted and then hidden so that an adversary has to first find the information (an often difficult task in and of itself) and then decrypt it. There are a number of uses for steganography besides the mere novelty. One of the most widely used applications is for so-called digital watermarking. A watermark, historically, is the replication of an image, logo, or text on paper stock so that the source of the document can be at least partially authenticated. A digital watermark can accomplish the same function; a graphic artist, for example, might post sample images on her Web site complete with an embedded signature so that she can later prove her ownership in case others attempt to portray her work as their own. Stego can also be used to allow communication within an underground community. There are several reports, for example, of persecuted religious minorities using steganography to embed messages for the group within images that are posted to known Web sites.
College Name
Page 61
STEGANOGRAPHIC METHODS The following formula provides a very generic description of the pieces of the steganographic process: cover_medium + hidden_data + stego_key = stego_medium In this context, the cover_medium is the file in which we will hide the hidden_data, which may also be encrypted using the stego_key. The resultant file is the stego_medium (which will, of course. be the same type of file as the cover_medium). The cover_medium (and, thus, the stego_medium) are typically image or audio files. In this article, I will focus on image files and will, therefore, refer to the cover_image and stego_image. Before discussing how information is hidden in an image file, it is worth a fast review of how images are stored in the first place. An image file is merely a binary file containing a binary representation of the color or light intensity of each picture element (pixel) comprising the image. Images typically use either 8-bit or 24-bit color. When using 8-bit color, there is a definition of up to 256 colors forming a palette for this image, each color denoted by an 8-bit value. A 24-bit color scheme, as the term suggests, uses 24 bits per pixel and provides a much better set of colors. In this case, each pix is represented by three bytes, each byte representing the intensity of the three primary colors red, green, and blue (RGB), respectively. The Hypertext Markup Language (HTML) format for indicating colors in a Web page often uses a 24-bit format employing six hexadecimal digits, each pair representing the amount of red, blue, and green, respectively. The color orange, for example, would be displayed with red set to 100% (decimal 255, hex FF), green set to 50% (decimal 127, hex 7F), and no blue (0), so we would use "#FF7F00" in the HTML code. The size of an image file, then, is directly related to the number of pixels and the granularity of the color definition. A typical 640x480 pix image using a palette of 256 colors would
require a file about 307 KB in size (640 480 bytes), whereas a 1024x768 pix highresolution 24-bit color image would result in a 2.36 MB file (1024 768 3 bytes). To avoid sending files of this enormous size, a number of compression schemes have been developed over time, notably Bitmap (BMP), Graphic Interchange Format (GIF), and Joint Photographic Experts Group (JPEG) file types. Not all are equally suited to steganography, however. GIF and 8-bit BMP files employ what is known as lossless compression, a scheme that allows the software to exactly reconstruct the original image. JPEG, on the other hand, uses lossy compression, which means that the expanded image is very nearly the same as the original but not an exact duplicate. While both methods allow computers to save storage space, lossless compression is much better suited to applications where the integrity of the original information must be maintained, such as steganography. While JPEG can be used for stego applications, it is more common to embed data in GIF or BMP files. The simplest approach to hiding data within an image file is called least significant bit (LSB) insertion. In this method, we can take the binary representation of the hidden_data and overwrite the LSB of each byte within the cover_image. If we are using 24-bit color, the amount of change will be minimal and indiscernible to the human eye. As an example, suppose that we have three adjacent pixels (nine bytes) with the following RGB encoding: 10010101 00001101 11001001 10010110 00001111 11001010 10011111 00010000 11001011 Now suppose we want to "hide" the following 9 bits of data (the hidden data is usually compressed prior to being hidden): 101101101. If we overlay these 9 bits over the LSB of the 9 bytes above, we get the following (where bits in bold have been changed):
College Name
Page 63
10010101 00001100 11001001 10010111 00001110 11001011 10011111 00010000 11001011 Note that we have successfully hidden 9 bits but at a cost of only changing 4, or roughly 50%, of the LSBs. This description is meant only as a high-level overview. Similar methods can be applied to 8-bit color but the changes, as the reader might imagine, are more dramatic. Gray-scale images, too, are very useful for steganographic purposes. One potential problem with any of these methods is that they can be found by an adversary who is looking. In addition, there are other methods besides LSB insertion with which to insert hidden information. Without going into any detail, it is worth mentioning steganalysis, the art of detecting and breaking steganography. One form of this analysis is to examine the color palette of a graphical image. In most images, there will be a unique binary encoding of each individual color. If the image contains hidden data, however, many colors in the palette will have duplicate binary encodings since, for all practical purposes, we can't count the LSB. If the analysis of the color palette of a given file yields many duplicates, we might safely conclude that the file has hidden information. But what files would you analyze? Suppose I decide to post a hidden message by hiding it in an image file that I post at an auction site on the Internet. The item I am auctioning is real so a lot of people may access the site and download the file; only a few people know that the image has special information that only they can read. And we haven't even discussed hidden data inside audio files! Indeed, the quantity of potential cover files makes steganalysis a Herculean task. A STEGANOGRAPHY EXAMPLE There are a number of software packages that perform steganography on just about any software platform; readers are referred to Neil Johnson's list of steganography tools at
http://www.jjtc.com/Steganography/toolmatrix.htm. Some of the better known packages for Windows NT and Windows 2000 systems include:

Hide4PGP (http://www.heinz-repp.onlinehome.de/Hide4PGP.htm) MP3Stego (http://www.cl.cam.ac.uk/~fapp2/steganography/mp3stego/) Stash (http://www.smalleranimals.com/stash.htm) Steganos (http://www.steganos.com/english/steganos/download.htm) S-Tools (available from http://www.webattack.com/download/dlstools.shtml)
College Name
Page 65
FIGURE 1. The cover_image (5th wave.gif), hidden_data file (virusdetectioninfo.txt), and stego_key.
The following examples come from Andy Brown's S-Tools for Windows. S-Tools allows users to hide information into BMP, GIF, or WAV files. The basic scheme of the program is straight-forward; you drag an image or audio file into the S-Tools active window to act as
the cover_medium, drag the hidden_data file onto the cover_medium, and then provide a stego_key for encryption. The result is the stego_medium. All of this is shown in Figure 1:
1.
I highlighted the GIF image file 5th wave.gif and dragged it to the S-Tools active window. Note that S-Tools reports that up to 138,547 bytes can be hidden in this image file. I next highlighted a 14 KB text file called virusdetectioninfo.txt and dragged it onto the image file in S-Tools. A dialog box pops up telling me that I am hiding 6,019 bytes of data and asks for a passphrase with which to encrypt the hidden text; the default secret key crypto scheme used by S-Tools is the International Data Encryption Algorithm (IDEA).
2.
3.
College Name
Page 67
College Name
Page 68
FIGURE 3. Extracting hidden information from the image file.
4.
Once the image file has been received, the user merely drags the file to S-Tools and rightclicks over the image, specifying the Reveal option. A dialog box will pop up requesting
College Name
Page 69
the passphrase. Figure 3 shows the information about the hidden archive file, and allows the user to open the file.
Appendix B
MATLAB
MATLAB is a numerical computing environment and fourth generation programming language. Developed by The MathWorks, MATLAB allows matrix manipulation, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs in other languages. Although it is numeric only, an optional toolbox uses the MuPAD symbolic engine, allowing access to computer algebra capabilities. An additional package, Simulink, adds graphical multidomain simulation and Model-Based Design for dynamic and embedded systems. In 2004, MathWorks claimed that MATLAB was used by more than one million people across industry and the academic world.[ MATLAB (meaning "matrix laboratory") was invented in the late 1970s by Cleve Moler, then chairman of the computer science department at the University of New Mexico. He designed it to give his students access to LINPACK and EISPACK without having to learn Fortran. It soon spread to other universities and found a strong audience within the applied mathematics community. Jack Little, an engineer, was exposed to it during a visit Moler made to Stanford University in 1983. Recognizing its commercial potential, he joined with Moler and Steve Bangert. They rewrote MATLAB in C and founded The MathWorks in 1984 to continue its development. These rewritten libraries were known as JACKPAC.[citation
needed]
In 2000,
MATLAB was rewritten to use a newer set of libraries for matrix manipulation, LAPACK. MATLAB was first adopted by control design engineers, Little's specialty, but quickly spread to many other domains. It is now also used in education, in particular the teaching of linear algebra and numerical analysis, and is popular amongst scientists involved with image processing.
Variables
Variables are defined with the assignment operator, =. MATLAB is dynamically typed, meaning that variables can be assigned without declaring their type, except if they are to be treated as symbolic objects[6], and that their type can change. Values can come from constants,
from computation involving values of other variables, or from the output of a function. For example:
>> x = 17 x = 17 >> x = 'hat' x = hat >> x = [3*4, pi/2] x = 12.0000 1.5708 >> y = 3*sin(x) y = -1.6097 3.0000
Vectors/Matrices
MATLAB is a "Matrix Laboratory", and as such it provides many convenient ways for creating vectors, matrices, and multi-dimensional arrays. In the MATLAB vernacular, a vector refers to a one dimensional (1N or N1) matrix, commonly referred to as an array in other programming languages. A matrix generally refers to a 2-dimensional array, i.e. an mn array where m and n are greater than or equal to 1. Arrays with more than two dimensions are referred to as multidimensional arrays. MATLAB provides a simple way to define simple arrays using the syntax:
init:increment:terminator. For instance: >> array = 1:2:9 array = 13579 defines a variable named array (or assigns a new value to an existing variable with the name array) which is an array consisting of the values 1, 3, 5, 7, and 9. That is, the array starts at 1 (the init value), increments with each step from the previous value by 2 (the increment value), and stops once it reaches (or to avoid exceeding) 9 (the terminator value). >> array = 1:3:9
array = 147 the increment value can actually be left out of this syntax (along with one of the colons), to use a default value of 1. >> ari = 1:5 ari = 12345 assigns to the variable named ari an array with the values 1, 2, 3, 4, and 5, since the default value of 1 is used as the incrementer. Indexing is one-basedwhich is the usual convention for matrices in mathematics. This is atypical for programming languages, whose arrays more often start with zero. Matrices can be defined by separating the elements of a row with blank space or comma and using a semicolon to terminate each row. The list of elements should be surrounded by square brackets: []. Parentheses: () are used to access elements and subarrays (they are also used to denote a function argument list).
>> A = [16 3 2 13; 5 10 11 8; 9 6 7 12; 4 15 14 1] A = 16 3 2 13 5 10 11 8 9 6 7 12 4 15 14 1 >> A(2,3)
ans = 11 Sets of indices can be specified by expressions such as "2:4", which evaluates to [2, 3, 4]. For example, a submatrix taken from rows 2 through 4 and columns 3 through 4 can be written as: >> A(2:4,3:4) ans =
11 8 7 12 14 1 A square identity matrix of size n can be generated using the function eye, and matrices of any size with zeros or ones can be generated with the functions zeros and ones, respectively. >> eye(3) ans = 100 010 001 >> zeros(2,3) ans = 000 000 >> ones(2,3) ans = 111 111 Most MATLAB functions can accept matrices and will apply themselves to each element. For example, mod(2*J,n) will multiply every element in "J" by 2, and then reduce each element modulo "n". MATLAB does include standard "for" and "while" loops, but using MATLAB's vectorized notation often produces code that is easier to read and faster to execute. This code, excerpted from the function magic.m, creates a magic square M for odd values of n (MATLAB function meshgrid is used here to generate square matrices I and J containing 1:n). [J,I] = meshgrid(1:n); A = mod(I+J-(n+3)/2,n); B = mod(I+2*J-2,n); M = n*A + B + 1;
Semicolon
Unlike many other languages, where the semicolon is used to terminate commands, in MATLAB the semicolon serves to suppress the output of the line that it concludes.
Graphics
Function plot can be used to produce a graph from two vectors x and y. The code:
x = 0:pi/100:2*pi; y = sin(x); plot(x,y)
produces the following figure of the sine function:
Three-dimensional graphics can be produced using the functions surf, plot3 or mesh.
[X,Y] = meshgrid(-10:0.25:10,10:0.25:10); f = sinc(sqrt((X/pi).^2+(Y/pi).^2)); mesh(X,Y,f); axis([-10 10 -10 10 -0.3 1]) xlabel('{\bfx}') ylabel('{\bfy}') zlabel('{\bfsinc} ({\bfR})') hidden off [X,Y] = meshgrid(-10:0.25:10,10:0.25:10); f = sinc(sqrt((X/pi).^2+(Y/pi).^2)); surf(X,Y,f); axis([-10 10 -10 10 -0.3 1]) xlabel('{\bfx}') ylabel('{\bfy}') zlabel('{\bfsinc} ({\bfR})')
This code produces a wireframe 3D plot of the two-dimensional unnormalized sinc function:
This code produces a surface 3D plot of the two-dimensional unnormalized sinc function:
College Name
Page 75
Object-Oriented Programming
MATLAB's support for object-oriented programming includes classes, inheritance, virtual dispatch, packages, pass-by-value semantics, and pass-by-reference semantics.[8]
classdef hello methods function doit(this) disp('hello') end end end
When put into a file named hello.m, this can be executed with the following commands:
>> x = hello; >> x.doit; hello
Limitations
For a long time there was criticism that because MATLAB is a proprietary product of The MathWorks, users are subject to vendor lock-in. Recently an additional tool called the MATLAB Builder under the Application Deployment tools section has been provided to deploy MATLAB functions as library files which can be used with .NET or Java application building environment. But the drawback is that the computer where the application has to be deployed needs MCR (MATLAB Component Runtime) for the MATLAB files to function normally. MCR can be distributed freely with library files generated by the MATLAB compiler.
College Name
Page 76
MATLAB, like Fortran, Visual Basic and Ada, uses parentheses, e.g. y = f(x), for both indexing into an array and calling a function. Although this syntax can facilitate a switch between a procedure and a lookup table, both of which correspond to mathematical functions, a careful reading of the code may be required to establish the intent. Many functions have a different behavior with matrix and vector arguments. Since vectors are matrices of one row or one column, this can give unexpected results. For instance, function sum(A) where A is a matrix gives a row vector containing the sum of each column of A, and sum(v) where v is a column or row vector gives the sum of its elements; hence the programmer must be careful if the matrix argument of sum can degenerate into a single-row array. While sum and many similar functions accept an optional argument to specify a direction, others, like plot, do not, and require additional checks. There are other cases where MATLAB's interpretation of code may not be consistently what the user intended[citation needed] (e.g. how spaces are handled inside brackets as separators where it makes sense but not where it doesn't, or backslash escape sequences which are interpreted by some functions like fprintf but not directly by the language parser because it wouldn't be convenient for Windows directories). What might be considered as a convenience for commands typed interactively where the user can check that MATLAB does what the user wants may be less supportive of the need to construct reusable code.[citation needed] Array indexing is one-based which is the common convention for matrices in mathematics, but does not accommodate any indexing convention of sequences that have zero or negative indices. For instance, in MATLAB the DFT (or FFT) is defined with the DC component at index 1 instead of index 0, which is not consistent with the standard definition of the DFT in any literature. This one-based indexing convention is hard coded into MATLAB, making it difficult for a user to define their own zero-based or negative indexed arrays to concisely model an idea having non-positive indices. Code written for a specific release of MATLAB often does not run with earlier releases as it may use some of the newer features. To give just one example: save('filename','x') saves the variable x in a file. The variable can be loaded with load('filename') in the same MATLAB
release. However, if saved with MATLAB version 7 or later, it cannot be loaded with MATLAB version 6 or earlier. As workaround, in MATLAB version 7 save('filename','x','-v6') generates a file that can be read with version 6. However, executing save('filename','x','v6')
in version 6 causes an error message.
Interactions with other languages

MATLAB can call functions and subroutines written in the C programming language or Fortran. A wrapper function is created allowing MATLAB data types to be passed and returned. The dynamically loadable object files created by compiling such functions are termed "MEXfiles" (for MATLAB executable). Libraries written in Java, ActiveX or .NET can be directly called from MATLAB and many MATLAB libraries (for example XML or SQL support) are implemented as wrappers around Java or ActiveX libraries. Calling MATLAB from Java is more complicated, but can be done with MATLAB extension, which is sold separately by MathWorks. Through the MATLAB Toolbox for Maple, MATLAB commands can be called from within the Maple Computer Algebra System, and vice versa.
Alternatives
MATLAB has a number of competitors. There are free open source alternatives to MATLAB, in particular GNU Octave, FreeMat, and Scilab which are intended to be mostly compatible with the MATLAB language (but not the MATLAB desktop environment). Among other languages that treat arrays as basic entities (array programming languages) are APL and its successor J, Fortran 95 and 2003, as well as the statistical language S (the main implementations of S are S-PLUS and the popular open source language R). There are also several libraries to add similar functionality to existing languages, such as Perl Data Language for Perl and SciPy together with NumPy and Matplotlib for Python.
College Name
Page 79

Final

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Final

Hochgeladen von

Copyright:

Verfügbare Formate

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

1.2 Ancient steganography

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

1.3 Steganographic techniques

1.3.1 Physical steganography

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

1.3.2 Digital steganography

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Image of a cat extracted from above image.

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

1.3.3 Printed steganography

1.4 Organisation of thesis:

Chapter 2 Literature Survey

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Authentication: The process of proving one's identity. (The primary forms

the intended receiver.

altered in any way from the original.

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

2.1.2.1 Secret Key Cryptography

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Electronic Codebook (ECB) mode is the simplest, most obvious

Cipher Block Chaining (CBC) mode adds a feedback mechanism to the

Cipher Feedback (CFB) mode is a block cipher implementation as a self-

Output Feedback (OFB) mode is a block cipher implementation

conceptually similar to a synchronous stream cipher. OFB prevents the same

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Advanced Encryption Standard (AES): In 1997, NIST initiated a very

CAST-128/256: CAST-128, described in Request for Comments (RFC)

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Blowfish: A symmetric 64-bit block cipher invented by Bruce Schneier;

Twofish: A 128-bit block cipher using 128-, 192-, or 256-bit keys.

Camellia: A secret-key, block-cipher crypto algorithm developed jointly

MISTY1: Developed at Mitsubishi Electric Corp., a block cipher using a

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Secure and Fast Encryption Routine (SAFER): Secret-key crypto scheme

2.1.2.2. Public-Key Cryptography

Multiplication vs. factorization: Suppose I tell you that I have two

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Diffie-Hellman: After the RSA algorithm was published, Diffie and

Digital Signature Algorithm (DSA): The algorithm specified in NIST's

Hellman and used for key exchange.

Public-Key Cryptography Standards (PKCS): A set of interoperable

V2.0 is also RFC 2898)

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

Shoup of IBM in 1998.

as the key exchange method for Capstone.

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

2.1.2.4. Hash Functions

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

that produce a 128-bit hash value from an arbitrary-length message.

such as smart cards.

designed specifically for fast processing in software.

Secure Hash Algorithm (SHA): Algorithm for NIST's Secure Hash

AN IMAGE STEGNOGRAPHY IMPLEMENTATION FOR JPEG-COMPRESSED IMAGES

HAVAL (HAsh of VAriable Length): Designed by Y. Zheng, J. Pieprzyk

Whirlpool: A relatively new hash function, designed by V. Rijmen and

Tiger: Designed by Ross Anderson and Eli Biham, Tiger is designed to be