Sie sind auf Seite 1von 33

Introduction to Steganalysis

Schemes

Multimedia Security

Outline
Steganalysis to LSB encoding
Steganalysis based on JPEG compatibility
Some discussions

Introduction
Steganography
The art of secret communication
Stego content (e.g. images) should not
contain any easily detectable artifacts due
to message embedding
The less information is embedded, the
smaller the probability of introducing
detectable artifacts

Watermarking vs. Steganography

Fidelity

Robustness

Steganography

Watermarking

Capacity

Steganalysis of LSB Encoding

Goal
To inspect one or possibly more images
for statistical artifacts due to message
embedding in color images using the
LSB method
To find out which images are likely to
contain secret messages
To estimate the reliability of decisions
Type I error (false-alarm) and Type II error
(Miss)

Application Scenarios
Automatic
Checking

Internet

Internet node with a special filter

Images in
Seized computer

Images sent
to a certain address
Forensics Expert

LSB Encoding
Replacing the LSB of every gray-level of
color channel with message bits
On average 50% of the LSB are changed
Logic behind this scheme
LSB in scanned or camera-taken images are
essentially random
Encrypted (randomized) message are random
No statistical artifacts will be introduced

Important Observation
Number of unique colors in cover images
Typically smaller than the number of pixels in the images
1:2 for high quality scans in BMP format
1:6 or lower for JPEG images or video

Many true-color images have a relatively small


palette
After LSB embedding, new color palette will have a
distinct feature
Many pairs of close colors
An evidence of LSB encoding-based steganography

Formulations
U: number of unique colors in an image
P: number of close color pairs
Two colors (R1,G1,B1) and (R2,G2,B2) are
close if |R1-R2|1 and |G1-G2|1 and |B1-B2|
1

R: ratio between the number of close


pairs of colors and all pairs of colors
R=P/C(U, 2) , C(., .) # of combination

The Proposed Scheme


After embedding, U will be increased to U,
and we can evaluate the number of unique
pairs of P.
The value of R for an image that does not
have a message will be smaller than that
of an image that already has a message
already embedded in it

The Proposed Scheme (cont.)


It is impossible to find a threshold of R for all
images
Due to a large variation of U

Observations for reliable distinguishing


For an image already contains a large message
Embedding another message in it does not modify R
significantly

For an image not containing a message


R increases significantly

Use the relative comparison of R as the decision


criterion

Detection Algorithm

To find out whether or not an image has a secret


message
1.
2.

Calculate R=P/C(U, 2)
Using LSB embedding in randomly selected pixels

3.
4.

Size of the test message: 3aMN (for M by N color images)

Calculate R=P/C(U,2)
Decide whether an image is embedded

R~=R the image already had a large message hidden


R>R the image did not have a message in it

R/R: the separating statistics

Limitations
If the secret message size is too small
the two ratio will be very close to each other
We cannot distinguish images with and without
messages

Experiments
Using an image database of 300 color images
350x250 pixels
JPEG compressed
Capacity for each image: 32.8k bits (350x250*3/8)

A message of length 20KB (2/3 of maximal capacity) was


embedded into each image to form a new database of
images with messages
The detection algorithm is run for both database and the
message presence is tested by embedding a test
message of size 1KB (a=1/30)

Experimental Results

1.1

_ : original database
: embedded database

Parameter Optimization

Model the density functions as Gaussian


distributions
N(, ) and N(s, s)

Different size of secret messages


,denoted as s, and test messages are
tested
Secret messages: 1% to 50%
Test messages: a=0.01 0.5

Results
>s for all s
s decreases N(s, s) become flat and
the peak moves right
s increases N(s, s) become narrower
and the peak moves left
Easier to separate the two peaks for larger
secret message sizes

Threshold Selection
Type I Error = Type II Error
(equals minimizing overall error)

Change the threshold Th to adjust for the importance of not


missing an image with a secret message at the expense of
false-alarm

Experimental Results

Experimental Results (cont.)

Conclusions
The probability of error prediction is mainly determined
by the size of the secret message
The influence of the test message size is much smaller

The optimal test message size is different for different


secret message size
The detection algorithm mainly targets for images with
smaller number of unique colors
The results for high-quality scanned and loselessly compressed
images (U>0.5MN) may be unreliable

Steganalysis Based on JPEG


Compatibility

Image Steganography
Image formats
Uncompressed (BMP)
Offering the highest capacity and best overall security

Palette (GIF)
Difficult to provide security with reasonable capacity

Lossy compressed (JPEG, JPEG 2000)


Difficult to hide message in JPEG stream in a secure
manner while keeping the capacity practical

Goal of this Paper


To show that images may be extremely poor
candidates for cover images if
Initially acquired as JPEG images and later decompressed to a
loseless format

For steganalysis methods, minimal amount of


distortion is to be achieved to reduce visible artifacts
The act of message embedding will not erase the
characteristic structure created by JPEG compression
Analyzing the DCT coefficients of images to recover even
the values of JPEG quantization table

Evidence for steganography


An image stored in loseless format that bears a strong
fingerprinting of JPEG compression, yet is not fully
compatible with JPEG compressed image

JPEG Compression

DCT
Uncompressed Image

Huffman coder

Borig

Zigzag-scan

dk(i), i=0,,63

Dk(i)=Round (dk(i)/Q(i))

JPEG Quantization Matrix Q

JPEG Decompression
Huffman decoding
QDk(i)=Q(i)*Dk(i)
Multiplying quantized DCT step with
quantization step

Braw=DCT-1(QD )
Inverse DCT

B=[Braw]
rounded to integers in the range of 0-255

Observations
If the block B has no pixels saturated at 0
or 255
||Braw-B||2 16 , ||||: L2 norm
Since |Braw(i) B(i)| 0.5 for all i

The Proposed Scheme


Question
Given an arbitrary 8x8 block B of pixel values, could this block
have arisen through the process of JPEG decompression with
the quantization matrix Q (if available)?
||B-Braw||2
=||DCT(B)- DCT(Braw)||
=||QD-QD||
By Parsevals Equality
16
|QD(i)-Q(i)round(QD(i)/Q(i)| = S
- Additional check
- (QD(i)-qp(i)(i))2 16, qp(i):integer multiples of Q(i) close to QD(i)
- B=[DCT-1(QD)], where QD(i)=qp(i)(i)

Algorithm
1. Divide the images into 8x8 blocks
2. Arrange the blocks in a list, and remove all
saturated blocks from the list

T: number of remaining blocks

3. Extract the quantization matrix Q from all T


blocks

If all elements of Q are 1s, the image is not


calculated

Algorithm (cont.)
4. For each block B, calculate S
5. If S>16,
B is not compatible with JPEG compression.
else
Perform the additional check
6. After going through T blocks, if no incompatible blocks is
found, no evidence of steganography is available.
7. Repeat the algorithm for different 8x8 division for
detecting cropped images

Extracting the Quantization Matrix

Some Discussions

Reference
J. Fridrich, R. Du and M. Long, Steganalysis of
LSB encoding in color images, ICME 2000,
New York, 2000
J. Fridrich, M. Goljan and R. Du, Steganalysis
based on JPEG compatibility, SPIE Multimedia
Systems and Applications IV, Denver, 2001
G. Goth, Steganalysis gets past the hype, IEEE
Distributed Systems Online, April 2005

Das könnte Ihnen auch gefallen