Image Forgery JPEG Compression Based Forgery Detection

Image Forgery JPEG Compression Based Forgery Detection
Presented by: Hilal Diab Course professor: Hagit Hel-Or 18/1/2015

What’s to come ? JPEG compression algorithm DCT Quantization tables
Benford’s law

JPEG That file format for images right ?
Wrong, JPEG is a compression type, similar to zip

JPEG (Joint Photographic Experts Group)
JPEG became an image compression standard in 1992 JPEG is a lossy compression algorithm JPEG can achieve 10:1 compression with little noticeable difference Explain what lossy means

JPEG Decreased compression from left to right.
The quality of the image is increasing from left to right. Increasing the compression level too much causes the image to be indistinguishable. * show the difference between the eyes! * notice that we have green blocks (explain later)

JPEG Algorithm Image Color transformation DCT Quantization JPEG
Encoding

YCbCr RGB has a lot of redundant information
Y′ is the luma component and CB and CR are the blue-difference and red-difference chroma components it is a way of encoding RGB information but these RGB signals are not efficient as a representation for storage and transmission, since they have a lot of redundancy. The Y′CBCR color space conversion allows greater compression without a significant effect on perceptual image quality

YCbCr we take the luma component and store it with high resolution (humans are very sensitive to white and black changes) and we take the two chroma components and store them in a lower resolution and or downsample the data.

Downsampling Source: computerphile
This image is to show that we can have 100 times less color and not tell the difference. of course if we zoom in to the edges we would notice some small differences in color. but for general purposes this is a great solution for conserving space while keeping the image pretty much the same Source: computerphile

JPEG Algorithm Image Color transformation DCT Quantization
Next step is DCT

8x8 we take the image and divide it to 8x8 blocks (non overlapping and per channel ) we represent each block with an 8x8 block of cosine waves (using DCT) If the image does not divide into 8, then we either pad with a black color(bad,ringing artifacts) or repeat the edges (good)

DCT What it does is translate a set of data points into a set of coefficients that each describe a cosine wave.

DCT Each 8x8 pixel block is encoded with its own DCT
If we have a signal (8 long) we can represent that data with 8 cosine waves of different frequencies with these 64 base cosine base blocks, we can create any 8x8 image if the only frequency that was contributing to the image was the first one, then that is how the image would look like each of these are weighted according to how much they contribute to the image (coefficients) Each 8x8 pixel block is encoded with its own DCT show directionality

DCT notice how the top left one is white
the bottom right has all the high frequencies

Quantization Tables The higher the quantization the more coefficients will be rounded to zero

DCT between a and b, it is important to mention the we center the data, we take off 128 of every value (because values are from 0 to 255) c (the table) the table is picked according to the amount of frequencies we want to get rid of (there is a penalty for every frequency), what we do is divide every coefficient with the corresponding value in the table, and then round the result to the nearest integer.

100 , 50 ,25 ,10 ,1 quality This slide is here to show what happens when we null out the high frequencies, and when we slowly start to null out the lower frequencies

Luminance - Chrominance
Luminance is usually quantised less than chrominance because the human eye is more sensitive to it both tables care are at the same quality factor.

JPEG Algorithm Image Color transformation DCT Quantization JPEG
Encoding

Encoding Encode the coefficients using Huffman encoding

Detecting Double Compressed JPEG Images
Babak Mahdian and Stanislav Saic

Double Quantization Double quantization process -> artifacts in the DCT coefficients histogram. When an image goes through a double quantization process

the graph are the magnitudes of the fourie transform on the DCT coefficients
first column is single compressed the rest are double compressed with different qualities, each row is a different frequency channel notice the local peaks in double compressed, and the decaying trend in the single compressed

Detecting Double JPEG Compression
Only check the luminance channel Only on the lower frequencies Denoise using an average filter Remove decaying trends while preserving local peaks Check only luminance because color is usually downsampled and causes bad results. We only consider low frequencies, because the higher ones are usually quantized to zero Denoise the HISTOGRAMS of the DCT coefficients Remove decaying trend by using local minimum subtraction.

Detecting Double JPEG Compression
Where is the minimum of n is the length of the minimum filter Hi is the normalized FFT on the Histograms of the DCT coefficients n is determined in the training process, set according to wanted accuracy/false positives.

Classification Use an SVM to classify if single or double JPEG compressed. The features that were used are the normalized peak positions in H shown earlier. The learning phase was done on 1000 images and tested on another 1000

Results notice that doesnt work well when the first QF is high.

Cons Classification is per quantization step
They assume that images are not in bitmap format Not good for high quality factors Not great results when the first quality factor is higher than the second.

Dongdong Fu*a, Yun Q. Shi*a, Wei Sub
A generalized Benford’s law for JPEG coefficients and its applications in image forensics Dongdong Fu*a, Yun Q. Shi*a, Wei Sub

Reminder JPEG coefficients are the 8x8 dct coefficients after quantization

Benford’s Law

Based On Statistics Statistics from UCID database shows that the DCT coefficients are indeed very close to the statistics of benford’s law

JPEG Coefficients Close! but not quite there
you can see there is some sort of pattern

Generalized Benford’s Law
Works Well SSE sum of square Errors stats are from a big dataset, and averaged the results When s =0 and q= 1 we revert back to Benford’s law

JPEG Identification Based on experiments, the previous law does NOT apply on double compressed images. We can JPEG compress a given bitmap image and check if the law applies or not. Just use an SVM to learn the SSE results and check if it was previously JPEG compressed or not.

Graphs Easy to spot the difference between the compressed and uncompressed

Results 100%

Estimating QF Re-Compress an image with the same QF won’t change the first digits statistics very much find the quality factor that was used they took an image that was compressed with QF=80 and compressed it again with multiple QFs and found that QF=80 is the closest to the generalized law.

Double-Compression Detection
the left is compressed once the other is double compressed we show the distribution of the MSB digit of the JPEG coefficient

Cons Using Benford’s law requires a lot more information, so this will not work on small images. Won't detect recompression of small patches

JPEG Error Analysis and Its Applications to Digital Image Forensics
Weiqi Luo, Member, IEEE, Jiwu Huang, Senior Member, IEEE, and Guoping Qiu, Member, IEEE

Porpose Identifying JPEG images Estimating quantization steps
Detecting quantization tables Double JPEG compression detection existing double JPEG compression detection usually needs to know if the image is JPEG or not before working. or dependant on quantization tables being known

Observations The AC coefficients of an image will increase in the range of (-1, +1) while decrease significantly in the union regions of (-2, -1] and [+1, +2) after JPEG compression with quantization steps that are equal to or larger than 2. Same quantization table -> better preservation of the original image. same quantization is better than using QF 100 !!

Reminder

Jpeg identification is converted to discriminate between fig b and fig d
as you can see, the AC coefficients of the recompressed image (d) are different than b they proposed a feature s for as a threshold for JPEG and uncompressed images.

Fun math (ha ha) Rounding error
R is the rounding error the rest makes sense.

Definitions P , P’: the PDF of d and d’ k: the quantization step

Fun math P is the pdf of the coefficients d1(i,j) will be quantized to the nearest integer kq k is the quantization step, k is determined by the image’s content 3rd equation, because d2 and d1’ are almost the same (just a rounding error difference) p2 is a noise contaminated version of p1.

Fun math these are the regions that we will check so that we might find a difference between jpeg and non jpeg

Math.. both equations result with our feature s
bottom increases while the top decreases after being quantized .. s for JPEG image will be close to zero

Identifying JPEG Images
even with high quality factor feature S can be put into 2 categories, for the uncompressed image and JPEG We can find a threshold to separate both cases! *the lower the QF the smaller the S feature * just train a machine to figure it out.

Results

Results (Wow) other methods needed more statistics (larger image) and were sensitive to small sizes ACCURACY/FALSE-POSITIVE

Estimating Quantization Steps
find qi,j Meaning we have 8.5% of the coefficients that were rounded to the nearest neighbour

since most DCT coefficients will be quantized to 0 and rounded to 1 so the largest bin could be 1 even though it’s not the actual step.

If then else Where H is the histogram of because of the error, we add this step to check if q = 1

Results shows the success for finding the quantization steps.
QF 50 because most of the frequencies were quantized to 0

Comparison 16% improvement

Detecting Quantization Table (QF)
we will look at the difference between a JPEG and its recompressed version with different QF Notice that we have a higher percentage of equal pixels and equal coefficients when we use the same QF as the one used in the previous JPEG compression. this way, if we take a JPEG and compressed it with a range of QF tables, we can find which one gives a better result -> find which one was used before.

Detecting Quantization Table (QF)
R: feature (previous graph) J1: JPEG image J2: recompressed image |E|: number of equal pixels M,N: image dimensions This way we find QF Find the table that results with the highest amount of equal pixels.

Results Method 16 is benford's law

Summary Forgery detection based on: Based on recompression
Based on Benford’s law Based on the double quantization effect

Questions ?

Thanks for listening! The End

References Benford’s Law JPEG compression algorithm DCT
JPEG and quantization JPEG JPEG error analysis and its applications to digital image forensics -2010 A generalized Benford’s law for JPEG coefficients and its applications in image forensics -2007 Detecting double compressed JPEG image -2009

Image Forgery JPEG Compression Based Forgery Detection

Similar presentations

Presentation on theme: "Image Forgery JPEG Compression Based Forgery Detection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Image Forgery JPEG Compression Based Forgery Detection

Similar presentations

Presentation on theme: "Image Forgery JPEG Compression Based Forgery Detection"— Presentation transcript:

Similar presentations

About project

Feedback