Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multimedia Data Hiding

Similar presentations


Presentation on theme: "Multimedia Data Hiding"— Presentation transcript:

1 Multimedia Data Hiding
Wade Trappe Wireless Information Network Laboratory Rutgers University

2 Outline Basic framework and issues in multimedia data hiding
Two main embedding mechanisms Spread spectrum additive embedding (Deterministic) Relationship enforcement embedding Along with several examples and applications Watermark attacks and countermeasures Collusion Resistance

3 Why Hide Data in Multimedia

4 Crypto is Useful to Protect MM, but Not Enough …
Encryption Helps to protect confidentiality Protection vanishes after decryption Prefer a way to associate copyright info. with multimedia source even after decryption/compression/transmission/etc. Digital cryptographic signature Helps to authenticate sender’s identity and data integrity Need to attach a separate signature to the data source Audio/image/video allows imperceptible changes Opportunities for new and seamless ways of authentication

5 Demands on Info. Security and Protection
Intellectual property management for digital media Promising electronic marketplace for digital music and movies Napster controversy Conventional encryption alone still leaves many problems unsolved Protection from encryption vanishes once data is decrypted Still want establish ownership and restrict illegal re-distributions How to distinguish changes introduced by compression vs. malicious tampering? Bit-by-bit accuracy is not always desired authenticity criterion for MM

6 Multimedia Data Hiding / Digital Watermarking
What is it? Example: Picture in picture, words in words It is “Steganography” Secondary information in perceptual digital media data Why hide information? Convey information without an additional channel Hidden data is tied to content: Can be made difficult to separate data from content. Covert communication: Adversary does not know it is there!

7 Data Hiding in Perceptual Data
Perceptual data (audio/image/video) vs. non-perceptual Perceptually no-difference: allow small changes in terms of Hamming or Euclidean distance Non-perceptual data often have semantic or functional constraints hence cannot be easily changed in terms of Hamming or Euclidean distance Unequal importance within perceptual (multimedia) data High data volume and real-time requirements for multimedia Perceptual properties of multimedia allow for imperceptible but lossy processing Including lossy compression (JPEG, MPEG, …) and data hiding Data hiding examples in perceptual & non-perceptual data Hiding Data in text message e.g., change a word to one of its synonyms (Birthday Attack on Hashes) Hiding Data in scanned text message e.g., change line spacing or pixel values

8 General Framework of Data Hiding
marked media (w/ hidden data) embed data to be hidden host media compress process / attack extract play/ record/… extracted data player “Hello, World” test media

9 Issues and Challenges Tradeoff among conflicting requirements
Robustness Capacity Imperceptibility Tradeoff among conflicting requirements Imperceptibility Robustness & security Capacity Key elements of data hiding Perceptual model Embedding one bit Multiple bits Uneven embedding capacity Robustness and security What data to embed Upper Layers Uneven capacity equalization Error correction Security …… Lower Layers Imperceptible embedding of one bit Multiple-bit embedding Coding of embedded data

10 Examples of Multimedia Data Hiding

11 Watermark-based Authentication
Embed patterns and content features using a lookup-table High embedding capacity/security via shuffling locate alteration differentiate content vs. non-content change (compression) unchanged content changed

12 Relationship Enforcement Embedding
Deterministically enforcing relationship Secondary info. carried solely in watermarked signal Representative: odd-even embedding No need to know host signal (no host interference) High capacity but limited robustness Robustness achieved by quantization or tolerance zone => Enforcing black pixel# per block to odd/even to hide data in binary image feature value 2kQ (2k+1)Q (2k+2)Q (2k+3)Q odd-even mapping lookup table mapping … … even “0” odd “1”

13 Relationship Enforcement (cont’d)
General approach: Partition host signal space into sub-regions each region is labeled with 0 or 1 marked sig. is from a region close to orig. & labeled w/ the bit to hide Secondary info. carried solely in X’ difference (X’-X) doesn’t necessarily reflect the embedded data Advanced embedding: Combining the two types with techniques suggested by information theory mapping { b} data to be hidden X host sig. X’= f( b ) marked copy 1 or 0

14 Additive Embedding Add secondary signal in host media
Representative: spread spectrum embedding Add a noise-like signal and detection via correlation Good tradeoff between imperceptibility and robustness Limited capacity: host signal often appears as major interferer modulation data to be hidden X original source X’ = X +  marked copy < X’ + noise,  > = <  + (X + noise),  > < X’ + noise - X,  > = <  + noise,  >

15 Spread Spectrum Approach: Cox et al (NECI)
4/17/2017 Spread Spectrum Approach: Cox et al (NECI) Key points Place wmk in perceptually significant spectrum (for robustness) Modify by a small amount below Just-noticeable-difference (JND) Use long random vector as wmk to avoid artifacts (for imperceptibility & robustness) Embedding v’i = vi +  vi wi = vi (1+ wi) Perform DCT on entire image and embed wmk in DCT coeff. Choose N=1000 largest AC coeff. and scale {vi} by a random factor 2D DCT sort v’=v (1+ w) IDCT & normalize Original image N largest coeff. other coeff. marked image random vector generator wmk seed

16 Cox’s Scheme (cont’d) Detection
4/17/2017 Cox’s Scheme (cont’d) Detection Subtract original image from the test one before running through detector Original detection measure used by Cox et al. a correlator normalized by |Y| orig X test X’ X’=X+W+N ? X’=X+N ? DCT compute similarity threshold test image decision wmk select N largest original unmarked image preprocess

17 Theoretical Foundations
Optimal detection for On-Off Keying (OOK) OOK under i.i.d. Gaussian noise {di} b{0,1} represents absence vs. presence of ownership mark Use a correlator-type detector (recall the review last week) Need to determine how to choose {si} Neyman-Pearson Detection [Poor’s book Sec.2.4] False-alarm ~ claiming wmk existence when nothing embedded Given max. allowed false-alarm, try to minimize prob. of miss detection Use likelihood ratio as detection statistic Determine threshold according to false-alarm prob.

18 Performance of Cox’s Scheme
4/17/2017 Performance of Cox’s Scheme Robustness (claimed) scaling, JPEG, dithering, cropping, “printing-xeroxing-scanning”, multiple watermarking No big surprise with high robustness equiv. to conveying just 1-bit {0,1} with O(103) samples Comment Must store original unmarked image  “private wmk”, “non-blind” detect. Perform image registration if necessary Adjustable parameters: N and 

19 Invisible Robust Wmk: Improved Schemes
Apply better Human-Perceptual-Model Global scaling factor is not suitable for all coefficients Explicitly computes Just-noticeable-difference (JND) JND ~ max amount each freq. coeff. can be modified imperceptibly Use i for each coeff.  finely tune wmk strength Better tradeoff between imperceptibility and robustness Try to add a watermark as strong as possible Block-DCT based schemes: Podilchuk-Zeng & Swanson et al. Existing visual model for block DCT: JPEG Quality Factors

20 Compare Cox & Podilchuk Schemes
Original Cox Podilchuk whole image DCT block-DCT Embed in 1000 largest coeff. Embed to all “embeddables”

21 Compare Cox & Podilchuk Schemes (cont’d)

22 Video Example 1st & 30th Mpeg4.5Mbps frame of original, marked, and their luminance difference human visual model for imperceptibility: protect smooth areas and sharp edges

23 Some Watermark Attacks

24 Watermark Attacks: What and Why?
Attacks: intentionally obliterate watermarks remove a robust watermark make watermark undetectable (e.g., miss synchronization) uncertainty in detection (e.g., multiple ownership claims) forge a valid (fragile) watermark bypass watermark detector Why study attacks? identify weaknesses propose improvement understand advantages and limitations of each solution

25 “Innocent Tools” Exploited by Attackers
Recovery of lost blocks for resilient multimedia transmission of JPEG/MPEG good quality by edge-directed interpolation: Jung et al; Zeng-Liu Remove robust watermark by block replacement edge estimation edge-directed interpolation

26 Attack effective on block-DCT based spread-spectrum watermark
JPEG 10% after proposed attack marked original (no distortion) 512x512 lenna Threshold: 3 ~ 6 Attack effective on block-DCT based spread-spectrum watermark claimed high robustness&quality by fine tuning wmk strength for each region

27 Secure Digital Music Initiative Challenge
International consortium ~ 180+companies/organizations Currently pursuing watermark based solution for access and copy control on digital music use watermark to convey copy/access control policy Public challenge ( 9/15-10/8/2000 ) Attacks on four robust watermark technologies Non-traditional research values Reveal real industrial problem and state-of-art technologies Present an emulated competitive environment for better understanding on audio watermarking Lead to a few research problems

28 SDMI Challenge Setup GOAL Sample-3 (marked) Sample-4 (attacked) Embed
Obtained From SDMI Job for “Attackers” Black Box (unknown) SDMI Challenge Setup Embed Watermark (special signal) Sample-1 (original) Sample-2 (marked) “Watermark Found” Detect Any Marked Audio “Watermark NOT Found” Attack Detect Sample-3 (marked) Sample-4 (attacked) GOAL

29 Can Ear Tell Difference?
Liang Zhu Top of World Comparison among 3 samples: original, applying attack-1 to orig., applying attack-2 to orig. 2 / 0 / 3

30 Learning from SDMI Challenge
Princeton University’s successful attacks Blind attacks: warping, jittering Attacks based on studying orig.-marked pairs deliberate filtering / subtraction / randomization Research issues What embedding framework/algorithm gives sufficient robustness as well as security? esp. when orig.-marked pairs are available to attacker Is watermark useful for copy/access control? Hard to get complete solution with technology alone business model, pricing model, etc. Improved watermark tech. could be part of the solution make attack non-trivial and keep honest people honest

31 Summary and Conclusions
Data hiding in digital multimedia for a variety of purposes, involving multiple disciplines Tradeoff among many criterions Important to think both as designer and as attacker Emerging problems how to effectively combine with other security mechanisms Data hiding in market digital cameras with authentication watermark module plug-in for image editors video watermark proposals for DVD copy control on-going SDMI effort for digital music “Digital Rights Management (DRM)” for multimedia data

32 Summary Comparisons of data hiding in MM vs. non-perceptual
Basic framework and issues in multimedia data hiding Two main embedding mechanisms Spread spectrum additive embedding (Deterministic) Relationship enforcement embedding Along with several examples and applications Watermark attacks and countermeasures

33 Suggested reading I. Cox, J. Kilian, T. Leighton, T. Shamoon: “Secure Spread Spectrum Watermarking for Multimedia'', IEEE Transaction on Image Processing, vol.6, no.12, pp , 1997. Download from IEEE online journal, or C. Podilchuk and W. Zeng, “Image Adaptive Watermarking Using Visual Models,” IEEE Journal Selected Areas of Communications (JSAC), vol.16, no.4, May, 1998. Download from IEEE online journal M. Wu and B. Liu: “Multimedia Data Hiding”, Springer Verlag, to appear Dec An earlier dissertation version is at

34 Digital Fingerprinting and Tracing Traitors
Recent advances in communications allow for convenient information sharing Severe damage is created by unauthorized information leakage e.g., pirated content or classified document Promising countermeasure: robustly embed digital fingerprints Insert ID or “fingerprint” to identify each customer Prevent improper redistribution of multimedia content content co. A Beautiful Mind Alice Bob Carl w1 w2 w3 Sell

35 Embedded Fingerprinting for Multimedia
original media compress extract Customer: Alice Sell Content Suspicious Candidate Search Database Fingerprint Tracing:

36 4/17/2017 Collusion Scenarios Collusion: A cost-effective attack against multimedia fingerprints Users with same content but different fingerprints come together to produce a new copy with diminished or attenuated fingerprints Colluders cannot arbitrarily manipulate embedded fingerprint bits Different from Boneh-Shaw’s marking assumption (1995) Result of fair collusion: energy of embedded fingerprints decreases . . . Averaging Attack Interleaving Attack

37 Anti-Collusion Fingerprinting
Build anti-collusion fingerprinting to trace traitors and colluders Potential impact Gather digital evidence and trace culprits Deter unauthorized dissemination in the first place let the bad guys know their high risk of being caught Potential civilian use for digital rights management (DRM) DRM business ~ $96M in 2000 and projected $3.5B in 2005

38 Additive Embedding Overview
4/17/2017 Additive Embedding Overview Additive embedding is spread spectrum watermarking (Cox et al.): Content signal User j’s watermark is a noise-like signal: Watermark is scaled and added to content to make yj. sj wj x a yj Orthogonal Modulation: Code Modulation: We choose or Typical WNR: -20dB in blind detection, 0dB in non-blind detection. Detection via hypothesis testing: correlator for AWGN Mention Ortho modulation we have amount of waveforms equal to amount of users. Mention that other possibilities are possible for bij, but that this is further directions to explore.

39 Additive Fingerprint Detection
4/17/2017 Additive Fingerprint Detection Detection can be formulated as a hypothesis testing problem. Optimal detector can be calculated from assumptions on distortion (host media and noise from attacks). If distortion is N(0, ) then optimal detector is a correlator: Example (Antipodal Code Modulation) Detection is similar to what you do in digital communication. TN b=1 b=-1

40 Approaches to Tracing Colluders
Colluder identification via orthogonal fingerprints Prior works by Cox et al., Stone, Killian et al. Advantage in distinguishing individual fingerprints Disadvantage in fingerprint attenuation during collusion Colluder identification via correlated fingerprints Boneh-Shaw codes may be too long to be reliably embedded and extracted (Su et al.) ~ millions bits for 1000 users Prefer to trace as many colluders as possible instead of only tracing one in B-S code

41 Our Proposed Approach Overall: consider fingerprint encoding, embedding & detection Build correlated fingerprints in two steps Anti-collusion fingerprint codes resist up to K colluders any subset of up to K users share a unique set of code bits shared bits get sustained and used to identify colluders Use antipodal coded modulation to embed fingerprint codes via orthogonal spread spectrum sequences Jointly consider fingerprint detection and decoding

42 16-bit ACC for Detecting  3 Colluders Out of 20
4/17/2017 16-bit ACC for Detecting  3 Colluders Out of 20 User-1 ( -1,-1, -1, -1, 1, 1, 1, 1, …, 1 ) ( -1, 1, 1, 1, 1, 1, …, -1, 1, 1, 1 ) User-4 Extracted fingerprint code ( -1, 0, 0, 0, 1, …, 0, 0, 0, 1, 1, 1 ) Collude by Averaging Uniquely Identify User 1 & 4 Embed fingerprint via HVS-based spread spectrum embedding in block-DCT domain

43 Anti-Collusion Fingerprint Codes
Simplified assumption: Assume fingerprint codes follow logic-AND operation after collusion K-resilient AND ACC code A binary code C={c1, c2, …, cn} The logical AND operation of any combination of up to K codevectors is distinct from the AND of any other combinations of up to K codevectors Example: {(1110), (1101), (1011), (011) ACC code via combinatorial design Balanced Incomplete Block Design (BIBD) Simple Example ACC code via (7,3,1) BIBD for handling up to 2 colluders among 7 users

44 Balanced Incomplete Block Design (BIBD)
Construction example of (7,3,1) BIBD code X={1,2,3,4,5,6,7} A={123, 145, 246, 167, 347, 257, 356} (v,k,l=1)-BIBD is an (k-1)-resilient AND ACC Defined as a pair (X,A) X is a set of v points A is a collection of blocks of X, each with k points every pair of distinct points is in exactly l blocks # blocks Code length for n=1000 users: O( n0.5 ) ~ dozens-to-hundreds bits Shorter than prior art by Boneh-Shaw O( (log n)6) ~ millions bits 1 2 3 4 5 6 7 x

45 ACC Codes Under Averaging Collusion
BIBD-based ACC codes under averaging collusion Can distinguish colluded bits from sustained bits statistically with appropriate modulation or embedding The set of sustained bits is unique with respect to colluder set

46 Colluder Detector Design: Two Approaches
4/17/2017 Colluder Detector Design: Two Approaches Hard Detection: Detect the bit values and then estimate colluders from these values Uses the fact that the combination of codevectors uniquely identifies colluders Everyone is suspected as guilty and each ‘1’ bit narrows down set Soft Detection: Possible candidates for soft detection: Sorting: Use the largest detection statistics to optimize likelihood function to first determine bit values, then estimate colluder set. Sequential: Iteratively update the likelihood function and directly identify the colluder set.

47 ACC Experiment with Gaussian Signals
Soft decoding gives more accurate colluder identification than hard decoding Joint decoding and colluder identification gives better performance than separating the two steps Sequential colluder identification gives a good tradeoff between performance and computational complexity

48 ACC Experiments with Images
4/17/2017 ACC Experiments with Images 1, 2, and 3 colluder cases were performed using the Lenna image under both blind and non-blind detection. Embed fingerprints in perceptually significant DCT coeffs (Podilchuk-Zeng). The fingerprinted images had no visible distortion, PSNR of 41.2 dB. Colluded images were compressed using JPEG with QF 50%.

49 Summary Important to design anti-collusion fingerprint for multimedia
4/17/2017 Summary Important to design anti-collusion fingerprint for multimedia Collusion is a cost-effective attack against fingerprinting Anti-collusion fingerprint can allow us trace traitor and deter unauthorized information leakage Proposed anti-collusion fingerprinting for multimedia by combining code design and code modulation Anti-Collusion Codes developed using BIBDs use sustained bits to trace colluders code length is much shorter than prior Boneh-Shaw code Joint decoding and colluder identification gives better performance than separating decoding and colluder identification


Download ppt "Multimedia Data Hiding"

Similar presentations


Ads by Google