Presentation on theme: "An Exploration in the Detection of Hidden Data in Audio Bit Streams Presented by: John Monk CS 525, Spring Semester 2002"— Presentation transcript:
An Exploration in the Detection of Hidden Data in Audio Bit Streams Presented by: John Monk CS 525, Spring Semester 2002 http://cs.uccs.edu/~jmmonk/project
The Problem The cyber threat to our security is increasing daily and is difficult to detect and defeat. Attack and exploitation techniques are being developed faster than defensive measures can be researched and implemented. One such field where this is true is that of Steganography (the practice of information hiding). Many examples of this can be found, ranging from subtly altering a JPEG file to embed hidden data, to embedding encrypted messages into streaming video and audio.
The Problem - Continued In a real world environment, this practice is difficult to detect in real time for several reasons: There is too much network traffic. There is no guarantee that all the RTP packets will take the same route to their destination, which makes monitoring difficult and expensive. Traditional tools like protocol analyzers, while being very good at detecting problems with framing, transmission, checksums, etc. give no real insight into the content of the data being observed.
Proposal The strategy that I investigated is whether or not embedded hidden data in an audio bitstream can be detected through more qualitative methods (i.e., does the "sound" change when embedded hidden data is present? If so, what are the characteristics that are presented?). Ideally, this research would yield a set of analytical formulas that describe the relationship between the hidden data, the voice, or carrier data, the measurements, and the results.
Materials Agilent Technologies has a VoIP test and measurement tool called the VQT. The VQT is a comprehensive and objective voice quality test system that provides detailed test and analysis capabilities for voice quality on modern telephony networks by providing accurate and objective testing of voice service quality. The VQT provides detailed scoring analysis that exposes the impairments to voice quality by testing for voice clarity, delay, echo, silence suppression, DTMF, and signal loss.
Materials CoolEdit2000 – For the spectral analysis tools. MP3 Stego – Software for encrypting and embedding a text message into an MP3. http://www.cl.cam.ac.uk/~fapp2/steganography/mp3stego
Methods 1.Choose a.wav file to convert to an MP3. 2.Create a small text file to be hidden in the resultant MP3. 3.encode -E hidden_text.txt -P pass svega.wav svega_stego.mp3 1.compresses svega.wav (mono, 44.1 kHz, 16bit encoded) and hides hidden_text.txt. The hidden text is encrypted using pass as a password. This produces the output called svega_stego.mp3. 4.Compare the original recording with the recording containing the encrypted, hidden message.
Comparison of Waveforms in the Time Domain The waveforms are almost identical!
From a frequency analysis point of view, there are no distinguishing characteristics between the original recording, and the recording with the hidden message. Comparison of Waveforms in the Frequency Domain
Perceptual Clarity (PAMS) There seems to be a slight amount of signal loss and a slight increase in distortion due to the hidden data. But, the overall measurement score is very good.
Perceptual Clarity (PESQ) There seems to be a slight amount of signal loss occurring in the same places as detected in the PAMS measurement. But, the overall measurement score is very good.
Perceptual Clarity (PSQM) There seems to be a slight amount of signal loss occurring in the same places as detected in the other two measurements. But, the overall measurement score and the estimated MOS is very good.
Results Making (objective) qualitative observations does not seem to be a feasible way to detect hidden data in bit streams. There may be some minor alterations of the original recording when data is hidden in it, but the difference is so small that it not perceptible. The bitstream with hidden data is dependant on the encoding scheme which is not limited to the obvious transforms (DCT, wavelet, fractal, etc). There may not be any dependable universal criteria by which to determine if hidden data is present.
Discussion The encoding scheme used in this experiment is very efficient. The data is first compressed, encrypted and then data hidden in the MP3 bit stream. The hiding process takes place at the heart of the Layer III encoding process, namely in the inner_loop function (l3loop.c). The inner loop quantizes the input data and increases the quantizer step size until the quantized data can be coded with the available number of bits. Another loop checks that the distortions introduced by the quantization do not exceed the threshold defined by the psycho acoustic model. I believe that it is this adherence to the psycho acoustic model that makes it very difficult to “hear” the hidden data.
Summary Detecting hidden data in bit streams is still a difficult problem with no good answers yet. However, a plausible attack on data that is hidden in MP3, or other compressed formats is to uncompress the bit stream and then re- compress it. However, this can result in a severe loss of quality.
References Multimedia Security Research at Purdue University - http://dynamo.ecn.purdue.edu/~ace/water2/digwmk.html http://dynamo.ecn.purdue.edu/~ace/water2/digwmk.html “A Tutorial on Digital Watermarking”, Fernando Perez- Gonzalez and Juan R. Hernandez, Dept. Tecnologias de las Comunicaciones, ESTI Telecom., Universidad de Vigo, 36200 Vigo, Spain “On the Limits of Steganography”, Ross J. Anderson and Fabien Petitcolas. IEEE Journal of Selected Areas in Communication, 16(4):474-481, May 1998.