Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiments in Software Watermarking Bradford P. Cuppy B.S. University of Evansville Fri, Nov 8, 2002.

Similar presentations


Presentation on theme: "Experiments in Software Watermarking Bradford P. Cuppy B.S. University of Evansville Fri, Nov 8, 2002."— Presentation transcript:

1 Experiments in Software Watermarking Bradford P. Cuppy B.S. University of Evansville Fri, Nov 8, 2002

2 Introduction Problem with Software Piracy is it has a cost in the billions of dollars. Theft of code segments by others and denying original author proper credit for work accomplished Watermarking as legal proof of ownership Software watermarking is a very new field whereas video, graphic and audio watermarking are well established. We will cover in this presentation, my watermark, the Planted Plane Cubic Tree (PPCT), Experiments on it, its strengths, attacks on it and comparisons to what others have done

3 Related Research Types of watermarking Systems –Private (my watermark) –Semi-Private –Public Different Types of Software Watermarks –Static Data (i.e. strings netscape) –Dynamic Software –Dynamic Data –Easter Egg –Dynamic Execution Trace –Others (Fingerprinting, License Mark)

4 Attacks on Watermarks Three types of attacks –additive counterfeiter adds his own watermark –subtractive removal of the watermark –distortive through obfuscation, code decompile and recompilation, altering the watermark enough to where it is no longer recognizable

5 Protecting Watermarks Palsberg Paper, there are three defenses for watermarks –randomization –obfuscation –tamper proofing This will be covered in later slides Reference: Jens Palsberg, Sowmya Krishnaswarmy, Minseok Kwon, Di Ma, Qiuyun Shao, Yi Zhang, “Experience with Software Watermarking”, 2000 Annual Computer Security Applications Conference, New Orleans, Section 3

6 My Software Watermark My Watermark uses modified Planted Plane Cubic Tree (PPCT) which is based on Dynamic Data Structure Watermark Watermark is basically a Binary Tree with a root node The watermark cannot be accessed while the program is running It is only accessible running through GNU Debugger (gdb) Accessibility is determined by setting a couple of special variables within debugger Watermark can be found in core dump

7 Comparison of Binary Trees PPCT Tree A PPCT Tree is like a binary tree except there is a root node “R” which points to the starting node of the binary tree. The binary tree nodes point back to root “R”.

8 Comparisons of Binary Trees (con’t) My Modified PPCT Tree My modified PPCT leaves out the pointers to the root “R” from the binary tree nodes. Questions concerning Binary Trees –Why a Binary Tree? Why not another kind of tree or graph ? Binary trees based on dynamic data structure watermarks are not apparent to an attacker since there isn’t any distinctive output when the program is executed even in debug mode. Other watermarks such as Easter Egg are apparent and can be found and destroyed.

9 Modified PPCT versus Original PPCT How is my Modified PPCT better than the original PPCT ? –There is no need to point back to root node since the tools available with Java to look at byte-code binaries are not available with C/C++. GNU software only has GNU Debugger (gdb) to go through programs. –The watermark is found in a core dump after execution of the code –The raw memory locations has to be set through GNU Debugger(dbg) before executing the program. –The public would get the binary with the symbols stripped (“-s” option). –The modified tree is easier to build through the available functions

10 Mod PPCT v. PPCT (con’t) Is the modified PPCT easier or harder to break ? –In the original PPCT tree, the whole watermark can be found regardless of which node the cracker starts on since there is a pointer to root node. –In the Modified PPCT, the cracker must start on the root node in order to manipulate the whole watermark. If he starts anywhere else, he won’t be able to destroy the whole watermark. Parts of it would remain. –The tree is smaller in structure since it has pointers forward only –The tree would be more difficult to find since the it is more compact. –An example which is opposite is the Easter Egg watermark which is very big and complex which makes it an easy target to go after as mentioned in the Palsberg paper.

11 Mod PPCT v. PPCT (con’t) Why was I motivated to modify it ? –PPCT under Java can be easily referenced with the available Java tools where as under GNU C/C++, the only tool to use is the GNU Debugger (dbg) –The equivalent tools in C/C++ would have to be written and would require a thorough knowledge on how the binaries are written by the compiler. –Under C/C++, the PPCT tree cannot be traversed through due to limitations of tools. –Simplicity of coding. How is the Watermark Discovered ? –The tree is traversed once and then a core dump is forced and the watermark is discovered through the core dump file

12 Alternative Watermarks What were the alternative watermarks have I looked at ? –Static Watermark: A display line such as doing “strings netscape” in Unix which shows “©1995 by Netscape, Inc.” which is the copyright message. It can be simply be removed through a bin-hex editor. Very easily cracked. –Code watermarks are easily susceptible to distortive de- watermarking attacks –The Easter egg watermark shows a special display after a special input sequence is performed. An example is putting “about:mozilla” in the URL field in Netscape and a Fire Breathing Dragon is shown instead of shooting stars.

13 Placement of Watermark Watermarking functions and sub-routines are added in the source code stage. Nothing is done after compilation. The watermark sub-routines are placed in various parts of the program by hand –Each program and its functions are unique enough to where it cannot be done automatically –The only way this can be done automatically is specific standards are set and followed by the programmer when writing source code. This would include naming convention for example. –The disadvantage of the automated watermark insertion would be consistency to finding the watermark. The programmer must decide how to place the watermarking subroutines –The functions are mixed in with functions that are required by the program –The watermark function names are renamed to names that look similar to names of the program’s required function names Watermarking subroutines need to be integrated into each unique program which varies quite a bit In diagram in next slide. The yellow “watermark” word represents an access to a watermark subroutine or variable

14 Place of Watermark - Diagram

15 Experimental Results Used John the Ripper v1.6 as the platform Early versions of the watermark were easily found through the use of unix “nm” command Fixes were including stripping the symbol table from the compiles. “-s” option Mixing watermark and non-watermark functions in the source code Including tamperproofing Putting conditional statement to get watermark further down in main code or program and required execution by a different name The size of program and execution time between watermarked and non-watermarked version varies very little

16 Strength Standards for Watermarks Three different Protection Mechanisms for Software Watermarks –Randomization –Obfuscation –Tamperproofing Randomization –Weave watermark into code as defined as mixing the watermark functions with program functions in source code, therefore, making it harder to do comparisons between watermarked versions and non-watermarked versions –My watermark was randomized by taking different watermarking subroutines and placing them in different parts of the watermarked program.

17 Strength Con’t Obfuscation –Dynamic & Static Opaque Predicate which is defined as a conditional statement that is triggered in order to show watermark –Variable Split and merging such as x is x1, x2 for split and y1, y2 into y for merge –Renaming which is changing the names of variables –Renaming variables is moot since variables are referred to as address numbers when symbols are stripped from the binary –My watermark was obfuscated by using static predicates and renaming the function and variable names to where they “blend” in with other parts of the program Tamperproofing –Program depends on watermark to function –check hash or CRC value of program –parent pointer which is point of origin for binary tree –check form of watermark such as if it still a PPCT or changed into a simple binary tree –My program depends on the watermark in order to function.

18 Attacking my Watermark In order to attack my watermark, the attacker will need to do : –Find out the type of watermark since it can be discovered only through the use of debugger and setting the special variable (raw memory location) and then executing the program –Use a debugger or decompiler to go through the binary code Reverse Engineering Compiler (REC) found at URL www.backerstreet.com/rec/rec.html was used www.backerstreet.com/rec/rec.html Decompilers for GNU C/C++ are crude and source code derived thereof is very large The original source is about 390 lines and decompiled source was about 26 thousand lines –A rough estimate on the time to break the watermark would be more than an 8 hour day

19 Comparison to other Watermarks Benchmarked against Dr. Collberg’s Watermark and Dr. Palsberg’s Watermark (Java Wiz) My Watermark contains all 3 elements of Randomization, Obfuscation and Tamperproofing Dr. Collberg’s and Dr. Palsberg’s Watermarks contains only 2 elements of Randomization and Obfuscation. There is no tamperproofing in their watermarks. On Randomization, both Dr. Collberg and I use weaving and unusual means to access the watermark whereas Dr. Palsberg uses only weaving On Obfuscation, Dr. Collberg’s watermark and mine differ quite a bit. –Dr. Collberg’s watermark uses static opaque predicate, padding, variable split/merge, renaming, method in/outline, and modify inheritance hierarchy –mine only uses static opaque predicate –Dr. Palsberg uses only dynamic opaque predicate

20 Conclusion Image and audio watermarking are well established Software watermarking is a relatively new field that has a lot of potential to be explored The technology has the potential to become a cat and mouse game between pirates and software authors/owners My watermark is one of many steps towards perfecting the field of Software Watermarking. It is not the end once and for all the research and work. Just a stepping stone to achieving the never ending elusive goal of the ultimate software watermark. END


Download ppt "Experiments in Software Watermarking Bradford P. Cuppy B.S. University of Evansville Fri, Nov 8, 2002."

Similar presentations


Ads by Google