PZAPR Parallel Zip Archive Password Recovery CSCI High Perf Sci Computing Univ. of Colorado Spring 2011 Neelam Agrawal Rodney Beede Yogesh Virkar
Topics The Team Introduction Framework Brute Force Dictionary Password Verification Process Data Collection Results & Conclusions Questions
Introduction ZipCrypto was first ZIP encryption o Easily defeated AES-256 o Standard o 2003 integrated into ZIP spec Password recovery of ZIP's not new o Proprietary companies Open source solution o Free (if you have hardware)
Framework MPI with C++ & C 3 Components o Password Generator Brute Force Dictionary o Password Verification Command Parameters o Log Path o Zip Path o Method (BRUTE | DICTIONARY) o Dictionary Path
Initialize password generator Next Password(BRUTE|DICTIONARY) AttemptPassword() Correct? Tell Everyone Else I Found It Anyone Else Find It? END NO MORE? NO YES NO YES Framework (cont) Initialize decrypt engine
Brute Force All alphanumeric from 1 to 7 length o 0-9, A-Z, a-z o 62 possible characters 3,579,345,993,194 possible passwords o 62^7 + 62^ ^1 Traditional increment o 'a' + 1 ==> 'b' o 'az' + 1 ==> 'b0' o Not feasible for parallel
Brute Force - Algorithm Pick number from 1 to 3 trillion o Called position Know password without increment The Algorithm: f(position) = factor 1 * (ALPHA_LEN)^(n - 1) + factor 2 * (ALPHA_LEN)^(n - 2) factor n-1 * (ALPHA_LEN)^(n - (n-1)) + factor n * (ALPHA_LEN)^(n - n)
Brute Force - Algorithm (cont) f(position) = factor 1 * (ALPHA_LEN)^(n - 1) + factor 2 * (ALPHA_LEN)^(n - 2) factor n-1 * (ALPHA_LEN)^(n - (n-1)) + factor n * (ALPHA_LEN)^(n - n) ALPHA_LEN => Alphabet length o Number possible characters o 62 (easy to expand)
Brute Force - Algorithm (cont) f(position) = factor 1 * (ALPHA_LEN)^(n - 1) + factor 2 * (ALPHA_LEN)^(n - 2) factor n-1 * (ALPHA_LEN)^(n - (n-1)) + factor n * (ALPHA_LEN)^(n - n) n = PASSWORD LENGTH o Start at maximum possible (7) o Based on position find max possible < position o Password length is 1 more than that length
Brute Force - Algorithm (cont) f(position) = factor 1 * (ALPHA_LEN)^(n - 1) + factor 2 * (ALPHA_LEN)^(n - 2) factor n-1 * (ALPHA_LEN)^(n - (n-1)) + factor n * (ALPHA_LEN)^(n - n) factor i is the ith character of the password o No factor can be zero o Must borrow from previous if zero factor i points to alphabet array index
Brute Force - Example position = 1,000,000 ALPHA_LEN = 62 n = 4 (password length) f(1,000,000) = factor 1 * (62)^(3) + factor 2 * (62)^(2) + factor 3 * (62)^(1) + factor 4 * (62)^(0) factors = 4, 12, 9, 2
Brute Force - Example (cont) factors = 4, 12, 9, 2 o Correspond to alphabet indexes const char PASSWORD_ALPHABET[] = { '\0', // always idx 0 '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z' }; PASSWORD = '3', 'B', '8', '1' or "3B81"
Dictionary Attack Mode Defeating a cipher or authentication mechanism by o Searching likely possibilities. o i.e. searching part of the key space. Not brute force Assumption: Potentially weak passwords
Building Dictionary Tool Used: John the Ripper o Permutations o Combinations Command o john --wordlist=all.lst --rules --stdout | unique mangled.lst
Building Dictionary (2) Rules o l (convert to lowercase) o C (lowercase the first character, and uppercase the rest) o r (reverse: "Fred" ==> "derF") o f (reflect: "Fred" ==> "FredderF") o d (duplicate: "Fred ==> "FredFred"") o and many more!! Time to permute: little over 4 hours Newer versions: o John the Ripper Support for OpenMP directives. (Source:openwall.info/wiki/john/parallelization)
Reading the Dictionary: Initialization
Reading the Dictionary: Indexing Indexing uses o displacement array o rank o per process word count Load is evenly distributed. o Eg: n = 103, m = 10 o n/m = 103/10 = 10 o n%m = 103%10 = 3 o rank 0 : 11 words o rank 1 : 11 words o rank 2 = 11 words o rank 3-9 = 10 words
Requirements for Cracking a zip file Zip file format Extracting information from zip file Verifying the password Decrypting the file data Used Dr. Brian Gladman’s code o C library for AES encryption o Used by WinZip
Zip file format HEADER FILE NAME EXTRA FIELD SALT PASSWORD VERIFIER ENCRYPTED FILE DATA AUTHENTICATION CODE (MAC)
Password Verification Process Zip File Given Password
Password Verification Process Zip File Salt Given Password
Password Verification Process Zip File Salt Given Password Password Verifier
Password Verification Process Zip File Salt Password Verifier Given Password Password Verifier
Password Verification Process Zip File Salt Password Verifier Match Given Password
Password Verification Process Zip File Salt Password Verifier Given Password Password Verifier Match Return False
Password Verification Process Zip File Salt Password Verifier Given Password Password Verifier Match Return False
Password Verification Process Zip File Salt Password Verifier Data Given Password Password Verifier Match Return False
Password Verification Process Zip File Salt Password Verifier Data Given Password Password Verifier Match Decrypt MAC Return False
Password Verification Process Zip File Salt Password Verifier Data MAC Given Password Password Verifier Match Decrypt MAC Return False
Password Verification Process Zip File Salt Password Verifier Data MAC Given Password Password Verifier Match Decrypt MAC Match Return False
Password Verification Process Zip File Salt Password Verifier Data MAC Given Password Password Verifier Match Decrypt MAC Match Return False
Password Verification Process Zip File Salt Password Verifier Data MAC Given Password Password Verifier Match Decrypt MAC Match Return False Return True
Speed ups Reducing file handling operations Quick 2 byte check Parallel implementation on GPU
Data Collection & Testing Frost o 32-bit, 700Mhz, 512MB Ram Janus o 64-bit, 2.8GHz, 2GB Ram o Ran in 32-bit mode Test Types o Brute and Dictionary o Nodes: 128, 1024, 2048, 4096 o First, Middle, Last, Never (password positions) Model o Passwords / time unit for X nodes o Time to solution for X nodes
Results (Estimated Time: Brute, Janus )
Results (Estimated Time: Brute, Janus vs Frost)
Results (Estimated Time: Dictionary, Janus )
Results (Estimated Time: Dict., Janus vs Frost)
Conclusions Max throughput (Janus) o Brute = 172 passwords / second o Dictionary = 86 passwords / second Brute (Janus) o 7 alphanumeric = 60 days with 4096 processors o 8 alphanumeric = 9.9 years with 4096 processors o 10 alphanumeric = years with 4096 processors Dictionary (Janus) o 1 billion = 47.3 minutes with 4096 processors o 100 billion = hours with 4096 processors Conclusion o Choose good passwords
Questions?