Computer System Basics 1 Number Systems & Text Representation Computer Forensics BACS 371
Computer System Basics Number Systems Decimal (base 10) Binary (base 2) Octal (base 8) Hexadecimal (base 16) Conversions Little Endian vs. Big Endian Text Representation ASCII EBCDIC Unicode
Number Systems Decimal – base 10 Binary – base 2 Octal – base 8 Hexadecimal – base 16
Decimal Number System Base 10 Uses digits 0~9 Based on powers of ,00010, * 10 5 = 300,000 2 * 10 4 = 20,000 7 * 10 3 = 7,000 1 * 10 2 = * 10 1 = 90 4 * 10 0 = TOTAL = 327,194
Binary Number System Base 2 Uses digits 0~1 Based on powers of * 2 5 = 32 1 * 2 4 = 16 0 * 2 3 = 0 1 * 2 2 = 4 0 * 2 1 = 0 1 * 2 0 = = Base 10 Base
Octal Number System Base 8 Uses digits 0~7 Based on powers of * 8 4 = 28,672 0 * 8 3 = 0 2 * 8 2 = * 8 1 = 48 5 * 8 0 = = 28, Base 10 Base
Hexadecimal Number System Base 16 Uses digits 0~9 and A, B, C, D, E, F Based on powers of ,048,57665, F7A0E 3 * 16 5 = 3,145,728 F * 16 4 = 983,040 7 * 16 3 = 28,672 A * 16 2 = * 16 1 = 0 E * 16 0 = F7A0E 16 = 10,451, Base 10 Base A 11B 12C 13D 14E 15F
Number System Comparison DecimalBinaryOctalHexadecimal A B C D E F
Number System Representations Binary b Octal 115o – note: trailing charter is a lowercase ‘oh’ Hexadecimal 0x4D -- note: leading character is a zero 4Dh 4D 16
Little Endian vs. Big Endian Please read this. Deals with the order that bytes are stored in Intel-based versus non Intel-based computers. Intel-based are normally PC-type computers Non Intel-based are normally mainframe computers Little Endian – stored left-to-right (Intel-based) Big Endian – stored right-to-left (mainframe)
Text Representations Text values stored in a computer can be in several formats ASCII EBCDIC Unicode (various types) By far, the most common is ASCII
ASCII ASCII, pronounced "ask-key", is the common code for microcomputer equipment American Standard Code for Information Interchange Proposed by ANSI in 1963, and finalized in 1968ANSI The standard ASCII character set consists of 128 decimal numbers ranging from zero through 127 assigned to letters, numbers, punctuation marks, and the most common special characters The first 32 codes are reserved for “non-printing” or “control” characters – supported original teletype systems The Extended ASCII Character Set also consists of 128 decimal numbers and ranges from 128 through 255 representing additional special, mathematical, graphic, and foreign characters
ASCII Table
Extended ASCII Table
Text Binary Converters ools/binary.shtml ools/binary.shtml binary.php binary.php TEXT Hello World BINARY Hex C 6C 6F F 72 6C 64 TextBinaryOctalHex H e l C l C o F
WinHex View
EBCDIC Extended Binary Code Decimal Interchange Code Originally used by IBM-based mainframes Totally different encoding scheme from ASCII and Unicode Still used, but not as prevalent as in the past
Unicode Character coding standard used in NTFS “Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.” Three varieties of Unicode Transformation Format UTF-8 – identical to ASCII for western languages UTF-16 – 16-bits per character UTF-32 – 32-bits per character
Why do we care? As a forensic analyst, you will be working with different number systems and encoding schemes. You need to understand the conversion process between the different number systems and, if necessary, perform them by hand. You also need to understand hexadecimal and ASCII well enough to be able to interpret “hex dumps.”
But wait…. There’s more! All the encoding schemes covered only apply to “text” data. There are different encoding methods for other types of digital evidence (e.g., numbers, dates, times, executable programs, …). The computer stores everything as 1’s and 0’s and the way you (and the computer) interpret groups of bits depends upon the context.