Data Representation
Data representation Encoding Analogue v. digital Sampling File systems (drives, folders, naming)
Data representation Humans work with Computers use Text Plain Formatted Numbers Integers Fractions Big & small Graphics Video Sound Computers use 1 or 0 How do we resolve this mismatch?
Data encoding What does this say? How do we know? ●●● — — — ●●● How do we know? Information has been encoded
Encoding Finding a way to represent data in a way that can be handled by computers Many different encoding methods e.g. Text – ASCII, Unicode etc. Numbers – Integers, Fixed point, floating point Pictures – 2 bit, 8 bit, 16 bit, 24 bit colour & different file formats Sound – WAV, MP3, OGG etc. Video – AVI, MPG, MOV etc.
Encoding examples - ASCII American Standard Code for Information Interchange Represents English language characters using numbers Originally 7 bits per character -> 128 options Alphanumeric & control characters (e.g. BS=Back Space) 8th bit used for parity checking Extended ASCII uses 8 bits = 256 options
ASCII code (7 bit)
Encoding examples - Unicode ASCII just one of many encoding schemes ASCII only supports basic Latin alphabet Other encoding schemes required to support different alphabets, accented letters, mathematical symbols etc. Problems of incompatibility etc. Unicode aims to replace all “legacy” character sets
Encoding examples - Unicode provides a unique number for every character, no matter what the platform no matter what the program no matter what the language Uses 8, 16 or 32 bits for each character Currently at version 8.0 Full details available at Unicode website www.unicode.org
Unicode 8.0 code charts
Unicode 8.0 code charts (continued)
Unicode 8.0 code charts (continued)
Numbers Numbers fundamental part of life Many different forms 1,2,3,4,5 … £17.99 47% Earth to Sun 149.6×109 m ½ , ¾ √2 Negative numbers
Numbers Computers can only use 0 or 1 => binary (Base 2) Encoding schemes needed to represent all other types of number in binary Next week we will look at Binary Base 2 Decimal Base 10 Hexadecimal Base 16
Analogue A continuously varying signal Varying in terms of frequency, amplitude, or both A signal that is constantly changing Analogue can represent many values Accurate BUT vulnerable to noise
Digital A signal with discrete value changes Signal levels are either on or off Often thought of as either 1 or 0, especially in computers, but digital can represent many different values using discrete levels Due to limited possible values, digital signals less vulnerable to noise
Analogue to digital conversion Most things we deal with are analogue e.g. Colours Sounds Time Weights Lengths etc. Computers are digital ...
Sampling
Sampling Most common way to convert analogue data in digital form Take measurements at set intervals (time or space) Measure value of analogue source to a defined degree of accuracy => resolution Higher resolution gives more accurate result but increases storage Store digitised values
Resolution Two aspects to resolution Sampling rate examples No. of possible values per data point Sampling rate examples Dots per inch -> pictures Sampling frequency -> sounds Values per data point examples 16bit, 24bit, 32bit colour -> graphics Bit rate -> MP3 files
End result? All these forms of information end up as 1s and 0s… which have to be stored or transmitted…
File systems Each operating system has facilities to manage files in various ways e.g. Create Move Copy Delete Rename Open Execute
File systems - Windows Each drive has a letter e.g. C: Files are organised in folders (used to be called directories) Top level of each drive is the root folder e.g. C:\ File names are not case sensitive MyFile.txt = myfile.txt Some characters are not allowed as they have special meanings \ / ? : * “ > < |
File systems - Windows Files generally have an extension which tells Windows what type of file it is e.g. myfile.txt = plain text file myfile.doc = Word document Windows maintains a database of file extensions and the program that opens each type
File systems - Windows File names must be unique in a folder but can be duplicated if in different folders Full name of a file includes the path to find it Starts with drive letter Folders divided up by \ character D:\documents\college work\Year 1\CS\PC hardware.pptx Note: spaces are allowed but can cause problems…
File systems – Linux / Android Based on Filesystem Hierarchy Standard (FHS) No drive letters Everything including physical drives appears in directories (folders) under the root directory (/) Levels in file paths divided by / (where else have you seen this?) Case sensitive
Ubuntu Drives show as folders under /dev or /mnt e.g. sda is first SATA drive
Android Android is a version of Linux Same file system (ish) Often has sdcard/ as symbolic link to /mnt/sdcard
Apple iOS “To keep the system simple, users of iOS devices do not have direct access to the file system and apps are expected to follow this convention.” (iOS Developer Library) Each app runs in its own “sandbox” “an app is generally prohibited from accessing or creating files outside its containers” Must use operating system to access “things such as the user’s contacts or music”