Dale & Lewis Chapter 3 Data Representation

Slides:



Advertisements
Similar presentations
Dale & Lewis Chapter 3 Data Representation Analog and digital information The real world is continuous and finite, data on computers are finite  need.
Advertisements

Review of HTML Ch. 1.
Lecture 10 : Huffman Encoding Bong-Soo Sohn Assistant Professor School of Computer Science and Engineering Chung-Ang University Lecture notes : courtesy.
Review Ch.1,Ch.4,Ch.7. Review of tags covered various header tags Img tag Style, attributes and values alt.
Text Compression 1 Assigning 16 bits to each character in a document uses too much file space We need ways to store and transmit text efficiently Text.
Huffman Encoding 16-Apr-17.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
Data Representation CS105. Data Representation Types of data: – Numbers – Text – Audio – Images & Graphics – Video.
1 Chapter 1 Introduction. 2 Outline 1.1 A Very Abstract Summary 1.2 History 1.3 Model of the Signaling System 1.4 Information Source 1.5 Encoding a Source.
Data Representation in Computers
Data Representation (in computer system) Computer Fundamental CIM2460 Bavy LI.
Chapter 3 Data Representation. 2 Data and Computers Computers are multimedia devices, dealing with many categories of information. Computers store, present,
DAT2343 Basic Character Encoding Including ASCII © Alan T. Pinck / Algonquin College; 2003.
Lesson Objectives Explain the use of binary codes to represent characters Explain the term “Character set” Describe with examples (for examples ASCII and.
CODING SYSTEMS CODING SYSTEMS CODING SYSTEMS. CHARACTERS CHARACTERS digits: 0 – 9 (numeric characters) letters: alphabetic characters punctuation marks:
CHARACTERS Data Representation. Using binary to represent characters Computers can only process binary numbers (1’s and 0’s) so a system was developed.
Management Information Systems Lection 06 Archiving information CLARK UNIVERSITY College of Professional and Continuing Education (COPACE)
Huffman Codes Message consisting of five characters: a, b, c, d,e
©Brooks/Cole, 2003 Chapter 2 Data Representation.
Chapter 2 Data Representation. Define data types. Visualize how data are stored inside a computer. Understand the differences between text, numbers, images,
(2.1) Fundamentals  Terms for magnitudes – logarithms and logarithmic graphs  Digital representations – Binary numbers – Text – Analog information 
Chapter 3 Representing Numbers and Text in Binary Information Technology in Theory By Pelin Aksoy and Laura DeNardis.
Representing text Each of different symbol on the text (alphabet letter) is assigned a unique bit patterns the text is then representing as.
Binary Numbers and ASCII and EDCDIC Mrs. Cueni. Data Representation  Human speech is analog because it uses continuous signals (waves) that vary in strength.
Bits & Bytes: How Computers Represent Data
Chapter 4: Representation of data in computer systems: Characters OCR Computing for GCSE © Hodder Education 2011.
Representing Nonnumeric Data Everything is really a number.
Chapter 2 Computer Hardware
Chapter 3 Representation. Key Concepts Digital vs Analog How many bits? Some standard representations Compression Methods 3-2.
Chapter 3 Section 1 Number Representation Modern cryptographic methods, unlike the classical methods we just learned, are computer based. Representation.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
Introduction to Computer Design CMPT 150 Section: D Ch. 1 Digital Computers and Information CMPT 150, Chapter 1, Tariq Nuruddin, Fall 06, SFU 1.
1 INFORMATION IN DIGITAL DEVICES. 2 Digital Devices Most computers today are composed of digital devices. –Process electrical signals. –Can only have.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Communication Technology in a Changing World Week 2.
CISC1100: Binary Numbers Fall 2014, Dr. Zhang 1. Numeral System 2  A way for expressing numbers, using symbols in a consistent manner.  " 11 " can be.
1 Data Representation Characters, Integers and Real Numbers Binary Number System Octal Number System Hexadecimal Number System Powered by DeSiaMore.
Quiz # 1 Chapters 1,2, & 3.
Data Representation, Number Systems and Base Conversions
Representation of Characters
Agenda Character representation Numerical Conversions ASCII EBCDIC
Chapter 1 Background 1. In this lecture, you will find answers to these questions Computers store and transmit information using digital data. What exactly.
Chapter 3 Data Representation. 2 Compressing Files.
CS 101 – Sept. 11 Review linear vs. non-linear representations. Text representation Compression techniques Image representation –grayscale –File size issues.
Lecture Coding Schemes. Representing Data English language uses 26 symbols to represent an idea Different sets of bit patterns have been designed to represent.
Nat 4/5 Computing Science Data Representation Lesson 3: Storing Text
DATA REPRESENTATION - TEXT
Binary Representation in Text
Binary Representation in Text
Unit 2.6 Data Representation Lesson 2 ‒ Characters
GCSE COMPUTER SCIENCE Topic 3 - Data 3.2 Data Representation.
Binary 1 Basic conversions.
Binary Numbers and ASCII and EDCDIC
Bits & Bytes How Computers Represent Data
Introduction to Computers
Information Support and Services
Data Encoding Characters.
TOPICS Information Representation Characters and Images
Representing Nonnumeric Data
Communication Technology in a Changing World
Communication Technology in a Changing World
COMS 161 Introduction to Computing
Chapter 2 Data Representation.
COMS 161 Introduction to Computing
Huffman Encoding.
Chapter 3 - Binary Numbering System
Lecture 36 – Unit 6 – Under the Hood Binary Encoding – Part 2
Digital Representation of Data
ASCII and Unicode.
Presentation transcript:

Dale & Lewis Chapter 3 Data Representation

Analog and digital information The real world is continuous and finite, data on computers are finite  need to approximate real-world data for our computational needs Analog data: information represented in a continuous form Digital data: information represented in digital form

Analog and digital information

Noise in signals

Digitizing a signal Sample the signal in time within discrete levels The pieces are numbered The binary number system is used to represent the numbers n bits can represent 2n numbers Q: how many bits are needed to represent m numbers? Actual number of bits that can be easily addressed in a computer sets some constraints

Representing text English language character set: 26 letters (both upper and lower case), punctuation, numeric digits, etc How many bits can we use? What about other languages?

ASCII character set American Standard Code for Information Interchange Each character is coded as a byte (8 bits) 7-bit code (1 check bit) Later all 8 bits used in the “extended character set” 128 characters encoded (27) 95 visible characters 33 invisible (control) characters

7-bit ASCII character set

ASCII Table The table above was sorted in decimal values These decimal values are really representing binary sequences So the character J is in position 74 This would be 01001010 in Binary or 4A in Hexadecimal j in 106 is 01101010 in Binary or 6A in Hexadecimal Notice anything? There is a purpose for that! The Unicode character set 16-bit standard, 65,536 possible codes Enough to cover the principal languages of the World Superset of ASCII so the first 256 codes of Unicode are the same as Extended ASCII

Text compression Keyword encoding Substitute frequently used words with single characters i.e.: “as”  ^, “the”  ~, “and”  +, “that”  $, etc. Problems: These characters can’t be part of the text Frequently used words tend to be short, so not much gain Word variations are not handled: i.e. “The” vs. “the”

Run-length encoding Replace long series of a repeated character with a special short code i.e.: replace “AAAAAAA” with *A7 This is equivalent to 01000001 01000001 01000001 01000001 01000001 01000001 01000001 with 00101010 01000001 00000111 Note that repetitions shorter than 4 characters are not worth encoding Also note that the repetition number is encoded in binary, not ASCII, so that repetitions longer than 9 can be captured Used in limited-palette image compression and fax machines

Huffman encoding Generalization of Morse Code Morse code (dots & dashes) is based on distribution of letters in general English usage Huffman encoding in based on distribution in a given message Algorithm: Encoding: Build frequency table of letter usage Build the code and encode the message Decoding Huffman code has the prefix property Prefix property: no code is the front part of another code Decoding processes the bit stream until a match is found

Example of Huffman encoding/decoding Message: DOORBELL Encoding: 1011110110111101001100100 Compression ratio (vs ASCII): 25/64 = 0.39 Decode: