# Chapter 3 Data Representation

## Presentation on theme: "Chapter 3 Data Representation"— Presentation transcript:

Chapter 3 Data Representation

Data and Computers Computers are multimedia devices, dealing with many categories of information. Computers store, present, and help modify: Numbers Text Audio Images and graphics Video

Analog and Digital Information
Computers are finite. Computer memory and other hardware devices have only so much room to store and manipulate a certain amount of data. The goal of data representation is to represent enough of the world to satisfy our computational needs and our senses of sight and sound.

Analog and Digital Information
Information can be represented in one of two ways: analog or digital. Analog data: A continuous representation, analogous to the actual information it represents. Digital data: A series of discrete representations, breaking the information up into separate elements.

Analog and Digital Information
A mercury thermometer exemplifies analog data as it continually rises and falls in direct proportion to the temperature. Digital displays only show discrete information.

Analog and Digital Information
Computers cannot work well with analog information, so we digitize it by sampling it at discrete intervals and representing each interval by a numeric value.

Electronic Signals An analog signal continually fluctuates up and down in voltage. But a digital signal has only a high or low state, corresponding to the two binary digits. An analog and a digital signal

Electronic Signals All electronic signals (both analog and digital) degrade as they move down a line. That is, the voltage of the signal fluctuates due to environmental effects. Degradation of analog and digital signals

Electronic Signals (Cont’d)
Even when it has deteriorated, it is possible to distinguish the 2 states of a digital signal by comparison to the threshold. Periodically, a digital signal can be reclocked to regain its original shape. No such process is available for analog signals.

Representing Audio Data

Representing Audio Information
We perceive sound when a series of air compressions vibrate a membrane in our ear, which sends signals to our brain. A stereo sends an electrical signal to a speaker to produce sound. This signal is an analog representation of the sound wave. The voltage in the signal varies in direct proportion to the sound wave.

Representing Audio Information
To digitize the signal we periodically measure the voltage of the signal and record the appropriate numeric value. The process is called sampling. In general, a sampling rate of around 40,000 times per second is enough to create a reasonable sound reproduction. The standard sampling rate for CDs is 44.1 kHz. The Pro Audio standard is 48 kHz.

BEWARE!! This is NOT true! Figure 3.8 Sampling an audio signal

Representing Audio Information
The potential loss of peak values suggested in the previous slide is a myth. The time lapse between samples is much too short for any such loss.

Representing Audio Information
The human ear hears sounds between 20 Hz and 20,000 Hz. Sampling at twice this frequency (44,000+) eliminates any potential loss of data. For a complete explanation refer to the Nyquist–Shannon sampling theorem. 15

Audio Formats Audio Formats MP3 is dominant
WAV, AU, AIFF, VQF, and MP3. MP3 is dominant MP3 is short for MPEG-2, audio layer 3 file. MP3 employs both lossy and lossless compression. First it analyses the frequency spread and compares it to mathematical models of human psychoacoustics (the study of the interrelation between the ear and the brain), and it discards information that can’t be heard by humans. Then the bit stream is compressed using a form of Huffman encoding to achieve additional compression.

Representing Audio Information
A compact disk (CD) stores audio information digitally. On the surface of the CD are microscopic pits that represent binary digits. A low intensity laser is pointed at the disc. The laser light reflects strongly if the surface is smooth and reflects poorly if the surface is pitted.

Representing Audio Information
Figure A CD player reading binary information

Representing Graphic Images

Representing Images and Graphics
Colour is our perception of the various frequencies of light that reach the retinas of our eyes. Our retinas have three types of colour photoreceptor cones which respond to different sets of frequencies. These photoreceptor categories correspond to the colours of red, green, and blue.

Representing Images and Graphics
Colour is often expressed in a computer as an RGB (red, green, blue) value, which is actually three numbers that indicate the relative contribution of each of these three primary colours. For example, an RGB value of (255, 255, 0) maximizes the contribution of red and green, and minimizes the contribution of blue. The resulting colour is a bright yellow.

Representing Images and Graphics
Figure Three-dimensional color space

Representing Images and Graphics

Representing Images and Graphics
The amount of data that is used to represent a colour is called the colour depth. HiColor is a term that indicates a 16-bit colour depth. Five bits are used for each number in an RGB value and the extra bit is sometimes used to represent transparency. TrueColor indicates a 24-bit colour depth. Therefore, each number in an RGB value gets eight bits.

Representing Images and Graphics
HiColor uses 5 bits for each number. Since 25 = 32, there are 32 different levels for each of the 3 primary colours. So there are 323 (or 215) possible colours. This is a total of 32,768 different colours. TrueColor uses eight bits for each colour component. 28* 28* 28 = 224 or 16,777,216 colours. Some monitors can use as many as 32 bits for colour depth. This is potentially 4,294,967,296 colours!

Representing Images and Graphics
The human eye is able to distinguish about 200 intensity levels in each of the three primaries red, green, and blue. All in all, up to 10 million different colours can be distinguished. So modern monitors are examples of solutions without a problem. If the human eye can distinguish only 10 million colours, why develop monitors that can display over 4 billion?

Indexed Colour A particular application such as a browser may support only a certain number of specific colours, creating a palette from which to choose. For example, Netscape Navigator’s colour palette has only 216 colours.

Digitized Images and Graphics
Digitizing a picture is the act of representing it as a collection of individual dots, called pixels. The number of pixels used to represent an image is called the resolution. As an example, the resolution of many monitors is 1024 X 768, or 786,432 pixels. If the colour of each pixel is stored as 24 bits (3 bytes) of data, the screen alone requires 2,359,296 bytes (2 megabytes) of memory.

Digitized Images and Graphics
Figure A digitized picture composed of few individual pixels

Digitized Images and Graphics
Figure A digitized picture composed of many individual pixels

Digitized Images and Graphics
The storage of image information on a pixel-by-pixel basis is called a raster-graphics format. There are several popular raster file formats including: BMP (bitmap) GIF (Graphics Interchange Format) JPEG (Joint Photographic Experts Group)

Vector Graphics Instead of assigning colours to pixels as we do in raster graphics, a vector-graphics format describes an image in terms of lines and geometric shapes. A vector graphic is a series of commands that describe a line’s direction, thickness, and colour. The file size for these formats tends to be small because every pixel does not need to be represented.

Vector Graphics Vector graphics can be resized mathematically, and these changes can be calculated dynamically as needed. This makes them particularly useful for defining scalable fonts. However, vector graphics is not a good technique for representing real-world images.

Representing Video Data

Representing Video A video codec (COmpressor/DECompressor) refers to the methods used to shrink the size of a movie to allow it to be played on a computer or over a network. Almost all video codecs use lossy compression to minimize the huge amounts of data associated with video.

Representing Video To simulate motion, movies need to record (and play back) at least 12 frames per second. However, good sound quality requires 24 frames/s. 24 frames/s = 1440 frames/minute = frames/hour

Representing Video *This is a very conservative resolution. Recall…
If each frame has a resolution of 1024 x 768* there are 786,432 pixels in a frame. If the colour of each pixel is stored as 24 bits (3 bytes) of data, one frame alone requires 2,359,296 bytes (2 MB) of memory. An hour of film then, requires 203,843,174,400 bytes (194,400 MB – more than 190 Gigabytes) of storage – just for the images. *This is a very conservative resolution.

Representing Video The first step in compressing video is to reduce the amount of information stored for a frame. This problem is essentially the same as that faced when compressing still images. Spatial compression: A technique based on removing redundant information within a frame.

Representing Video Each compressed frame will still be quite large.
Moreover, each one is a still picture that looks very much like the one before it. After all, how much can change in 1/24 of a second? Why should we waste space to duplicate all of the identical information?

Representing Video We can save even more space by recognizing that between two frames, most of the image hasn’t changed. Storing only the changes (deltas) from one cell to the next is much more efficient. Temporal compression A technique based on storing differences between consecutive frames.