Presentation is loading. Please wait.

Presentation is loading. Please wait.

EE 382 Processor DesignWinter 98/99Michael Flynn 1 Client and Server processors Client incorporates –Multi Media (sound and video) –Imaging (3D) –Security.

Similar presentations


Presentation on theme: "EE 382 Processor DesignWinter 98/99Michael Flynn 1 Client and Server processors Client incorporates –Multi Media (sound and video) –Imaging (3D) –Security."— Presentation transcript:

1 EE 382 Processor DesignWinter 98/99Michael Flynn 1 Client and Server processors Client incorporates –Multi Media (sound and video) –Imaging (3D) –Security and network accessibility –wireless communications Server incorporates –High speed processing –Management of large memory and file store complexes

2 EE 382 Processor DesignWinter 98/99Michael Flynn 2 Client processors Modern processors are being enhanced to support multimedia, security, etc. Most of the recent interesting processor developments have been in client processors –largest market, not dominated by clock speed, and more amenable to low power implement. –“system on a die”… includes dsp arithmetic and as much structured memory as possible.

3 EE 382 Processor DesignWinter 98/99Michael Flynn 3 Multi Media Includes video, audio, 3 D graphic imaging, as well as subsidiary functions such as music (composition and rendering), voice recognition, handwriting rec., animation Closely coupled to the display / presentation technology (raster line or pixel density, audio speaker fidelity / range)

4 EE 382 Processor DesignWinter 98/99Michael Flynn 4 Still Images/ Video/ Audio The problem is compression and meeting real time constraints –a B/W still image, 512 x 512 pixels, represents about 1/4 MB (8b/pixel); color (3B/pixel) almost 1MB; use 1 MB as a typical image –video requires 30 frames/sec; 30MB/sec; 1 hour is 108GB –voice requires 44k samples/sec; 3B/samples/sec 2 or more channels; about 1/4 MB/sec.

5 EE 382 Processor DesignWinter 98/99Michael Flynn 5 Still Images Lossless vs. Lossy compression –a simple bounded Huffman code gives 3:1 lossless compression JPEG is standard –offers (say) 25:1 lossy compression –tradeoffs: image quality  file size  computational complexity

6 EE 382 Processor DesignWinter 98/99Michael Flynn 6 JPEG Image is partitioned into 8 x 8 pixel blocks –transform into frequency domain by DCT (the high freq components are at the high index values of the resultant 8 x 8 matrix and often = 0. –Quantize (the lossy step) map values to few numbers –Zig zag access, to access low freq components (non 0) values first. –Huffman (run length) encode values

7 EE 382 Processor DesignWinter 98/99Michael Flynn 7 Discrete cosine transform (DCT) Map X (spatial domain) to Y (freq. domain) –more compact representation, use 8 x 8 pixel blk –y[u,v] =(4C(u) C(v)/n2) SjSkx(u,v) cos(2j+1) up/2n cos(2k+1) vp/2n –C(w) = 1for w=1,2… or C(w) = 1/sqrt2 for w=0 –better than discrete Fourier transform, but needs more computation

8 EE 382 Processor DesignWinter 98/99Michael Flynn 8 DCT basis functions

9 EE 382 Processor DesignWinter 98/99Michael Flynn 9 Block diagram of JPEG encoder

10 EE 382 Processor DesignWinter 98/99Michael Flynn 10 Video Popular standards: –H263 (video conferencing) –MPEG 1 (VHS quality) –MPEG 2 (Broadcast quality) –MPEG 4 (uses VOPs to achieve high quality with good compression)…. More complex, an emerging standard

11 EE 382 Processor DesignWinter 98/99Michael Flynn 11 Typical compression Image size, quality and delay are factors Lossless 3:1 JPEG 25:1 MPEG1 100:1 uses 352x288 CIF; 1-2 sec H 263 maybe 300:1; QCIF 176x144; 1/4sec MPEG2 4xCIF uses lower Q; longer delay

12 EE 382 Processor DesignWinter 98/99Michael Flynn 12 MPEG frames Three types of frames: –I intra-picture, like lossy JPEG –P predicted picture, motion prediction based on earlier I; motion vector plus error terms, as error terms are small quantizing gives good compression –B bidirectional pictures, motion prediction based on past and future I or P –result is GOP, e.g. IPBBPBBPBBPBBPBBI

13 EE 382 Processor DesignWinter 98/99Michael Flynn 13 I frames In MPEG typically use 1 I per 15 frames In H263 maybe 1 I per 300 frames I frames take (maybe) 4 x bits to represent than a P or B frame.

14 EE 382 Processor DesignWinter 98/99Michael Flynn 14 MPEG block diagram

15 EE 382 Processor DesignWinter 98/99Michael Flynn 15 P frames Motion prediction is computationally intensive; based on macro blocks 2 x 2 blocks 16 x 16 of luminance, 1 8 x 8 Cr, 1 8 x 8 Cb, color is interleaved (called 4:2:0)

16 EE 382 Processor DesignWinter 98/99Michael Flynn 16 Motion estimation process

17 EE 382 Processor DesignWinter 98/99Michael Flynn 17 Forward motion compensation

18 EE 382 Processor DesignWinter 98/99Michael Flynn 18 Motion estimation Computation intensive Compute SAD for all neighboring macro block combinations (index by 1 pixel). S [xi,j  yi,k] across all macro blocks Find location that minimizes SAD

19 EE 382 Processor DesignWinter 98/99Michael Flynn 19 Bidirectional motion compensation

20 EE 382 Processor DesignWinter 98/99Michael Flynn 20 Block diagram of MPEG encoder

21 EE 382 Processor DesignWinter 98/99Michael Flynn 21 Instructions /pixel JPEG about 320 to compress;280 to decode MPEG1 about 1100 to compress; about 80 to decode. Note problem in motion estimation; need 352 x 288 x 1100 x 30 instr /sec = 3.3 GIPS for MPEG1 to compress. MPEG2 uses bigger frames; better motion estimation and color …maybe…20 GIPS

22 EE 382 Processor DesignWinter 98/99Michael Flynn 22 Video memory Even if we have enough arithmetic BW, memory (cache) access is a problem. A single CIF frame has 200  400 kB and won’t fit into a L2 caches less than (say) 1 or 2 MB. Worse is the behavior of the L1 D cache. There are NO hits after a line is used. Solution: prefetch and stride prediction caches at L1.

23 EE 382 Processor DesignWinter 98/99Michael Flynn 23 Audio Frequency range 20-20k Hz @ 2x sampling Sample rates 8k telephone 22k personal computers 32k digital audio and TV 44k CDs 48k HDTV, DAT

24 EE 382 Processor DesignWinter 98/99Michael Flynn 24 Audio Dynamic range: 0 to 120db about 20 bits of exponent Phasing: 2 or more channels to locate source Clipping: ear tolerates about 200ms delay, after 300ms becomes annoying. Bit rates: 44k x 20 x 2 = 1.7 Mbps or (PCs) 22k x 16 = 352 kbps

25 EE 382 Processor DesignWinter 98/99Michael Flynn 25 Audio Can do better by compression; use ADPCM and send the difference between adjacent pulses… G722 standard 16k with ADPCM to fit into 64kbps. G728 uses linear predictive coder achieves 16kbps. Models voice as a linear filter; matches sample with codebook, send index into receivers codebook

26 EE 382 Processor DesignWinter 98/99Michael Flynn 26 MPEG audio Compresses digital audio signals (PCM) Uses 32 sub band filters (512 taps), samples shifted 32 at a time. Computation is s(i) = Snx(t- n) Hi(n) over n = 512 per sample Hi(n) is the impulse response for the ith filter. Thus we have 512 multiply- accumulates per sample. About 22Mops/sec

27 EE 382 Processor DesignWinter 98/99Michael Flynn 27 MPEG Audio Sample rates 32, 44, 48 kHz Mono, stereo or joint stereo Bit rates 64kbps to 128kbps, several layers and coder complexity to get better bit rates and/or better quality. Computationally requires probably 5 - 100 million multiply-adds per second (16b).

28 EE 382 Processor DesignWinter 98/99Michael Flynn 28 MPEG audio encoder


Download ppt "EE 382 Processor DesignWinter 98/99Michael Flynn 1 Client and Server processors Client incorporates –Multi Media (sound and video) –Imaging (3D) –Security."

Similar presentations


Ads by Google