What Choices Make A Killer Video Processor Architecture? Jonah Probell Ultra Data Corp

Slides:



Advertisements
Similar presentations
Portable media player RD –A 谢国佳.
Advertisements

Progetto MAIS - WP5 esplorazione di architetture alternative Resoconto delle attività svolte Andrea Pagni STMicroelectronics Advanced System Architectures.
March 24, 2004 Will H.264 Live Up to the Promise of MPEG-4 ? Vide / SURA March Marshall Eubanks Chief Technology Officer.
Clare Smtih SHARC Presentation1 The SHARC Super Harvard Architecture Computer.
NS Training Hardware. NS9750 System Overview.
Introduction to Computer Systems CCE INPUT Human/Machine Interface DATA Organisation Access Analysis Computation Synthesis PROCESSING Systems Programming.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Designing Embedded Hardware 01. Introduction of Computer Architecture Yonam Institute of Digital Technology.
Chapter 20 This chapter provides a series of applications. There is no daughter cards with the DSK6713 and DSK6416 Part 1: Applications using the PCM3003.
Microprocessors and Interfacing
Philips Research ICS 252 class, February 3, The Trimedia CPU64 VLIW Media Processor Kees Vissers Philips Research Visiting Industrial Fellow
DSPs Vs General Purpose Microprocessors
PIPELINE AND VECTOR PROCESSING
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
EZ-COURSEWARE State-of-the-Art Teaching Tools From AMS Teaching Tomorrow’s Technology Today.
Instruction-Level Parallel Processors {Objective: executing two or more instructions in parallel} 4.1 Evolution and overview of ILP-processors 4.2 Dependencies.
1 Advanced Computer Architecture Limits to ILP Lecture 3.
ECSE DSP architecture Review of basic computer architecture concepts C6000 architecture: VLIW Principle and Scheduling Addressing Assembly and linear.
Design center Vienna Donau-City-Str. 1 A-1220 Vienna Vers SVEN Scalable Video Engine Gerald Krottendorfer.
 Understanding the Sources of Inefficiency in General-Purpose Chips.
Computer Architecture & Organization
TigerSHARC and Blackfin Different Applications. Introduction Quick overview of TigerSHARC Quick overview of Blackfin low power processor Case Study: Blackfin.
Instruction Level Parallelism (ILP) Colin Stevens.
1 Architectural Analysis of a DSP Device, the Instruction Set and the Addressing Modes SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software.
1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.
Eye-RIS. Vision System sense – process - control autonomous mode Program stora.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Input/Output devices Part 3: Programmable I/O and DSP's dr.ir. A.C. Verschueren.
Computer Organization & Assembly Language
Motivation Mobile embedded systems are present in: –Cell phones –PDA’s –MP3 players –GPS units.
Basics and Architectures
2007 Sept 06SYSC 2001* - Fall SYSC2001-Ch1.ppt1 Computer Architecture & Organization  Instruction set, number of bits used for data representation,
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Computer Architecture and Organization Introduction.
SPCA554A Mobile Camera Multimedia Processor By Harrison Tsou.
A Flexible Multi-Core Platform For Multi-Standard Video Applications Soo-Ik Chae Center for SoC Design Technology Seoul National University MPSoC 2009.
Real-Time HD Harmonic Inc. Real Time, Single Chip High Definition Video Encoder! December 22, 2004.
SC2005 Product Overview DTV Source Applications Broadband Entertainment Division July 2001.
Chapter 1 Introduction. Architecture & Organization 1 Architecture is those attributes visible to the programmer —Instruction set, number of bits used.
Equator BSP-15 Video- centric SoC By Anastasia Christou Reg. No. :
CSI-2111 Computer Architecture Ipage Control, memory and I/O v Objectives: –To define and understand the control units and the generation of sequences.
1 Introduction CEG 4131 Computer Architecture III Miodrag Bolic.
Ch. 2 Data Manipulation 4 The central processing unit. 4 The stored-program concept. 4 Program execution. 4 Other architectures. 4 Arithmetic/logic instructions.
Computer Architecture Memory, Math and Logic. Basic Building Blocks Seen: – Memory – Logic & Math.
Overview of Super-Harvard Architecture (SHARC) Daniel GlickDaniel Glick – May 15, 2002 for V (Dewar)
Motherboard A motherboard allows all the parts of your computer to receive power and communicate with one another.
CPU/BIOS/BUS CES Industries, Inc. Lesson 8.  Brain of the computer  It is a “Logical Child, that is brain dead”  It can only run programs, and follow.
Overview von Neumann Architecture Computer component Computer function
Computer operation is of how the different parts of a computer system work together to perform a task.
Case Study: Implementing the MPEG-4 AS Profile on a Multi-core System on Chip Architecture R 楊峰偉 R 張哲瑜 R 陳 宸.
CSC 360- Instructor: K. Wu Review of Computer Organization.
CPS 258 Announcements –Lecture calendar with slides –Pointers to related material.
1 3 Computing System Fundamentals 3.2 Computer Architecture.
Architectural Effects on DSP Algorithms and Optimizations Sajal Dogra Ritesh Rathore.
PRESENTED BY: MOHAMAD HAMMAM ALSAFRJALANI UFL ECE Dept. 3/31/2010 UFL ECE Dept 1 CACHE OPTIMIZATION FOR AN EMBEDDED MPEG-4 VIDEO DECODER.
IBM Cell Processor Ryan Carlson, Yannick Lanner-Cusin, & Cyrus Stoller CS87: Parallel and Distributed Computing.
Niagara: A 32-Way Multithreaded Sparc Processor Kongetira, Aingaran, Olukotun Presentation by: Mohamed Abuobaida Mohamed For COE502 : Parallel Processing.
DSP Processor
Nios II Processor: Memory Organization and Access
ESE532: System-on-a-Chip Architecture
Embedded Systems Design
Multi-core SOC for Future Media Processing
Vector Processing => Multimedia
Introduction to Digital Signal Processors (DSPs)
Directory-based Protocol
Computer Organization
Digital Signal Processors-1
What Choices Make A Killer Video Processor Architecture?
ADSP 21065L.
Presentation transcript:

What Choices Make A Killer Video Processor Architecture? Jonah Probell Ultra Data Corp

© Copyright 2004 Jonah Probellslide 2 Outline Overview of Ultra Data UD3000 Software programmability Parallelism –VLIW –SIMD –Multiprocessing Appropriate use of on- and off-chip memory –Optimal organization of data structures in DRAM Deterministic performance –5-port regfile –2-port on-chip memory –DMA controller instead of caches

© Copyright 2004 Jonah Probellslide 3 Nobodys Video Decoder Chip SDRAM high-speed interconnect Video Decode Processor Peripheral bus bridge Host / audio processor SDRAM controller Video post- processing peripheral bus Video output S-video / raw 24-bit RGB or 8/16-bit YCrCb Audio output I 2 S / SPDIF / raw I 2 C, SATA, timers DVD optical interface SATA & I 2 C bussesOptics sled Audio / Video DACs

© Copyright 2004 Jonah Probellslide 4 The Ultra Data UD3000 Outer Loop Processor 0 Crossbar Switch Fabric System Bus Bridge Inner Loop Processor 1 instruction extensions Inner Loop Processor 0 Inner Loop Processor 2 Smart 2-D DMA Controller 2-port DMEM 2-port DMEM … FIFO … Test & Set Outer Loop Processor 1 instruction extensions

© Copyright 2004 Jonah Probellslide 5 H.264 Main Profile Decode ILP 0 DMA ctrl ILP 1 OLP 1 ILP 2 OLP 0 CABAC CA VLC interpolation inverse transform apply deltas Deblocking thresholds Deblocking Filter load prediction source store block

© Copyright 2004 Jonah Probellslide 6 The Inner Loop Processor Data Aligner IMEM Control Unit 32-bit RISC Program Counter Loads & Stores Vector Unit 64-bit SIMD data Multiply Acc Data packing 3-port Regfile 5-port Regfile Switch Fabric 32 64

© Copyright 2004 Jonah Probellslide 7 Video Codec Standards ITU-T standards ITU-T / MPEG joint standards MPEG standards H.261H.263 H.262 / MPEG-2 H.264 / MPEG-4 Part 10 AVC MPEG-1MPEG-4 VP3 On2 Technologies standards DivX Networks standard DivX VP4VP5VP6 Microsoft standard Windows Media Video

© Copyright 2004 Jonah Probellslide 8 VLIW Parallelism load multiply load multiply load multiply shift store add branch sequential DSP program program sequencer regfile data memory ALU + - x & | ! >> << load multiply store shift branch add VLIW DSP program

© Copyright 2004 Jonah Probellslide 9 SIMD Parallelism frame of macroblocks macroblock of pixels 8x8 block of pixels 4x4 block of pixels

© Copyright 2004 Jonah Probellslide 10 Multiprocessor Parallelism video codec motion estimation prediction transform & compression deblocking software system CPU 0 CPU 1 CPU 2 hardware symmetric parallel multiprocessing video codec motion estimation prediction transform & compression deblocking software system CPU 0 CPU 1 CPU 2 hardware pipelined multiprocessing

© Copyright 2004 Jonah Probellslide 11 Data Bandwidths bitstrea m source SDRAM temporary data storage display devicevideo chip

© Copyright 2004 Jonah Probellslide 12 DRAM Optimal Data Ordering DRAM : 1k byte rows Frame mapped to DRAM rows as a C-style two- dimentional array Frame mapped to DRAM rows as square groups

© Copyright 2004 Jonah Probellslide 13 Deterministic Performance

© Copyright 2004 Jonah Probellslide 14 The Inner Loop Processor Data Aligner IMEM Control Unit 32-bit RISC Program Counter Loads & Stores Vector Unit 64-bit SIMD data Multiply Acc Data packing 3-port Regfile 5-port Regfile Switch Fabric 32 64

© Copyright 2004 Jonah Probellslide 15 The Ultra Data UD3000 Outer Loop Processor 0 Crossbar Switch Fabric System Bus Bridge Inner Loop Processor 1 instruction extensions Inner Loop Processor 0 Inner Loop Processor 2 Smart 2-D DMA Controller 2-port DMEM 2-port DMEM … FIFO … Test & Set Outer Loop Processor 1 instruction extensions

© Copyright 2004 Jonah Probellslide 16 A Killer Video Processor Architecture Software programmability Parallelism –VLIW –SIMD –Multiprocessing Appropriate use of on- and off-chip memory –Optimal organization of data structures in DRAM Deterministic performance –5-port regfile –2-port on-chip memory –DMA controller instead of caches

© Copyright 2004 Jonah Probellslide 17 Acknowledgements This presentation is © Copyright 2004 Jonah Probell ALL RIGHTS RESERVED. Certain information for this document was derived from publicly available documents of Ultra Data Corp., UB Video Inc., On2 Technologies Inc., and Wikipedia. All trademarks mentioned in this document are property of their respective owners and are hereby acknowledged. Jonah Probell (781)