Playstation2 Architecture Architecture Hardware Design.

Slides:



Advertisements
Similar presentations
Memory Interleaving.
Advertisements

Vector Unit Assembly Overview Architecture Review VU0 Macro Mode Instruction Set Building a Vector Library.
Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
KeyStone C66x CorePac Overview
Shortcomings of The Simple CPUs
Fall EE 333 Lillevik 333f06-l20 University of Portland School of Engineering Computer Organization Lecture 20 Pipelining: “bucket brigade” MIPS.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
Microprocessors. Von Neumann architecture Data and instructions in single read/write memory Contents of memory addressable by location, independent of.
Processor System Architecture
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
Chapter 4 Conventional Computer Hardware Architecture
Computer Architecture and Data Manipulation Chapter 3.
PlayStation2 as a General Purpose Computer (The Emotion Engine vs. general PC architectures)
1 Microprocessor-based Systems Course 4 - Microprocessors.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Computer Performance.
Computational Astrophysics: Methodology 1.Identify astrophysical problem 2.Write down corresponding equations 3.Identify numerical algorithm 4.Find a computer.
Chapter 7 Interupts DMA Channels Context Switching.
Midterm Tuesday October 23 Covers Chapters 3 through 6 - Buses, Clocks, Timing, Edge Triggering, Level Triggering - Cache Memory Systems - Internal Memory.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Operating Systems Béat Hirsbrunner Main Reference: William Stallings, Operating Systems: Internals and Design Principles, 6 th Edition, Prentice Hall 2009.
GCSE Computing - The CPU
architectural overview
Computer Systems CS208. Major Components of a Computer System Processor (CPU) Runs program instructions Main Memory Storage for running programs and current.
PlayStation 2 Architecture Irin Jose Farid Momin Quy Ngo Olivia Wong.
5.1 Chaper 4 Central Processing Unit Foundations of Computer Science  Cengage Learning.
Inside The CPU. Buses There are 3 Types of Buses There are 3 Types of Buses Address bus Address bus –between CPU and Main Memory –Carries address of where.
1 Computer System Overview Chapter 1 Review of basic hardware concepts.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Emotion Engine A look at the microprocessor at the center of the PlayStation2 gaming console Charles Aldrich.
Computer performance.
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
1 Copyright © 2011, Elsevier Inc. All rights Reserved. Appendix E Authors: John Hennessy & David Patterson.
Semiconductor Memory 1970 Fairchild Size of a single core –i.e. 1 bit of magnetic core storage Holds 256 bits Non-destructive read Much faster than core.
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
Lecture#14. Last Lecture Summary Memory Address, size What memory stores OS, Application programs, Data, Instructions Types of Memory Non Volatile and.
Microcontrollers Microcontroller (MCU) – An integrated electronic computing device that includes three major components on a single chip Microprocessor.
David Carter SCEE Technology Group
1 Computer System Overview Chapter 1. 2 n An Operating System makes the computing power available to users by controlling the hardware n Let us review.
1 CS503: Operating Systems Spring 2014 Dongyan Xu Department of Computer Science Purdue University.
2007 Oct 18SYSC2001* - Dept. Systems and Computer Engineering, Carleton University Fall SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices.
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Chapter 1: Introduction. 1.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts Chapter 1: Introduction What Operating Systems Do Computer-System.
COMPUTER SYSTEM OVERVIEW. Operating Systems: Internals and Design Principles “No artifact designed by man is so convenient for this kind of functional.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
2009 Sep 10SYSC Dept. Systems and Computer Engineering, Carleton University F09. SYSC2001-Ch7.ppt 1 Chapter 7 Input/Output 7.1 External Devices 7.2.
Operating Systems Lecture No. 2. Basic Elements  At a top level, a computer consists of a processor, memory and I/ O Components.  These components are.
C66x CorePac: Achieving High Performance. Agenda 1.CorePac Architecture 2.Single Instruction Multiple Data (SIMD) 3.Memory Access 4.Pipeline Concept.
The fetch-execute cycle. 2 VCN – ICT Department 2013 A2 Computing RegisterMeaningPurpose PCProgram Counter keeps track of where to find the next instruction.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
EECB 473 Data Network Architecture and Electronics Lecture 1 Conventional Computer Hardware Architecture
Lecture 1: Review of Computer Organization
Chapter 5: Computer Systems Design and Organization Dr Mohamed Menacer Taibah University
The Central Processing Unit (CPU)
System Hardware FPU – Floating Point Unit –Handles floating point and extended integer calculations 8284/82C284 Clock Generator (clock) –Synchronizes the.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Emotion Engine™ AKA the “Playstation 2” Architecture Or The progeny of a MIPS and a DSP By Idan Gazit – June 2002.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
Sun Microsystems’ UltraSPARC-IIi a Stunt-Free Presentation by Christine Munson Amanda Peters Carl Sadler.
COMP SYSTEM ARCHITECTURE PRACTICAL CACHES Sergio Davies Feb/Mar 2014COMP25212 – Lecture 3.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
On-chip Parallelism Alvin R. Lebeck CPS 220/ECE 252.
Interrupts and Exception Handling. Execution We are quite aware of the Fetch, Execute process of the control unit of the CPU –Fetch and instruction as.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
CPIT Program Execution. Today, general-purpose computers use a set of instructions called a program to process data. A computer executes the.
GCSE Computing - The CPU
GCSE Computing - The CPU
Presentation transcript:

Playstation2 Architecture Architecture Hardware Design

System Overview The listing below is a clean view of the design behind the Playstation2 hardware The listing below is a clean view of the design behind the Playstation2 hardware

CORE CPU General Purpose MIPS variant General Purpose MIPS variant 128bit SIMD integer multimedia extensions 128bit SIMD integer multimedia extensions ICACHE and DCACHE ICACHE and DCACHE Scratch Pad RAM Scratch Pad RAM Dedicated FPU coprocessor Dedicated FPU coprocessor CPU CORE SPR 16 KB I$ 16KB D$ 8KB Emotion Engine FPU

Dedicated FPU FPU – Floating-Point Processing Unit FPU – Floating-Point Processing Unit This unit is used to handle fast floating- point operations This unit is used to handle fast floating- point operations Playstation2 is optimized for 32bit operations. Playstation2 is optimized for 32bit operations. “double” data type or 64bit floating operations are much slower and cause major bottle necks “double” data type or 64bit floating operations are much slower and cause major bottle necks

SIMD SIMD - Single Instruction Multiple Data SIMD - Single Instruction Multiple Data 128bit SIMD allows for a single operation to be applied to four integers / floats 128bit SIMD allows for a single operation to be applied to four integers / floats The operations that can be performed are specific to the CPU The operations that can be performed are specific to the CPU SIMD is especially useful in games for all of it’s complex vector and matrix math SIMD is especially useful in games for all of it’s complex vector and matrix math

How SIMD Works If given two packed data elements the operation is performed to all of the components in each element If given two packed data elements the operation is performed to all of the components in each element

Typical System Layout: Cache Dependency The cache is found on the CPU and has faster access times than system memory The cache is found on the CPU and has faster access times than system memory

CACHE The purpose of cache is to reduce the time it takes to execute redundant operations or access data values The purpose of cache is to reduce the time it takes to execute redundant operations or access data values ICACHE – Instruction Cache ICACHE – Instruction Cache DCACHE – Data Caches DCACHE – Data Caches SPR – Scratch Pad RAM SPR – Scratch Pad RAM

How Cache Works CPU Fetch ICACHE System RAM DCACHE Priority: Cache > System Memory

Hardware Controllers A controller is a device used to interface and communicate with a piece of hardware A controller is a device used to interface and communicate with a piece of hardware Every major component has a controller for their interface Every major component has a controller for their interface The user application will typically use registers or interrupt calls to access the controller devices The user application will typically use registers or interrupt calls to access the controller devices

DMA Controller DMA – Direct Memory Access DMA – Direct Memory Access DMAC is the arbiter for the main bus DMAC is the arbiter for the main bus Used to transfer data between processes Used to transfer data between processes Allows for some parallelism Allows for some parallelism DMA Controller 10 CH CPU CORE I$D$ 128bit

Vector Units Playstation2 has two vector units that are similar but not the same Playstation2 has two vector units that are similar but not the same VU0 is the CPU’s alternate processing unit. VU0 is the CPU’s alternate processing unit. VU1 is the GS’s alternate processing unit VU1 is the GS’s alternate processing unit Each Unit has a direct pipeline to it’s alternate processor Each Unit has a direct pipeline to it’s alternate processor Vector Units are designed for vectors Vector Units are designed for vectors (imagine that)

DMAC and Graphics DMAC feeds VU1 with needed data, and does so with no CPU intervention DMAC feeds VU1 with needed data, and does so with no CPU intervention Data that is transferred to VU1 is resident on system RAM Data that is transferred to VU1 is resident on system RAM CPU is now free to process any instructions that have made hits in the instruction cache CPU is now free to process any instructions that have made hits in the instruction cache CPU can also access any information in the data cache CPU can also access any information in the data cache

VU Architecture VU0/1 each have access to 32 float registers and 16 integer register VU0/1 each have access to 32 float registers and 16 integer register Float registers are not your average PC style registers; they are 128bits in size Float registers are not your average PC style registers; they are 128bits in size 128bits can conveniently fit 4 float values at once (very similar to SIMD architecture) 128bits can conveniently fit 4 float values at once (very similar to SIMD architecture) Integer registers are typically used as loop counters and address calculators Integer registers are typically used as loop counters and address calculators

VU0 VU0 has two bus lines VU0 has two bus lines One bus is dedicated to the CPU One bus is dedicated to the CPU The other bus is used to communicate with all other devices The other bus is used to communicate with all other devices Access to shared bus lines always need to be monitored Access to shared bus lines always need to be monitored VU0 has 4KB of $ VU0 has 4KB of $ VU0 I$ 4KB D$ 4KB CPU CORE SYS RAM shared bus dedicated

Shared Buses and VU0 Why do we need to monitor shared buses? Why do we need to monitor shared buses? –Only one process can access shared devices at a time –Any access operations through a shared bus will cause all other processes to wait Using the VU registers and reducing RAM access will help prevent shared access Using the VU registers and reducing RAM access will help prevent shared access

VU1 VU1 has two bus lines VU1 has two bus lines Main bus is dedicated to the GS Main bus is dedicated to the GS Has almost identical functionality as VU0 Has almost identical functionality as VU0 Main purpose of VU1 is to process the data before the GS Main purpose of VU1 is to process the data before the GS VU1 has 16KB of $ VU1 has 16KB of $ VU1 I$ 16KB D$ 16KB GS CORE SYS RAM shared bus dedicated

Review Playstation2 is like having 4x300 MHz processors Playstation2 is like having 4x300 MHz processors –CPU + VU0 + VU1 + GS Cache utilization is the key to reaching the limits of this system Cache utilization is the key to reaching the limits of this system VU0 is primarily for CPU vector operations VU0 is primarily for CPU vector operations VU1 is dedicated to geometry processing VU1 is dedicated to geometry processing GS manages hardware support of triangle rasterization GS manages hardware support of triangle rasterization