Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adam Kunk Anil John Pete Bohman.  Released by IBM in 2010 (~ February)  Successor of the Power6  Clock Rate: 2.4 GHz - 4.25 GHz  Feature size: 45.

Similar presentations


Presentation on theme: "Adam Kunk Anil John Pete Bohman.  Released by IBM in 2010 (~ February)  Successor of the Power6  Clock Rate: 2.4 GHz - 4.25 GHz  Feature size: 45."— Presentation transcript:

1 Adam Kunk Anil John Pete Bohman

2  Released by IBM in 2010 (~ February)  Successor of the Power6  Clock Rate: 2.4 GHz - 4.25 GHz  Feature size: 45 nm  ISA: Power ISA v 2.06  Cores: 4, 6, 8  Cache: L1, L2, L3 References: [1]

3  PERCS – Productive, Easy-to-use, Reliable Computer System  DARPA funded contract that IBM won in order to develop the Power7 ($244 million contract, 2006) ▪ Contract was to develop a petascale supercomputer architecture before 2011 in the HPCS (High Performance Computing Systems) project.  IBM, Cray, and Sun Microsystems received HPCS grant for Phase II.  IBM was chosen for Phase III in 2006. References: [1], [2]

4  Side note:  The Blue Waters system was meant to be the first supercomputer using PERCS technology.  But, the contract was cancelled (cost and complexity).

5 2004 2001 20072010 POWER4/4+  Dual Core Dual Core  Chip Multi Processing Chip Multi Processing  Distributed Switch Distributed Switch  Shared L2 Shared L2  Dynamic LPARs (32) Dynamic LPARs (32)  180nm, 180nm, POWER5/5+  Dual Core & Quad Core Md Dual Core & Quad Core Md  Enhanced Scaling Enhanced Scaling  2 Thread SMT 2 Thread SMT  Distributed Switch + Distributed Switch +  Core Parallelism + Core Parallelism +  FP Performance + FP Performance +  Memory bandwidth + Memory bandwidth +  130nm, 90nm 130nm, 90nm POWER6/6+  Dual Core Dual Core  High Frequencies High Frequencies  Virtualization + Virtualization +  Memory Subsystem + Memory Subsystem +  Altivec Altivec  Instruction Retry Instruction Retry  Dyn Energy Mgmt Dyn Energy Mgmt  2 Thread SMT + 2 Thread SMT +  Protection Keys Protection Keys  65nm 65nm POWER7/7+  4,6,8 Core 4,6,8 Core  32MB On-Chip eDRAM 32MB On-Chip eDRAM  Power Optimized Cores Power Optimized Cores  Mem Subsystem ++ Mem Subsystem ++  4 Thread SMT++ 4 Thread SMT++  Reliability + Reliability +  VSM & VSX VSM & VSX  Protection Keys+ Protection Keys+  45nm, 32nm 45nm, 32nm POWER8 Future First Dual Core in Industry Hardware Virtualization for Unix & Linux Fastest Processor In Industry Most POWERful & Scalable Processor in Industry References: [3]

6 Cores:  8 Intelligent Cores / chip (socket)  4 and 6 Intelligent Cores available on some models  12 execution units per core  Out of order execution  4 Way SMT per core  32 threads per chip  L1 – 32 KB I Cache / 32 KB D Cache per core  L2 – 256 KB per core Chip:  32MB Intelligent L3 Cache on chip Core L2 Core L2 Memory Interface Core L2 Core L2 Core L2 Core L2 Core L2 Core L2 GXGX SMPFABRICSMPFABRIC POWERPOWER BUSBUS Memory++ L3 Cache eDRAM References: [3]

7

8  Each core implements “aggressive” out-of- order (OoO) instruction execution  The processor has an Instruction Sequence Unit capable of dispatching up to six instructions per cycle to a set of queues  Up to eight instructions per cycle can be issued to the Instruction Execution units References: [4]

9  The Power7 processor has a set of 12 execution units:  2 fixed point units  2 load store units  4 double precision floating point units  1 vector unit  1 branch unit  1 condition register unit  1 decimal floating point unit References: [4]

10

11

12

13  Simultaneous Multithreading  SMT1: Single instruction execution thread per core  SMT2: Two instruction execution threads per core  SMT4: Four instruction execution threads per core  This means that an 8-core Power7 can execute 32 threads simultaneously

14 Thread 1 ExecutingThread 0 ExecutingNo Thread Executing FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL Single thread Out of Order FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL S80 HW Multi-thread FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL POWER5 2 Way SMT FX0 FX1 FP0 FP1 LS0 LS1 BRX CRL POWER7 4 Way SMT Thread 3 ExecutingThread 2 Executing References: [3]

15

16

17  (Look at section 2.1.4 in http://www.redbooks.ibm.com/redpapers/pd fs/redp4639.pdf) http://www.redbooks.ibm.com/redpapers/pd fs/redp4639.pdf

18  (Look at section 2.1.6 in http://www.redbooks.ibm.com/redpapers/pd fs/redp4639.pdf) http://www.redbooks.ibm.com/redpapers/pd fs/redp4639.pdf

19

20

21  1. http://en.wikipedia.org/wiki/POWER7 1. http://en.wikipedia.org/wiki/POWER7  2. http://en.wikipedia.org/wiki/PERCS 2. http://en.wikipedia.org/wiki/PERCS  3. Central PA PUG POWER7 review.ppt  http://www.google.com/url?sa=t&rct=j&q=&esrc =s&source=web&cd=1&ved=0CCEQFjAA&url=ht tp%3A%2F%2Fwww.ibm.com%2Fdeveloperwor ks%2Fwikis%2Fdownload%2Fattachments%2F1 35430247%2FCentral%2BPA%2BPUG%2BPOW ER7%2Breview.ppt&ei=3El3T6ejOI-40QGil- GnDQ&usg=AFQjCNFESXDZMpcC2z8y8NkjE- v3S_5t3A

22  4. http://www.redbooks.ibm.com/redpapers/p dfs/redp4639.pdf http://www.redbooks.ibm.com/redpapers/p dfs/redp4639.pdf


Download ppt "Adam Kunk Anil John Pete Bohman.  Released by IBM in 2010 (~ February)  Successor of the Power6  Clock Rate: 2.4 GHz - 4.25 GHz  Feature size: 45."

Similar presentations


Ads by Google