Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29.

Similar presentations


Presentation on theme: "Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29."— Presentation transcript:

1 Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29

2 Computer Science and Engineering Copyright by Hesham El-Rewini Contents Group workExams AssignmentsProject Presentations Literature Search Lectures

3 Computer Science and Engineering Copyright by Hesham El-Rewini Put-it-all-together Memory System Design Pipeline Design Techniques Multiprocessors Shared Memory Systems Message Passing Systems Multiprocessor Systems-on-Chips Network Computing

4 Computer Science and Engineering Copyright by Hesham El-Rewini Put-it-all-together Memory System Design

5 Computer Science and Engineering Copyright by Hesham El-Rewini Memory Hierarchy CPU Registers Cache Main Memory Secondary Storage Latency Bandwidth Speed Cost per bit

6 Computer Science and Engineering Copyright by Hesham El-Rewini Pentium IV two-level cache Cache Level 1 L1 Cache Level 2 L2 Main Memory Processor

7 Computer Science and Engineering Copyright by Hesham El-Rewini Placement Policies How to Map memory blocks (lines) to Cache block frames (line frames) Blocks (lines) Block Frames (Line Frames) Memory Cache n Direct Mapping n Fully Associative n Set Associative

8 Computer Science and Engineering Copyright by Hesham El-Rewini Direct Mapping 128 129 255 0 1 127 3968 4095 0 1 2 127 Memory Tag cache 0131 5 bits TagBlock frameWord 475

9 Computer Science and Engineering Copyright by Hesham El-Rewini Example – Fully Associate 0 1 4094 4095 0 1 2 127 Memory Tag cache 12 bits TagWord 412

10 Computer Science and Engineering Copyright by Hesham El-Rewini Example – Set Associate 0 1 2 3 126 127 Set 0 Tag cache 7 bits Set 31 32 33 63 0 1 314095 Memory 01 127 124 125 4 TagSetWord 57

11 Computer Science and Engineering Copyright by Hesham El-Rewini Put-it-all-together Pipeline Design Techniques

12 Computer Science and Engineering Copyright by Hesham El-Rewini Pipeline Task 1 2 n Sub-tasks 1 2 n Pipeline Stream of Tasks

13 Computer Science and Engineering Copyright by Hesham El-Rewini 5 Tasks on 4 stage pipeline Task 1 Task 2 Task 3 Task 4 Task 5 1 23 4 5 67 8 Time

14 Computer Science and Engineering Copyright by Hesham El-Rewini Speedup t t t 1 2 n Pipeline Stream of m Tasks T (Seq) = n * m * t T(Pipe) = n * t + (m-1) * t Speedup = n * m/n + m -1

15 Computer Science and Engineering Copyright by Hesham El-Rewini Linear Pipeline  Processing Stages are linearly connected  Perform fixed function  Synchronous Pipeline  Clocked latches between Stage i and Stage i+1  Equal delays in all stages  Asynchronous Pipeline (Handshaking)

16 Computer Science and Engineering Copyright by Hesham El-Rewini Reservation Table X X X X S1 S2 S3 S4 Time

17 Computer Science and Engineering Copyright by Hesham El-Rewini Non Linear Pipelines  Variable functions  Feed-Forward  Feedback

18 Computer Science and Engineering Copyright by Hesham El-Rewini 3 stages & 2 functions S1 S2 S3 Y X

19 Computer Science and Engineering Copyright by Hesham El-Rewini Reservation Tables for X & Y XXX XX XXX YY Y YYY S1 S2 S3 S1 S2 S3

20 Computer Science and Engineering Copyright by Hesham El-Rewini State Diagram 1 0 1 1 0 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 3 6 8+8+ 6 8+8+ 8+8+ 3*3* 1*1*

21 Computer Science and Engineering Copyright by Hesham El-Rewini Put-it-all-together Multiprocessors Shared Memory Systems Message Passing Systems Multiprocessor Systems-on-Chips Network Computing

22 Computer Science and Engineering Copyright by Hesham El-Rewini Types of Parallelism Single Data Stream Multiple Data Stream Single Instruction Stream SISD Uniprocessors SIMD Array Processors Vector Multiple Instruction Stream MISDMIMD Multiprocessors Multicomputers Flynn’s Taxonomy

23 Computer Science and Engineering Copyright by Hesham El-Rewini Walk 4 miles /hour Bike 10 miles / hour Car-1 50 miles / hour Car-2 120 miles / hour Car-3 600 miles /hour 200 miles 20 hours A B must walk Amdhal’s Law

24 Computer Science and Engineering Copyright by Hesham El-Rewini 10%20%30%40%50%60%70%80%90%99% 0 5 10 15 20 25 Speedup % Serial 1000 CPUs 16 CPUs 4 CPUs Amdahl’s Law

25 Computer Science and Engineering Copyright by Hesham El-Rewini Gustafson – Barsis Law (1988)  Gordon Bell Prize  Overcoming the conceptual barrier established by Amdahl’s law  Scale the problem to the size of the parallel system  No fixed size problem

26 Computer Science and Engineering Copyright by Hesham El-Rewini 0 20 40 60 80 100 10%20%30%40%50%60%70%80%90%99% % Serial Speedup Gustafson-Barsis Amdhal Amdahl vs. Gustafson-Barsis

27 Computer Science and Engineering Copyright by Hesham El-Rewini SIMD Systems Processor Memory P M P M P M P M P M P M P M P M P M P M P M P M P M P M P M P M von Neumann Computer Some Interconnection Network One control unit Lockstep All Ps do the same or nothing

28 Computer Science and Engineering Copyright by Hesham El-Rewini MIMD Shared Memory Systems Interconnection Networks MM MM PPPPP P C P C P C P C MMMM Global Memory P C P C P C One global memory Cache Coherence All Ps have equal access to memory

29 Computer Science and Engineering Copyright by Hesham El-Rewini Cache Coherent NUMA Interconnection Network M C P M C P M C P M C P Each P has part of the shared memory Non uniform memory access

30 Computer Science and Engineering Copyright by Hesham El-Rewini MIMD Distributed Memory Systems Interconnection Networks MMMM PPPP 1110 1111 1010 1011 0110 0111 0010 0011 1101 1010 1000 1001 0100 0101 0010 0000 0001 S LAN/WAN No shared memory Message Passing Topology

31 Computer Science and Engineering Copyright by Hesham El-Rewini Cluster Architecture M C P I/O OS M C P I/O OS M C P I/O OS Middleware Programming Environment Interconnection Network Home cluster

32 Computer Science and Engineering Copyright by Hesham El-Rewini Internet Grids Dependable, consistent, pervasive, and inexpensive access to high end computing. Geographically distributed platforms.

33 Computer Science and Engineering Copyright by Hesham El-Rewini Multi-core Gate delay does not reduce much The frequency and performance of each core is the same or a little less than previous generation Generation N Generation N Generation N Technology Generation N Technology Generation N+1

34 Computer Science and Engineering Copyright by Hesham El-Rewini 10 100 1 200320052007200920112013 Increasing HW Threads HT Multi-core Era Scalar and Parallel Applications Many-core Era Massively Parallel Applications From HT to Many-Core Intel predicts 100’s of cores on a chip in 2015

35 Computer Science and Engineering Copyright by Hesham El-Rewini Four Eras 1970198019902000Beyond 2000 Parallelism Level Processor level Machine level (In box) LAN levelWAN levelChip level ArchitectureVectorSMP / MPPClusterGridMulti-Core ThreadsOneMultiple Interconnection Network NoneBus, switch, mesh, hypercube Ethernet, Switch InternetOn Chip SystemCustom CommodityCombinationSoC ProgrammingVector Fortran C*, C-Linda, Occam, many others PVM, MPI, HPF, … MPI, OpenMP, … ?

36 Computer Science and Engineering Copyright by Hesham El-Rewini Degree of Coupling SIMDMIMD Shared Memory Distributed Memory Supported Grain Sizes Communication Speed slowfast finecoarse loose tight SIMDSMPCC-NUMADMPCClusterGridOn Chip!

37 Computer Science and Engineering Copyright by Hesham El-Rewini Good Luck to You!!!


Download ppt "Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 May 2, 2006 Session 29."

Similar presentations


Ads by Google