+ CS 325: CS Hardware and Software Organization and Architecture Multicore Computers 1.

Slides:



Advertisements
Similar presentations
Larrabee Eric Jogerst Cortlandt Schoonover Francis Tan.
Advertisements

Multi-Core Computing Ahmad Aljebaly Department of Computer Science
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Multicore Architectures Michael Gerndt. Development of Microprocessors Transistor capacity doubles every 18 months © Intel.
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
William Stallings Computer Organization and Architecture 8th Edition
Intel Multi-Core Technology. New Energy Efficiency by Parallel Processing – Multi cores in a single package – Second generation high k + metal gate 32nm.
Structure of Computer Systems
Multi-core systems System Architecture COMP25212 Daniel Goodman Advanced Processor Technologies Group.
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
1 COMP 206: Computer Architecture and Implementation Montek Singh Mon, Dec 5, 2005 Topic: Intro to Multiprocessors and Thread-Level Parallelism.
Chapter Hardwired vs Microprogrammed Control Multithreading
Associative Cache Mapping A main memory block can load into any line of cache Memory address is interpreted as tag and word (or sub-address in line) Tag.
1 Pipelining for Multi- Core Architectures. 2 Multi-Core Technology Single Core Dual CoreMulti-Core + Cache + Cache Core 4 or more cores.
How Multi-threading can increase on-chip parallelism
Multi-core processors. History In the early 1970’s the first Microprocessor was developed by Intel. It was a 4 bit machine that was named the 4004 The.
Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.
Joram Benham April 2,  Introduction  Motivation  Multicore Processors  Overview, CELL  Advantages of CMPs  Throughput, Latency  Challenges.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CE-321: Computer.
Chapter 18 Multicore Computers
Computer performance.
Lecture 2 : Introduction to Multicore Computing Bong-Soo Sohn Associate Professor School of Computer Science and Engineering Chung-Ang University.
Computer System Architectures Computer System Software
Cell Architecture. Introduction The Cell concept was originally thought up by Sony Computer Entertainment inc. of Japan, for the PlayStation 3 The architecture.
1 VLSI and Computer Architecture Trends ECE 25 Fall 2012.
LOGO Multi-core Architecture GV: Nguyễn Tiến Dũng Sinh viên: Ngô Quang Thìn Nguyễn Trung Thành Trần Hoàng Điệp Lớp: KSTN-ĐTVT-K52.
Practical PC, 7th Edition Chapter 17: Looking Under the Hood
Lecture 2 : Introduction to Multicore Computing
Multi Core Processor Submitted by: Lizolen Pradhan
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Company LOGO High Performance Processors Miguel J. González Blanco Miguel A. Padilla Puig Felix Rivera Rivas.
Multi-core architectures. Single-core computer Single-core CPU chip.
Multi-Core Architectures
1 Computer Architecture Research Overview Rajeev Balasubramonian School of Computing, University of Utah
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
[Tim Shattuck, 2006][1] Performance / Watt: The New Server Focus Improving Performance / Watt For Modern Processors Tim Shattuck April 19, 2006 From the.
History of Microprocessor MPIntroductionData BusAddress Bus
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
Outline  Over view  Design  Performance  Advantages and disadvantages  Examples  Conclusion  Bibliography.
Hyper Threading Technology. Introduction Hyper-threading is a technology developed by Intel Corporation for it’s Xeon processors with a 533 MHz system.
Computer Organization & Assembly Language © by DR. M. Amer.
Shashwat Shriparv InfinitySoft.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Presentation 31 – Multicore, Multiprocessing, Multithreading, and Multitasking. When discussing modern PCs, the term “Multi” is thrown around a lot as.
THE BRIEF HISTORY OF 8085 MICROPROCESSOR & THEIR APPLICATIONS
MULTICORE PROCESSOR TECHNOLOGY.  Introduction  history  Why multi-core ?  What do you mean by multicore?  Multi core architecture  Comparison of.
Multi-Core Architectures 1. Single-Core Computer 2.
HyperThreading ● Improves processor performance under certain workloads by providing useful work for execution units that would otherwise be idle ● Duplicates.
Background Computer System Architectures Computer System Software.
Page 1 2P13 Week 1. Page 2 Page 3 Page 4 Page 5.
Hardware Architecture
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
MAHARANA PRATAP COLLEGE OF TECHNOLOGY SEMINAR ON- COMPUTER PROCESSOR SUBJECT CODE: CS-307 Branch-CSE Sem- 3 rd SUBMITTED TO SUBMITTED BY.
Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.
William Stallings Computer Organization and Architecture 8th Edition
Multiprocessing.
Multi-Core Computing Osama Awwad Department of Computer Science
Hyperthreading Technology
Multi-Processing in High Performance Computer Architecture:
Multi-Core Architectures
Constructing a system with multiple computers or processors
Adaptive Single-Chip Multiprocessing
William Stallings Computer Organization and Architecture 8th Edition
Chapter 4 Multiprocessors
William Stallings Computer Organization and Architecture 8th Edition
Presentation transcript:

+ CS 325: CS Hardware and Software Organization and Architecture Multicore Computers 1

+ Outline  Introduction  Motivation for Multi-Core  What is multi-core processor?  Properties of Multi-core systems  Applications benefit from multi-core  Multiprocessor memory types  Multi-core design  Symmetric multi-core processor  Asymmetric multi-core processor  Advantages & disadvantages of multi-core 2

+ Hardware Performance Issues Microprocessors have seen an exponential increase in performance Improved organization Increased clock frequency Increase in Parallelism Pipelining Superscalar (multi-issue) Simultaneous multithreading (SMT) Diminishing returns More complexity requires more logic Increasing chip area for coordinating and signal transfer logic Harder to design, make and debug 3

+ Introduction  Flood of Computer Tasks(1990’s)  Increasing number of computer users  Server management ▪ We need better performance of PC or Server. → These demands accelerate the development of microprocessor.  Emergence of Multi-core Processor(2000’s)  Improvements over single core ▪ Put execution cores in one die 4

+ Increased Complexity Power requirements grow exponentially with chip density and clock frequency Can use more chip area for cache Smaller Order of magnitude lower power requirements By 2016 >100 billion transistors on 300mm 2 die >1 billion transistors for logic 5

+ Increased Complexity Multicore has the potential for near-linear improvement Needs some programming effort Won’t work for all problems Unlikely that one core can use all of a huge cache effectively, so add processing units (cores) to make an MPSoC (Multiprocessing System on Chip) 6

+ Power and Memory Considerations Less action More action We passed 50%!!! Is this a RAM or a processor? 7

+ Chip Utilization of Transistors Cache CPU 8

+ Effective Applications for Multicore Processors Database (e.g. Select *) Servers handling independent transactions Multi-threaded native applications Lotus Domino, Siebel CRM Multi-process applications Oracle, SAP, PeopleSoft Java applications Java VM is multi-threaded with scheduling and memory management Sun’s Java Application Server, IBM Websphere, Tomcat Multi-instance applications One application running multiple times 9

+ Motivation for Multi-Core Exploits increased feature-size and density Increases functional units per chip Limits energy consumption per operation Constrains growth in processor complexity 10

+ Multi-Core Computer  A multi-core processor is a processing system composed of two or more independent cores (or CPUs). The cores are typically integrated onto a single integrated circuit die (known as a chip multiprocessor or CMP).  A many-core processor is one in which the number of cores is large enough that traditional multi- processor techniques are no longer efficient  Somewhere in the range of several tens of cores - and likely requires a network on chip. 11

+ Multi-Core Computer dual-core processor contains two independent microprocessors. A dual core set-up is somewhat comparable to having multiple, separate processors installed in the same computer. But because the two processors are actually plugged into the same socket, the connection between them is faster. Ideally, a dual core processor is nearly twice as powerful as a single core processor. In practice, performance gains are about 50%: A dual core processor is likely to be about one-and-a-half times as powerful as a single core processor. 12

+ Multi-Core Computer A multi-core processor implements multiprocessing in a single physical package. Cores may or may not share caches May implement message passing or shared memory inter-core communication methods. All cores are identical in symmetric multi-core systems. EX: Intel Core 2 Duo They are not identical in asymmetric multi-core systems. EX: IBM Cell Processor 13

+ CMP benefits with a shared on-chip cache memory, communication events can be reduced to just a handful of processor cycles. therefore with low latencies, communication delays have a much smaller impact on overall performance. threads can also be much smaller and still be effective. automatic parallelization more feasible. 14

+ Core i7 and Duo Let us review these two Intel architectures… 15

+ Individual Core Architecture Intel Core Duo uses superscalar cores More than one instruction executed at a time during a clock cycle. Intel Core i7 uses simultaneous multi-threading (SMT) Scales up number of threads supported (extended superscalar architecture) 4 SMT cores, each supporting 4 threads appears as 16 core (i7 has 2 threads per CPU) Core i7Core 2 duo 16

+ Intel x86 Multicore Organization - Core Duo 2006 Two x86 superscalar, shared L2 cache Dedicated L1 cache per core 32KB instruction and 32KB data Thermal control unit per core Manages chip heat dissipation with sensors, clock speed is throttled Maximize performance within thermal constraints Improved ergonomics (quiet fan) Advanced Programmable Interrupt Controlled (APIC) Inter-process interrupts between cores Routes interrupts to appropriate core Includes timer so OS can self-interrupt a core 17

+ Intel x86 Multicore Organization - Core Duo Power Management Logic Monitors thermal conditions and CPU activity Adjusts voltage (and thus power consumption) Can switch on/off individual logic subsystems to save power Split-bus transactions can sleep on one end 2MB shared L2 cache Dynamic allocation MESI support for L1 caches Extended to support multiple Core Duo in SMP (not SMT) L2 data shared between local cores (fast) or external Bus interface is FSB 18

+ Intel x86 Multicore Organization - Core i7 November 2008 Four x86 SMT processors Dedicated L2, shared L3 cache Speculative pre-fetch for caches On chip DDR3 memory controller Three 8 byte channels (192 bits) giving 32GB/s No front side bus (just like labs 1 & 2 with the SDRAM controller) QuickPath Interconnect Cache coherent point-to-point link High speed communications between processor chips Total bandwidth 25.6GB/s 19

+ What applications benefit from multi-core? Database servers Web servers Telecommunication markets Multimedia applications Scientific applications In general, applications with Thread-level parallelism (as opposed to instruction-level parallelism) 20

+ Multi-core architectures  Replicate multiple processor cores on a single die.  The cores fit on a single processor socket. 21

+ The cores run in parallel (like on a uniprocessor) core1core1 core2core2 core3core3 core4core4 several threads 22

+ Programming for multi-core Programmers must use threads or processes. Write parallel algorithms. OS will map threads/processes to cores Spread the workload across multiple cores. 23

+ Examples Editing a photo while recording a TV show through a digital video recorder. Downloading software while running an anti- virus program. “Anything that can be threaded today will map efficiently to multi-core”. BUT: some applications difficult to parallelize. Examples? Piped processes 24

+ Multiprocessor memory types Shared memory: In this model, there is one (large) common shared memory for all processors. Distributed memory: In this model, each processor has its own (small) local memory, and its content is not replicated anywhere else. 25

+ Microprocessor Design Taking the idea of superscalar operations to the next level, it is possible to put multiple microprocessor cores onto a single chip, and have the cores operate in parallel with one another. 26

+ Symmetric Multi-core Processor(SMP ) A symmetric multi-core processor is one that has multiple cores on a single chip, and all of those cores are identical. Example: Intel i3, i5, i7 The Intel i series CPU is an example of a symmetric multi-core processor. The i series can have either 2 cores on chip (“i3”) or 4 cores on chip (“i5/i7”). Each core in the i series CPU is symmetrical, and can function independently of one another. It requires a mixture of scheduling software and hardware to farm tasks out to each core. 27

+ Symmetric Multi-core Processor Applications Personal Computers Servers/Clusters 28

+ Asymmetric Multi-core Processor  An asymmetric multi-core processor is one that has multiple cores on a single chip, but those cores might be different designs.  For instance, there could be 2 general purpose cores and 2 vector cores on a single chip. 29

+ Asymmetric Multi-core Processor(ASMP) – Cell Processor Applications  Super Computing: ▪ IBM's latest supercomputer, IBM Roadrunner, is a hybrid of General Purpose CISC Opteron as well as Cell processors. 30

+ Applications  Home cinema ▪ Toshiba is considering producing HDTVs using Cell. They have already presented a system to decode 48 standard definition MPEG-2 streams. This can enable a viewer to choose a channel based on dozens of thumbnail videos displayed on the screen in the same time. Asymmetric Multi-core Processor(ASMP) – Cell Processor 31

+ Applications  Video Processing Card ▪ Some companies, such as Leadtek, have plans to release a PCI-E card based upon the Cell to allow for "faster than real time" transcoding of H.264, MPEG-2 and MPEG-4 video. Asymmetric Multi-core Processor(ASMP) – Cell Processor 32

+ Applications  Console Video Games ▪ The first major commercial application of Cell was in Sony's PlayStation 3 game console. ▪ This video game console contains the first production application of the Cell processor, clocked at 3.2 GHz and containing seven out of eight operational cores Asymmetric Multi-core Processor(ASMP) – Cell Processor 33

+  Future  Based on the unique features, Cell can bridge the gap between  conventional desktop processors  and more specialized high-performance processors, such as the NVIDIA and ATI graphics-processors (GPUs). Asymmetric Multi-core Processor(ASMP) – Cell Processor 34

+ Challenges resulting from multi-core  Aggravates memory wall  Memory bandwidth ▪ Way to get data out of memory banks ▪ Way to get data into multi-core processor array  Memory latency  Fragments L3 cache  Pins become strangle point ▪ Rate of pin growth projected to slow and flatten ▪ Rate of bandwidth per pin (pair) projected to grow slowly  Requires mechanisms for efficient inter-processor coordination  Synchronization  Mutual exclusion  Context switching 35

+ Advantages of Multi-core Cache circuitry can operate at a much higher clock rate than is possible if the signals have to travel off-chip. Signals between different CPUs (cores) travel shorter distances, those signals degrade less. These higher quality signals allow more data to be sent in a given time period. A dual-core processor uses slightly less power than two coupled single-core processors. 36

+ Disadvantages of Multi-core Ability of multi-core processors to increase application performance depends on the use of multiple threads within applications. Most Current video games will run faster on a 3 GHz single- core processor than on a 2GHz dual-core processor (of the same core architecture. Two processing cores sharing the same system bus and memory bandwidth limits the real-world performance advantage. If a single core is close to being memory bandwidth limited, going to dual-core might only give 30% to 70% improvement. If memory bandwidth is not a problem, a 90% improvement can be expected. 37

+ Conclusion  Multi-core processors represent an important new trend in computer architecture.  Decreased power consumption and heat generation.  Minimized wire lengths and interconnect latencies.  They enable true thread-level parallelism with great energy efficiency and scalability.  To utilize their full potential, applications will need to move from a single to a multi-threaded model.  Parallel programming techniques likely to gain importance.  The difficult problem is not building multi-core hardware, but programming it in a way that lets mainstream applications benefit from the continued growth in CPU performance. 38