Presentation on theme: "Ahmad Aljebaly Department of Computer Science Western Michigan University."— Presentation transcript:
Ahmad Aljebaly Department of Computer Science Western Michigan University
Introduction Motivation for Multi-Core What is multi-core processor? Properties of Multi-core systems Applications benefit from multi-core Multiprocessor memory types Multi-core design Symmetric multi-core processor Asymmetric multi-core processor Advantages & disadvantages of multi-core
First Microprocessor(1970’s) Intel 4004 PC spreads in the world(1980’s) Up to 32bits microprocessor AMD followed Intel’s technology
Flood of Computer Tasks(1990’s) Increasing of Computer user Server management → Construct the database ▪ We need better performance of PC or Server. → These demands accelerate the development of microprocessor. Emergence of Multi-core Processor(2000’s) Limits of improvement of single core Turn over of the idea that improve the MP technology ▪ Put execution cores in one die
Exploits increased feature-size and density Increases functional units per chip (spatial efficiency) Limits energy consumption per operation Constrains growth in processor complexity
A multi-core processor is a processing system composed of two or more independent cores (or CPUs). The cores are typically integrated onto a single integrated circuit die (known as a chip multiprocessor or CMP), or they may be integrated onto multiple dies in a single chip package. A many-core processor is one in which the number of cores is large enough that traditional multi- processor techniques are no longer efficient - this threshold is somewhere in the range of several tens of cores - and likely requires a network on chip.
dual-core processor contains two independent microprocessors. A dual core set-up is somewhat comparable to having multiple, separate processors installed in the same computer, but because the two processors are actually plugged into the same socket, the connection between them is faster. Ideally, a dual core processor is nearly twice as powerful as a single core processor. In practice, performance gains are said to be about fifty percent: a dual core processor is likely to be about one-and-a-half times as powerful as a single core processor.
A multi-core processor implements multiprocessing in a single physical package. Cores in a multi-core device may be coupled together tightly or loosely. For example, cores may or may not share caches, and they may implement message passing or shared memory inter-core communication methods. Common network topologies to interconnect cores include: bus, ring, 2- dimentional mesh, and crossbar. All cores are identical in symmetric multi-core systems and they are not identical in asymmetric multi-core systems. Just as with single-processor systems, cores in multi-core systems may implement architectures such as superscalar, vector processing, or multithreading.
Multi-core processors are widely used across many application domains including: general-purpose, embedded, network, digital signal processing, and graphics. The amount of performance gained by the use of a multi- core processor is strongly dependent on the software algorithms and implementation. Multi-core processing is a growing industry trend as single core processors rapidly reach the physical limits of possible complexity and speed. Companies that have produced or are working on multi- core products include AMD, ARM, Broadcom, Intel, and VIA.
with a shared on-chip cache memory, communication events can be reduced to just a handful of processor cycles. therefore with low latencies, communication delays have a much smaller impact on overall performance. threads can also be much smaller and still be effective. automatic parallelization more feasible.
Cores will be shared with a wide range of other applications dynamically. Load can no longer be considered symmetric across the cores. Cores will likely not be asymmetric as accelerators become common for scientific hardware. Source code will often be unavailable, preventing compilation against the specific hardware configuration.
Database servers Web servers Telecommunication markets Multimedia applications Scientific applications In general, applications with Thread-level parallelism (as opposed to instruction-level parallelism)
Replicate multiple processor cores on a single die. The cores fit on a single processor socket.
core1core1 core2core2 core3core3 core4core4 several threads
Programmers must use threads or processes. Spread the workload across multiple cores. Write parallel algorithms. OS will map threads/processes to cores
Most major OS support multi-core today. OS perceives each core as a separate processor. OS scheduler maps threads/processes to different cores.
Editing a photo while recording a TV show through a digital video recorder. Downloading software while running an anti-virus program. “Anything that can be threaded today will map efficiently to multi-core”. BUT: some applications difficult to parallelize.
Better Performance ▪ For the Multi tasking ▪ e.g. Burning CD with graphic works at the same time Power consumption and Heat generation ▪ Caused from the advance of CPU clock speed
Save the room of motherboard ▪ Two single cores → In one die ▪ We can use this room more efficiently Simplicity ▪ We need additional systems to control the several single cores. Economical efficiency ▪ A dual-core is much cheaper than two single cores
Shared memory: In this model, there is one (large) common shared memory for all processors. Distributed memory: In this model, each processor has its own (small) local memory, and its content is not replicated anywhere else.
Taking the idea of superscalar operations to the next level, it is possible to put multiple microprocessor cores onto a single chip, and have the cores operate in parallel with one another.
A symmetric multi-core processor is one that has multiple cores on a single chip, and all of those cores are identical. ▪ Example: Intel Core 2: ▪ The Intel Core 2 is an example of a symmetric multi- core processor. The Core 2 can have either 2 cores on chip ("Core 2 Duo") or 4 cores on chip ("Core 2 Quad"). Each core in the Core 2 chip is symmetrical, and can function independently of one another. It requires a mixture of scheduling software and hardware to farm tasks out to each core.
All cores which exist in a die are exactly identical
A symmetric multi-core processor is a processor which has multiple cores that are all exactly the same. Every single core has the same architecture and the same capabilities. Each core has the same capabilities, so it requires that there is an arbitration unit to give each core a specific task. Software that uses techniques like multithreading makes the best use of a multi-core processor like the Intel Core2.
Applications Personal Computers Server / Super Computer
An asymmetric multi-core processor is one that has multiple cores on a single chip, but those cores might be different designs. For instance, there could be 2 general purpose cores and 2 vector cores on a single chip. ▪ Example: Cell Processor: ▪ IBM's Cell processor, used in the Sony PlayStation 3 video game console is an asymmetrical multi-core processor. The Cell has 9 processor cores on board, one general purpose processor, and 8 data-processing cores. The one multipurpose core, known as the Power Processor Element (PPE) controls the communication between the other cores, and distributes computing tasks to the other cores for processing. The other 8 cores are known as Synergistic Processor Elements (SPE), and are specially designed to have high floating-point throughput, especially with vector operations.
▪ In an asymmetric multi-core processor, the chip has multiple cores onboard, but the cores might be different designs. ▪ Each core will have different capabilities.
The IBM Cell processor has 1 Power Processor Element (PPE) that controls the chip, and 8 Synergistic Processor Elements (SPEs) that are designed for high mathematical throughput. The IBM Cell processor is designed as follows: Notice how the SPE cores only connect to the PPE, and not to each other. Notice also that the PPE core is much larger then the individual SPE cores. An example of an asymmetric multi-core processor is the IBM Cell processor.
Applications Super Computing: ▪ IBM's latest supercomputer, IBM Roadrunner, is a hybrid of General Purpose CISC Opteron as well as Cell processors.
Applications Home cinema ▪ Toshiba is considering producing HDTVs using Cell. They have already presented a system to decode 48 standard definition MPEG-2 streams. This can enable a viewer to choose a channel based on dozens of thumbnail videos displayed on the screen in the same time.
Applications Video Processing Card ▪ Some companies, such as Leadtek, have plans to release a PCI-E card based upon the Cell to allow for "faster than real time" transcoding of H.264, MPEG-2 and MPEG-4 video.
Applications Console Video Games ▪ The first major commercial application of Cell was in Sony's PlayStation 3 game console. ▪ This video game console contains the first production application of the Cell processor, clocked at 3.2 GHz and containing seven out of eight operational SPEs, to allow Sony to increase the yield on the processor manufacture. Only six of the seven SPEs are accessible to developers as one is reserved by the OS.
Future Based on the unique features, Cell can bridge the gap between conventional desktop processors (such as the Athlon 64, and Core 2 families) and more specialized high-performance processors, such as the NVIDIA and ATI graphics-processors (GPUs). Cell will expand its intended use in current and future digital distribution systems, as well as in high-definition displays and recording equipment and computer entertainment systems.
Future of the SMP (Related to ASMP) Easy to implement that lots of cores put in one integrated circuit Easier programming than ASMP ▪ Because all the cores are identical Easy to keep the development speed Apply to any type of system (General usage)
Future of the SMP (Related to ASMP) Not proper to certain specific system ▪ Audio/video processing, data compression, and so on Waste the silicon and power ▪ Because it is made for the general purpose ▪ Less efficiency than ASMP These are reasons why ASMP has emerged.
Relies on effective exploitation of multiple-thread parallelism Need for parallel computing model and parallel programming model Aggravates memory wall Memory bandwidth ▪ Way to get data out of memory banks ▪ Way to get data into multi-core processor array Memory latency Fragments L3 cache Pins become strangle point ▪ Rate of pin growth projected to slow and flatten ▪ Rate of bandwidth per pin (pair) projected to grow slowly Requires mechanisms for efficient inter-processor coordination Synchronization Mutual exclusion Context switching
Cache coherency circuitry can operate at a much higher clock rate than is possible if the signals have to travel off-chip. Signals between different CPUs travel shorter distances, those signals degrade less. These higher quality signals allow more data to be sent in a given time period since individual signals can be shorter and do not need to be repeated as often. A dual-core processor uses slightly less power than two coupled single-core processors.
Ability of multi-core processors to increase application performance depends on the use of multiple threads within applications. Most Current video games will run faster on a 3 GHz single-core processor than on a 2GHz dual-core processor (of the same core architecture. Two processing cores sharing the same system bus and memory bandwidth limits the real-world performance advantage. If a single core is close to being memory bandwidth limited, going to dual-core might only give 30% to 70% improvement. If memory bandwidth is not a problem, a 90% improvement can be expected.
All computers are now parallel computers! Multi-core processors represent an important new trend in computer architecture. Decreased power consumption and heat generation. Minimized wire lengths and interconnect latencies. They enable true thread-level parallelism with great energy efficiency and scalability. To utilize their full potential, applications will need to move from a single to a multi-threaded model. Parallel programming techniques likely to gain importance. the difficult problem is not building multi-core hardware, but programming it in a way that lets mainstream applications benefit from the continued exponential growth in CPU performance. the software industry needs to get back into the state where existing applications run faster on new hardware.
http://en.wikipedia.org/wiki/Multi-core_(computing) http://en.wikipedia.org/wiki/Multi-core_(computing) Olukotun, Kunle and Hammond, Lance. The future of microprocessors.Queue, Volume 3, Issue 7, September 2005. www.princeton.edu/~jdonald/research/hyperthreading/garg_re port.pdf www.princeton.edu/~jdonald/research/hyperthreading/garg_re port.pdf Zheltov, Sergey N. and Bratanov, Stanislav V. Multi-threading for Experts: Synchronization. Technical Report. Intel. 2005. (WWWdocument, referenced 17.11.2005). Available: http://www.intel.com/cd/ids/developer/asmo- na/eng/183321.htm http://www.intel.com/cd/ids/developer/asmo- na/eng/183321.htm
Question Give a definition and an example for each of: 1.A symmetric multi-core processor 2.An asymmetric multi-core processor Answer: A symmetric multi-core processor is one that has multiple cores on a single chip, and all of those cores are identical. ▪ Example: Intel Core 2 An asymmetric multi-core processor is one that has multiple cores on a single chip, but those cores might be different designs. ▪ Example: Cell Processor.