Presentation is loading. Please wait.

Presentation is loading. Please wait.

IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.01.

Similar presentations


Presentation on theme: "IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.01."— Presentation transcript:

1 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.01

2 Classes of computers Based on their physical size, performance and application areas they are divided into four categories as Classification of computers Micro Hand-heldDesktopLaptop MiniMainframeSuper IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.02

3 Micro Computers Small,low cost digital computer, which usually consists of a microprocessor, a storage unit, an input channel, and an output channel, all of which may be on one chip inserted into one or several PC board. Power supply, connecting cables, peripherals an OS and software program can provide a complete micro computer system. Smallest of the computer family. They were designed for the individual users only but nowadays they have become powerful tools for many businesses that when networked together can serve more than one user. e.g: IBM-PC, Pentium 100, Pentium 200, Apple Macintosh IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.03

4 Desktop Computer:  PC (Personal computer) intended for stand alone use by an individual.  system unit, a display monitor, a keyboard, internal hard disk storage, and other peripheral devices.  Not very expensive.  APPLE, IBM, Dell & Hewlett Packard. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.04

5 Laptop:  Portable computer. It resembles a notebook, it is also known as notebook.  Features of a normal desktop’s.  Advantage: use this at anywhere at any time ( when one is traveling).  No need of external power supply, only rechargeable battery is enough.  Expensive when compared to desktop computers. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.05

6 Hand held Computers:  PDA (Personal Digital Assistant) conveniently be stored in a pocket used while the user is holding it.  Slightly bigger than the calculator.  Pen or electronic stylus as input device.  Also called as Palmtop computers.  No disk drive instead they have small cards to store programs and data. But they can be connected to a printer to get the output.  Limited memory & not powerful as desktop computers. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.06

7 Mini Computers Small digital computer whose process & storage capacity micro computer. Speed in between mainframe & micro computer. Size: two drawing filing cabinet. Also called as mid range computer Supporting 4 to about 200 simultaneous users. Multi user system for real time controls & engineering work. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.07

8 Mainframe Computer  Large expensive-simultaneous dp 100s or 1000s of users.  Used to store, manage & process large amounts of data that need to be reliable, secure & centralized.  Supports large volumes of dp, high performance online transaction processing systems & extensive data storage & retrieval.  Mainframes are 2nd largest IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.08

9 Super Computers  Special purpose machines. specially designed to maximise the numbers of FLOPS (Floating Point Operations Per Second).>1gigaflop/sec  Highest processing speed to solve engineering and scientific problems.  Number of CPU that operate in parallel.  Speed :400-10000 MFLOPS.  Resolve complex mathematical equations in a few hours.  Fastest, costliest and most powerful computer.  Used area: Aerodynamic metrology, plasma physics, military strategist and Cinematics specialist. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.09

10 10

11 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.011

12 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.012

13 Why multi-core ? Difficult to make single-core clock frequencies even higher Deeply pipelined circuits: –heat problems –Clock problems –Efficiency (Stall) problems Doubling issue rates above today’s 3-6 instructions per clock, say to 6 to 12 instructions, is extremely difficult –issue 3 or 4 data memory accesses per cycle, –rename and access more than 20 registers per cycle, and –fetch 12 to 24 instructions per cycle. Many new applications are multithreaded General trend in computer architecture (shift towards more parallelism) IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.013

14 Multi-Core Processor is a Special Kind of a Multiprocessor IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.014

15 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.015 Replicate multiple processor cores on a single die. Multi-Core Architectures

16 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.016 The cores fit on a single processor socket Also called CMP (Chip Multi-Processor) Multi-Core CPU Chip

17 Trends in Technology Capacity Speed (latency) Logic: 2x in 3 years 2x in 3 years DRAM: 4x in 3 years 2x in 10 years Disk: 4x in 3 years 2x in 10 years DRAM Generations Year Size Cycle Time 1980 64 Kb 250 ns 1983 256 Kb 220 ns 1986 1 Mb 190 ns 1989 4 Mb 165 ns 1992 16 Mb 120 ns 1996 64 Mb 110 ns 1998 128 Mb 100 ns 2000 256 Mb 90 ns 2002 512 Mb 80 ns 2006 1024 Mb 60ns 16000:1 4:1 (Capacity) (Latency) IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.017

18 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.018

19 Power, Energy & Cost IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.019  A Multi-core processor uses less power than single-core processors  Cache coherency circuitry can operate at a much higher clock rate than is possible if the signals have to travel off-chip.  Signals between different CPUs travel shorter distances, those signals degrade less.  These higher quality signals allow more data to be sent in a given time period since individual signals can be shorter and do not need to be repeated as often.  Ability of multi-core processors to increase application performance depends on the use of multiple threads within applications.  Most Current video games will run faster on a 3 GHz single-core processor than on a 2GHz dual-core processor (of the same core architecture.  Two processing cores sharing the same system bus and memory bandwidth limits the real-world performance advantage.  If a single core is close to being memory bandwidth limited, going to multi-core might only give 30% to 70% improvement.  If memory bandwidth is not a problem, a 90% improvement can be expected.

20 Dependability Database servers Web servers (Web commerce) Compilers Multimedia applications Scientific applications, CAD/CAM In general, applications with Thread-level parallelism as opposed to instruction level parallelism IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.020

21 Measuring, Reporting and Summarizing Performance IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.021 processor-multicoreStep 2, masuk ke msconfig melalui menu "start...

22 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.022  All computers are now parallel computers!  Multi-core processors represent an important new trend in computer architecture.  Decreased power consumption and heat generation.  Minimized wire lengths and interconnect latencies.  They enable true thread-level parallelism with great energy efficiency and scalability.  To utilize their full potential, applications will need to move from a single to a multi-threaded model.  Parallel programming techniques likely to gain importance. the difficult problem is not building multi-core hardware, but programming it in a way that lets mainstream applications benefit from the continued exponential growth in CPU performance.  the software industry needs to get back into the state where existing applications run faster on new hardware Measuring, Reporting and Summarizing Performance

23 Processor-DRAM Performance To illustrate the performance impact, assume a single- issue pipelined CPU with CPI = 1 using non-ideal memory. The minimum cost of a full memory access in terms of number of wasted CPU cycles: CPU CPU Memory Minimum CPU cycles or Year speed cycle Access instructions wasted MHZ ns ns 1986: 8 125 190 190/125 - 1 = 0.5 1989: 33 30 165 165/30 -1 = 4.5 1992: 60 16.6 120 120/16.6 -1 = 6.2 1996: 200 5 110 110/5 -1 = 21 1998: 300 3.33 100 100/3.33 -1 = 29 2000: 1000 1 90 90/1 - 1 = 89 2003: 2000.5 80 80/.5 - 1 = 159 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.023

24 y MainMemor y Main memory generally uses (DRAM), which uses a single transistor to store a bit, but requires a periodic data refresh (~every 8 msec). Cache uses SRAM: Static Random Access Memory –No refresh (6 transistors/bit vs. 1 transistor/bit for DRAM) Size: DRAM/SRAM ­ 4-8, Cost & Cycle time: SRAM/DRAM ­ 8-16 Main memory performance: –Memory latency: Access time: The time it takes between a memory access request and the time the requested information is available to cache/CPU. Cycle time: The minimum time between requests to memory (greater than access time in DRAM to allow address lines to be stable) –Memory bandwidth: The maximum sustained data transfer rate between main memory and cache/CPU. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.024

25 Quantitative Principles of Computer Design IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.025  Exploits increased feature-size and density  Increases functional units per chip (spatial efficiency)  Limits energy consumption per operation  Constrains growth in processor complexity

26 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.026 The Cores Run in Parallel

27 Classes of Parallelism Concurrent events  multiprogramming, multiprocessing, or multi computing. Parallelism  pipelining, vectorization, concurrency, simultaneity, data parallelism, partitioning, interleaving, overlapping, multiplicity, replication, time sharing, space sharing, multi tasking, multiprogramming, multithreading, and distributed computing at different process level. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.027

28 Classes of Parallelism Parallelism in Hardware (Uniprocessor) – Pipelining, Superscalar, VLIW etc. Parallelism in Hardware -(SIMD, Vector processors, GPUs) Parallelism in Hardware (Multiprocessor) – Shared-memory multiprocessors – Distributed-memory multiprocessors – Chip-multiprocessors, Multi-cores Parallelism in Hardware (Multicomputer,clusters) Parallelism in Software – Task parallelism – Data parallelism IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.028

29 ILP,DLP,TLP and RLP 1ILPInstruction Level Parallelism 2LLPLoop Level Parallelism 3DLPData Level Parallelism 4TLPThread Level Parallelism(S) 5RLPR Level Parallelism IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.029

30 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.030

31 Loop Level Parallelism Instructions: iterations in a loop Accomplished by unrolling the loop H/W :Dynamic  if block size increased S/W : Static  compiler Loop controlling achieved by replicating the loop body multiple times & adjusting the loop termination code.(Creating multiple copies of the loop body) IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.031

32 Thread Level Parallelism A thread is a short sequence of instructions schedulable as a unit by a processor. This is parallelism on a more coarser scale Server can serve each client in a separate thread (Web server, database server) A computer game can do AI, graphics, and physics in three separate threads Single-core superscalar processors cannot fully exploit TLP Multi-core architectures are the next step in processor evolution: explicitly exploiting TLP IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.032

33 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.033

34 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.034

35 Inter-Core Bus IFETCE/ME/CSE/B.V.R.Raju/Iyear/ Isem/CP7103/MCA/Unit- 1/PPt/Ver1.0 35

36 Multithreading SIMULTANEOUSLYPermits multiple independent threads to execute SIMULTANEOUSLY on the SAME core Weaving together multiple “threads” on the same core Example: –if one thread is waiting for a floating point operation to complete, –another thread can use the integer units IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.036

37 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.037

38 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.038

39 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.039 SMT Archtechture

40 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.040 Multi-Core vs SMT Advantages/disadvantages? Multi-core: –Since there are several cores, each is smaller and not as powerful, but easier to design and manufacture –Great with thread-level parallelism SMT –Can have one large and fast superscalar core Great performance on a single thread Mostly still only exploits instruction-level parallelism

41 CMP Architecture IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.041

42 Limitations of Single core processors IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.042 Smarter Brain –(e.g. x386  x486  Pentium  P2  P3  P4) Larger Memory –Larger caches, DRAM, Disk Smaller Head –Fewer chips (integrate more things onto a chip) More Power Consumption –few Watts  120+ Watts! More Complex –  1Billion Transistors; design + verification complexity

43 Multi-Core System In multi-core systems, the term multi-CPU refers to multiple physically separate processing-units (which often contain special circuitry to facilitate communication between each other). The terms many-core and massively multi-core are sometimes used to describe multi-core architectures with an especially high number of cores (tens or hundreds). IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.043

44 A multi-core processor is a single computing component with 2 or more independent actual cpu (called "cores"), which are the units that read and execute program instructions. The instructions are ordinary CPU instructions such as add, move data, and branch, but the multiple cores can run multiple instructions at the same time, increasing overall speed for programs amenable to parallel computing. Manufacturers typically integrate the cores onto a single integrated circuit die (known as a chip multiprocessor or CMP), or onto multiple dies in a single chip package.computingprogram instructionsCPU instructionsparallel computing integrated circuitdiechip package IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.044

45 1 single-core 2 dual-core 3 tri-core /triple-core 4 quad-core 5penta-core 6hexa-core 7hepta-core 8octa-core/octo-core 9nona-core 10deca-core Number of cores Common names 11hendeca-core 12dodeca-core 13trideca-core 14tetradeca-core 15pentadeca-core 16hexadeca-core 17heptadeca-core 18octadeca-core 19enneadeca-core 20icosa-core IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.045

46 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.046 Multicore processors share the cache and MMU with short interconnects

47 What Is Multicore Processing? Multicore (or multi-core) processing uses software designed to run as parallel or asynchronously processed multiple applications over a multicore processor. The efficiency of multicore processing is dependent on how well the software application is optimized to take advantage of the multiple cores, the composition of those multicore processors, and the speed of the external interfaces and related hardware components in a system. Multicore processing can be especially useful for low-latency applications, with the largest boost in performance likely to be noticed in improved response time while running CPU-intensive processes. IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.047

48 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.048

49 The Multi core era A multi-core processor is a single computing component with two or more independent actual central processing units (called "cores"), which are the units that read and execute program instructions. [1] The instructions are ordinary CPU instructions such as add, move data, and branch, but the multiple cores can run multiple instructions at the same time, increasing overall speed for programs amenable to parallel computing. [2] Manufacturers typically integrate the cores onto a single integrated circuit die (known as a chip multiprocessor or CMP), or onto multiple dies in a single chip package.computingcentral processing unitsprogram instructions [1]CPU instructionsparallel computing [2] integrated circuitdiechip package IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.049

50 Case studies of Multi core Architectures IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.050

51 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.051

52 IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.052


Download ppt "IFETCE/ME/CSE/B.V.R.Raju/Iyear/Isem/CP7103/MCA/Unit-1/PPt/Ver1.01."

Similar presentations


Ads by Google