Download presentation
Presentation is loading. Please wait.
Published byTamsin Palmer Modified over 8 years ago
1
Samira Khan University of Virginia Jan 26, 2016 COMPUTER ARCHITECTURE CS 6354 Fundamental Concepts: Computing Models The content and concept of this course are adapted from CMU ECE 740
2
AGENDA Review from last lecture Why study computer architecture? Fundamental concepts – Computing models 2
3
LAST LECTURE RECAP What it means/takes to be a good (computer) architect – Roles of a computer architect (look everywhere!) Levels of transformation Abstraction layers, their benefits, and the benefits of comfortably crossing them Two example problems and solution ideas – Solving DRAM Scaling with system-level detection and mitigation – Merging memory and storage with non-volatile memories Course Logistics Assignments: HW (today), Review Set 1 (Saturday) 3
4
REVIEW: KEY TAKEAWAY Breaking the abstraction layers (between components and transformation hierarchy levels) and knowing what is underneath enables you to solve problems and design better future systems Cooperation between multiple components and layers can enable more effective solutions and systems 4
5
HOW TO DO THE PAPER REVIEWS 1: Brief summary – What is the problem the paper is trying to solve? – What are the key ideas of the paper? Key insights? – What is the key contribution to literature at the time it was written? – What are the most important things you take out from it? 2: Strengths (most important ones) – Does the paper solve the problem well? 3: Weaknesses (most important ones) – This is where you should think critically. Every paper/idea has a weakness. This does not mean the paper is necessarily bad. It means there is room for improvement and future research can accomplish this. 4: Can you do (much) better? Present your thoughts/ideas. 5: What have you learned/enjoyed/disliked in the paper? Why? Review should be short and concise (~half a page to a page) 5
6
AGENDA Review from last lecture Why study computer architecture? Fundamental concepts – Computing models 6
7
AN ENABLER: MOORE’S LAW Moore, “Cramming more components onto integrated circuits,” Electronics Magazine, 1965. Component counts double every other year Image source: Intel 7
8
Number of transistors on an integrated circuit doubles ~ every two years Image source: Wikipedia 8
9
RECOMMENDED READING Moore, “Cramming more components onto integrated circuits,” Electronics Magazine, 1965. Only 3 pages A quote: “With unit cost falling as the number of components per circuit rises, by 1975 economics may dictate squeezing as many as 65 000 components on a single silicon chip.” Another quote: “Will it be possible to remove the heat generated by tens of thousands of components in a single silicon chip?” 9
10
WHAT DO WE USE THESE TRANSISTORS FOR? Your readings for this week should give you an idea… Patt, “Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor Evolution,” Proceedings of the IEEE 2001. Mutlu and Subramanium, “Research Problems and Opportunities in Memory Systems,” SUPERFRI 2015. 10
11
WHY STUDY COMPUTER ARCHITECTURE? Enable better systems: make computers faster, cheaper, smaller, more reliable, … – By exploiting advances and changes in underlying technology/circuits Enable new applications – Life-like 3D visualization 20 years ago? – Virtual reality? – Personalized genomics? Personalized medicine? Enable better solutions to problems – Software innovation is built into trends and changes in computer architecture > 50% performance improvement per year has enabled this innovation Understand why computers work the way they do 11
12
COMPUTER ARCHITECTURE TODAY (I) Today is a very exciting time to study computer architecture Industry is in a large paradigm shift (to multi-core and beyond) – many different potential system designs possible Many difficult problems motivating and caused by the shift – Power/energy constraints multi-core? – Complexity of design multi-core? – Difficulties in technology scaling new technologies? – Memory wall/gap – Reliability wall/issues – Programmability wall/problem – Huge hunger for data and new data-intensive applications No clear, definitive answers to these problems 12
13
COMPUTER ARCHITECTURE TODAY (II) These problems affect all parts of the computing stack – if we do not change the way we design systems No clear, definitive answers to these problems Microarchitecture ISA Program/Language Algorithm Problem Runtime System (VM, OS, MM) User Logic Circuits Electrons Many new demands from the top (Look Up) Many new issues at the bottom (Look Down) Fast changing demands and personalities of users (Look Up) 13
14
Computing landscape is very different from 10-20 years ago Both UP (software and humanity trends) and DOWN (technologies and their issues), FORWARD and BACKWARD, and the resulting requirements and constraints General Purpose GPUs Hybrid Main Memory Persistent Memory/Storage Every component and its interfaces, as well as entire system designs are being re-examined COMPUTER ARCHITECTURE TODAY (III) Heterogeneous Processors and Accelerators 14
15
You can revolutionize the way computers are built, if you understand both the hardware and the software (and change each accordingly) You can invent new paradigms for computation, communication, and storage Recommended book: Thomas Kuhn, “The Structure of Scientific Revolutions” (1962) – Pre-paradigm science: no clear consensus in the field – Normal science: dominant theory used to explain/improve things (business as usual); exceptions considered anomalies – Revolutionary science: underlying assumptions re-examined COMPUTER ARCHITECTURE TODAY (IV) 15
16
You can revolutionize the way computers are built, if you understand both the hardware and the software (and change each accordingly) You can invent new paradigms for computation, communication, and storage Recommended book: Thomas Kuhn, “The Structure of Scientific Revolutions” (1962) – Pre-paradigm science: no clear consensus in the field – Normal science: dominant theory used to explain/improve things (business as usual); exceptions considered anomalies – Revolutionary science: underlying assumptions re-examined COMPUTER ARCHITECTURE TODAY (IV) 16
17
… BUT, FIRST … Let’s understand the fundamentals… You can change the world only if you understand it well enough… – Especially the past and present dominant paradigms – And, their advantages and shortcomings – tradeoffs – And, what remains fundamental across generations – And, what techniques you can use and develop to solve problems 17
18
AGENDA Review from last lecture Why study computer architecture? Fundamental concepts – Computing models 18
19
WHAT IS A COMPUTER? Three key components Computation Communication Storage (memory) 19
20
Memory (program and data) I/O Processing control (sequencing) datapath WHAT IS A COMPUTER? 20
21
THE VON NEUMANN MODEL/ARCHITECTURE Also called stored program computer (instructions in memory). Two key properties: Stored program – Instructions stored in a linear memory array – Memory is unified between instructions and data The interpretation of a stored value depends on the control signals Sequential instruction processing – One instruction processed (fetched, executed, and completed) at a time – Program counter (instruction pointer) identifies the current instr. – Program counter is advanced sequentially except for control transfer instructions When is a value interpreted as an instruction? 21
22
Recommended reading – Burks, Goldstein, Von Neumann, “Preliminary discussion of the logical design of an electronic computing instrument,” 1946. Stored program Sequential instruction processing THE VON NEUMANN MODEL/ARCHITECTURE 22
23
THE VON NEUMANN MODEL (OF A COMPUTER) CONTROL UNIT IPInst Register PROCESSING UNIT ALU TEMP MEMORY Mem Addr Reg Mem Data Reg INPUTOUTPUT 23
24
Q: Is this the only way that a computer can operate? A: No. Qualified Answer: But, it has been the dominant way – i.e., the dominant paradigm for computing – for N decades THE VON NEUMANN MODEL (OF A COMPUTER) 24
25
Von Neumann model: An instruction is fetched and executed in control flow order – As specified by the instruction pointer – Sequential unless explicit control flow instruction Dataflow model: An instruction is fetched and executed in data flow order – i.e., when its operands are ready – i.e., there is no instruction pointer – Instruction ordering specified by data flow dependence Each instruction specifies “who” should receive the result An instruction can “fire” whenever all operands are received – Potentially many instructions can execute at the same time Inherently more parallel THE DATA FLOW MODEL (OF A COMPUTER) 25
26
VON NEUMANN VS DATAFLOW Consider a Von Neumann program – What is the significance of the program order? – What is the significance of the storage locations? Which model is more natural to you as a programmer? v <= a + b; w <= b * 2; x <= v - w y <= v + w z <= x * y +*2 -+ * a b z Sequential Dataflow 26
27
MORE ON DATA FLOW In a data flow machine, a program consists of data flow nodes – A data flow node fires (fetched and executed) when all it inputs are ready i.e. when all inputs have tokens Data flow node and its ISA representation 27
28
DATA FLOW NODES 28
29
AN EXAMPLE DATA FLOW PROGRAM OUT 29
30
CONTROL FLOW VS. DATA FLOW 30
31
Dataflow Machine: Instruction Templates Each arc in the graph has an operand slot in the program Destination 1 Destination 2 Presence bits 1 2 3 4 5 + 3L 4L * 3R 4R - 5L + 5R * out Opcode Operand 1Operand 2 a b +*7 -+ * y x 12 3 4 5 31 Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974.
32
DATA FLOW SUMMARY Availability of data determines order of execution A data flow node fires when its sources are ready Programs represented as data flow graphs (of nodes) Data Flow at the ISA level has not been (as) successful Data Flow implementations under the hood (while preserving sequential ISA semantics) have been successful – Out of order execution – Hwu and Patt, “HPSm, a high performance restricted data flow architecture having minimal functionality,” ISCA 1986. 32
33
DATA FLOW CHARACTERISTICS Data-driven execution of instruction-level graphical code – Nodes are operators – Arcs are data (I/O) – As opposed to control-driven execution Only real dependencies constrain processing No sequential I-stream – No program counter Operations execute asynchronously Execution triggered by the presence of data 33
34
DATA FLOW ADVANTAGES/DISADVANTAGES Advantages – Very good at exploiting irregular parallelism – Only real dependencies constrain processing Disadvantages – Debugging difficult (no precise state) Interrupt/exception handling is difficult (what is precise state semantics?) – Implementing dynamic data structures difficult in pure data flow models – Too much parallelism? (Parallelism control needed) – High bookkeeping overhead (tag matching, data storage) – Instruction cycle is inefficient (delay between dependent instructions), memory locality is not exploited 34
35
ANOTHER WAY OF EXPLOITING PARALLELISM SIMD: – Concurrency arises from performing the same operations on different pieces of data MIMD: – Concurrency arises from performing different operations on different pieces of data Control/thread parallelism: execute different threads of control in parallel multithreading, multiprocessing – Idea: Use multiple processors to solve a problem 35
36
FLYNN’S TAXONOMY OF COMPUTERS Mike Flynn, “Very High-Speed Computing Systems,” Proc. of IEEE, 1966 SISD: Single instruction operates on single data element SIMD: Single instruction operates on multiple data elements – Array processor – Vector processor MISD: Multiple instructions operate on single data element – Closest form: systolic array processor, streaming processor MIMD: Multiple instructions operate on multiple data elements (multiple instruction streams) – Multiprocessor – Multithreaded processor 36
37
Concurrency arises from performing the same operations on different pieces of data – Single instruction multiple data (SIMD) – E.g., dot product of two vectors Contrast with thread (“control”) parallelism – Concurrency arises from executing different threads of control in parallel Contrast with data flow – Concurrency arises from executing different operations in parallel (in a data driven manner) SIMD exploits instruction-level parallelism – Multiple instructions concurrent: instructions happen to be the same 37 SIMD PROCESSING
38
Single instruction operates on multiple data elements – In time or in space Multiple processing elements Time-space duality – Array processor: Instruction operates on multiple data elements at the same time – Vector processor: Instruction operates on multiple data elements in consecutive time steps 38
39
ARRAY VS. VECTOR PROCESSORS ARRAY PROCESSORVECTOR PROCESSOR LD VR A[3:0] ADD VR VR, 1 MUL VR VR, 2 ST A[3:0] VR Instruction Stream Time LD0LD1 LD2 LD3 AD0AD1 AD2 AD3 MU0MU1 MU2 MU3 ST0ST1 ST2 ST3 LD0 LD1AD0 LD2AD1 MU0 LD3AD2 MU1 ST0 AD3 MU2 ST1 MU3 ST2 ST3 Space Same op @ same time Different ops @ same space Different ops @ time Same op @ space 39
40
SCALAR PROCESSING Conventional form of processing (von Neumann model) add r1, r2, r3 40
41
SIMD ARRAY PROCESSING Array processor 41
42
VLIW PROCESSING Very Long Instruction Word – We will get back to this later 42
43
VECTOR PROCESSORS A vector is a one-dimensional array of numbers Many scientific/commercial programs use vectors for (i = 0; i<=49; i++) C[i] = (A[i] + B[i]) / 2 A vector processor is one whose instructions operate on vectors rather than scalar (single data) values Basic requirements – Need to load/store vectors vector registers (contain vectors) – Need to operate on vectors of different lengths vector length register (VLEN) – Elements of a vector might be stored apart from each other in memory vector stride register (VSTR) Stride: distance between two elements of a vector 43
44
VECTOR PROCESSOR ADVANTAGES + No dependencies within a vector – Pipelining, parallelization work well – Can have very deep pipelines, no dependencies! + Each instruction generates a lot of work – Reduces instruction fetch bandwidth + Highly regular memory access pattern – Interleaving multiple banks for higher memory bandwidth – Prefetching + No need to explicitly code loops – Fewer branches in the instruction sequence 44
45
Samira Khan University of Virginia Jan 26, 2016 COMPUTER ARCHITECTURE CS 6354 Fundamental Concepts: Computing Models The content and concept of this course are adapted from CMU ECE 740
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.