Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 0: Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Fall 2012, Dr. Rozier.

Similar presentations


Presentation on theme: "Lecture 0: Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Fall 2012, Dr. Rozier."— Presentation transcript:

1 Lecture 0: Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Fall 2012, Dr. Rozier (UM)

2 Welcome to EEN 312!

3 ROSE-E-A Professor Eric Rozier

4 Who am I? BS in Computer Science from William and Mary

5 Who am I? BS in Computer Science from William and Mary Studied models of agricultural pests (flour beetles).

6 Who am I? BS in Computer Science from William and Mary Studied models of agricultural pests (flour beetles). And load balancing of super computers.

7 Who am I? First job – NASA Langley Research Center

8 Who am I? First job – NASA Langley Research Center Researched problems in aeroacoustics

9 Who am I? First job – NASA Langley Research Center Researched problems in aeroacoustics – Primarily on the XV-15

10 Who am I? First job – NASA Langley Research Center Researched problems in aeroacoustics – Primarily on the XV-15 – Precursor to the better known V-22

11 Who am I? PhD in CS/ECE from the University of Illinois

12 Who am I? PhD in CS/ECE from the University of Illinois Studied non-linear dynamics of transactivation networks in economically important species…

13 Who am I? PhD in CS/ECE from the University of Illinois Studied non-linear dynamics of transactivation networks in economically important species… corn…

14 Who am I? PhD in CS/ECE from the University of Illinois Worked with the NCSA on problems in super computing, reliability, and big data.

15 Who am I? PhD in CS/ECE from the University of Illinois Worked with the NCSA on problems in super computing, reliability, and big data. Research led to patented advances with IBM

16 Who am I? Served as a visiting scientist and IBM Fellow at the IBM Almaden Research Center in San Jose, CA Helped advance state of the art in fault- tolerance, and our understanding of why systems fail

17 Who am I? Postdoctoral work at the Information Trust Institute – Worked on Blue Waters Super Computer, first sustained Petaflop machine – Designed new fault- tolerant methods for data protection on large- scale systems

18 Who am I? Assistant Professor at UM ECE

19 Who am I? Assistant Professor at UM ECE – Head of the Trustworthy Systems Lab

20 Who am I? Assistant Professor at UM ECE – Head of the Trustworthy Systems Lab – Working on problems in: Cloud computing Big Data Reliability Security Compliance

21 How to get in touch with me? Office – Department of Electrical and Computer Engineering – Fifth Floor, Room 517 Contact Information – Email: e.rozier@miami.edue.rozier@miami.edu – Phone: 8-9752 Currently looking for motivated students – Research projects and papers

22 Office Hours Office – Department of Electrical and Computer Engineering – Fifth Floor, Room 517 DayHours Tuesday10:00a – 11:00a Thursday10:00a – 11:00a Or by appointment

23 COURSE SUMMARY AND OVERVIEW

24 EEN 312 Processors: Hardware, Software, and Interfacing – Class: MM 102 – Lab: McArthur Engineering Building 402 Class website – http://performalumni.org/erozier2/een312.html

25

26 The syllabus…

27 Grades Grade ComponentPercentage Midterm I10% Midterm II10% Laboratory Projects50% Final Exam30%

28 Grades Guaranteed Grades A+’s are assigned on the basis of exceptional work, scoring 99 or 100 for the entire course.

29 Labs Labs are a HUGE component of this course – Lab sessions will be held based on the session you have been assigned and registered for. – Labs for this class will be very demanding. It is unlikely you will finish them during the assigned sessions. – You will need to make good use of your assigned laboratory time to seek guidance from your TAs, but you should expect to spend significant time outside of lab working on your lab assignments.

30 Active Learning After 2 weeks we tend to remember: – Passive learning 10% of what we read 20% of what we hear 30% of what we see 50% of what we hear and see – Active learning 70% of what we say 90% of what we say and do

31 Bloom’s Taxonomy Evaluation Synthesis Analysis Application Comprehension Knowledge

32 Training Good Engineers Understanding processors isn’t our only goal – Critical Reading – Critical Reasoning Ask questions! Think through problems! Challenge assumptions!

33 312

34 304 118

35 Course overview Understanding the abstractions beneath your applications and programs. We will focus on: – How programs are translated into machine language. – How hardware executes machine instructions. – How computers are organized and designed.

36 Course Components Class time – High level concepts – Hands on exercises and application – Discussions Labs – The heart of the course – 1-2 weeks each – Indepth exploration of an aspect of system design and organization Exams – 2 Midterms + 1 Final – Test your understanding of concepts and mathematics

37 Textbook

38 Be sure to get the 4 th edition! Available from the bookstore – New: $89.95 – Used: $67.50 Available online – Softback: $61.98 (Amazon) – Kindle: $71.99 – Kindle Rental: ~$35 The textbook is essential for this course.

39 LABORATORIES

40 Laboratories TAs – Yilin Yan y.yan4@umiami.edu – Murat Aykin m.aykin@umiami.edu Lab Sections – Wednesday 2:30 – 4:50p – Friday 2:30 – 4:50p

41 Lab Procedure Labs will be completed in groups of 2-3. – You may complete labs as a group, but you must each hand in a separate lab assignment. – You may change groups with each lab.

42 Raspberry Pi

43 Lab Pis

44 We have a set of 16 Raspberry Pis available for the class. Each group will be assigned one for each lab. – Don’t use an unassigned Pi! – Some of our labs will have the potential to reboot the platform, or worse! One group per Pi! Pis used for the lab are accessible from the school network.

45 Laboratory Assignments The labs for this class will require a lot of time. Start them early. Labs will be assigned in class on Tuesday before the first lab session. – It is recommended you prepare any questions for your first laboratory session in advance! Labs are typically due at the beginning of your lab session, 2 weeks after they are assigned.

46 Laboratory Assignments Each student is allocated 3 slip days for the semester. – A slip day can be used to extend the due date for a laboratory by 1 day, no questions asked. – You should indicate on your submitted assignment how many slip days are being used. No other extensions will be granted except in the case of a documented emergency. – Late work suffers A -20% on the first day it is late. A -40% on the second day it is late. A -60% on the third day it is late. No credit for four days or more late.

47 Examinations – Midterm I – February 13 th in class – Midterm II – March 3 rd in class – Final Exam – May 1 st from 11:00a – 1:30p in MM- 102.

48 Course Plan University of Miami Honor Code is in effect – Open hands policy on assignments Late policy – Late assignments are only accepted if arrangements are made ahead of time Electronic device policy – Laptops and tablets are ok as long as they’re being used for class – Silence cell phones please Week 1 Introduction, Computer Organization, Performance Lab 0 2 Instructions, operands, load/store, and numbers Lab 1 3 Branches, conditions, loops, procedures and the stack 4 Arithmetic, ALUs, Processors Lab 2 5 Data path, control, pipeliningMIDTERM I 6 Jumps, branches, and pipelines Lab 3 7 Pipeline hazards, branch prediction, exceptions 8 Memory hierarchy, caches, addressing Lab 4 3/11 Spring Break 9 Cache performance, block replacement, caching algorithms Lab 4 10 Virtual memory, paging, page faults, protection Lab 5 11 Intro to storage systemsMIDTERM II 12 Storage systems, reliability, deduplication, RAID, flash and PCM Lab 6 13 Connecting processors, memory, and I/O 14 Parallel processing, concurrency, and course synthesis No Lab

49 ON ABSTRACTIONS

50 Abstraction and Reality Most courses in CS/ECE emphasize abstractions – Abstract data types – Abstract analysis Abstractions have limits – Reality raises its ugly heads as bugs in design and implementation. – Understanding the details of underlying systems becomes important!

51 Some Realities What is an int? What is a float?

52 Some Realities Reality #1 An int is not an integer! A float is not a real number! Example: Is x^2 >= 0?

53 Some Realities An int is not an integer! A float is not a real number! Example: Is x^2 >= 0? – Floats? Yes. – Ints? 40000 * 40000 -> 1600000000 50000 * 50000 -> ?? Doesn’t behave like an integer!

54 Some Realities Is addition ?communicative? – Does x + y = y + x? – Ints? Yes. – Floats? No! – ADD SOMETHING HERE

55 Computer Arithmetic It isn’t random! – Operations have mathematical properties they adhere to. May not be the ones we assume as “usual” – Finite representation in the hardware matters! Observation: – Understanding the hardware implementation is necessary.

56 What kind of abstractions are we using? Code found in BSD implementation of getpeername /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; }

57 Intended Usage Code found in BSD implementation of getpeername /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; } #define MSIZE 528 void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, MSIZE); printf(“%s\n”, mybuf); }

58 Malicious Usage Code found in BSD implementation of getpeername /* Kernel memory region holding user-accessible data */ #define KSIZE 1024 char kbuf[KSIZE]; /* Copy at most maxlen bytes from kernel region to user buffer */ int copy_from_kernel(void *user_dest, int maxlen) { /* Byte count len is minimum of buffer size and maxlen */ int len = KSIZE < maxlen ? KSIZE : maxlen; memcpy(user_dest, kbuf, len); return len; } #define MSIZE 528 void getstuff() { char mybuf[MSIZE]; copy_from_kernel(mybuf, -MSIZE);... }

59 Some Realities Reality #2 Knowing assembly is essential to your future!

60 Some Realities Reality #2 Knowing assembly is essential to your future! – You will probably NEVER write assembly programs outside of this class. Compilers are better at it than you are. – But…

61 Some Realities Reality #2 Knowing assembly is essential to your future! – You will probably NEVER write assembly programs outside of this class. Compilers are better at it than you are. – Understanding assembly is key to understanding machine execution. Behavior of programs with bugs. Performance tuning. System Software. Malware analysis.

62 Some Realities Reality #3 Memory matters! – Random access memory is an abstraction with little basis in the physical world. – Memory is not unbounded. Memory must be allocated and managed. Many applications are memory dominated. – Memory reference bugs are very difficult. – Memory performance is non-uniform.

63 Memory Referencing Bug Example Result varies based on architecture. double fun(int i) { volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0]; } fun(0) ➙ 3.14 fun(1) ➙ 3.14 fun(2) ➙ 3.1399998664856 fun(3) ➙ 2.00000061035156 fun(4) ➙ 3.14, then segmentation fault

64 Memory Referencing Bug Example double fun(int i) { volatile double d[1] = {3.14}; volatile long int a[2]; a[i] = 1073741824; /* Possibly out of bounds */ return d[0]; } fun(0) ➙ 3.14 fun(1) ➙ 3.14 fun(2) ➙ 3.1399998664856 fun(3) ➙ 2.00000061035156 fun(4) ➙ 3.14, then segmentation fault Location accessed by fun(i) Explanation: Saved State 4 d7... d4 3 d3... d0 2 a[1] 1 a[0] 0

65 Memory Referencing Errors C/C++ do not provide memory protection – Out of bound references – Invalid pointer values – Abuses of allocation Can lead to hard to debug situations – Dependent on architecture and compiler – Action at a distance

66 Memory Performance Example How big of a difference does this simple change make? 21x times slowdown! void copyji(int src[2048][2048], int dst[2048][2048]) { int i,j; for (j = 0; j < 2048; j++) for (i = 0; i < 2048; i++) dst[i][j] = src[i][j]; } void copyij(int src[2048][2048], int dst[2048][2048]) { int i,j; for (i = 0; i < 2048; i++) for (j = 0; j < 2048; j++) dst[i][j] = src[i][j]; }

67 Memory Performance Example

68 Some Realities Reality #4 There is more to performance than asymptotic complexity. – Constants matter too! – Exact op count doesn’t even full describe the situation! – Must optimize at all levels, algorithm, data representation, functions, loops.

69 Some Realities Reality #4 There is more to performance than asymptotic complexity. – Need to understand how systems work to optimize them! How are programs compiled? How are they executed? How do we measure performance? How do we find bottlenecks? How do we improve performance without affecting the code?

70 Performance of Matrix Multiply Same computer. Same compiler. Same flags. Exactly the same number of operations Why does this happen? 160x Triple loop Best code (K. Goto)

71 Performance of Matrix Multiply Reasons for 20x improvement: Blocking or tiling, loop unrolling, array scalarization, instruction scheduling, search to find best choice. Effect: Fewer register spills, L1/L2 cache misses, and TLB misses. Memory hierarchy and other optimizations: 20x Vector instructions: 4x Multiple threads: 2x

72 Some Realities Reality #5 Computers do more than execute programs – Need to get data in and out I/O systems are critical to performance and reliability. – Communicate over networks How to cope with unreliable media? Dealing with concurrency? Cross platform issues?

73 COMPUTER ORGANIZATION

74 Classes of Computers Desktop Computers – General purpose, run a variety of software for many applications – Subject to cost/performance tradeoffs Server Computers – Network based – High capacity, performance, and reliability – Range from small servers to super computers Embedded Computers – Parts of systems, cyberphysical controllers – Power/performance/cost constraints

75 Market Trends

76 Components of a Computer All computers have similar philosophies of organization. Get input, perform computation, produce output. – User interface: Display, keyboard, sensors – Storage devices: Hard disk, CD/DVD, flash – Communication: Network, wifi, etc. – Compute: CPU, GPU, Memory

77 Internals of a Computer

78 Internals of a Processor (CPU) Datapath: Performs operations on data. Control: sequences datapath, memory Cache memory – Small fast SRAM memory for immediate access to data.

79 Abstractions and the CPU Abstractions help us deal with complexity and hide low level details. Instruction set architecture (ISA) – The hardware/software interface Application binary interface – ISA + system software interface

80 Why Abstractions? What is an instruction?

81 Why Abstractions? What is an instruction? – A collection of bits the computer understands and can “execute” or perform. – Example: 00000010001100100100000000100000 – Tells a computer to add two numbers. – How does the computer know?

82 Why Abstractions 00000010001100100100000000100000 Op codersrtrdshamtfunction code What the heck does this even mean? add $t0, $s1, $s2 op code - Code for the basic operation of the instruction rs - The first register source operand rt- The second register source command rd- The register destination operand, gets the result of the operands shamt- Shift amount function code- Function code, selects the specific variant of the operation indicated by the op code.

83 The Hardware

84 Why abstractions This: add $t0, $s1, $s2 Is easier than this: 00000010001100100100000000100000

85 Why abstractions You know what is even easier? This: c = a + b; For a human at least…

86 Abstraction Layers High-level language – Level of abstraction is close to the problem domain. – Allows us to be productive! – Allows the code to be machine portable Different machines have different instructions!

87 Abstraction Layers Assembly language – Assembly language is created from a compiler. – Compiler takes a high- level language and compiles the instructions necessary to accomplish the indicated algorithm. – Assembly language is a symbolic version of binary instructions.

88 Abstraction Layers Machine Language – Created by the assembler. – Translates from symbolic assembly language into the binary representation in machine language which the computer actually understands.

89 WRAP UP

90 For next time Read Chapter 1, Sections 1.1 – 1.5, and 1.8. Start Lab 0 early!!! Many thanks to Drs. Lee and Seshia for their text and materials


Download ppt "Lecture 0: Introduction EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Fall 2012, Dr. Rozier."

Similar presentations


Ads by Google