Computer Architecture Education: Black Boxes Proved Harmful Yale N. Patt The University of Texas at Austin Workshop on Computer Architecture Education Toronto, June 24, 2017
What I want to do today Point out a number of black box education examples and the problems of the black box approach My approach in the freshman course My approach in the senior course My approach in the graduate course
Examples Trace driven Simulation Cache pollution vs cache prefetching Actual benefit of branch prediction (NOT pred. accuracy) Direct mapped caches vs. Set associative caches Which has the faster cycle time? FPGAs We understand: when flexibility trumps performance What about cost: area, time, energy – how much? Machine Learning See it works! How does it work? Why does it work? A REAL example with a late stage PhD student Are we asking for a real catastrophe? SimplScalar Performance due to bugs
On the other hand.. Picasso vis a vis Velasquez Norm Jouppi et.al. vis a vis Google’s TPU
The Freshman Course Start with what they “know” The transistor as light switch Not quantum mechanics Choose a computer model that is simple (The LC-3) As the genius said: simple, but still rich Continually build on what they know Continually raising the level of abstraction Memorizing as little as absolutely necessary Trying very hard to not introduce magic
The Motivated Bottom-up Approach Overview 12. Transition to C 13. Programming in C 8. Programming and Debugging 11. Physical I/O, traps, interrupts 10. Data Structures (stacks, linked lists, character strings, trees) 9. Assembly Language programming 2. Operations on bits, bytes (arithmetic, logical) 7. The LC-3 Instruction set architecture 6. The Von Neumann model 5. The finite state machine 4. Digital Logic 3. The transistor
7
8
The ISA
The Data Path
The State Machine
Some observations At Michigan, from a Mechanical Eng’g Professor: I want my students to take your course I can teach them vibration, friction, etc. They also need to know how computers work There are 55 microprocessors in every new car Students I talk to in my freshman course at UT Intro to Computing, required of all ECE and BioMed majors I teach it every other year 388 freshmen in Fall, 2015 (most with super AP credit)
What a Computer is to an Engineer (and not just a Computer Engineer) A tool used to solve problems (e.g., MATLAB) Computers process algorithmically Computers process numbers An embedded processor that controls a system (airplane, factory, heart monitor, traffic flow) Sensors Actuators Functions Concept of State
What ALL engineers need to know in 2017 (in addition to Physics and Math) How the computer works How the numbers are represented How an algorithm “works” on a computer From sensors (inputs) via “programs” to actuators (outputs)
What an engineer does not need a course in Excel Word Web Browsing Rote learning of programming
Some comments on my choice of the LC-3 First, I wanted an anti-black_box. Something we could make absolutely concrete A subset of a real ISA is not without glitches A clean sheet of paper can guarantee pedagoguy We can still get the “wow” effect (on a Simulator) Students can debug their own programs We can still build on this in later courses
LC-3: What it has, and what it doesn’t have 8 registers, NZP condition codes, 15 opcodes, 16 bit address space Two addressing modes (PC+offset, Reg+offset) Privileged memory Memory-mapped I/O What we left out Endian-ness Lots of data types (no floats, no bytes, only 16-bit integers) Lots of addressing modes All but 15 essential opcodes Resisting pressure for MUL, SHF, ADDC
What I learned about students Freshmen can handle serious meat Students don’t need glitz Computer architecture can begin with freshmen Students can debug their own programs If you get rid of the black box so they know what’s going on
The Senior Course Lectures attempt to emphasize fundamentals Features in various ISAs, Microarchitectures Physical memory, Virtual memory, caches Process state, exceptions, interrupts Pipelining, Branch Prediction Out-of-order execution, in-order retirement Single-thread and multi-thread parallelism Cache coherence, memory consistency Labs attempt to reinforce selected lecture topics (LC-3b) Lab 1: Write a program in LC-3b Assy Lang, and an Assembler Lab 2: Provide control signals: datapath, state machine, eq Lab 3: Add exception handling Lab 4: Add virtual memory Lab 5: Pipelined design
The LC-3b ISA We added virtual memory The NOT became the XOR (retaining compatibility!) Word addressing Byte addressing Ergo, endian-ness LDI, STI were scrapped A comprehensive SHF instruction was added
The Graduate Course The term project: Implement an ISA (subset of x86) Team of three students, project worth half the grade Start with a clean sheet of paper Structural level verilog Students decide what to put into their design Students responsible for data path, state machine, control signals, buses, cache and memory, I/O Students decide what accelerators What it will do to cycle time (teams push cycle time) What interconnect will be required How many stages in the pipeline Decode: One or two cycles? What effect on performance (e.g., misprediction penalty)?
Finally, In my view, black boxes = magic Magic impedes fundamental understanding Lack of understanding limits how high you can soar …and that is harmful!
Thank you!