Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Register File and ALU

Similar presentations


Presentation on theme: "The Register File and ALU"— Presentation transcript:

1 The Register File and ALU
CS/COE 0447 Jarrett Billingsley

2 Class announcements no!!!!!!!!!!!!!! CS447

3 A taste of design CS447

4 Tug-of-war register file design is constrained by many competing factors compilers love lots of identical registers! ISA says instructions have 2 operands and 1 destination …but there are diminishing returns. fast L1 cache? not as many regs needed …except for this one instruction that has 2 destinations. D Q multi-issue CPU: need to read 4 regs and write 2 …but context switches are slower. humans like intuitive assembly language! with lots of registers, function calls are faster! more registers means more silicon… CS447

5 It doesn't have to be this way
CISC CPUs usually have small sets of registers, and many have special purposes or behaviors RISC CPUs usually have 32* mostly-interchangeable registers: MIPS, RISC, SPARC, ARMv8, AVR, RISC-V… 8086 ax bx cx dx si di sp bp z80 a f b c d e h l ix iy sp 6502 A X Y PDP8 AC r0 r1 r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r18 r19 r20 r21 r22 r23 r24 r25 r26 r27 r28 r29 r30 r31 32/64 bits 12 bits 8 bits why is this? well, what do you remember about the differences between RISC and CISC? - RISC is made for compilers to write programs; CISC is made for humans to write programs. - humans are weird, can only keep a few things in their brains at once, and need help with names and stuff. - compilers want lots of identical registers because then it's much easier to produce machine code algorithmically. 16 bits CS447 *or 32 at a time

6 Monkeys and Ladders, revisited
the thing is… sometimes, the hose is real. it's hard to tell if we're all doing something "just because" or because there's a good reason. don't waste time reinventing the wheel. first, find out why the wheel is shaped that way. - then you can decide if it's a bullshit reason or not. - 32 registers just happens to be a nice number of registers. 16 might even be better, but there's not much of a compelling reason to switch. CS447 image credit: throwcase.com

7 A word of advice you will see many imperfect designs in your life
but in problem-solving, perfection isn't always the goal everyone has to work within the constraints they're given also don't be a judgmental ass about someone else's design because one, it's shitty, and two, they know more about why it was designed that way, so you're just being presumptuous 9 times out of 10 it's because their boss told them to - I'm not a wise man but I play one on TV CS447

8 The register file CS447

9 let's just look at the first 4 instead.
The MIPS register file soooo we're gonna need 32 registers, right? 1 2 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 god, this is gonna be a pain to demonstrate let's just look at the first 4 instead. CS447

10 1/8 of the MIPS register file
soooo we're gonna need 4 registers, right? except, register 0 is named zero, and it's always 0, and if you write to it, it's still 0, and if you read from it, it's always 0, so maybe it should just be 0. the constant 0. D Q en now let's work on reading from the registers. D Q en 1 registers constantly output whatever value they store. D Q en 2 how will we choose which register to read, though? D Q en 3 - 0. - choOOOoooooOOOOooooose……….. that means a mux. CS447

11 Reading from one register
we'll use a mux to decide which register to read. so what decides which register to read? add t0, t1, t2 D Q en 1 83 uh, but this mux only reads one register. 83 29 D Q en 2 4 so what do you think we'll need? D Q en 3 29 3 1 - this muxtopus is missing a few legs  - how the registers get from your assembly text into that control signal input is next week this is a muxtopus: a mux with a bunch of legs coming out of it. CS447

12 Another mux! we duplicate that circuitry to be able to read 2 registers at once! now we can read from two different registers at the same time. 83 29 D Q en 1 83 1 3 or, if we like, the same register twice… D Q en 2 4 29 D Q en 3 29 - this is why adding a second output to RAM is impractical: look at all the extra wires just for 4 locations! 3 CS447

13 Writing to a register add t0, t1, t2 bne t0, t2, top 83 4 29
writing is sort of the mirror image. sort of. what decides which register to write? we're making a choice here, but a mux doesn't really make sense… add t0, t1, t2 D Q en 1 83 do we always write to a register? we can also use the write enables to make choices. D Q en 2 4 bne t0, t2, top D Q en 3 29 so let's think about the logic of when each of these registers should change. - register 1 should change if we've got a destination AND the destination is 1. - register 2 should change if we've got a destination AND the destination is 2. - etc… CS447

14 A first attempt we only want to write to one register at a time… if any. we'll need a write enable signal for the register file as a whole. WE dest could try a comparator… D Q en 1 83 it's 1 (true) for instructions that have a destination. 1 = every register needs an AND gate, cause that's what we said. D Q en 2 4 and repeat for every damn register… but… how do I check if the destination == 1? or 2, or 3? D Q en 3 29 - when the register file's WE=0, no registers change. when its WE=1, 1 register changes. this feels verbose. CS447

15 Chekhov's Gun I said this component was pointless 95% of the time. well, now it has a point. a demultiplexer forwards its input to one of its outputs. the rest will be 0. D Q en 1 83 now, a register is only written when WE=1 and it is selected as the destination register. DEMUX D Q en 2 4 WE 1 dest D Q en 3 29 - is this a demuxtopus? 1 3 CS447

16 The last thing to hook up
the input data has to go to the destination register. but because of our write enable circuitry… 74 we can hook up the data input to all the registers' data inputs. Data D Q en 1 83 the write enable signal is like a door: only the register with WE=1 will be changed. D Q en 2 4 we're done! D Q en 3 29 74 1 CS447

17 Register File The MIPS register file
now we have the completed register file: there's one input or write port. Register File there are two output or read ports. each port can read a different register. there's the clock input, of course… WE rd rs rt a single write enable… and inputs to select the source and destination registers. CS447

18 The ALU

19 Arithmetic and Logic Unit
we talked about this last time. it does arithmetic and logic… remember the lab last week? the thing where you can choose to add, subtract, multiply, or divide? mmmmmmyep. that's an ALU. CS447

20 It really is that straightforward
an ALU can be entirely made of combinational logic. A + B - the Op(eration) signal controls what the ALU does Op do everything, but only pick the thing you need. CS447

21 Bit slicing the book makes the ALU like this.
this approach is called bit slicing: build a 1-bit ALU, then copy-and-paste it then they make it more confusing??? it might be what they really use when designing a chip, but it's not great for learning it does the same thing but is way harder to understand CS447

22 What about multiply and divide though??
MIPS also does them separately because they're slow. mult t0, t1 add t2, t3, t4 and t2, t2, a2 ... move v0, t2 mflo v1 Main ALU then we run other stuff… it sends the multiplication off to a separate unit. +- D Q ctrl ×÷ Unit …and later, ask for the result. CS447

23 Hey, that's neat actually
by making the multiply/divide unit separate from the rest of the CPU, we can do fun things like overclock it. maybe the CPU runs at 2 GHz, but the divider at 8 GHz now the divider does 4 steps on each main CPU cycle! but… what should happen if the program tries to get the quotient before the division is done? how do we "pause" the CPU when that happens? these issues are (mostly) solved by superscalar execution CS447

24 Save Our Silicon an ALU can be a pretty sizeable chunk of space
you might reuse the ALU hardware for multiple purposes t2 t5 t2 & t5 and t0, t2, t5 t0 10 t0 - 0 bne t0, 10, lab1 PC 24 PC + 24 b lab2 but we can't do all three at the same time. so either duplicate parts of the ALU, or use multi-cycle. CS447

25 Ah, whatever, use as much silicon as you want
a multi-issue CPU can run multiple instructions in parallel and 2 ALUs, of course. add t0, t0, a0 sub t1, t1, a1 these are two independent calculations. let's do them at the same time! t0 Register File ALU 1 a0 now we need four read ports and two write ports. t1 ALU 2 a1 CS447


Download ppt "The Register File and ALU"

Similar presentations


Ads by Google