Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hardware Compilation Gordon J. Pace December 2007.

Similar presentations


Presentation on theme: "Hardware Compilation Gordon J. Pace December 2007."— Presentation transcript:

1 Hardware Compilation Gordon J. Pace December 2007

2 Introduction Circuits can be used to implement algorithms … And are obviously as expressive as software … But would you want to design a circuit for a task by hand? How can high level languages be automatically compiled to hardware?

3 But … Running compiled code as software involves simply putting the program in memory … But running code compiled to hardware required building the circuit. Or does it? One can use programmable logic devices (such as FPGAs, field- programmable gate arrays).

4 Programmable Logic Devices Not as fast as ASIC (application-specific integrated circuit), and not as big, and not as cheap… But you can download any circuit onto such a board! Consists of a large array of gates which can be interconnected as desired. May include more complex units on- board than simple logic gates.

5 This lecture I will be introducing the concepts behind hardware compilation, by starting with a simple imperative parallel language and slowly extend it. I will describe hardware using Lava – a hardware description language embedded in Haskell. Which will enable me to show you the code of an actual hardware compiler.

6 But before we start … A short reminder about circuit components and a Lava primer

7 Hardware Components We will be using the following basic components in our circuits: andor not delay

8 Building a Multiplexer sel i0i0 i1i1 o o i0i0 i1i1 o = if sel then i 1 else i 0 mux (sel, (i0, i1)) = or2 (m0, m1) where m0 = and2 (i0, inv sel) m1 = and2 (i1, sel)

9 Building a Register o = if write then i else keep the previous value register (write, i) = o where o = mux (write, (old, i)) old = delay low o i o write i o

10 Back to Hardware Compilation How can one compile a high-level program into a hardware circuit? main (input n, pow: int16, output result: int16) { local e: int16; result := 1; e := pow; while (e > 0) { result := result * num; e := e – 1; } n pow result

11 But how should it work? Option 1: Execute the algorithm at every clock tick What about non-terminating programs? Behaviour is identical in all clock ticks Algorithms can only describe combinational circuits What about long combinational logic? How long should a clock tick take?

12 But how should it work? Option 2: Program runs over several clock ticks When will the result be available? How will we know that the result is ready? Should the control over clock ticks be in the hands of the programmer or the compiler?

13 Mini-Flash An imperative parallel language with one output variable

14 Mini-Flash Syntax program ::= Skip | Delay | Emit | program ; program | if signal then program else program | while signal do program | program || program

15 Informal semantics of Mini-Flash Skip terminates immediately and does nothing. Delay takes one clock tick to terminate. ; is sequential composition. || is fork-join parallel composition – execute in parallel, and terminate once both sides have terminated. Conditionals and loops work as usual.

16 What about Emit ? The program has only one output wire, Which is always low unless an Emit instruction has been executed in that clock tick. The Emit instruction terminates immediately.

17 Macros We will be defining macros to avoid giving long, difficult to understand programs as examples, and to avoid having to deal with function definitions. emit1 Emit; Delay

18 Example: An Oscillator oscillator forever { Delay; emit1 } where forever is defined as: forever P while high do P

19 Example: Reading Input Copy the input to the output copy in forever { if in then emit1 else Delay } Copy input to output but avoiding two sequential high values: safecopy in forever if in then { emit1; Delay } else Delay

20 Example: Waiting for a Signal Block the program until a signal becomes high: wait in while (inv in) do Delay

21 A final example: A detonator Two persons get a controller each. Each controller has two buttons, which must be pressed in sequence (possibly with a pause in between) to enable. Both controllers must be triggered to enable the detonator. detonator ((a, b), (c, d)) { wait a; wait b } || { wait c; wait d }; Emit a b c d

22 Compiling Mini-Flash From programs to circuits.

23 The Problem Unlike execution of software on traditional processors, hardware in inherently parallel. How do we perform sequential composition? How do we know a circuit has finished its computation?

24 The Shape of Circuits to Come Solution: The circuits will have extra input start to start their execution and an extra output finish, to mark their termination: start finish emitinputs

25 Compiling Skip start finish emit low

26 Compiling Delay start finish emit low

27 Compiling Emit start finish emit

28 Compiling P;Q start finish emit P Q

29 Compiling Conditionals start finish emit P Q cond

30 Compiling Parallel Composition start finish emit P Q synchroniser

31 The Synchroniser Outputs high when both inputs have become high … Can be implemented using the other language constructs 1 : synchronise (f1, f2) forever { wait (or2 (f1, f2)); if (and2 (f1, f2)) then Skip else { Delay; wait (or2 (f1, f2)) }; emit1 } 1 Well, this synchroniser almost always works …

32 Compiling Loops start finish emit P cond

33 Beware! start finish emit skip Compiling: while high do Skip high

34 Beware! start finish emit Compiling: while high do Skip high low

35 Beware! start finish emit high low Compiling: while high do Skip

36 Beware! start finish emit high low Compiling: while high do Skip

37 Beware! start finish emit low Compiling: while high do Skip

38 Beware! start finish emit low Compiling: while high do Skip Combinational loop!

39 Beware! start finish emit low Compiling: while high do Skip To avoid combinational loops, ensure that bodies of while loops always take time to execute.

40 Writing a Mini-Flash Compiler From programs to circuits in one page of code!

41 Mini-Flash Syntax data MiniFlash = Skip | Delay | Emit | MiniFlash :>: MiniFlash | IfThenElse Signal (MiniFlash, MiniFlash) | While Signal MiniFlash | MiniFlash :|: MiniFlash

42 Mini-Flash in Lava The type of circuits produced: type MiniFlashCircuit = Signal -> (Signal, Signal) The compiler: compile :: MiniFlash -> MiniFlashCircuit

43 Compiling Skip start finish emit low compile Skip start = (finish, emit) where finish = start emit = low

44 Compiling Delay compile Delay start = (finish, emit) where finish = delay low start emit = low start finish emit low

45 Compiling Emit start finish emit compile Emit start = (finish, emit) where finish = start emit = start

46 Compiling P;Q compile (p :>: q) start = (finish, emit) where (middle, emit1) = compile p start (finish, emit2) = compile q middle emit = or2 (emit1, emit2) start finish emit P Q

47 Compiling Conditional compile (IfThenElse c (p, q)) start = (finish, emit) where (finish1, emit1) = compile p (and2 (start, c)) (finish2, emit2) = compile q (and2 (start, inv c)) emit = or2 (emit1, emit2) finish = or2 (finish1, finish2) start finish emit Q

48 Compiling Parallel Composition compile (p :|: q) start = (finish, emit) where (finish1, emit1) = compile p start (finish2, emit2) = compile q start emit = or2 (emit1, emit2) finish = synchroniser (finish1, finish2) start finish emit synchroniser

49 Compiling Loops compile (While c p) start = (finish, emit) where start1 = or2 (start, finish1) (finish1, emit) = compile p (and2 (c, start1)) finish = and2 (inv c, start1) start finish emit

50 Sanity Check for Loop Body takesTime Skip = False takesTime Emit = False takesTime Delay = True takesTime (IfThenElse _ (p, q)) = takesTime p && takesTime q takesTime (p :>: q) = takesTime p || takesTime q takesTime (p :|: q) = takesTime p || takesTime q takesTime (While _ _) = False Warning! This may give false negatives

51 Extending the Language

52 Assignments Extending the language to still have one output variable, but set using assignments rather than Emit signals. Changes the shape of the circuits produced: start finish value inputs assign

53 Assignments Extending the language to still have one output variable, but set using assignments rather than Emit signals. Changes the shape of the circuits produced: start finish value inputs assign High when the output is being assigned a value Contains the value which is being assigned

54 Assignments Combining the output of two circuits now requires more logic: value1 assign1 value2 assign2 value assign

55 Assignments Combining the output of two circuits now requires more logic: value1 assign1 value2 assign2 value assign What happens if two parallel blocks assign to the variable at the same time?

56 Assignments At the top level, we can extract the actual value of the output: start finish value inputs assign reg actual value

57 Reading the Output Variable We simply add another input wire (which could be bundled with the inputs we already have) and connect it at the top level: start finish value inputs assign reg actual value

58 Multiple Output Variables Can be done by replicating the output wires: start finish emit 1 inputs emit 2 emit n …

59 Multiple Output Variables And replicating the logic used to combine the outputs. emitP 1 emitP 2 emitP n … emitQ 1 emitQ 2 emitQ n … emit 1 emit 2 emit n …

60 Multiple Output Variables And replicating the logic used to combine the outputs. emitP 1 emitP 2 emitP n … emitQ 1 emitQ 2 emitQ n … emit 1 emit 2 emit n … Note that the number of outputs has to be statically determined at compile-time!

61 Other Possible Extensions Channel communication. Common blocks of code (reusing the same hardware is not always possible). Adding in-built functions (eg arithmetic). Dynamic variable allocation (well, sort of).

62 Conclusions

63 Hardware Compilation So algorithms can be compiled directly into circuits. The extra start and finish wires handle the logic of the program counter in software. After all, hardware is not so different from software.

64 Other Issues Arising Compact placement of gates is an NP-hard problem. How do we decide placement at compile time? Some compilation/synthesis techniques compile algorithms without explicit timing. The compilation presented here is very naïve. How can one optimise (number of gates, power consumption, combinational depth, etc)? A single algorithm can now be realised in a combination of software and hardware. How can one partition code for software-hardware codesign effectively?

65 Extra slides

66 Example: Waiting for an edge Detect a rising edge: rising in forever { wait (inv t); wait t; emit1 } A falling edge: falling in rising (inv in) Any edge: edge in rising in || falling in


Download ppt "Hardware Compilation Gordon J. Pace December 2007."

Similar presentations


Ads by Google