Introduction to Silicon Programming in the Tangram/Haste language Material adapted from lectures by: Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir.

Slides:



Advertisements
Similar presentations
Introduction to VLSI Programming TU/e course 2IN30 Lecture 2: Control Handshake Circuits (1) Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir. Ad.
Advertisements

Chapter 4: Combinational Logic
Combinational Logic.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
التصميم المنطقي Second Course
Introduction to VLSI Programming TU/e course 2IN30 Lecture 3: Control Handshake Circuits (2)
Clockless Logic System-Level Specification and Synthesis Ack: Tiberiu Chelcea.
VLSI Programming of Asynchronous circuits for Low Power Kees van Berkel Philips Research Lab. Martin Rem Eindhoven University of Technology.
Flip-Flops, Registers, Counters, and a Simple Processor
Lecture 12 Latches Section Schedule 3/10MondayLatches (1) /12WednesdayFlip-flops5.4 3/13ThursdayFlip-flops, D-latch 3/17MondaySpring.
Lecture 12 Latches Section , Block Diagram of Sequential Circuit gates New output is dependent on the inputs and the preceding values.
Registers.1. Register  Consists of N Flip-Flops  Stores N bits  Common clock used for all Flip-Flops Shift Register  A register that provides the.
Introduction to asynchronous circuit design: specification and synthesis Part IV: Synthesis from HDL Other synthesis paradigms.
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
Chapter 2 – Combinational Logic Circuits Part 1 – Gate Circuits and Boolean Equations Logic and Computer Design Fundamentals.
Introduction to asynchronous circuit design: specification and synthesis Part III: Advanced topics on synthesis of control circuits from STGs.
Spring 2002EECS150 - Lec14-seq1 Page 1 EECS150 - Digital Design Lecture 14 - Sequential Circuits I (State Elements) March 12, 2002 John Wawrzynek.
COMP Clockless Logic and Silicon Compilers Lecture 3
11/16/2004EE 42 fall 2004 lecture 331 Lecture #33: Some example circuits Last lecture: –Edge triggers –Registers This lecture: –Example circuits –shift.
Introduction to VLSI Programming Lecture 6: Resource sharing (course 2IN30) Prof. dr. ir.Kees van Berkel.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Logic and Computer Design Fundamentals Registers and Counters
Chapter 7 - Part 2 1 CPEN Digital System Design Chapter 7 – Registers and Register Transfers Part 2 – Counters, Register Cells, Buses, & Serial Operations.
Advanced Verilog EECS 270 v10/23/06.
Multiplexer MUX. 2 Multiplexer Multiplexer (Selector)  2 n data inputs,  n control inputs,  1 output  Used to connect 2 n points to a single point.
Introduction to Silicon Programming in the Tangram/Haste language Material adapted from lectures by: Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir.
Today’s Lecture Process model –initial & always statements Assignments –Continuous & procedural assignments Timing Control System tasks.
Overview Logistics Last lecture Today HW5 due today
Combinational Circuits
Combinational Circuits
Introduction to VHDL (part 2)
Introduction to Digital Logic Design Appendix A of CO&A Dr. Farag
Verilog Basics Nattha Jindapetch November Agenda Logic design review Verilog HDL basics LABs.
Combinational Circuit – Arithmetic Circuit
Combinational Circuits
ECE 2372 Modern Digital System Design
PRINCIPLES OF DIGITAL CIRCUITS Jehan-François Pâris
Chapter 5 Factoring and Algebraic Fractions
Chap 4. Sequential Circuits
1 Workshop Topics - Outline Workshop 1 - Introduction Workshop 2 - module instantiation Workshop 3 - Lexical conventions Workshop 4 - Value Logic System.
1 Workshop Topics - Outline Workshop 1 - Introduction Workshop 2 - module instantiation Workshop 3 - Lexical conventions Workshop 4 - Value Logic System.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Computer Organization & Programming Chapter 5 Synchronous Components.
In1200/04-PDS 1 TU-Delft Digital Logic. in1200/04-PDS 2 TU-Delft Unit of Information l Computers consist of digital (binary) circuits l Unit of information:
Chap 2. Combinational Logic Circuits
Reading Assignment: Rabaey: Chapter 9
ELEN 468 Lecture 131 ELEN 468 Advanced Logic Design Lecture 13 Synthesis of Combinational Logic II.
1 Chapter 4 Combinational Logic Logic circuits for digital systems may be combinational or sequential. A combinational circuit consists of input variables,
1 COMP541 State Machines – 2 Registers and Counters Montek Singh Feb 11, 2010.
Introduction to Verilog
Digital Logic Design Basics Combinational Circuits Sequential Circuits Pu-Jen Cheng Adapted from the slides prepared by S. Dandamudi for the book, Fundamentals.
ECE DIGITAL LOGIC LECTURE 8: BOOLEAN FUNCTIONS Assistant Prof. Fareena Saqib Florida Institute of Technology Spring 2016, 02/11/2016.
Combinational Design, Part 2: Procedure. 2 Topics Positive vs. negative logic Design procedure.
1 Asynchronous Sequential Logic For most figures:.
1 Introduction to Engineering Spring 2007 Lecture 18: Digital Tools 2.
Exp#5 & 6 Introduction to Verilog COE203 Digital Logic Laboratory Dr. Ahmad Almulhem KFUPM Spring 2009.
Overview Logistics Last lecture Today HW5 due today
ECE 301 – Digital Electronics
Synthesis from HDL Other synthesis paradigms
Asynchronous Interface Specification, Analysis and Synthesis
A Brief Review of Factoring
Part IV: Synthesis from HDL Other synthesis paradigms
An Introduction to Microprocessor Architecture using intel 8085 as a classic processor
Introduction to VLSI Programming Lecture 7: Introduction to the DLX
Introduction toVLSI Programming Lecture 4: Data handshake circuits
Introduction to VLSI Programming Lecture 8: High Performance (DLX)
Introduction to Silicon Programming in the Tangram/Haste language
Program correctness Axiomatic semantics
Introduction to VLSI Programming Lecture 5: Tangram & Tools
Introduction to Silicon Programming in the Tangram/Haste language
Presentation transcript:

Introduction to Silicon Programming in the Tangram/Haste language Material adapted from lectures by: Prof.dr.ir Kees van Berkel [Dr. Johan Lukkien] [Dr.ir. Ad Peeters] at the Technical University of Eindhoven, the Netherlands

Philips Research, Kees van Berkel, Ad Peeters, TU/e Handshake signaling and data request a r active side passive side acknowledge a k data a d request a r active side passive side acknowledge a k data a d push channel versus pull channel

Philips Research, Kees van Berkel, Ad Peeters, TU/e Handshake signaling: push channel ack a k req a r time early a d broad a d late a d

Philips Research, Kees van Berkel, Ad Peeters, TU/e Data bundling In order to maintain event ordering at both sides of a channel, the circuit must satisfy data bundling constraint: for push channel: delay along request wire must exceed delay of data wire; for pull channel: delay along acknowledge wire must exceed delay of data wire.

Philips Research, Kees van Berkel, Ad Peeters, TU/e Handshake signaling: pull channel ack a k req a r time early a d broad a d late a d When data wires are invalid: multiple and incomplete transitions allowed.

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram assignment x:= f(y,z) yw zw y f z   xw 0 |x xr xw 1 Handshake circuit y f z   |x y f z   |x

Philips Research, Kees van Berkel, Ad Peeters, TU/e Four-phase data transfer   b c  r / b r time b d / c d b a / c r c a /  a

Philips Research, Kees van Berkel, Ad Peeters, TU/e Handshake latch  [ [ w  ; [w  : r d := w d ] [] r  ; r  ] ] 1-bit handshake latch: w d  w r  r d   w d  w r  r d  w k = w r r k = r r x rw wdwd wrwr rdrd

Philips Research, Kees van Berkel, Ad Peeters, TU/e N-bit handshake latch wrwr w d1 r d1 w d2 wkwk r d2 w dN rdNrdN... r rkrk area, delay, energy area: 2(N+1) gate eqs. delay per cycle: 4 gate delays energy per write cycle: *2N transitions, in average

Philips Research, Kees van Berkel, Ad Peeters, TU/e Transferrer  [ [ a  : (b   ; c   )] ; [ a  : (b   ; c d := b d ; c   ; c d :=  )] ]  a bc a r a k brbr bkbk bdbd ckck crcr cdcd

Philips Research, Kees van Berkel, Ad Peeters, TU/e Multiplexer  [ [ a  : c   ; a  : (c d := a d ; c   ; c d :=  ) [] b  : c   ; b  : (c d := b d ; c   ; c d :=  ) ] ] Restriction:  a r   b r must hold at all times! | a b c

Philips Research, Kees van Berkel, Ad Peeters, TU/e Multiplexer realization data circuit control circuit

Philips Research, Kees van Berkel, Ad Peeters, TU/e Logic/arithmetic operator  [ [ a  : (b   || c   ) ] ; [ a  : ((b   || c   ) ; a d := f(b d, c d ))] ] Cheaper realization (delay sensitive):  [ [ a  : (b   || c   ) ] ; [ a  : ((b   || c   ) ; a d := f(b d, c d ))] ; “delay” ; a d :=  ] f b c a

Philips Research, Kees van Berkel, Ad Peeters, TU/e A one-place fifo buffer byte = type [0..255] & BUF1 = main proc (a?chan byte & b!chan byte). begin x: var byte | forever do a?x ; b!x od end BUF1 ab

Philips Research, Kees van Berkel, Ad Peeters, TU/e A one-place fifo buffer byte = type [0..255] & BUF1 = main proc (a?chan byte & b!chan byte). begin x: var byte | forever do a?x ; b!x od end   ;  x  ba  ;  a a x b b   x   ; x x

Philips Research, Kees van Berkel, Ad Peeters, TU/e 2-place buffer byte = type [0..255] & BUF1 = proc (a?chan byte & b!chan byte). begin x: var byte | forever do a?x ; b!x od end & BUF2: main proc (a?chan byte & c!chan byte). begin b: chan byte | BUF1(a,b) || BUF1(b,c) end BUF1 a b c

Philips Research, Kees van Berkel, Ad Peeters, TU/e Two-place ripple buffer

Philips Research, Kees van Berkel, Ad Peeters, TU/e Two-place wagging buffer ba  byte = type [0..255] & wag2: main proc (a?chan byte & b!chan byte). begin x,y: var byte | a?x ; forever do (a?y || b!x) ; (a?x || b!y) od end

Philips Research, Kees van Berkel, Ad Peeters, TU/e Two-place ripple register … begin x0, x1: var byte | forever do b!x1 ; x1:=x0; a?x0 od end

Philips Research, Kees van Berkel, Ad Peeters, TU/e 4-place ripple register byte = type [0..255] & rip4: main proc (a?chan byte & b!chan byte). begin x0, x1, x2, x3: var byte | forever do b!x3 ; x3:=x2 ; x2:=x1 ; x1:=x0 ; a?x0 od end

Philips Research, Kees van Berkel, Ad Peeters, TU/e 4-place ripple register area : N (A var + A seq ) cycle time : T c = (N+1) T := cycle energy: E c = N E := x0x1x2x3 x0x3 x2x3x1x2x0x1x0

Philips Research, Kees van Berkel, Ad Peeters, TU/e Introducing vacancies … begin x0, x1, x2, x3, v: var byte | forever do (b!x3 ; x3:=x2 ; x2:=v) || (v:=x1 ; x1:=x0 ; a?x0) od end what is wrong?

Philips Research, Kees van Berkel, Ad Peeters, TU/e Introducing vacancies forever do ((b!x3 ; x3:=x2) || (v:=x1 ; x1:=x0 ; a?x0)) ; x2:=v od or: forever do ((b!x3 ; x3:=x2) || (v:=x1 ; x1:=x0)) ; (x2:=v || a?x0) od

Philips Research, Kees van Berkel, Ad Peeters, TU/e “synchronous” 4-p ripple register forever do (s0:=m0 || s1:=m1 || s2:=m2 || b!m3 ) ; ( a?m0 || m1:=s0 || m2:=s1 || m3:=s2) od m0 s0 m1 s1 m2 s2 m3 x0b m0 s0 m1 s1 m2 s2 m3 x0b m0 s0 m1 s1 m2 s2 m3 x0b m0 s0 m1 s1 m2 s2 m3 x0b m0 s0 m1 s1 m2 s2 m3 x0b

Philips Research, Kees van Berkel, Ad Peeters, TU/e 4-place wagging register forever do b!x1 ; x1:=x0 ; a?x0 ; b!y1 ; y1:=y0 ; a?y0 od x0x1 x2x3y0y1 a b x1 x2b x0x1 a x0 b a y1 bby0y1a y0 a

Philips Research, Kees van Berkel, Ad Peeters, TU/e 8-place register 4-way wagging forever do b!u1 ; u1:=u0 ; a?u0 ; b!v1 ; v1:=v0 ; a?v0 ; b!x1 ; x1:=x0 ; a?x0 ; b!y1 ; y1:=y0 ; a?y0 od

Philips Research, Kees van Berkel, Ad Peeters, TU/e Four 8  8 shift registers compared

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram/Haste Purpose: programming language for asynchronous VLSI circuits. Creator: Tangram Philips Research Labs (proto- Tangram 1986; release 2 in 1998). Inspiration: Hoare’s CSP, Dijkstra’s GCL. Lectures: no formal introduction; manual hand-out (learn by example, learn by doing). Main tools: compiler, analyzer, simulator, viewer.

Philips Research, Kees van Berkel, Ad Peeters, TU/e 2-place buffer byte = type [0..255] & BUF1 = proc (a?chan byte & b!chan byte). begin x: var byte | forever do a?x ; b!x od end & BUF2: main proc (a?chan byte & c!chan byte). begin b: chan byte | BUF1(a,b) || BUF1(b,c) end BUF1 a b c

Philips Research, Kees van Berkel, Ad Peeters, TU/e Median filter median: main proc (a? chan W & b! chan W). begin x,y,z: var W & xy, yz, zw: var bool | forever do ((z:=y; y:=x) || yz:=xy) ; a?x ; (xy:= x<=y || zx:= z<=x) ; if zx=xy then b!x or xy=yz then b!y or yz=zx then b!z fi od end Median ab

Philips Research, Kees van Berkel, Ad Peeters, TU/e Greatest Common Divisor gcd: main proc (ab?chan > & c!chan byte). begin x,y: var byte | forever do ab? > ; do x y then x:= x-y od ; c!x od end GCD abc

Philips Research, Kees van Berkel, Ad Peeters, TU/e Nacking Arbiter nack : main proc (a?chan bool & b!chan bool). begin na,nb: var bool | > := > ; forever do sel probe(a) then a!nb || na:= na#nb or probe(b) then b!na || nb:= nb#na les od end Nacking arbiter a b

Philips Research, Kees van Berkel, Ad Peeters, TU/e C : Tangram  handshake circuit T  ab C(T) = ;  ac SR C(R;S)=

Philips Research, Kees van Berkel, Ad Peeters, TU/e C : Tangram  handshake circuit ;  ac SR C(R;S)= ac SR ;  | b

Philips Research, Kees van Berkel, Ad Peeters, TU/e C : Tangram  handshake circuit C (R||S) = SR ||   o | rx  i

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram Compilation Tangram program T Handshake circuit VLSI circuit C E Handshake process H ||  · H · T = || · C ·T

Philips Research, Kees van Berkel, Ad Peeters, TU/e VLSI programming of asynchronous circuits expander Tangram program Handshake circuit Asynchronous circuit (netlist of gates) compiler simulator feedback behavior, area, time, energy, test coverage

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram tool box Let Rlin4.tg be a Tangram program: htcomp -B Rlin4 –compiles Rlin4.tg into Rlin4.hcl, a handshake circuit htmap Rlin4 –produces Rlin4*.v files, a CMOS standard-cell circuit htsim Rlin4 a b –executes Rlin4.hcl with files a, b for input/output htview Rlin4 –provides interactive viewing of simulation results

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram program “Conway” B1 = type [0..1] & B2 = type > & B3 = type > & P = … & Q = … & R = … & conway: main proc (a?chan B2 & d!chan B3). begin b,c: chan B1 | P(a,b) || Q(b,c) || R(c,d) end PQR abcd

Philips Research, Kees van Berkel, Ad Peeters, TU/e Tangram program “Conway” & P = proc(a?chan B2 & b!chan B1). begin x: var B2 | forever do a?x; b!x.0; b!x.1 od end & Q= proc(b?chan B1 & c!chan B1). begin y: var B1 | forever do b?y; c!y od end & R= proc(c?chan B1 & d!chan B3). begin x,y,z: var B1 | forever do c?x; c?y; c?z; d! > od end

Philips Research, Kees van Berkel, Ad Peeters, TU/e VLSI programming for … Low costs: –introduce resource sharing. Low delay (high throughput): –introduce parallelism. Low energy (low power): –reduce activity; …

Philips Research, Kees van Berkel, Ad Peeters, TU/e VLSI programming for low costs Keep it simple!! Introduce resource sharing: commands, auxiliary variables, expressions, operators. Enable resource sharing, by: –reducing parallelism –making similar commands equal

Philips Research, Kees van Berkel, Ad Peeters, TU/e Command sharing S ; … ; S P : proc(). S P() ; … ; P() S 00 S 11 | S 00 11

Philips Research, Kees van Berkel, Ad Peeters, TU/e Command sharing: example a?x ; … ; a?x ax : proc(). a?x ax() ; … ; ax()  11 | 00 |  axw | 00 11  a

Philips Research, Kees van Berkel, Ad Peeters, TU/e Procedure definition vs declaration Procedure definition: P = proc (). S –provides a textual shorthand (expansion) –each call generates copy of resource, i.e. no sharing Procedure declaration: P : proc (). S –defines a sharable resource –each call generates access to this resource

Philips Research, Kees van Berkel, Ad Peeters, TU/e Command sharing Applies only to sequentially used commands. Saves resources, almost always (i.e. when command is more costly than a mixer). Impact on delay and energy often favorable. Introduced by means of procedure declaration. Makes Tangram program less well readable. Therefore, apply after program is correct & sound. Should really be applied by compiler.

Philips Research, Kees van Berkel, Ad Peeters, TU/e Sharing of auxiliary variables x:=E is an auto assignment when E depends on x. This is compiled as aux:=E; x:= aux, where aux is a “fresh” auxiliary variable. With multiple auto assignments to x, as in: x:=E;... ; x:=F auxiliary variables can be shared, as in: aux:=E; aux2x();... ; aux:=F; aux2x() with aux2x(): proc(). x:=aux

Philips Research, Kees van Berkel, Ad Peeters, TU/e Expression sharing x:=E ; … ; a!E f : func(). E x:=f() ; … ; a!f() | E e0e0 e1e1 E e0e0 E e1e1

Philips Research, Kees van Berkel, Ad Peeters, TU/e Expression sharing Applies only to sequentially used expressions. Often saves resources, (i.e. when expression is more costly than the demultiplexer). Introduced by means of function declarations. Makes Tangram program less well readable. Therefore apply after program is correct & sound. Should really be applied by compiler.

Philips Research, Kees van Berkel, Ad Peeters, TU/e Operator sharing Consider x0 := y0+z0 ; … ; x1 := y1+z1. Operator + can be shared by introducing add : func(a,b? var T): T. a+b and applying it as in x0 := add(y0, z0) ; … ; x1 := add(y1,z1).

Philips Research, Kees van Berkel, Ad Peeters, TU/e Operator sharing: the costs Operator sharing may introduce multiplexers to (all) inputs of the operator and a demultiplexer to its output. This form of sharing only reduces costs when: –operator is expensive, –some input(s) and/or output are common.

Philips Research, Kees van Berkel, Ad Peeters, TU/e Operator sharing: example Consider x := y+z0 ; … ; x := y+z1. Operator + can be shared by introducing add2y : proc(b? var T). x:=y+b and applying it as in add2y(z0) ; … ; add2y(z1).

Philips Research, Kees van Berkel, Ad Peeters, TU/e Greatest Common Divisor gcd: main proc (ab?chan > & c!chan byte). begin x,y: var byte | forever do ab? > ; do x y then x:= x-y od ; c!x od end GCD abc

Philips Research, Kees van Berkel, Ad Peeters, TU/e Assigment: make GCD smaller Both assignments (y:= y-x and x:= x-y) are auto assignments and hence require an auxiliary variable. Program requires 4 arithmetic resources (twice < and –). Reduce costs of GCD by saving on auxiliary variables and arithmetic resources. (Beware the costs of multiplexing!) Use of ff variables not allowed for this exercise.