Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation.

Similar presentations


Presentation on theme: "1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation."— Presentation transcript:

1 1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation

2 2 ® Charles Dike Introduction Tutorial Share some ideas about synchronization and metastability Introduce NEW, IMPROVED theory on metastability Charles Dike (cdike@ichips.intel.com)

3 3 ® Charles Dike Why and where synchronize? Reduce latency between independent clock domains.  Asynchronous domain to synchronous clock.  Synchronous clock to an independent synchronous clock. Benefit - higher performance in critical circuits. Asynchronous Circuit Pausable Clock at 1.8 GHz Synchronous Clock at 3.0 GHz Synchronous Clock at 1.5GHz

4 4 ® Charles Dike Design Direction MEM FPU ALU MEM FPU ALU MEM FPU ALU MEM FPU ALU 80s towards 100MHz 90s towards 1GHz 00s multi-GHz VALUE ADDED

5 5 ® Charles Dike Chip Area Networks Late 00s multi-GHz

6 6 ® Charles Dike I believe…. We must be able to synchronize all domains to a PLL controlled clock Interconnect on chip will be asynchronous (GALS) We need to minimize latency There will be two basic synchronizer uses - near neighbor and the chip net

7 7 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability

8 8 ® Charles Dike Generic Synchronizer Handles self timed to synchronous interfaces and vice-versa Supports synchronous to synchronous interfaces Can handle streaming data Adaptable to any speed range Possibly used over the chip network

9 9 ® Charles Dike Two flop synch DQDQ CLK VALID #1#2

10 10 ® Charles Dike Single latch synch DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK LATCH OUTPUT RECEIVER CLOCK SENDER CLOCK

11 11 ® Charles Dike Multi latch synch DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK

12 12 ® Charles Dike General Case 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 WRITE POINTER READ POINTER EMPTY SYNC STATUS REGISTER 1 1 1 1 1 0 0 0 0 0 SYNCHRONIZERSSYNCHRONIZERS LATENCY PADDING FULL EN Write Clock Write Enable Read Clock

13 13 ® Charles Dike empty case WRITE POINTER READ POINTER STATUS REGISTER EMPTY DQ R EN DQ R DQ R SYNCHRONIZER Write Pointer a Read Pointer b Read Clock EMPTY DQ R EN DQ R DQ R Write Clock Write Enable Write Pointer b Read Pointer a

14 14 ® Charles Dike General Case 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 WRITE POINTER READ POINTER EMPTY SYNC STATUS REGISTER 1 1 1 1 1 0 0 0 0 0 SYNCHRONIZERSSYNCHRONIZERS LATENCY PADDING FULL EN Write Clock Write Enable Read Clock

15 15 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm  processor The Myrinet pipeline synchronization scheme Latest understanding of metastability

16 16 ® Charles Dike Simple Synchronizer Constrained by frequency ratio Supports synchronous to synchronous interfaces Does it support asynch to synch? Yes, with restrictions. Possibly used in local neighbor synchronizers

17 17 ® Charles Dike Simple Synchronizer DQDQDQDQ Divide by 2 SLOW CLK FAST CLK SYNC * MI* MI* = Metastable Immune AA1 A2A3 wxyz

18 18 ® Charles Dike timing1 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SLOW CLOCK A A1 A2 A3 SYNC

19 19 ® Charles Dike timing2 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK

20 20 ® Charles Dike timing3 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK

21 21 ® Charles Dike timing4 Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK SLOW CLOCK# SYNC DQDQDQ FAST SYNC * MI* AA1 A2A3 DQDQDQDQDQ * MI*

22 22 ® Charles Dike transfers 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK DQDQ SYNC FAST CLOCK SLOW CLOCK FAST TO SLOW TRANSFERSLOW TO FAST TRANSFER DQDQ SYNC FAST CLOCK SLOW CLOCK

23 23 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability

24 24 ® Charles Dike Pipeline Synchronizer Supports synchronous to synchronous interfaces Supports asynch to synch and vice- versa Possibly used in local neighbor synchronizers Essentially a distributed fifo and synchronizer

25 25 ® Charles Dike Pipeline Synchronizer S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do   

26 26 ® Charles Dike  R1R1 R0R0 A1A1 A0A0 ME S  ME element X REQ

27 27 ® Charles Dike Fifo element Ri Ai Di Ro Ao Do C Ri Ai Ro Ao C Data

28 28 ® Charles Dike Async to sync S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do      SynchronousAsynchronous

29 29 ® Charles Dike Sync to async   SynchronousAsynchronous Ri Ai Di Ro Ao Do Ri Ai Di Ro Ao Do Ri Ai Di Ro Ao Do    S SS

30 30 ® Charles Dike Points to ponder #1 All synchronizing interfaces have one thing in common - a latching element that holds data while metastabilities are being resolved. There is no way to avoid the latency which is required to resolve metastabilities. To minimize latency the latching element characteristics can be improved. We will be required to understand and use this knowledge. This is the future of digital design.

31 31 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability

32 32 ® Charles Dike Role of the Synchronizing Flop Reorients incoming information to a clock edge Its performance determines system failure rate or latency

33 33 ® Charles Dike Real Life There is no magic bullet There is a lot of misinformation on metastability around To date many circuits have been over designed through planning and luck Whenever a circuit fails based on too high of a frequency ultimately the cause of failure is metastability There is no way to synchronize a signal faster than about the time it takes to pass a signal through six static gates

34 34 ® Charles Dike Metastability is.... SET RESET OUT NODE A NODE B

35 35 ® Charles Dike Technical terms T w (window size) - likelihood of entering a metastable state - in units of time Tau (  ) - rate at which metastability resolves - in units of time MTBF (Mean Time Between Failures) MTBF = TwfdfcTwfdfc e t  =4kT/C < thermal noise

36 36 ® Charles Dike Simple jamb latch DATA CLOCKRESET OUT NODE A NODE B Propagation delay  time of data after clock

37 37 ® Charles Dike Simple jamb latch DATA CLOCKRESET OUT NODE A NODE B Propagation delay  time of data after clock ~RC time constant

38 38 ® Charles Dike Rough Histogram Propagation delay  time of data after clock Propagation delay  time of data after clock (log scale) MTBF = TwfdfcTwfdfc e t  TwTw The slope is the 

39 39 ® Charles Dike Why is the theory a problem? It assumes a uniform distribution of data about the clock –What happens when data always violates the setup/ hold window? It is not detailed enough –Doesn’t consider a deterministic region –Doesn’t account for thermal noise People tend to extrapolate the theory improperly MTBF = TwfdfcTwfdfc e t 

40 40 ® Charles Dike Overview of refined theory Not everything past a normal propagation is a metastable event The T w window can’t be improved by input edge rates T w has a complex relationship to t based on load The MTBF formula needs to be modified due to non-uniform distribution of data about the clock input

41 41 ® Charles Dike Schematic

42 42 ® Charles Dike Simulation of a typical latching device

43 43 ® Charles Dike Test case DQ R PC DELAY PULSE GENERATOR #2 PULSE GENERATOR #1 TRIGGER INPUT TEK 11801-B OSCILLOSCOPE DELAY

44 44 ® Charles Dike Measuring real data advancing time

45 45 ® Charles Dike Histogram Inflection point time 0.6mv/0.1ps

46 46 ® Charles Dike Histogram Inflection point time 0.6mv/0.1ps

47 47 ® Charles Dike Measured versus Basic Propagation delay  time of data after clock (log scale) MTBF = TwfdfcTwfdfc e t  TwTw The slope is the  Propagation delay 0.6mv/0.1ps

48 48 ® Charles Dike  Simulated.... Voltage Controlled Switch R1 = 100  R1 = 100M  Battery

49 49 ® Charles Dike Tau Simulated 2  = | t1 - t2 | ln V2 V1 Where: V1 = voltage at time t1 V2 = voltage at time t2 t2 t1 Latch outputs at nodes 1 and 2 1.0 1.2 1.4ns Semilog difference between latch outputs 1.0 1.2 1.4ns 10 0 10 -3 10 -6 volts time 1.5 1.0 0.5 0.0 volts

50 50 ® Charles Dike =4kT/C=4kTBR k = 1.38 x 10 -23 J/K B = 1/  =  5 x 10 10 Hz R = ~400  T = 300 o K  = 20 picoseconds V n = ~0.6 mv

51 51 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) A normal

52 52 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) B ? deterministic

53 53 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) C Thermal noise point 1.80 v 180 mv 18.0 mv 1.80 mv 180  v 18.0  v 1.80  v deterministic

54 54 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) D T=19 ps deterministic true metastability

55 55 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) E T w =15 ps T=19 ps deterministic true metastability

56 56 ® Charles Dike MTBF = TwfdfcTwfdfc e (t-deter)  MTBF = TwfdfcTwfdfc e t  Worst case Simple case MTBF = TwfdfcTwfdfc e (t-0.5*deter)  Expected

57 57 ® Charles Dike Points to ponder #2 Jakov Seizovic postulated a “malicious” asynchronous signal: no matter how we position the sampling window, and no matter how small we make the sampling window, the asynchronous transition will appear in that window. This case has to be assumed when interfacing to a signal of unknown probability distribution. We know something about just how malicious a signal can be.

58 58 ® Charles Dike Exploring

59 59 ® Charles Dike Worst case bound

60 60 ® Charles Dike < 0.1 ps Uniform distribution 12 ps jitter Not worst case bound

61 61 ® Charles Dike Final comments With the proper synchronizing device it may be possible to synchronize a signal within a single clock cycle. The constraints are: –You require about 35  s in order to get the MTBF out to about 1 century. –Each typical static gate delay is equivalent to about 5  s in a properly designed synchronizing flop. –The metastability MTBF of a device should probably be an order of magnitude better than the mechanical MTBF. –You must assume a ‘malicious’ input to the synchronizer. Nevertheless, this only adds about 5  s to the delay. –Standard flop designs are generally very poor synchronizers. Use a jamb structure. It has the best transconductance. –You should never require more than two synchronizing flops in series

62 62 ® Charles Dike Conclusion There are several ways to communicate between independent domains I believe more asynchronous domains will appear that are imbedded within synchronous designs –Latency must be reduced to maximize the use of asynchronous designs. –This is a burden that asynch designers must bear –We need to know the limitations of synchronization and metastability Chip area networks are coming and they will open up opportunities for asynchronous design

63 63 ® Charles Dike References T. Sakurai, “Optimization of CMOS Arbiter and Synchronizer Circuits with Submicrometer MOSFET’s,” IEEE J. Solid State Circuits, vol. 23,no. 4, pp. 901-906, Aug 1988. L. Kleeman and A. Cantoni, “Metastable Behavior in Digital Systems,” IEEE Design & Test of Computers, pp. 4-19, Dec 1987. I. E. Sutherland, “Micropipelines.” Turing Award Lecture, Communications of the ACM, 32(6), pp.720-738, 1989. J. N. Seizovic, “Pipeline Synchronization,” Proc. Int’l Symp. Advanced Research in Asynchronous Circuits and Systems, CS Press, 1994. C. Dike and E. Burton, “Miller and Noise Effects in a Synchronizing Flip-Flop,” IEEE J. Solid State Circuits, vol. 34,no. 6, pp. 849-855, June 1999. A. Van der Ziel, Noise in Measurements. New York: Wiley, 1976.

64 64 ® Charles Dike Overview of present theory Everything past a normal propagation is considered a metastable event A deterministic region doesn’t exist T w has no fixed relationship to  The MTBF formula assumes a uniform distribution of data about the clock input MTBF = TwfdfcTwfdfc e t 


Download ppt "1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation."

Similar presentations


Ads by Google