Presentation is loading. Please wait.

Presentation is loading. Please wait.

PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst.

Similar presentations


Presentation on theme: "PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst."— Presentation transcript:

1 PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 1

2 ASIC Typical Design Steps Top Level Design Unit Block Design Integration and Synthesis Trial Netlists System Level Verification Timing Convergence & Verification Fabrication DVT DVT Prep ??5 8 Time in Weeks Time to Mask order Unit Block Verification Typical ASIC design can take up to two years to complete PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 2

3 SoC Typical Design Steps Top Level Design Unit Block Design Integration and Synthesis Trial Netlists System Level Verification Timing Convergence & Verification Fabrication DVT DVT Prep Time in Weeks Time to Mask order Unit Block Verification 4 2 With increasing Complexity of IC’s and decreasing Geometry, IC Vendor steps of Placement, Layout and Fabrication are unlikely to be greatly reduced. In fact there is a greater risk that Timing Convergence steps will involve more iteration. Need to reduce time before Vendor Steps. Need to consider Layout issues up-front. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 3

4 SoC Typical Design Steps Top Level Design Unit Block Design Integration and Synthesis Trial Netlists System Level Verification Timing Convergence & Verification Fabrication DVT DVT Prep Time in Weeks Time to Mask order 24 Unit Block Verification 4 2 SoC Architecture already defined. Flexible to scale in frequency and complexity. Allows new IP cores, new technology to be integrated. Separate the design of the reusable IP from the design of the SoC. Build the SoC from library of tested IP. Unit design consists only of any additional core features or wrapping new IP to enable integration. Reusable IP purchased from external sources, developed from in-house designs or designed as separate project off critical SoC development path. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 4

5 Design Methodology A Front-End ASIC Design Flow PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 5

6 Design Methodology A Back-End Design Flow or Generic Physical Flow. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 6

7 ASIC Methodology PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 7

8 SOC Methodology PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 8

9 SOC Methodology Evolving... PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 9

10 How to Design an SOC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 10

11 How to Design an SOC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 11

12 How to Design an SOC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 12

13 How to Design an SOC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 13

14 How to Design an SOC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 14

15 Main SOC testing challenges Core level test: Embedded cores are tested as a part of the system Test access: Due to absence of physical access to the core peripheries, electronic access mechanism required SOC level test: SOC test is a single composite test including individual core, and UDL test and test scheduling System on Chip - Testing Test data volume for core-based SOC designs is very high. New techniques are required to reduce testing time, test cost, and the memory requirements of the automatic test equipment (ATE) SOCs are complex designs combining logic, memory and mixed-signal circuits in a single IC PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 15

16 Verification Today about 70% of design cost and effort is spent on verification. Verification teams are often almost twice as large as the RTL designers at companies developing ICs. Traditionally, chip design verification focuses on simulation. However, new verification techniques are emerging. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 16

17 Design for Integration OCBSpeedBandwidthArbitrationExample SystemHigh ComplexARM AHB PeripheralLow SimplePCI Bus A key issue in SOC design is integration of silicon IPs (cores). Integration of IPs directly affects the complexity of SOC designs and also influences verification of the SOC. Verification is faster and easier if the SOC interconnect is simple and unified (use an on-chip communication system or intelligent on-chip bus). There is no standard for OCBs; they are chosen almost exclusively by the specific application for which they will be used and by the designer's preference. Two main types of OCBs (on-chip bus) and their characteristics PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 17

18 A Typical Gateway SoC Architecture An example of typical gateway VoIP (Voice over Internet Protocol) system-on-a-chip diagram. A gateway VoIP SoC is a device used for functions such as vocoders, echo cancellation, data/fax modems, and VoIP protocols. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 18

19 A Traditional SOC Architecture (bus-based) In a typical SOC, there are complex data flows and multiple cores such as CPUs, DSPs, DMA, and peripherals. Therefore, resource sharing becomes an issue, communication between IPs becomes very complicated. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 19

20 So nics’ Silicon Backplane Used in SOC Design Architecture The CPU, DMA, and the DSP engine all share the same bus (the CPU or the system bus). Also, there are dedicated data links, a lot of control wires between blocks, and peripheral buses between subsystems  there is interdependency between blocks and a lot of wires in the chip. Therefore, verification, test, and physical design all become difficult to fulfill. A solution to this system integration is to use an intelligent, on-chip interconnect that unifies all the traffic into a single entity. An example of this is Sonics’ SMART Interconnect SiliconBackplane MicroNetwork. When compared to a traditional CPU bus, an on-chip interconnect such as Sonics SiliconBackplane has the following advantages:  Higher efficiency  Flexible configuration  Guaranteed bandwidth and latency  Integrated arbitration PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 20

21 Sonics’ SiliconBackplane MicroNetwork Used in SOC Design Architecture A MicroNetwork is a heterogeneous, integrated network that unifies, decouples, and manages all of the communication between processors, memories, and input/output devices. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 21

22 The basic WiseNET SoC architecture The architecture includes: the ultralow-power dual-band radio transceiver (Tx and Rx), a sensor interface with a signal conditioner and two analog-to-digital converters (ANA_FE), a digital control unit based on a Cool-RISC microcontroller (μC) with on-chip low-leakage memory, several timebasis and digital interfaces, a power management block (POW) PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 22

23 Networks on a chip PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 23

24 24 Chip Design Flows Custom Design Flow RTL to GDSII Design Flow (aka ASIC Design Flow) IP-based Design Flow Platform-based Design Flow PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

25 25 RTL-to-GDSII Design Flow – at 10,000ft RTL creation and verification Block-level synthesis, timing analysis and ATPG Floorplanning Top-level RTL synthesis, timing analysis and ATPG Physical Design Parasitic Extraction Top-level timing analysis and ATPG Handoff B1B2 top.v(hd) B1.v(hd) B2.v(hd) top.gdsii sc.lib macros constraints reports PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

26 26 IP Reuse - Challenges IP vendors have multiple clients with different requirements One size may not fit all Does IP meet all functional requirements? Standards compliance IP verification becomes a challenge Multiple vendors may supply IP – extra effort needed to interface them (“Heterogeneous IP”) Does IP meet area and performance constraints? Sometimes, there may be surprises after the IP is integrated We may need to add extra hardware for protocol translation PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

27 27 IP integration challenges Verification in the context of the SoC Performance Standards Compliance Functionality Unconnected pins, Unknown values, etc. Three students (A, B, C) form a team for writing a report. B and C write sections. A integrates them. What problems might come up during integration? Page count exceeds limit, File provided by C not compatible, File provided by B may have a virus, Repetitions, Referencing problems, etc. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

28 28 Example You are designing a 32-bit two’s complement multiplier and wish to use an adder as an IP You only have a 16-bit adder available IP assumes that inputs are coming MSB-first, whereas another IP provides the outputs LSB-first IP assumes Big-Endian storage of data; another IP stores data in Little-Endian format Voltage levels of two IP may be different Interface wrappers are used to overcome these problems. Wrappers will result in overheads. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

29 29 Eight elements for judging the quality of Silicon Hardware/Software IP Courtesy: Synopsys (2006) PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

30 30 IP cores Hard core, Soft core Hardware IP, Software IP Design IP, Test IP, Verification IP Processors, Memories, Hardware Accelerators, Peripherals, Analog Search for some IP vendors who provide processor cores, memory cores, design libraries, Analog IP, IP for USB 2.0 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

31 31 Hard IP and Soft IP Hard IP Technology Dependent Predictable – already proved in silicon More protected against illegal usage Limitation – cannot be customized Soft IP Technology Independent Risk Can be modified to suit the needs of the SoC Application-specific Hard IP – tries to combine the best of both worlds. PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

32 32 Hardware/Software Co-design Two important steps in IP integration Provide a hardware interface Pin/Signal mapping, Protocol translation, Buffering Software Driver Access to IP functionality through the OS Implement some functions in hardware and others in software for area/performance/power tradeoff Example of JTAG: DCT in hardware, other functions in software PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

33 33 IP Standards OCP-IP is a well-known standard to mitigate the problem of interfacing IP from multiple vendors Many IP are OCP-compliant PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

34 34 IP-based Design – Decisions impact system cost, power, performance Including an IP can increase chip cost, but can bring down system cost Examples - DDR2 Interface, Power Management, Security IP Knowledge of the target market essential to make such decisions PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

35 35 Ideal ESL Design Flow 35© 2008 Sudeep Pasricha & Nikil Dutt PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

36 36 Platform Based Design Most System-on-Chip need One or more processors Digital Signal Processors Hardware Accelerators Peripheral IP (touch screen, etc) Connectivity IP Embedded Memories Analog/Mixed-Signal IP Custom Designing the SoC for each application has advantages, but infeasible! What are the limitations of the top-down approach? PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

37 Design for Timing Closure: Logic Design Issues Interfaces and Timing Closure Proper design of block interfaces can solve the problem One of the major issues compounding the problem of timing closure for large chips is the uncertainty in interconnect wire delays Ex: In deep submicron technologies, wire delay due to wire load capacitance plus RC delay for the wires between the gates can be much larger than the intrinsic delay of the gate. The problem is, of course, that the architect and designer do not know which wires will require additional buffering until physical design. Timing-driven place and route tools can help deal with some of these timing problems by attempting to place critical timing paths so as to minimize total wire length. 37 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

38 Macro Interfaces 38 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

39 Continued… Block A and Block B can be designed independently, and without consideration of their relative position on the chip. To meet timing closure Buffers can to be inserted at the top level to drive long wires between blocks, without requiring redesign of Blocks A and B. This kind of defensive timing design is useful in all large chip designs, but is essential for reuse-based SoC design. The IP designer does not know the timing context in which the block will be used. Output wires may be short or they may be many millimeters. Defensive timing design is the only way to assure that timing problems will not limit the use of the IP in multiple designs. This interface is critical for high-performance designs, and usually requires special design. 39 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

40 Sub block Interfaces 40 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

41 Continued…… Registering the outputs of the sub blocks is sufficient to provide locality in timing closure. Subblock 1 is relatively close to Subblock 2, there is only a very small chance that the output wires from Subblock 1 to Subblock 2 will be long enough to cause timing problems. Wire load estimates, synthesis results, and the timing constraints we give the physical design tools should all be accurate enough to achieve rapid timing closure in physical design. There are several issues with this approach: When is a block large enough that we must register outputs? When is a block large enough that we must register both inputs and outputs? When can we break these rules, and how do we minimize timing risks when we do? 41 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

42 Issue1 Any block that is synthesized as a unit should have its outputs registered. Synthesis and time budgeting for synthesis, is where we start striving for timing closure. 42 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

43 Issue 2 Any block that is floor planned as a unit should have its inputs and outputs registered. With blocks, especially reusable blocks, that are floor planned as stand-alone units, we do not necessarily know how long the wires on its outputs and inputs will be. Registering all interfaces gives us the best chance of achieving timing closure for an arbitrary chip with an arbitrary floorplan 43 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

44 Issue 3 We should violate these guidelines only when we absolutely need to, and only when we understand the timing and floor planning implications of doing so. Example For instance, the PCI specification requires several levels of logic between the PCI bus and the first flop in the PCI interface block, for several critical control signals. In this case we cannot register all the inputs of the PCI bus directly; instead, we must floor plan the chip so that the PCI block is very close to the I/O pads for those critical control signals. 44 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

45 PCI vs. PCI-X 45 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

46 Synchronous vs. Asynchronous Design Style The system should be synchronous and register (flip-flop) based Latches should be used only to implement small memories or FIFOs. These memories and FIFOs should be synchronous and edge triggered. In the past, latch-based designs have been popular, especially for some processor designs. Multi-phase, non-overlapping clocks were used to clock the various pipeline stages. Latches were viewed as offering greater density and higher performance than register (flop) based designs. 46 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

47 Continued…. The cost of the increased complexity of latch-based design has risen significantly with the increase in design size and the need for design reuse. Guaranteeing that the data is set up before the leading clock edge at one stage Allowing data to arrive as late as one setup time before the trailing clock edge at the next stage 47 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

48 48 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

49 Continued…… True latch-based designs are not appropriate for SoC designs. Some LSSD design styles are effectively register-based, however, and are acceptable if used correctly. Example: Zycad Accelerators 49 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

50 Clocking SoC designs almost always require multiple clock domains. In canonical design the high-speed bus and the low-speed bus will have separate clocks. The interface/USB block may require yet another clock to match the data rate of its external interface. 50 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

51 Clock planning Problem Distributing a clock to tens or hundreds of thousands of registers with a skew low enough to avoid hold time problems will stress even the best clock tree synthesis tool. Every additional clock domain is an opportunity for disaster, since every time data crosses clock domains, there is an opportunity for metastability and corrupted data. Rule The system clock generation and control logic should be separate from all functional blocks of the system. Keeping the clocking logic in a separate module allows the designer to modify it for a specific process or tool without having to touch the functional design. 51 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

52 52 Whenever there are setup and hold time violations in any flip-flop, it enters a state where its output is unpredictable: this state is known as metastable state (quasi stable state); at the end of metastable state, the flip-flop settles down to either '1' or '0'. This whole process is known as metastability. In the figure below Tsu is the setup time and Th is the hold time. Whenever the input signal D does not meet the Tsu and Th of the given D flip-flop, metastability occurs.

53 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 53 What are the cases in which metastability occurs? As we have seen that whenever setup and hold violation time occurs, metastability occurs, so we have to see when signals violate this timing requirement: When the input signal is an asynchronous signal. When the clock skew/slew is too much (rise and fall time are more than the tolerable values). When interfacing two domains operating at two different frequencies or at the same frequency but with different phase. When the combinational delay is such that flip-flop data input changes in the critical window (setup+hold window)

54 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur. 54

55 Continued… Required clock frequencies and associated phase-locked loops External timing requirements (setup/hold and output timing) needed to interface to the rest of the system Skew requirements between different, but related, clock domains (the difference between clock delays for Clock 1 and Clock 2.) 55 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

56 Continued… Clock Delays for Hard Blocks Clocks to hard macros present a special problem. Often they have a large insertion delay which must be taken into account during clock tree implementation. Hard blocks should have their own clock net in a separate clock domain. This net can be used to compensate for the insertion delay, Once the insertion delay is determined, you can address it in either of two ways: `Eliminate the delay using a PLL (to de-skew) Balance the clock insertion delay with the clock delay for the rest of the logic by giving the hard macro an early version of the clock. 56 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

57 Reset Synchronous reset: Advantage: Is easy to synthesize—reset is just another synchronous input to the design. Disadvantage: Requires a free-running clock, especially at power-up, for reset to occur. Asynchronous reset: Advantage: Does not require a free-running clock. Advantage: Uses separate input on flop, so it does not affect flop data timing. Disadvantage: Is harder to implement—reset is a special signal, like clock. Usually, a tree of buffers is inserted at place and route. Disadvantage: Makes static timing analysis and cycle-based simulation more difficult, and can make the automatic insertion of test structures more difficult. 57 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

58 Timing Exceptions and Multi cycle Paths In general, the standard model of reuse is for a fully synchronous system. Asynchronous signals and other timing exceptions should be avoided; they make chip-level integration significantly more difficult. The optimization tools—synthesis and physical synthesis—work best with fully synchronous designs. Once the clock frequency is defined, these tools can work to ensure that every path from flop to flop meets this timing constraint. Any exceptions to this model—asynchronous signals, multicycle paths, or test signals that do not need to meet this timing constraint—must be identified. Otherwise, the optimization tools will focus on optimizing these (false) long paths, and not properly optimize the real critical timing paths. 58 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

59 Design for Verification: Verification Strategy Design teams consistently list timing closure and verification as the major problems in chip design. Reduce both the number of iterations and the time each iteration takes. The objective of verification is to assure that the block or chip being verified is 100% functionally correct. Best strategy for minimizing defects is to do bottom-up verification. Finding and fixing bugs is easier in small designs. The major challenge in bottom-up verification is developing test bench for the macro. With modern high-level verification languages, creating test benches at the macro level is considerably easier than before. The system-level verification strategy must be developed and documented before macro selection or design begins. 59 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

60 Continued……… The macro-level verification strategy must be developed and documented before the design of macros and major blocks for the chip begins. This strategy should be based on bottom-up verification. The verification strategy also determines the kinds of test benches required for system or chip-level verification. 60 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

61 System Interconnect and On-Chip Buses In the early days of reuse-based design, the wide variety of buses in common use presented a major problem. Ex: Every chip in a project had a unique bus. Solution: Design clearly to have a standard bus, allowing reusable blocks to be developed with a single interface that will allow it to be used in a wide variety of chips. Meanwhile, design teams were under pressure to develop complex chips on very aggressive schedules and struggling with specialized buses. 61 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

62 Continued….. ARM,MIPS, and IBM power PC are standard processors for various segments The ARM processors are designed to work with ARM’S AMBA bus. MIPS has its own EC bus. IBM has developed its CoreConnect bus and it is available to IBM customers AMBA bus has become the closest thing to an industry-wide standard for on-chip interconnect. CoreConnect is clearly the other key player in this area. Example: Integrating a Third-Party AGP Core 62 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

63 Basic Interface Issues Why we need to take care while designing a bus. To reduce power while still meeting the bandwidth requirements of the various blocks in the system. A private bus connects the processor to its cache and perhaps to other memories that it accesses frequently. For performance reasons, these memories are often designed for a specific process, rather than for general reuse. 63 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

64 Types and characteristics High speed bus, Low speed bus A high-speed bus (Advanced High-speed Bus or AHB in the case of AMBA) A high-performance interface between the processor and the other high- bandwidth blocks in the design. This bus is typically pipelined and supports maximum bandwidth. The AHB, has a two-stage pipeline, and supports split transactions as well as a variety of burst mechanisms A low-speed bus (Advanced Peripheral Bus or APB in the case of AMBA) Separate bus for low-speed peripherals and peripherals rarely accessed by the processor. By reducing the clock speed on the APB, it is possible to reduce the power consumption of the bus and the peripherals. In very high performance designs, it may be necessary to use a layered bus architecture to meet the required bandwidth 64 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.

65 Questions Waterfall and spring model Top down and bottom up approach A canonical SOC design and its challenges Write a short note on specification ? Differential between Formal specification and executable specification System design process Differentiate b/w soft and Hard ip. In canonical SOC design divide the blocks into hard and soft ip Differentiate between full and semi custom design. Discuss the significance of Full-Custom in Design Reuse What is Timing closure? Explain How to interface macro and sub blocks Discuss Synchronous and Asynchronous Design Style. Write a short note on Clocking in SOC design. Discuss the advantages and disadvantages of Synchronous and Asynchronous reset. What are the Interfacing issues while designing an SOC Why Mux based busses are preferred Write a short note on IP-to-IP Interfaces Discuss Static and Dynamic power while designing a low power SOC Write a short note on power-reduction techniques Explain different types of Clock gating 65 PradeepKumar S K, Asst. Professor,Dept. of ECE, KIT,Tiptur.


Download ppt "PradeepKumar S K Asst. Professor Dept. of ECE, KIT, TIPTUR. PradeepKumar S K, Asst."

Similar presentations


Ads by Google