Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient.

Similar presentations


Presentation on theme: "1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient."— Presentation transcript:

1 1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient −Up to the CAD tool to select the resource 2.Mixture of PAL-like LBs and LUT-based LBs:  PAL blocks: improved circuit speed  LUT blocks: area efficiency 3.Mixture of “specific-purpose logic” and general- purpose LBs:  SP LBs: superior area, speed, and power consumption  If the function is not used, the silicon area is wasted

2 2 Heterogeneous Logic Blocks Key questions: 1.Which kinds of SP functions? 2.What should be the ratio: SP/GP? 3.What can be done about SP LBs not used in a specific application? −Rose’s golden rule: “build structures that are always useful, even if that use is less than perfectly efficient.” −“The more useful a hard structure is, across a wider range of applications, then the greater its net benefit - provided the cost of the extra functionality is not excessive.” −Rose. Hard vs. Soft: The Central Question of Pre-Fabricated Silicon. In Proceedings of the 34 th International Symposium on Multiple-Valued Logic (ISMVL’04), 2004.

3 3 Hard Blocks Common hard blocks in modern FPGAs:  Memory  Multipliers  MAC for DSP applications  Microprocessors

4 Embedded Memories

5 5 Memory in Altera Flex10K

6 6 Memory in FLEX 10K

7 7

8 8 Heterogeneous Logic Blocks Each EAB:  2048 bits if used as memory −Dual port RAM, ROM, FIFO, …  10-600 gates if used as logic

9 9 پيكر بندي به عنوان حافظه A[10..0] D0 2048x1 D[7..0] A[7..0] 256x8 A[8..0]D[3..0] 512x4 A[9..0]D[1..0] 1028x2

10 10 پيكر بندي به عنوان حافظه Can be used independently Can be combined for a larger memory A[8..0]D[3..0] 512x4 A[8..0]D[3..0] 512x4 D[7..0] A[8..0]

11 11 Altera Cyclone III Architecture

12 12 Cyclone III

13 13 پيكر بندي به عنوان تابع منطقي مي تواند به عنوان LUT به كار رود : مثل جذرگير ( با يك EAB 8 ورودي 8 خروجي ). مزيت ( نسبت به پياده سازي با چند LE): تأخير قابل پيش بيني و سرعت بيشتر. مي تواند مستقلاً استفاده شود يا چند EAB ترکيب شوند و تابع پيچيده تري را پياده سازي کنند.  Remember: 3.What can be done about SP LBs not used in a specific application?

14 14 Cyclone III M9K

15 15 Memory Modes Embedded shift register mode ROM mode FIFO buffer Single/dual-port

16 16 Memory Volume in Cyclone III

17 17 Memory Modes Simple dual-port mode:  Supports simultaneous read and write operation to different locations. True dual-port mode:  Supports any combination of two-port operations: −two reads, −two writes, −one read and one write, at two different clock frequencies.

18 18 Memory Block Megafunctions Can instantiate memory blocks by Quartus MegaWizard Can instantiate them in your VHDL/Verilog code.  Refer to −“RAM Megafunction User Guide,” 2007, http://www.altera.com/literature/ug/ug_ram.pdf

19 19 Altera Stratix II Embedded Memory

20 20 TriMatrix Memory Structure

21 21 Stratix II RAM Blocks

22 22 Stratix IV RAM Blocks

23 23 کاربردهاي Embedded Memory ضرب کننده 4x4: ( يا هر تابع رياضي پيچيده : ريشة B ام عدد A) براي ضرب کننده هاي بزرگتر، از چند ضرب کننده ي 4x4 و چند جمع کننده استفاده مي کنيم.

24 24 کاربردهاي Embedded Memory ضرب کننده ي ثابت ( در DSP و سيستمهاي کنترلي ): مقدار ثابت تعيين کننده ي الگوي محتويات EAB خواهد بود. اگر مقدار ثابت در حين اجرا تغيير کند مي توان الگوي جديد را در EAB لود کرد. دقت ضرب کننده را مي توان با تنظيم تعداد بيتهاي خروجي تنظيم کرد ( براي صرفه جويي )

25 25 کاربردهاي Embedded Memory FSM هاي با تغيير حالت (transition) هاي پيچيده : FSM عمومي (general purpose):

26 26 کاربردهاي Memory

27 27 کاربردهاي Embedded Memory توابع Transcendental: سينوس،... ، لگاريتم،... که محاسبه شان با الگوريتم و پياده سازي سخت افزاريشان مشکل است. آرگومان تابع : ورودي خطوط آدرس. نتيجه : روي خروجي داده.

28 28 کاربردهاي Embedded Memory مبدل کدهاي بزرگ : مبدل کد عدد 8 بيتي به عدد 10 بيتي

29 29 Xilinx Virtex II Pro (Digital Clock Manager)

30 30 Xilinx Virtex II Pro

31 31 Xilinx Virtex 4

32 32 Virtex 5

33 Computation-Oriented Tiles

34 34 Virtex Family

35 35 ضرب كننده هاي 18*18 براي كارهاي محاسباتي و DSP

36 36 تراشه هاي خانوادةVirtex II Pro (Digital Clock Manager)

37 37 ضرب كننده هاي 18*18 In Virtex 5: DSP48E slices - 25 x 18, two ’ s complement multiplication - One adder, one subtracter and an accumulator

38 38 Multipliers in Altera Cyclone III

39 39 Embedded Multipliers

40 40 Embedded Multipliers Can configure each embedded multiplier as  one 18 × 18 or  two 9 × 9. For > 18 × 18, the Quartus II software cascades. No restriction on the data width  but the greater the data width, the slower the multiplication process. Can also implement soft multipliers using Cyclone III M9K memory blocks.  Increase the number of multipliers.

41 41 Number of Multipliers

42 42 Multiplier Block Architecture

43 43 9-Bit Mode

44 44 Multiplier Megafunctions For instantiating multipliers, refer to:  Quartus User Guide, Synthesis, http://www.altera.com/literature/hb/qts/qts_qii5v1_03.pdf

45 45 Stratix II DSP Blocks

46 46 Stratix II DSP Blocks

47 47 Stratix II DSP Blocks

48 48 Stratix II DSP Blocks

49 49

50 50 Stratix Architecture

51 51 Ratio-Based Architectures If multipliers not needed by an application, the multipliers provide little benefit.  One way: multiple sub-families within a device family with different ratios of soft logic to hard-logic.  Designer can select the device with the most appropriate ratio −  minimize “wasted” area −  FPGA vendor must support a larger number of devices 223449 275 373 soft/hard ratio

52 52 Ratio-Based Architectures Virtex 4/Virtex 5 sub-families: 1.LX: focus on soft logic and memory 2.SX: focus on arithmetic computational units 3.FX: with a processor and high-speed serial interface focus Virtex 6: 1.LXT: High-performance logic with advanced serial connectivity 2.SXT: Highest signal processing capability with advanced serial connectivity 3.HXT: Highest bandwidth serial connectivity

53 53 Xilinx Virtex 4

54 54 Virtex 5

55 Embedded Processors

56 56 System-Level Design  Until recently, CPU and its peripheral: as discrete chips. Two Scenarios: Memory Connected to CPU via general-purpose processor bus Tightly-coupled memory (TCM) connected to processor via dedicated bus

57 57 Embedded System Design  Dedicated chips for CPU and peripherals  −High area cost, −Low reliability.  For relatively small amount of memory, integrated memory in FPGA is used.

58 58 Challenges Challenges:  Decision on hardware/software partitioning.  Design environment must support hardware/software co-verification.

59 59 SoPC SoC:  A chip that integrates the major functional elements of a complete end product. Complex FPGAs :  CPU  Memory  Arithmetic units (multipliers, …)  Peripheral modules  Logic  Whole system on a chip (SoPC)

60 60 Microprocessor Cores Two types:  Hard Core −Implemented as hardwired component −E.g. PowerPC in Xilinx −E.g. Arm in Altera −E.g. MIPS in QuickLogic  Soft Core −Configure logic blocks to act as microprocessor(s) −E.g. MicroBlaze in Xilinx −E.g. NiosII in Altera −E.g. Q90C1 in QuickLogic

61 61 Hard Microprocessor Cores Two Scenarios: 1.Locate it in a strip to the side of FPGA fabric.  Easier for tools because the main FPGA fabric is identical for devices with or without hard code  FPGA vendor can embed a lot of additional functions in the strip to complement the micro.  Altera: ARM in Excalibur

62 62 Hard Microprocessor Cores Two Scenarios: 2.Embed core(s) directly into the main FPGA fabric  Design tools must consider presence of these blocks in the fabric.  Memory used by the core from embedded RAM blocks  Speed advantages by proximity to the main FPGA fabric.  Xilinx: PowerPC in Virtex II-Pro, Virtex 4, and Virtex 5.

63 63 Hard Microprocessor Cores 2.(cont.) Embed core(s) directly into the main FPGA fabric  No dedicated processor bus or peripheral bus.  These buses must be implemented using FPGA logic.  Advantage: flexibility to define the architecture of the embedded system.  Disadvantage: the processor cannot perform useful work without configuring the FPGA logic

64 64 Soft Processor Core Disadvantages:  Generally slower  Larger Advantage:  can often be customized to exactly suit the needs of the application −  Gains back some of the lost performance and area efficiency.

65 65 Soft Microprocessor Cores Firm or Soft:  Soft: if in the form of RTL netlist that will be synthesized,  Firm: if placed and routed. Peripherals in soft or firm form:  E.g. Memory controllers, interrupt controllers, communication functions, timer counters.  Refer to library of FPGA vendor. Xilinx  MicroBlaze: 32-bit microprocessor (~1000 logic cells)  PicoBlaze: 8-bit microprocessor (~150 logic cells) Altera:  NiosII: 32-bits

66 66 References [Xilinx] www.xilinx.com [Altera] www.altera.com


Download ppt "1 Heterogeneous Logic Blocks 1.Mixture of two different sizes of LUTs:  Larger LUT and cluster sizes: higher speed  Smaller sizes: more area efficient."

Similar presentations


Ads by Google