Presentation on theme: "PARTIAL RECONFIGURATION DESIGN. 2 Partial Reconfiguration Partial Reconfiguration : Ability to reconfigure a portion of the FPGA while the remainder."— Presentation transcript:
2 Partial Reconfiguration Partial Reconfiguration : Ability to reconfigure a portion of the FPGA while the remainder of the design is still operational Application areas: In-the-field hardware upgrades and updates to remote sites Runtime reconfiguration (RTR) −Swap in/out tasks Adaptive hardware algorithms (EHW) Benefits: Reduced device count Reduced power consumption More efficient use of available board space Very few devices in the market: −Xilinx 6200: obsolete −Xilinx Virtex
3 Partial Reconfiguration Design Flows: 1.JBits approach For few old devices 2.Modular Design Flow 3.Difference-Based Approach
4 Partial Reconfiguration Bitstream: Configuration data which can be downloaded into the device via the configuration port Packet: Fragment of the complete bitstream sent to the device to reconfigure the needed part of the device Configuration memory: Part of the processor memory dedicated for reconfiguration data Dirty packets: Marked packets showing the changes made between the last configuration and the present one
5 Virtex Devices Partial reconfiguration in Virtex: Frames: Smallest unit of reconfiguration. Frames in Xilinx devices: Virtex, Virtex II, Virtex II-Pro: −The whole column. Virtex 4, Virtex 5 −Only a complete tile. −Different in various devices: Width Height TASK 1 Logical shared memory TASK 2 CLB [Banerjee07]
7 Bitstream Manipulation with JBits JBits: A tool to allow the end user to set the content of the LUT and make connections inside the FPGA “directly” −by changing reconfiguration data. JBits API: A set of java classes, methods: −to set the LUT-values −to define the interconnections −to read back the content of the FPGA currently in use.
8 JBits Methods Sets LUT of type F or G in slice 0 or 1 in CLB at (row, col) by array value −value: 16 entries for 4LUT set(row, col, SliceNumber, Type, value) connect(outpin, inpin) Connects the pin outpin to the pin inpin anywhere inside the FPGA (terminals of CLB)
9 JBits Methods JBits IP Cores: Hundreds of predefine cores: −adders, −subtractors, −multipliers, −CORDIC Processor, −encoder and decoders, −network modules Can directly be used in designs Can also be combined to generate more complex cores.
10 JBits Building components from scratch: Too difficult. Bitstream subtraction is used instead: 1.Implementing as many full bitstream (top-level) as required 2.Perform the subtraction among them: −These are the parts that will be used to configure the device later. Procedure: 1.Two bitstreams are scanned and compared for each element. 2.Only the difference is copied in the resulting bitstream.
11 Approaches to PR Partial Reconfiguration in Xilinx Devices: 1.Module-based PR: −Implement any single component separately. −Constrain components to be placed at a given location. −Complete bitstream is finally built as the sum of all partial bitstreams. 2.Difference-based PR: −Implement the complete bitstreams separately. −Implement fix parts + reconfigurable parts with components constrained at the same location in all the bitstreams. −Compute the difference of two bitstreams to obtain the partial bitstream needed to move from one configuration to the next one.
13 Module-Based PR Reconfigurable Module (RM): Distinct portions of an FPGA design to be reconfigured while the rest of the device remains in active operation. Properties of RMs: RM’s height is the full height of the device. −or the height of a frame in Virtex 4, 5, 6 RM’s width: min = 4 slices, max = full-device width (in four- slice increments) Horizontal placement must always be on a four-slice boundary; the leftmost placement being x = 0, 4, 8, … All logic resources encompassed by the width of the module are considered part of the RM's bitstream "frame“ −This includes slices, TBUFs, block RAMs, multipliers, IOBs, and most importantly, all routing resources. Clocking logic (BUFGMUX, CLKIOBs) is always separate from the reconfigurable module. −Clocks have separate bitstream frames.
14 Module-Based PR Properties of RMs: IOBs immediately above the top edge and below the bottom edge of an RM are part of the specific RM’s resources. If an RM occupies the leftmost/rightmost slice column, all IOBs on the specific edge are part of the specific RM’s resources. RMs communicate with other modules (both fixed and reconfigurable) by using a special bus macro. The implementation must be designed so that the static portions of the design do not rely on the state of the module under reconfiguration while reconfiguration is taking place. −The implementation should ensure proper operation of the design during the reconfiguration process. −Explicit handshaking (module ready/not-ready) logic may be required. The state of the storage elements inside the RM are preserved during and after the reconfiguration process. −Designs can take advantage of this fact to utilize "prior state" information after a new configuration is loaded.
15 Module-Based PR Initially developed to allow several engineers to cooperatively work on the same project. Procedure: Project leader: −Identifies the components of the whole project, −Estimates the amount of resources consumed by each component, −Defines locations for the components on the device Engineers: −develop the single parts independently.
18 1. Design Entry and Synthesis Engineers: Develop modules using an HDL. Synthesize them. Verify them. Leader: Completes top-level design entry: −Modules are black boxes with ports. −Defines top-level netlist. Synthesizes it. Verifies it.
19 2. Initial Budgeting Leader assigns top-level constraints (in a constraint file): Pin locations, Area constraints for each modules: −by hand or −by floorplanner, Timing constraints, Insertion of Bus macros: ….
20 Bus Macros in Xilinx FPGAs Problem in partial reconfiguration: The resulting application may not work if the signals connecting the fix and dynamic part are not using the same paths in the two configurations. Solution: Bus macro provides fix communication channels, which can be used by reconfigurable module. −tri-state lines −not affected by the reconfiguration
22 Module-Based PR Module-Based PR Flow: Bus Macros guarantee fixed communication channels among RMs and the fixed part. −One must ensure that signals will not be routed on the wrong paths after the reconfiguration. −Routing resources used for such inter-module signals must not change when a module is reconfigured. −By JBits, you must route manually again!
23 Bus Macro Bus macro: The HDL code should ensure that any reconfigurable module signal that is used to communicate with another module does so only by first passing through a bus macro. Each bus macro provides for 4-bits of inter-module communication. −if A communicates via 32 bits to B, then eight (32/4) bus macros will need to be instantiated.
26 Automatic Bus Macro Placement for Partially Reconfigurable FPGA Designs, Jeffrey M. Carver, Richard N. Pittman, FPGA09
27 3. Implementing Active Modules Team members implement their modules in parallel. −both fixed and partially reconfigurable. −synthesis, place, route. but always in the context of the top-level logic and constraints. If the area specified for the module cannot contain the physical logic for the module, then resizing must be done again in the constraint file generated during the initial budgeting. Verification by timing analysis. −May need iteration of floorplanning.
28 4. Module Assembling Team leader assembles the previously implemented modules into one top-level design. More detals in [Lim02]
30 Difference-Based Partial Reconfiguration Useful for making small on-the-fly changes to design parameters such as logic equations, …. Procedure: 1.Designer makes small logic changes using FPGA_Editor: −changing I/Os, −block RAM contents −LUT programming −muxs −flip-flop initialization and reset values −pull-ups or pull-downs on external pins −block RAM write modes −Changing any property or value that would impact routing is not recommended due to the risk of internal contention 2.Uses BitGen to generate a bitstream that programs only the difference between the two versions. −Very quick switching
31 Difference-Based Partial Reconfiguration Viewing a block
34 Difference-Based Partial Reconfiguration More details in [Eto07]
35 Early Access Design Flow Differences: Types of bus macros: −Synchronous and asynchronous −Wide- and narrow-type bus macros −Wide covers more CLBs −…
36 Early Access Design Flow Xilinx enhanced modular design flow. Differences: Support for partial reconfiguration of Virtex 4 & 5.
Direct Bitstream Manipulation in Virtex [Upegui05]
38 Direct Bitstream Manipulation in Virtex Addressing LUT contents of a bitstream: For Virtex family, XAPP151 describes detailed bitstream. Virtex-II upward has not been documented −But LUT contents can be localized in configuration bitstream.
40 Frame Description (XC2V40) Details in [Upegui05] Unknown functionality
41 LUT Contents In Virtex family: LUT configurations are stored inverted: −4-input AND function, 0111 1111 1111 1111 (7F FF) instead of 1000 0000 0000 0000 Bit order is swapped in F-LUTs respective to G-LUTs: −in G-LUT: 7F FF −in F-LUT: FF FE Benefits of direct manipulation: E.g. evolving circuits in a very flexible way Run-time reconfiguration
42 References [Bobda07] Christophe Bobda, “Introduction to Reconfigurable Computing: Architectures, Algorithms and Applications,” Springer, 2007. [Virtex-4] “Virtex-4 Configuration Guide,” http://www.xilinx.com. [Lim02] Davin Lim and Mike Peattie, “Two Flows for Partial Reconfiguration,” XAPP290, v1, www.xilinx.com, 2002. Module Based or Small Bit Manipulations [Eto07] Emi Eto, “Difference-Based Partial Reconfiguration,” XAPP290, v2, www.xilinx.com, 2007. [Upegui05] A. Upegui and E. Sanchez “Evolving hardware by dynamically reconfiguring Xilinx FPGAs,” Evolvable Systems: From Biology to Hardware, LNCS 3637, 2005.