Presentation on theme: "PARTIAL RECONFIGURATION USING FPGAs: ARCHITECTURE"— Presentation transcript:
1PARTIAL RECONFIGURATION USING FPGAs: ARCHITECTURE
2Agenda Introduction Partial Reconfiguration Basics Design ConsiderationsAdvantages of Partial ReconfigurationChallenges of Partial ReconfigurationApplication ExamplesCase Study
3Introduction FPGA Chip Basic Premise : Hardware reconfiguration is allowed during execution of an application.Some Interesting ApplicationsDynamic Instruction Set ArchitectureSoftware Defined RadioVideo encoding techniquesCryptographyNetworking protocolsFPGA ChipDesign ADesign BDesign C
4IntroductionClassification of FPGA with respect to configuration capabilitiesDynamic Partial Reconfiguration : Reconfiguring only a part of the device at run time while the rest of the device executes.Useful for systems which can time share the FPGA resources.
5Introduction Benefits Area reduction Power reduction Hardware Reuse FlexibilityPerformance ImprovementHigher Level of ParallelismTime sliced resource sharingFast system startLoad a basic module to enable a fast system boot upLoad peripheral modules later.Smaller bitstreams sizesApplication PortabilityEncapsulation of reconfigurable system into a portable application.
6Partial Reconfiguration Basics Each vendor’s products can have different characteristics and utilitiesSome common terminology are as below.TERMINOLOGYReconfigurable Partition(RP)Dynamic Partial Reconfiguration (DPR)Reconfigurable Module (PRM)Configuration Memory (CM)FramesPartial BitstreamMerged BitstreamStatic Logic (Base Region)Bus Macro
7Partial Reconfiguration Basics Structure OverviewOverall StructureCLBs – Configurable logic blocksIOBs – Input-output buffersDSP48s – Xilinx’s digital signal processing unitsBRAMs – Block Random Access MemoriesFIFOs – First-in First-out buffersDCMs – Digital Clock ManagersCLBsIOBsDSP48sBRAMS and FIFOsDCMs and Clock Dist.Figure 1. Virtex-4 LX15 FPGA layout
9Partial Reconfiguration Basics Bit Stream and FramesFPGAs are reprogrammed by writing bits into CMOrganized in small blocks called ‘Frames’Multiple frames required to program a column of tiles(After Virtex II )Contains both routing and logic tile configuration.Virtex-6 Frame size:81 x 32 bits (81 words)Typical Bit streams for Virtex-6 are in the range of 43Mb to 190 Mb
10Partial Reconfiguration Basics Bit StreamDifferent columns of FPGA fabric can have different bit streamsPR overhead for full flexibilityPossible to reduce Bit stream Size :- Compression Techniques- Partial Reconfiguration
11Partial Reconfiguration Basics FramesRow address – 0 to 9Top/Bottom row – with respect to HCLKTogether with row address can locate the tileMajor Address : Columns 0 onwardsMinor Address : No. of frames in tileBlock type : Logic Blocks, BRAMs, Routing Blocks.
12Partial Reconfiguration Basics Bus MacrosBus Macros: Means of communication between PRMs and static designAll connections between PRMs and static design must pass through a bus macro with the exception of a clock signalType of Bus MacrosTri-state buffer (TBUF) based bus macrosSlice-based (or LUT-based) bus macros
13Partial Reconfiguration Basics Xilinx Bus Macros (Tri state Buffer Based)Used for connecting points to link Static and reconfigurable partIntroduced in 2002Fixed positions on the FPGA fabricPresent along a thin vertical sliceExtra hardware required. No longer supported in modern FPGAs.
14Partial Reconfiguration Basics Xilinx Bus Macros (LUT Based)LUTs and Switch matrix acts as the connection points (2004)Passes the boundary of static and reconfigurable regions in a predefined manner.Uses 2 LUTs per wireIncreased latency and areaNot used any more.Partition Pins replace Bus Macros
15Partial Reconfiguration Basics Partition PinsPartition Pins are the logical and physical connection between static logic and reconfigurable logic.Automatically created for all RP ports.Also referred to as Proxy LUTs.It is single LUT1No special instantiations requiredNot Bidirectional
16Partial Reconfiguration Basics Methods of ReconfigurationExternallySerial configuration portJTAG (Boundary Scan) portSelect Map portInternallyThough the Internal configuration access port (ICAP) using an embedded microcontroller or state machineSummary of Configuration Options
17Partial Reconfiguration Basics Reconfiguration via a processor
18Partial Reconfiguration Basics ICAP InterfacePort to read and write the FPGA configuration at run timeEnables a user to write software programs for an embedded processor that modifies the circuit structure and functionality during the circuit’s operation.Allows for automated runtime reconfiguration
19Partial Reconfiguration Basics ICAP InterfaceStorage DeviceBus SystemDMA to Storage DeviceRead back SupportConfiguration manager
20Design Considerations Partitioning StylePartitioning style could be island styleSlot BasedGrid Based
21Design Considerations Placement FlexibilityPartitioning style affects placement and flexibilityA partition defines the smallest atomic area a module can be assignedIsland style – suffers from fragmentationSlot style - also suffers from fragmentation but to a lesser extent. Offered by the current vendors Xilinx and Altera.Grid Style – Reduced fragmentation. Difficult to support.To enhance flexibility, the PR module must be placed and routed in every region it needs to be configured.Additional stress on Bit stream size.
22Design Considerations ResourceColumn wise layout of different logic primitivesMust be considered when placingDepending on the type of logic primitives used by the module(SLICEX, SLICEM, etc), relocation may or may not be possible.
23Design Considerations PowerOne of the potential advantages of PR – Power reductionBut PR itself requires power.Power during PR is spent in:1. Configuration Data Access –- Spent on the configuration controller- Off/On chip Memory access- Programming interface(ICAP, SelectMAP,etc)2. Actual configuration of FPGA ResourcesBonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration." Microprocessors and Microsystems (2014).
24Design Considerations PowerTasks switching power graphT1 T2 and T2T1Bonamy, R., et al. "Power Consumption Models for the Use of Dynamic and Partial Reconfiguration." Microprocessors and Microsystems (2014).
25Design Considerations Design FlowsModule-based PR:Implement any single component separately.Constrain components to be placed at a given location.Complete bitstream is finally built as the sum of all partial bit streams.Difference-based PR:Implement the complete bitstreams separately.Implement fix parts + reconfigurable parts with components constrained at the same location in all the bitstreams.Compute the difference of two bitstreams to obtain the partial bitstream needed to move from one configuration to the next one.
27Design Considerations Difference Based P-RUseful for making small on-the-fly changes to design parameters such as logic equations, Filter Parameters.Procedure:Designer makes small logic changes using FPGA_Editor:changing I/Os,block RAM contentsLUT programmingmuxsflip-flop initialization and reset valuespull-ups or pull-downs on external pinsblock RAM write modesChanging any property or value that would impact routing is not recommended due to the risk of internal contentionUses BitGen to generate a bitstream that programs only the difference between the two versions.Very quick switching
28Design Considerations Difference Based P-RLUT equations change
29Design Considerations Difference Based P-RChanging BRAM contents
30Challenges of Partial Reconfiguration Complicated design flowManual assistance for reconfiguring different target devices.Security issuesDecrease performance as compared to full configuration.Xilinx reports 10% degradation in clock frequency when using PR.Xilinx PR Implementation FlowHDL Design DescriptionHDL SynthesisSet Design ConstraintsPlacement AnalysisImplement Static Design and PR ModulesMergeFinal BitsreamsManual steps
32Application examples of Partial Reconfiguration Evolution ArchitecturesArtifical Neural NetworksEvolvable Hardware PlatformsFuzzy systemsModular RoboticsSpeed UpCrypto (Asym)Area SavingNetworking (exchange packet filters according to traffic)Modulation/frequency/encryption hopping in military radiosDigital Signal ProcessingJPEG Encoder/Decoder systemsEdge detection applications
33Case Study Fault Tolerance – Self Healing Architecture Fault tolerant ProcessorIF ,MAC and ALU are thePRMsDifferent configurationsavailable for each module.Focus on the self healingfeature more than theperformance itself.
34Case Study Reconfigurable Crypto processor Processor can choose from Different crypto algorithmsMajor Area savingsSome Power Savings too.
35Case Study Fast Start Up Fast Start up is a 2 step configuration Useful in time criticalsystems to initiate aswift system start up.Example :Automotive safety