3 Selecting a Target FPGA Factors to be considered:Dynamic reconfigurabilityReconfiguration timePartial vs Full reconfigurationGranularity
4 Dynamic reconfigurability The ability of a FPGA to modify operation during runtime.Correct FPGA selection very important.The primary advantages of run-time reconfiguration in devices are reduced power consumption, hardware reuse and flexibility.Main problem is the speed of reconfiguration.
5 Granularity Two main architectures: Course grained: smaller number of larger, more powerful logic blocksAdvantage:Faster because of easy routingFine grained : consists of a large number of small logic blacksGood utilizationEasy conversion to ASIC
6 Xilinx Virtex Architecture Architecture is coarse grainedBasic cell of the Virtex FPGA is configurable logic block(CLB)CLB contains circuitry that allows it to efficiently perform arithmeticLUT’s can be configured as SRAM cellsContains programmable input output blocks (IOBs) interconnected to the CLBs
8 Configuration of Xilinx Virtex Fully and Partially reconfigurableModule-Based partial reconfigurationDistinct portions of an FPGA are referred to as reconfigurable modules.Used for independent design applicationsSmall-Bit Manipulationsaccomplished by making a small change to the designSwitching configuration is fast as bitstream difference is smaller than device difference.
9 Lattice ORCA Architecture Coarse grained architectureFour basic elements: programmable logic cells (PLCs), programmable input/output cells (PIOs), embedded block RAMs (EBRs), and system-level features.Programmable functional unit (PFU) is the basic functional unit containing eight 4-input LUTs, 8 Iatches/FFs and one additional flip-flop for arithmetic functions
11 Configuration of Lattice ORCA Fully and Partially reconfigurablePartial reconfiguration is done by setting a bitstream option that tells the FPGA to not reset the entire configurationOptions available to allow one portion of FPGA to remain in operation while the other is being reconfiguredOff chip reconfiguration
12 Atmel AT40K Architecture Fine Grained architectureSymmetrical array of identical cellsDistributed 10 ns SRAM capability8-sided core cells with direct horizontal, vertical, and diagonal cell-to-cell connectionSmall cells lead to large number of cells which leads to greater functionalityEach cell can implement 2 Boolean operations of 3 inputs or 1 operation of 4 inputs
14 Configuration of Atmel AT40K Fully and Partially reconfigurableOff chip reconfigerationConfiguration data transferred in either Master mode, slave mode or Synchronous RAM modeMaster mode:Auto configuring.The Master Mode uses an internal oscillator to provide the configuration clock for clocking configuration dataSynchronous RAM mode:Device receives a 32-bit wide bit streamIt is a memory mapped address space.User has full read/write access
15 Slave mode: configuration is always initiated by an extenal signal. Slave Serial Mode the device receives serial configuration dataSlave Parallel mode the device receives either 8-bit wide or 16-bit wide parallel data.
16 AItera APEX 20K Architecture Architecture is coarse grainedIt combines LUT-based, product-term-based and memory into one deviceSignal interconnections are provided by the Fast Track interconectConsist of an array of MegaLAB structuresEach MegaLAB structure consists of a group of logic array blocks (LABs), one Embedded System Block (ESB), and a MegaLAB interconnectEach LE contains a four-input LUT that can implement any function of four variables.
19 AItera APEX 20K configuration Only Fully reconfigurableDuring operation, it stores its data in SRAM cells.Active configuration: both the target and the configuration device generate both control and synchronization signalsPassive configuration: device uses microprocessor to control the configuration processOff chip reconfiguration
20 ConclusionWide variety of dynamically reconfigurable FPGA devices available todayReconfiguration time of partial reconfiguration is much smaller (~4-5 ms) than full reconfiguration(~12 ms)
21 Run-Time FPGA Reconfiguration for Power-/Cost-Optimized Real-Time Systems Marisha Rawlins
22 IntroductionRun time partial reconfiguration is useful to designers who want to develop applications demanding adaptive and flexible hardwareUsing partial reconfiguration can result in power and area savingsAn intelligent system is needed to manage reconfiguration in order to save power and meet timing constrains in real time systems22
27 Basic System Approach Address assignment Each module is assigned a default address during design timeNew modules are assigned valid logic addresses within the legal address range when they are loaded27
28 Basic System Approach Transferring data from modules The bus arbiter calls every existing addressIf the module’s busy signal is asserted the arbiter selects the next addressThe module’s request signal is asserted if the module wants to send dataTransferring data to modulesThe main controller asserts a Data-In signal if external data is available for the selected module28
29 Basic System Approach Data transfer schemes Replacing modules The main controller knows the module’s data transfer timeA time stamp is sent before the dataReplacing modulesIf a module not currently configured on the FPGA is needed an unused module is replacedA context-save is done using data and state I/O linesData and state is transferred to local memory of the main controller29
30 Basic System Approach Bus Realization Interfaces are in a fixed place All modules can be implemented in every possible function column30
31 Initial Results Using partial reconfiguration The application can be implemented on XCV200E instead of XCV300ESmaller areaLower powerLower costCan meet timing constraints for a real-time system31
34 ConclusionsIt is possible to use dynamic partial reconfiguration to save both power and areaAn intelligent run time system is needed to ensure that power is saved and that timing constraints are meet when using partial reconfiguration for real time systemsMeasuring power consumed during partial reconfiguration aids in determining the design of the run time system and the feasibility of using dynamic partial reconfiguration34
36 Outline Benefits of Fields-Programmable gate arrays (FPGAs) Overview of Partial Reconfiguration (PR) in FPGAsPartial Reconfiguration in the Xilinx Virtex-4Software-Defined Radio and Partial ReconfigurationWhat is Software-Defined Radio?Why PR is applicable to Software-Define Radio?PR designs of Software-Defined Radios (SDR)Conclusions
37 Why FPGAs are being used? FPGAs are now being used as replacement for application specific integrated circuits (ASICs) for many space-based applicationsFPGAs provide:Reconfigurable ArchitecturesLow Cost solutionRapid Development TimesAdditional benefits in each new generation other than expected larger size and faster speed
38 Design exploration on Latest FPGAs Exploration and evaluation of designs with latest FPGAs can be difficultContinuous endeavor as performance and capabilities improve significantly with each releaseEach vendors products can have different characteristics and utilitiesOften there are unique capabilitiesThus side by side evaluation of designs among different products are not straightforwardOne such advancement of particular importance is partial reconfiguration (PR)Three Vendors provide some degree of this featureXilinxAtmelLattice
39 What is Partial Reconfiguration? Partial reconfiguration (PR) allows the ability to reconfigure a portion of an FPGAReal advantage arises when PR is done during runtime also know as dynamic reconfigurationDynamic Reconfiguration allows the reconfiguration of a portion of an FPGA while the remainder continues operating without any loss of dataTwo types of RegionsStatic – Keeps operatingReconfigurable – Can be reconfigured with a new moduleCentral Controlling AgentICAPMem controllerModule AModule BModule CModule DStatic modulesReconfigurable Modules (PRMs)Modules: A & BPRR 1PRR 2FPGAStatic modulesStatic regionModules: C & D
40 Partial Reconfiguration in Virtex-4 Exploration of Partial Reconfiguration for a design requires significant knowledge on targeted DeviceXilinx’s FPGA’s are widespread, so the Virtex-4 Family was chosen as the example FPGANeed to evaluate current performance and limitationsReconfiguration speeds and methodsDesign Hierarchy limitations related to PR Modules (PRMs) and number of allowed PR regions
41 Virtex-4 LX15 FPGA Layout Overall Structure CLBs – Configurable logic blocksIOBs – Input-output buffersDSP48s – Xilinx’s digital signal processing unitsBRAMs – Block Random Access MemoriesFIFOs – First-in First-out buffersDCMs – Digital Clock ManagersCLBsIOBsDSP48sBRAMS and FIFOsDCMs and Clock Dist.Figure 1. Virtex-4 LX15 FPGA layout
42 Virtex-4 Reconfiguration FPGA is reconfigured by writing bits into configuration Memory (CM)Configuration data is organized into frames that target specific areas of the FPGA through frame addressesTo reconfigure any portion of that frame the partial bitstreams contain configuration data for a whole frameReconfiguration times highly depend on the size and organization of the PR regionsVirtex-II allowed column based PR onlyVirtex-4’s allow arbitrarily sized PR regionsVirtex-4 FramesComposed of bit wordsThe LX15 has 3,740 framesFour methods of reconfiguring a device, each has applications where desirableExternallySerial configuration portJTAG (Boundary Scan) portSelectMap portInternallyThough the Internal configuration access port (ICAP) using an embedded microcontroller or state machine
43 Reconfiguration Speeds Table 1 shows a summary of Configuration Speeds for the four optionsTable 2 shows example configurations sizes and times for the four optionsValues were based on estimates of Xilinx’s PlanAhead Software when targeting an approximate PR region slice utilization of 90%Table 1. Summary of Configuration OptionsTable 2. Example Configuration Sizes and Times to Configure with JTAG and SelectMAP/ICAP
44 Reconfiguration using an Embedded Microcontroller Many Xilinx FPGAs have Embedded hard processor coresProcessor core has the ability to process C/C++ codeMakes reconfigurable designs extremely flexible since no need for external controlReconfiguration StepsReconfiguration is triggered within the FPGAProcessor core loads the desired configuration data from external reconfiguration memoryThis could be from ROM, Flash, static Ram loaded at startup or filled up by the FPGA itselfProcessor reconfigures the PR region through the ICAP primitiveFigure 2. PR design using embedded microcontroller
45 PR Design HierarchyFlow from Hardware Description Language (HDL) to configuration bitstream is extremely complicatedTop-level module should contain sub-modules that are either static or reconfigurableAll communications except global signals such as clock must be explicitly declared using 8-bit bus macrosCurrent PR design flow allows multiple Partial reconfiguration regionsOne Partial Reconfiguration for PRM A and one for PRM BFigure 3. Example design showing two PR regions
46 PR Design Hierarchy Cont. Required hierarchy adds significant amount of effort when converting an existing static design into one that is ready for PRHaving all PRM at the top level will often require routing many signals to and from another module deep within the main static modulea) Hierarchical view before PR partitioningb) Required design partitioningFigure 4. Transceiver design with turbo coding and concatenated convolutional + Reed-Solomon coding
47 PR software SupportInitially there was very little software support to assist in generating PR designs and bitstreamsRecentlyXilinx’s EA PR flow with the integration of the Xilinx’s PlanAhead tool greatly mitigates the complicacies of PR
48 Software-Defined Radios and Partial Reconfiguration Determining whether PR is appropriate for a given application depends heavily upon the FPGA familyOne such field PR can be applied for great advantage is software-define radio
49 Why Software Defined Radio? Wireless communication abilities are becoming ubiquitous in the latest generation of portable electronicsCell phones, Camera’s, MP3 players etc.Satellites need an overwhelming number of communication standards to communicate with these wide array of devicesHardware designs that attempt to provide compatibility with current standards will likely become obsolete shortly after their releaseSolution: Use Software-defined radios (SDR’s)SDRs run on a generic hardware platform that allow communication parameters to be defined by the software during runtimeSome common communication functions affected by the parameters are modulation, demodulation, filtering, frequency selection and frequency hopping
50 What is Software-Defined Radio? SDRs split into three basic sectionsRadio Frequency (RF) SectionActs as a transceiver, performing conversion up or down to the intermediate frequencyIntermediate Frequency (IF) SectionPerforms all the necessary signal processing , modulation, demodulation, filtering etcBase-Band SectionPerforms the processing of the digital data, i.e, data payload assembled or disassembledReconfigurable SDR DesignBasically, creating a modular design on an FPGA that can load the desired functions as neededThe reconfigurable nature on an FPGA allows reconfiguring the functionality of a specific block while the reminder of the design continues to functionProvides a unique opportunity to create an extremely flexible and compact designFigure 5. Block Diagram of a generic Software-defined radio
51 PR designs for Software Defined Radios Three SDR types where the application of PR were studiedSimplex Spread-Spectrum Transceiver with FECDynamic Bandwidth Resource Allocation TransceiverCognitive Radio
52 Simplex Spread-Spectrum Transceiver with FEC Simplex transceiverTransmit or receive capabilities used at a given time but never at the same timeAssuming Waveform requiresForward error checking (FEC)Direct-sequence spread spectrum (DSSS)Two PR regions definedOne for Tx modulator or the Rx demodulatorOne for the Tx FEC encoder, the Rx DSSS acquisition engine, or the Rx FEC decoderFigure 6. Simplex Spread-Spectrum Transceiver with FEC
53 Dynamic Bandwidth Resource Allocation Transceiver Dynamic Bandwidth Resource Allocation (DBRA) systems alter communication waveform dynamically to match channel conditionsAdjust configuration to keep bit error rate at a current thresholdFour PR regions DefinedOne for Tx Binary Phase Shift-Keying (BPSK) or Tx Gaussian minimum shift keying (GMSK)One for Turbo encoder or Convolution encoder, followed by Reed-Solomon (RS) encodingOne for Rx BPSK or Rx 8-ary Phase shift-keying (8PSK)One for Turbo Decoder or concatenated Viterbi, followed by RS decoderFigure 7. Dynamic bandwidth resource allocation transceiver
54 Cognitive Radio Receiver Steps for Cognitive Radio (CR) functionScan the available spectrum using an FTTLocate energycreate a channel that attempts to match the spectral shapePerform modulation recognitionTry to demodulateOne PR regionOne for Modulation recognition and one for the DemodulatorFFT Module is left static as spectrum is frequently monitoredIf reconfiguration speed is very fast, FFT can also be made reconfigurableFigure 8. Cognitive radio receiver
55 ConclusionsExploring the usefulness of PR in the field of software define radio has shown both its feasibility and benefitsDesign time overhead involved when creating a PR design is acceptable but requires progressing though a slow learning curve before any results are obtainedFull benefits of PR will not be evident until it becomes commonplace in industry and vendors place more resources on supporting the PR design flow and keeping documentation up to dateHowever, the adaptivity of PR combined with the desire for SDR’s make a strong argument for pursuing PR.