Presentation is loading. Please wait.

Presentation is loading. Please wait.

Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic.

Similar presentations


Presentation on theme: "Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic."— Presentation transcript:

1 Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic Logic Realizations for Partial Reconfiguration of Field Programmable Gate Arrays Matthew G. Parris University of Central Florida Matthew G. Parris University of Central Florida

2 Agenda Contributions of Thesis Previous Work Evolvable Hardware Optimization Strategies Partial Reconfiguration & Architectural Analysis Dynamic Processor Allocation Strategies Conclusion and Future Work

3 Contributions of Thesis Novel Taxonomy  Classify current FPGA fault-handling methods FPGA Repair Optimization  Improve the performance of a Genetic Algorithm Architectural Analysis  Demonstrate benefits of newer FPGA devices Adaptive Architecture Implementation  Exploit benefits of Partial Reconfiguration

4 Previous Work SRAM Field Programmable Gate Arrays (FPGA) From: The Design Warrior’s Guide to FPGAs by Clive Maxfield LUT mux flip-flop a b c d in clock q y Programmable Logic Block (PLB)

5 Previous Work Unlimited Programmability  Quickly test prototypes on final H/W architecture  Patch design flaws while in use  Repair radiation faults Ideal target for space applications

6 Previous Work Manufacturer-provided  Increase production yield of FPGAs  Architectural / hardware modifications User Provided  Integrate fault-handling methods into FPGA application

7 Previous Work A-priori Allocation  Assign spare resources during design process Dynamic Processes  Assign spare resources or determine repair during run-time

8 Previous Work Fine-grained Medium-grained Coarse-grained Sub-PLB Spares PLB Spares Incremental Rerouting GA Repair Augmented GA Repair TMR w/ Single Module Repair Online BIST Competing Configurations Resources Operational Delay Fault Latency Unavailability Fault Occlusion Repair Granularity Fault Tolerance Fault Coverage Critical Requirements Metrics Methods

9 Previous Work Genetic Algorithm Fault-Handling  Some other method detects a fault  Create a population of candidate solutions  Test each candidate to evaluate performance  Apply genetic operators to create new individuals Crossover Mutation  Repeat process until complete repair is found ++

10 Evolvable Hardware Optimization Strategies Optimize GA fault-handling method  Some partition methods are based on similarity between individuals Requires similarity function that may not be possible, and also incurs undesired computation  Age-layered Population Structure (ALPS) Used to evolve higher-fit antenna designs Partition population of candidate solutions based on age of individual Negligible additional computation Contains best individual within one sub-population to prevent convergence of the population

11 Evolvable Hardware Optimization Strategies Optimize GA fault-handling method Standard GA population age-level 9 age-level 8 age-level 7 age-level 6 age-level 5 age-level 4 age-level 3 age-level 2 age-level 1 age-level 0 Repair

12 Evolvable Hardware Optimization Strategies Individuals increasing in age

13 Evolvable Hardware Optimization Strategies Evolution of competitive individuals

14 Evolvable Hardware Optimization Strategies Best Individuals at each Generation (averaged over 100 runs)

15 Evolvable Hardware Optimization Strategies Reasons for sluggish performance  Partitioning the population into sub-populations (restricts rate that genetic info is communicated)  Replacing the bottom age-level every 20 gen. (causes ALPS to be less deterministic)  Beginning population size of ALPS is 1/10 of standard (700 generations are needed to saturate capacity)

16 Parent 1 2 Choice 1 2 Evolvable Hardware Optimization Strategies Propose new selection strategy for crossover genetic operator Old Selection Strategy (combined) New Selection Strategy (separate) Parent 1 Pop 1 Pops 0&1 Parent 2 Pop 0 Pop 1 Choose with probability p

17 Evolvable Hardware Optimization Strategies Best Individuals at each Generation (averaged over 100 runs)

18 Evolvable Hardware Optimization Strategies

19 Partial Reconfiguration and Architectural Analysis Overview  Partial reconfiguration modifies a portion of the FPGA  Multiple modules may reside within reconfigurable area

20 Previous Work Spare Configs: Fine-grained

21 Previous Work Online Recovery: Competitive Configurations

22 Partial Reconfiguration and Architectural Analysis Benefits of Partial Reconfiguration  Reconfiguration: time-multiplex between functions (extend the number of available resources with time)  Partial: module granularity reduced Unchanged portion of FPGA is not affected by configuration Smaller bitstream filesize Smaller reconfiguration time Less storage requirements  Result: significantly more combinations of hardware arrangements with similar storage requirements

23 Partial Reconfiguration and Architectural Analysis xc2vp30-7ff896, 80CLB configuration frame Bitstream Filesize (bytes) Area Allocated (slices) Area Used (slices) Time to Configure (seconds) Full Device1,448,81713,696 7 MD5320,597 (22.1%) 1280 (9.3%)389 (2.8%)2 (28.6%) SHA-1356,702 (24.6%) 1280 (9.3%)457 (3.3%)2 (28.6%) 2.8 –3.3% resource usage versus 22.1 –24.6% bitstream filesize

24 Partial Reconfiguration and Architectural Analysis Overview of partial reconfiguration design

25 Partial Reconfiguration and Architectural Analysis FPGA Implementation and Resource Utilization

26 Partial Reconfiguration and Architectural Analysis xc4vfx60-11ff672, 16CLB configuration frame Bitstream Filesize (bytes) Area Allocated (slices) Area Used (slices) Full Device2,625,43825,280 MD595,962 (3.7%)1,280 (5.1%)405 (1.6%) SHA-197,619 (3.7%)1,280 (5.1%)472 (1.9%) 1.6 –1.9% resource usage versus 3.7% bitstream filesize V-II: 320,597 bytes versus V-4: 95,962 bytes (70% reduction)

27 Dynamic Processor Allocation Strategies Increase Reconfigurable Areas from 1 to 8 Implement Adaptable Architecture for Video Processing Functions  Discrete Cosine Transform (DCT)  Motion Estimation Video functions are sufficiently different in resources to require reconfiguration

28 Dynamic Processor Allocation Strategies Location of 8 PEs on a V4SX device

29 Dynamic Processor Allocation Strategies Slices within Area (Slice Utilization) Bitstream Filesize in bytes PE0320 (94.38%)22,306 PE1384 (95.05%)27,794 PE2384 (84.38%)28,306 PE3384 (92.97%)28,158 PE4320 (91.25%)22,306 PE5384 (88.54%)27,354 PE6384 (87.76%)27,618 PE7384 (95.57%)27,654

30 Dynamic Processor Allocation Strategies Bitstream Filesize Configuration Time Non-PR 1x1 Full 2D-DCT1,712,614 bytes17 ms 4x4 DCT & 4 ME PEs1,712,614 bytes17 ms 8x8 Full 2D-DCT1,712,614 bytes17 ms 3 H/W Arrangements4.90 MB 17ms/17ms (Best/Worst) PR Initial (8x8 )1,712,614 bytes17 ms 8 Full Precision PEs8 × 28,306 bytes8 × ms 8 Partial Precision PEs8 × 28,306 bytes8 × ms 8 Empty PEs8 × 10,586 bytes8 × ms 16 H/W Arrangements2.15 MB 0.106/2.265 ms (Best/Worst) PR Initial (8x8 )1,712,614 bytes17 ms 8 Full Precision PEs8 × 28,306 bytes8 × ms 8 Partial Precision PEs8 × 28,306 bytes8 × ms 8 Empty PEs8 × 10,586 bytes8 × ms 8 Motion Estimation PEs8 × 28,306 bytes8 × ms 80 H/W Arrangements2.36 MB 0.106/2.265 ms (Best/Worst)

31 Dynamic Processor Allocation Strategy Benefits of Partial Reconfiguration  Reconfiguration: time-multiplex between functions (extend the number of available resources with time)  Partial: module granularity reduced Unchanged portion of FPGA is not affected by configuration Smaller bitstream filesize Smaller reconfiguration time Less storage requirements  Result: significantly more combinations of hardware arrangements with similar storage requirements

32 Conclusion and Future Work Evolvable Hardware  Non-deterministic methods can repair faulty digital circuits  Time required justified by ability to exploit faults  Increase complete repair occurrence rate 5-fold  Future Improvements make use of fault location optimize genetic algorithm parameters

33 Conclusion and Future Work Partial Reconfiguration  Newer partial reconfiguration flow allows rectangle-sized areas Allows static resources to maximize FPGA area  Newer architecture allows: multiple rectangle-sized areas within one column of resources reduced configuration granularity for modules 30% reduction in storage and configuration time

34 Conclusion and Future Work Dynamic Processors  Utilizes newer software design flow and newer FPGA hardware architecture Storage reduced 55-fold Time reduced 8–160 fold  Benefits make reconfiguration possible for fast processes such as video functions  Time multiplexing may enable smaller FPGA devices to compete with larger devices not utilizing partial reconfiguration

35 Conclusion and Future Work Future Work  Develop self-contained partial reconfiguration solution  Continue to challenge and improve reconfiguration process and hardware design enable FPGAs to be standard hardware platform for evolvable/adaptable systems

36 Publication HUANG, J., PARRIS, M., LEE, J. and DEMARA, R.F Scalable FPGA Architecture for DCT Computation using Dynamic Partial Reconfiguration. accepted to International Conference on Engineering of Reconfigurable Systems and Algorithms.

37 Previous Work Spare Resources: Sub-PLB Spares

38 Previous Work Offline Recovery: Incremental Rerouting

39 Previous Work Online Recovery: Online BIST

40 Evolvable Hardware Optimization Strategies

41

42

43

44


Download ppt "Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Committee Members: Annie S. Wu, Jooheung Lee, and Ronald F. DeMara Optimizing Dynamic."

Similar presentations


Ads by Google