Presentation is loading. Please wait.

Presentation is loading. Please wait.

FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Topics n Hardware/software co-design.

Similar presentations


Presentation on theme: "FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Topics n Hardware/software co-design."— Presentation transcript:

1 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Topics n Hardware/software co-design.

2 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Why put CPUs on FPGAs? n Shrink a board to a chip. n What CPUs do best: –Irregular code. –Code that takes advantage of a highly optimized datapath. n What FPGAs do best: –Data-oriented computations. –Computations with local control.

3 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR System design n True concurrency increases system performance. –CPU and accelerator should run in parallel. n CPU cost is a non-linear function of performance. –Accelerator will be smaller, faster, lower power.

4 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Hardware/software partitioning CPU accelerator if (foo < 8) { for (i=0; i<N; i++) x[i] = y[i]*z[i]; }

5 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Methodology n Measure the application. n Identify what to put onto the accelerator. n Build interfaces.

6 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Concurrency n Concurrent applications provide the most speedup. CPU accelerator if (a > b)... x[i] = y[i] * z[i] No data dependencies

7 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Concurrency analysis n Data dependencies. z= x * y; w = z - v; n Control dependencies. if (a < b) u = r + s;

8 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Process 2 Process 3 Process 1 Partitioning n Can divide the application into several processes that run concurrently. n Process partitioning exposes opportunities for parallelism. if (i>b) … for (i=0; i<N; i++) … for (j=0; j<N; j++)...

9 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Partitioning programs n Reasonable partitioning points: –If statements,etc. –Loop nests.

10 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Multi-threaded systems n Single thread: n Multi-thread:

11 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Performance analysis n Single threaded: –Find longest possible execution path. n Multi-threaded with no synchronization: –Find the longest of several execution paths. n Multi-threaded with synchronization: –Find the worst-case synchronization conditions.

12 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Multi-threaded performance analysis n Synchronization causes the delay along one path to affect the delay along another. synchronization point tata tbtb tctc tdtd Delay = max(t a, t b ) + t d

13 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Control n Need to signal between CPU and accelerator. –Data ready. –Complete. n Implementations: –Shared memory. –Handshake.

14 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Keeping the accelerator fed n Must get data in, must get data out. n Data transfer costs: –flush CPU cache; –device driver; –bus transactions.

15 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Memory buffers n Must keep accelerator fed. –Buffer size in accelerator depends on amount of data needed at a time, delays in obtaining needed values. n Streaming generally requires small buffers: –x[i] = y[i] * z[i]; n Values with long lifetimes need more buffer space.

16 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Allocation n How do we decide what goes on the CPU, what goes on the FPGA? n Allocation puts functions on the CPU or FPGA.

17 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Speedup n Speedup for one iteration: –t HW - t SW - t I - t O n May be able to set up many iterations at once: –N*(t HW - t SW ) - t I - t O

18 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Drivers n Need interface between CPU and accelerator: –transfer data values; –start, stop computation. n If computation time is very predictable, a simpler communication scheme may be possible.

19 FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Debugging n Hard to test a CPU/accelerator system: –Hard to control and observe the accelerator without the CPU. –Software on CPU may have bugs. n Build separate test benches for CPU code, accelerator. n Test integrated system after components have been tested.


Download ppt "FPGA-Based System Design: Chapter 7 Copyright  2004 Prentice Hall PTR Topics n Hardware/software co-design."

Similar presentations


Ads by Google