Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instructor: Dr. Phillip Jones

Similar presentations


Presentation on theme: "Instructor: Dr. Phillip Jones"— Presentation transcript:

1 Instructor: Dr. Phillip Jones
CPRE 583 Reconfigurable Computing Lecture 11: Fri 10/1/2010 (Design Patterns) Instructor: Dr. Phillip Jones Reconfigurable Computing Laboratory Iowa State University Ames, Iowa, USA

2 Announcements/Reminders
HW2: Due Wed 10/6 Problem 2 will have a separate deadline (to be announced) MP2: Due Fri 10/1 (you can work in pairs) Bonus: 3 pts for working in simulation 2 pts in hardware For hardware you will have to take a close look at tcpdump, as your circuit is likely being exposed to a condition that not is not tested by the testbench. Make sure to read the README file in the MP2 distribution Contains info on how to fix a Gigabit core licensing issue ISE has Start thinking of class projects and forming teams Submit teams and project ideas: Mon 10/11 midnight Project proposal presentations: Wed 10/20

3 Initial Project Proposal Slides (5-10 slides)
Project team list: Name, Responsibility (who is project leader) Team size: 3-4 (5 case-by-case) Project idea Motivation (why is this interesting, useful) What will be the end result High-level picture of final product High-level Plan Break project into mile stones Provide initial schedule: I would initially schedule aggressively to have project complete by Thanksgiving. Issues will pop up to cause the schedule to slip. System block diagrams High-level algorithms (if any) Concerns Implementation Conceptual Research papers related to you project idea

4 Projects Ideas: Relevant conferences
FPL FPT FCCM FPGA DAC ICCAD Reconfig RTSS RTAS ISCA Micro Super Computing HPCA IPDPS

5 Initial Project Proposal Slides (5-10 slides)
Project team list: Name, Responsibility (who is project leader) Project idea Motivation (why is this interesting, useful) What will be the end result High-level picture of final product High-level Plan Break project into mile stones Provide initial schedule: I would initially schedule aggressively to have project complete by Thanksgiving. Issues will pop up to cause the schedule to slip. System block diagrams High-level algorithms (if any) Concerns Implementation Conceptual Research papers related to you project idea

6 Weekly Project Updates
The current state of your project write up Even in the early stages of the project you should be able to write a rough draft of the Introduction and Motivation section The current state of your Final Presentation Your Initial Project proposal presentation (Due Wed 10/20). Should make for a starting point for you Final presentation What things are work & not working What roadblocks are you running into

7 Projects: Target Timeline
Teams Formed and Idea: Mon 10/11 Project idea in Power Point 3-5 slides Motivation (why is this interesting, useful) What will be the end result High-level picture of final product Project team list: Name, Responsibility High-level Plan/Proposal: Wed 10/20 Power Point 5-10 slides System block diagrams High-level algorithms (if any) Concerns Implementation Conceptual Related research papers (if any)

8 Projects: Target Timeline
Work on projects: 10/ /8 Weekly update reports More information on updates will be given Presentations: Last Wed/Fri of class Present / Demo what is done at this point 15-20 minutes (depends on number of projects) Final write up and Software/Hardware turned in: Day of final (TBD)

9 Project Grading Breakdown
50% Final Project Demo 30% Final Project Report 30% of your project report grade will come from your 5-6 project updates. Friday’s midnight 20% Final Project Presentation

10 Overview Class Project (example from 2008) Common Design Patterns

11 What you should learn Introduction to common Design Patterns & Compute Models

12 Outline Design patterns Why are they useful? Examples Compute models

13 Outline Design patterns Why are they useful? Examples Compute models

14 References Reconfigurable Computing (2008) [1] Chapter 5: Compute Models and System Architectures Scott Hauck, Andre DeHon Design Patterns for Reconfigurable Computing [2] Andre DeHon (FCCM 2004) Type Architectures, Shared Memory, and the Corollary of Modest Potential [3] Lawrence Snyder: Annual Review of Computer Science (1986) Design Patterns: Abstraction and Reuse of Object Oriented Design [4] E. Gamma (1992) The Timeless Way of Building [5] C. Alexander (1979)

15 Design Patterns Design patterns: are a solution to reoccurring problems.

16 Reconfigurable Hardware Design
“Building good reconfigurable designs requires an appreciation of the different costs and opportunities inherent in reconfigurable architectures” [2] “How do we teach programmers and designers to design good reconfigurable applications and systems?” [2] Traditional approach: Read lots of papers for different applications Over time figure out ad-hoc tricks Better approach?: Use design patterns to provide a more systematic way of learning how to design It has been shown in other realms that studying patterns is useful Object oriented software [4] Computer Architecture [5]

17 Common Language Provides a means to organize and structure the solution to a problem Provide a common ground from which to discuss a given design problem Enables the ability to share solutions in a consistent manner (reuse)

18 Describing a Design Pattern [2]
10 attributes suggested by Gamma (Design Patterns, 1995) Name: Standard name Intent: What problem is being addressed?, How? Motivation: Why use this pattern Applicability: When can this pattern be used Participants: What components make up this pattern Collaborations: How do components interact Consequences: Trade-offs Implementation: How to implement Known Uses: Real examples of where this pattern has been used. Related Patterns: Similar patterns, patterns that can be used in conjunction with this pattern, when would you choose a similar pattern instead of this pattern.

19 Example Design Pattern
Coarse-grain Time-multiplexing Template Specialization

20 Coarse-grain Time-Multiplexing
B M3 M1 M2 M1 M2 A B M3 Temp M3 Temp Configuration 1 Configuration 2

21 Coarse-grain Time-Multiplexing
Name: Coarse-grained Time-Multiplexing Intent: Enable a design that is too large to fit on a chip all at once to run as multiple subcomponents Motivation: Method to share limited fixed resources to implement a design that is too large as a whole.

22 Coarse-grain Time-Multiplexing
Applicability (Requirements): Configuration can be done on large time scale No feedback loops in computation Feedback loop only spans the current configuration Feedback loop is very slow Participants: Computational graph Control algorithm Collaborations: Control algorithm manages when sub-graphs are loaded onto the device

23 Coarse-grain Time-Multiplexing
Consequences: Often platforms take millions of cycles to reconfigure Need an app that will run for 10’s of millions of cycles before needing to reconfigure May need large buffers to store data during a reconfiguration Known Uses: Video processing pipeline [Villasenor] “Video Communications using Rapidly Reconfigurable Hardware”, Transactions on Circuits and Systems for Video Technology 1995 Automatic Target Recognition [[Villasenor] “Configurable Computer Solutions for Automatic Target Recognition”, FCCM 1996

24 Coarse-grain Time-Multiplexing
Implementation: Break design into multiple sub graphs that can be configured onto the platform in sequence Design a controller to orchestrate the configuration sequencing Take steps to minimize configuration time Related patterns: Streaming Data Queues with Back-pressure

25 Coarse-grain Time-Multiplexing
B M3 M1 M2 M1 M2 A B M3 Temp M3 Temp Configuration 1 Configuration 2

26 Coarse-grain Time-Multiplexing
Assume: 1.) reconfiguration take10 thousand clocks 2.) 100 MHz clock 3.) We need to process for 100x the time spent in reconfiguration to get needed speed up. 4. A and B each produce one byte per clock M2 M1 A B M3 M1 M2 M1 M2 A B M3 Temp M3 Temp Configuration 1 Configuration 2

27 Coarse-grain Time-Multiplexing
Assume: 1.) reconfiguration take10 thousand clocks 2.) 100 MHz clock 3.) We need to process for 100x the time spent in reconfiguration to get needed speed up. 4. A and B each produce one byte per clock M2 M1 A B M3 M1 M2 M1 M2 What constraint does this place on Temp? A B 1 MB buffer What if the data path is changed from 8-bit to 64-bit? M3 Temp M3 Temp 8 MB buffer Likely need off chip memory Configuration 1 Configuration 2

28 Template Specialization
Empty LUTs A(1) A(0) LUT LUT LUT LUT - - - - C(3) C(2) C(1) C(0) Mult by 3 Mult by 5 A(1) A(1) A(0) A(0) LUT LUT LUT LUT LUT LUT LUT LUT 3 6 9 1 1 1 1 5 10 15 1 1 1 1 C(3) C(2) C(1) C(0) C(3) C(2) C(1) C(0)

29 Template Specialization
Name: Template Specialization Intent: Reduce the size or time needed for a computation. Motivation: Use early-bound data and slowly changing data to reduce circuit size and execution time.

30 Template Specialization
Applicability: When circuit specialization can be adapted quickly Example: Can treat LUTs as small memories that can be written. No interconnect modifications Participants: Template cell: Contains specialization configuration Template filler: Manages what and how a configuration is written to a Template cell Collaborations: Template filler manages Template cell

31 Template Specialization
Consequences: Can not optimize as much as when a circuit is fully specialize for a given instance. Overhead need to allow template to implement several specializations. Known Uses: Multiply-by-Constant String Matching Implementation: Multiply-by-Constant Use LUT as memory to store answer Use controller to update this memory when a different constant should be used.

32 Template Specialization
Related patterns: CONSTRUCTOR EXCEPTION TEMPLATE

33 Template Specialization
Empty LUTs A(1) A(0) LUT LUT LUT LUT - - - - C(3) C(2) C(1) C(0) Mult by 3 Mult by 5 A(1) A(1) A(0) A(0) LUT LUT LUT LUT LUT LUT LUT LUT 3 6 9 1 1 1 1 5 10 15 1 1 1 1 C(3) C(2) C(1) C(0) C(3) C(2) C(1) C(0)

34 Template Specialization
Mult by 3 A(1) A(0) LUT LUT LUT LUT 1 1 1 1 3 6 9 C(3) C(2) C(1) C(0) Multiply by a constant of 2: Support inputs of 0 - 7

35 Template Specialization
Mult by 3 A(1) A(0) LUT LUT LUT LUT 1 1 1 1 3 6 9 C(3) C(2) C(1) C(0)

36 Template Specialization
Mult by 3 A(1) A(0) LUT LUT LUT LUT 1 1 1 1 3 6 9 Mult by 2 A(2) A(1) A(0) LUT LUT LUT LUT 2 4 6 8 10 12 14 1 1 1 C(3) C(2) C(1) C(0)

37 Catalog of Patterns (Just a start) [2]
[2] Identifies 89 patterns Area-Time Tradeoff Basic (implementation): Coarse-grain Time-Multiplex Parallel (Expression): Dataflow, Data Parallel Parallel (Implementation): SIMD, Communicating FSM Reducing Area or Time Ruse Hardware (implementation): Pipelining Specialization (Implementation): Template Communications Layout (Expression/Implementation): Systolic Memory Numbers and Functions

38 Catalog of Patterns (Just a start) [2]
[2] Identifies 89 patterns Area-Time Tradeoff Basic (implementation): Coarse-grain Time-Multiplex Parallel (Expression): Dataflow, Data Parallel Parallel (Implementation): SIMD, Communicating FSM Reducing Area or Time Ruse Hardware (implementation): Pipelining Specialization (Implementation): Template Communications Layout (Expression/Implementation): Systolic Memory Numbers and Functions

39 Next Lecture Continue Compute Models

40 Lecture Notes:


Download ppt "Instructor: Dr. Phillip Jones"

Similar presentations


Ads by Google