Presentation is loading. Please wait.

Presentation is loading. Please wait.

September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture.

Similar presentations


Presentation on theme: "September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture."— Presentation transcript:

1 September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture

2 Sept 28 th 2004 2 Motivation  ITRS Roadmap: Reasons for increasing power consumption  Higher chip operating frequencies  Increased gate leakage of transistors  Higher interconnect Capacitances and Resistances Lack of interconnect architecture design tool until 2009 Inability of the Interconnect to scale for performance beyond 2009

3 Sept 28 th 2004 3 Heterogeneous Interconnects: A starting point  Two sets of Interconnects Low Delay, high power wires Low Power wires(high delay)  Easier to target instructions  Augurs well for a more sophisticated model

4 Sept 28 th 2004 4 Interconnect transfers - Types Bypassed register value Ready register value Address transfer Store value Load value

5 Sept 28 th 2004 5 Bypassed Register Values  Operands produced in a cluster that are immediately required by another cluster  Criticality based on two factors Operand arrival time at the cluster Actual issue time of the sourcing instruction  Criticality changes at runtime Needs a dynamic predictor Rename & Dispatch IQ Regfile FU IQ Regfile FU IQ Regfile FU Producing Instruction completing execution at cycle 120 Consumer Instruction dispatched at Cycle 100

6 Sept 28 th 2004 6 The Data Criticality Predictor  A table indexed by the lower order bits of the instruction address, updated dynamically to indicate the criticality of data.  Difference in arrival time and usage calculated for each operand of an instruction  Difference < Threshold Critical  Difference > Threshold Non-Critical

7 Sept 28 th 2004 7 Summary of transfers CriticalNon-Critical Load ValuesStore value Effective address unpredictedEffective address predicted Bypassed register value Ready register value

8 Sept 28 th 2004 8 Result summary  Two kinds of non-critical transfers Data that are not immediately used – 36% Verification of address predictions – 13%  Criticality based case 49% of all data transfers through the Power-optimized wires Performance penalty - only 2.5% Potential energy savings of around 50% in the interconnects

9 Sept 28 th 2004 9 Things that are missing  Power modeling for the processor as a whole.  Implications on transient temperature variations for varying workloads.  Lack of a good on chip interconnect power/temperature simulator  Complexity effective design for the criticality predictor

10 Sept 28 th 2004 10 Interconnect simulator: Problems  Should account for: No. of wires in the particular process. Deal with a 3-D space for routing of wires. Satisfy the design rule constraints. More of a layout optimization problem.

11 Sept 28 th 2004 11 What we propose to do  Wattch: incorporated into a scalable 16 cluster system  HotSpot: Transient temperature model  HotLeakage: Leakage power model  Build a prototype layout to satisfy the above requirements

12 Sept 28 th 2004 12 Wattch  Power model from Princeton University  Simulates an o-o-o processor (Alpha 21264)  Caveat: Interconnects are not accurately simulated

13 Sept 28 th 2004 13 Wattch Modified  Wattch uses a single instruction window logic  Issue queue model Separate Int and FP Wakeup logic Separate Int and FP Selection logic Helps in efficient distribution

14 Sept 28 th 2004 14 Wattch Modified  Single result bus, FUs and register files  Distributed units Separate Integer and floating point register files Separate Integer and floating point execution and result bus units

15 Sept 28 th 2004 15 Wattch Modified  Wattch: Simple Alpha 21264  Modified for a scalable 16 cluster system Modular: easy for adaptation and testability. Caveat: There is lot of scope for improvement

16 September 28 th 2004University of Utah16 Visual Feature Recognition Elastic Bunch Graph Matching(EBGM)

17 Sept 28 th 2004 17 History  No particular algorithm known  Many algorithms for face and object recognition  Few feature recognition benchmarks like the FERET  Eigen faces – traditionally known for face recognition

18 Sept 28 th 2004 18 Motivation: EBGM FLESH TONING SEGMENT- ATION FACE DETECTION FACE RECOGNITION No Segmentation needed in EBGM! Steps in Face Recognition

19 Sept 28 th 2004 19 EBGM Steps involved in EBGM NORMALIZATION/ PREPROCESSING FACE GRAPH CREATION FACE IDENTIFICATION Looks easy

20 Sept 28 th 2004 20 EBGM: Mathematically  Image descriptions are based on a Wavelet transform  Gabor jets are extracted from each landmark  Local image information around each node is the key

21 Sept 28 th 2004 21 EBGM: What is missing?  Landmark localization is less reliable  Difficult to track small differences in face orientation now  Compute intensive Gabor jets

22 Sept 28 th 2004 22 Questions? Thank you


Download ppt "September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture."

Similar presentations


Ads by Google