Download presentation
Presentation is loading. Please wait.
1
September 28 th 2004University of Utah1 A preliminary look Karthik Ramani Power and Temperature-Aware Microarchitecture
2
Sept 28 th 2004 2 Motivation ITRS Roadmap: Reasons for increasing power consumption Higher chip operating frequencies Increased gate leakage of transistors Higher interconnect Capacitances and Resistances Lack of interconnect architecture design tool until 2009 Inability of the Interconnect to scale for performance beyond 2009
3
Sept 28 th 2004 3 Heterogeneous Interconnects: A starting point Two sets of Interconnects Low Delay, high power wires Low Power wires(high delay) Easier to target instructions Augurs well for a more sophisticated model
4
Sept 28 th 2004 4 Interconnect transfers - Types Bypassed register value Ready register value Address transfer Store value Load value
5
Sept 28 th 2004 5 Bypassed Register Values Operands produced in a cluster that are immediately required by another cluster Criticality based on two factors Operand arrival time at the cluster Actual issue time of the sourcing instruction Criticality changes at runtime Needs a dynamic predictor Rename & Dispatch IQ Regfile FU IQ Regfile FU IQ Regfile FU Producing Instruction completing execution at cycle 120 Consumer Instruction dispatched at Cycle 100
6
Sept 28 th 2004 6 The Data Criticality Predictor A table indexed by the lower order bits of the instruction address, updated dynamically to indicate the criticality of data. Difference in arrival time and usage calculated for each operand of an instruction Difference < Threshold Critical Difference > Threshold Non-Critical
7
Sept 28 th 2004 7 Summary of transfers CriticalNon-Critical Load ValuesStore value Effective address unpredictedEffective address predicted Bypassed register value Ready register value
8
Sept 28 th 2004 8 Result summary Two kinds of non-critical transfers Data that are not immediately used – 36% Verification of address predictions – 13% Criticality based case 49% of all data transfers through the Power-optimized wires Performance penalty - only 2.5% Potential energy savings of around 50% in the interconnects
9
Sept 28 th 2004 9 Things that are missing Power modeling for the processor as a whole. Implications on transient temperature variations for varying workloads. Lack of a good on chip interconnect power/temperature simulator Complexity effective design for the criticality predictor
10
Sept 28 th 2004 10 Interconnect simulator: Problems Should account for: No. of wires in the particular process. Deal with a 3-D space for routing of wires. Satisfy the design rule constraints. More of a layout optimization problem.
11
Sept 28 th 2004 11 What we propose to do Wattch: incorporated into a scalable 16 cluster system HotSpot: Transient temperature model HotLeakage: Leakage power model Build a prototype layout to satisfy the above requirements
12
Sept 28 th 2004 12 Wattch Power model from Princeton University Simulates an o-o-o processor (Alpha 21264) Caveat: Interconnects are not accurately simulated
13
Sept 28 th 2004 13 Wattch Modified Wattch uses a single instruction window logic Issue queue model Separate Int and FP Wakeup logic Separate Int and FP Selection logic Helps in efficient distribution
14
Sept 28 th 2004 14 Wattch Modified Single result bus, FUs and register files Distributed units Separate Integer and floating point register files Separate Integer and floating point execution and result bus units
15
Sept 28 th 2004 15 Wattch Modified Wattch: Simple Alpha 21264 Modified for a scalable 16 cluster system Modular: easy for adaptation and testability. Caveat: There is lot of scope for improvement
16
September 28 th 2004University of Utah16 Visual Feature Recognition Elastic Bunch Graph Matching(EBGM)
17
Sept 28 th 2004 17 History No particular algorithm known Many algorithms for face and object recognition Few feature recognition benchmarks like the FERET Eigen faces – traditionally known for face recognition
18
Sept 28 th 2004 18 Motivation: EBGM FLESH TONING SEGMENT- ATION FACE DETECTION FACE RECOGNITION No Segmentation needed in EBGM! Steps in Face Recognition
19
Sept 28 th 2004 19 EBGM Steps involved in EBGM NORMALIZATION/ PREPROCESSING FACE GRAPH CREATION FACE IDENTIFICATION Looks easy
20
Sept 28 th 2004 20 EBGM: Mathematically Image descriptions are based on a Wavelet transform Gabor jets are extracted from each landmark Local image information around each node is the key
21
Sept 28 th 2004 21 EBGM: What is missing? Landmark localization is less reliable Difficult to track small differences in face orientation now Compute intensive Gabor jets
22
Sept 28 th 2004 22 Questions? Thank you
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.