Download presentation

Presentation is loading. Please wait.

Published byReese Brinley Modified about 1 year ago

1
Energy Security with Scalably Verifiable Dynamic Power Management Luwa Matthews Meng Zhang Daniel J. Sorin 0 Duke University ECE

2
Source: Hennessy and Patterson Computer Architecture Chips have hit power density ceiling Case for Dynamic Power Management 1

3
Source: hp.com Reducing cloud electricity consumption by half saves as much as UK consumes Datacenters consume increasing amounts of power Case for Dynamic Power Management DPM goal: maximize performance at fixed power budget 2 Cloud map of Europe

4
DPM involves: -dynamically adjusting power states – clock gating, DVFS, etc -ensuring safety of power allocations to resources DPM can be performed at several granularities -cores, chips, racks, datacenters, etc. n cores in CMP … DPM Request Power grant deny 3 Dynamic Power Management

5
DPM involves: -dynamically adjusting power states – clock gating, DVFS, etc -ensuring safety of power allocations to resources DPM can be performed at several granularities -cores, chips, racks, datacenters, etc. … … DPM Request Power grant deny n machines in datacenter 4 Dynamic Power Management

6
Case for Verifiable DPM Want formal verification - prove correctness for all possible DPM situations - DPM correct in all situations system is energy secure DPM can greatly improve energy efficiency 5 Unverified DPM could -overshoot power budget – system damage -underutilize resources -deadlock -not energy secure!

7
CMPs and datacenters have many computing resources S power states per CR + S n possible DPM states Why Scalably Verifiable DPM is Hard n computing resources (CR) 6 Checking S n states is intractable for typical values of S and n

8
Hypothesis and Assumptions Problem: verification of existing DPM protocols is unscalable Hypothesis: We can design DPM such that it is scalably verifiable -key idea: design DPM amenable to inductive verification -change architecture to match verification methodologies Approach: -abstract away details of computing resources -abstract power states – e.g., Medium power -focus on decision algorithm (not DVFS or power gating) 7

9
Outline Background and Motivation Fractal DPM Experimental Evaluation Conclusions 8

10
Our Inductive Approach Induction key to scalable verification can prove DPM correct for arbitrary number of computing resources Base case: small scale system with few CRs is correct - small enough that it’s easy to verify with existing tools Inductive step: system behaves the same at every scale fractal behavior Prove base case + prove inductive step DPM scheme is correct for any number of CRs 9

11
Attaining Scalable Verification -base case of induction CRs request power from DPM controller DPM controller grants or denies each request Few states easy to verify that DPM is correct note: over-simplified base case for now Request Power Grant/Deny DPM-C CR 10

12
11 CR DPM-C CR Root DPM-C Base Case -Refine our base case a little -Need all types of structures – CR, DPM-C, Root DPM-C Attaining Scalable Verification -inductive proof

13
12 behavior must be fractal Request Power Grant/Deny DPM-C CR Attaining Scalable Verification -inductive step

14
13 Request Power Grant/Deny DPM-C CR Request Power Grant/Deny DPM-C CR can scale system by replacing CR with larger system {DPM-C + 2 CRs} behaves just like 1 CR

15
14 Request Power Grant/Deny Request Power Grant/Deny DPM Controller CR DPM Controller Observed externally, node behaves like a single CR CR

16
15 Ready State: P Block State: P Request: R CR Requests R Parent DENIES R State: R State: P Parent GRANTS R CR sends ACK to Parent START CR sends ACK to Parent

17
16 Block State: P:X State: P:X Request: R:X if Avg(P:X)=Avg(R:X) & R:X!=H:H Block if R:X=H:H START Child REQUESTS R Child sends ACK Request Avg(R:X) from Parent Block if Avg(P:X)!=Avg(R:X) GRANT Child R DENY Child R if parent GRANTS Avg(R:X) GRANT Child R if parent DENIES Avg(R:X) DENY Child R

18
17 Block State: P:X State: P:X Request: R:X if R:X!=H:H Block if R:X=H:H START Child REQUESTS R Child sends ACK GRANT Child R DENY Child R

19
“Looking-down” equivalence check Attaining Scalable Verification -inductive proof 18 Inductive Step – Two Observational Equivalences Observed externally (from P1, P2), A and A’ behave same Small System Large System A A’ P1 P2

20
By induction, protocol correct for all scales (Zhang, MICRO 2010) “Looking-up” equivalence check Attaining Scalable Verification -inductive proof 19 Inductive Step – Two Observational Equivalences Observed externally (from P1,P2), B and B’ behave same Large System Small System B’ B P1 P2

21
CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) Parent DPM controller “sees” child DPM controller in averaged state 20 DPM controller state is : H L H:L L M:L M Fractal DPM Design Avg(H:L) = M

22
CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) Parent DPM controller “sees” child DPM controller in averaged state 21 DPM controller state is : Fractal DPM Design MH H MH:H L H:L H Avg(MH:H) = H

23
Fractal DPM Design -fractal invariant 22 Fractal design + inductive proof invariant must also be fractal - Invariant must apply at every scale of system - Not OK to specify, e.g., <75% of all CRs are in H state Our fractal invariant: children of DPM controller not both in H H H H:H L H:L H MH H:MH H H:H ILLEGAL H

24
23 Translating Fractal Invariant to System-Wide Cap We must have fractal invariant for fractal design But most people interested in system-wide invariants We prove (not shown) that our fractal invariant implies system- wide power cap Max power for n CRs is: (n-1)MH + H i.e., (n-1) CRs in state MH and one CR in state H

25
24 Fractal DPM Design -illustration CR requests MH H L L M:L H:L Req. MH

26
H L L M:L MH:L block Grant MH 25 Fractal DPM Design -illustration CR requests MH Granting request doesn’t change controller’s Avg state Avg(H:L)=Avg(MH:L)=M Request Granted, doesn’t violate invariant Controller blocks waiting for ack

27
26 Fractal DPM Design -illustration CR sends ack to Controller MH L L M:L MH:L block ack CR sets its state

28
27 Fractal DPM Design -illustration Controller unblocks H L L M:L H:L

29
28 Computing Resource requests H Fractal DPM Design -illustration L L L L:L Req. H

30
29 Controller defers request to its parent -new request is M (not H) because Avg(H:L)=M CR requests H from its Controller Fractal DPM Design -illustration L L L L:L Req. M Req. H

31
L M M LM:M L:M Req. M Req. H L M:L Grnt M Grnt H Final Controller Intermediary Controller

32
31 Fractal DPM Design -illustration Root grants request to Controller, blocks L L L M:L L:L Grant M block

33
32 Controller grants request to CR, blocks Fractal DPM Design -illustration L L L M:L H:L Grant H block Grant M block

34
33 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L ack block

35
34 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L ack block Controllers unblock upon receiving ack ack

36
35 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L Controllers unblock upon receiving ack

37
36 Use same model checker to verify observational equivalences - use prior aggregation method for equivalence check (Park, TCAD 2000) Use model checker to verify base case - we use well-known, automated Murphi model checker Verification Procedure

38
Outline Background and Motivation Fractal DPM Experimental Evaluation Conclusions 37

39
38 Experimental Evaluation -characterizing performance Goal of DPM is optimize performance Per allocation perf = f(requested power, allocated power) ? Might vary with implementation, we want abstraction

40
39 Experimental Evaluation -characterizing performance allocated power Performance max perf for Req H performance as a function of requested power and allocated power CR requests power needed for max perf H M L max perf for Req M max perf for Req L

41
Req. M Req. H Req. MH 40 Experimental Evaluation -performance optimality CR1CR2CR3 CR4 DPM cannot grant all requests DPM Req. L Some allocations are more optimal than others

42
Req. M Req. H Req. MH 41 Experimental Evaluation -performance optimality CR1CR2CR3 CR4 DPM cannot grant all allocations Some allocations are more optimal than others DPM Req. L L MH M H optimal allocation H MH M L suboptimal allocation

43
42 Oracle DPM always picks best performing legal allocation ie. allocation with Oracle DPM constrained by system-wide invariant, not fractal Experimental Evaluation -performance optimality DPM cannot grant all allocations Some allocations are more optimal than others -Oracle DPM unimplementable

44
Violating fractal invariant overshooting system-wide power cap Illegal: total power = 4MH Legal: total power = 4MH violates fractal invariant Some safe power requests are denied – fractal inefficiency 43 Our fractal invariant implies system-wide cap > n*MH MH MH:MH MH MH:MH M M M:H H H H:H M:M Fractal Inefficiency – cost of fractal behavior

45
44 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. MH Req. M Fractal DPM Oracle DPM Req. LM Req. M Req. H Req. MH

46
45 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. MH Req. M Fractal DPM Oracle DPM Req. LM Req. M Req. H Req. MH LM M H MH LM M H MH

47
46 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H

48
47 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H M M H LM M M H H

49
48 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Can compare total system perf. of Fractal DPM vs Oracle DPM Perf = f(request,allocation) Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H M M H M M H LM H

50
Results 49 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100%

51
Results 50 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% On 72.6% of time steps Fractal DPM ≡ Oracle DPM

52
Results 51 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% On 99.9% of time steps Fractal DPM < 20% off from Oracle

53
Results 52 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% Worst case, Fractal DPM < 36.4% off from Oracle

54
We show how a scalably verifiable DPM can be built Fractal behavior enables one-time verification for all scales Entire verification is done in model checker Fractal invariants lead to some inefficient power allocations -very few and cause little loss in performance Conclusions 53 DPM protocol correct for all scales system more energy secure

55
Thanks Questions? 54

56
Simulate Fractal DPM managing 8 computing resources All CRs make random power requests L-H every time step Quantifying Performance Cost 55 Compare performance with oracle DPM (optimal allocator) linear perf % loss assymptotic perf % loss

57
Naïve DPM cannot provide strong power guarantee in system with churn/failures Case for Scalably Verifiable DPM: 56

58
Naïve DPM cannot provide strong power guarantee in system with churn/failures 57 Case for Scalably Verifiable DPM:

59
Hence, the need for DPM with one-time proof of correctness for all scales of system DPM scheme correct for n nodes correct for n+x nodes Naïve DPM cannot provide strong power guarantee in system with churn/failures 58 Case for Scalably Verifiable DPM:

60
Attaining Scalable Verification System Model Abstract away specific resources managed, implementation mechanism Focus on DPM decision algorithm verification on model checkers Seek one-time DPM verification, any number of resources managed -inductive proof useful Fractal Behavior in DPM allocation decision -hierarchical structure enables inductive proof

61
Root DPM Controller DPM Controller Leaf Node Run exhaustive check of base case in Mur Base case correct if no invariant violation Proof of Scalable Correctness of our DPM - Base Case 60

62
Req. P Req P Q X Y X:Y Large System “Looking-Down” Req X’ Proof of Scalable Correctness of our DPM - Inductive Cases Q=Avg (X:Y) Small System 61

63
Grant P P X’ Y X’:Y Large System “Looking-Down” Grant X’ P=Avg (X’:Y) Proof of Scalable Correctness of our DPM - Inductive Cases Small System 62

64
“Looking-Up” Req P Req P’ Req P X Y Small System Z Z=Avg(X,Y) Proof of Scalable Correctness of our DPM - Inductive Cases Large System 63

65
“Looking-Up” Grant P Grant P’ Large System Grant P X Y Small System Z Z=Avg(X,Y) Proof of Scalable Correctness of our DPM - Inductive Cases 64

66
MH MH:MH MH MH:MH M M M:H H H H:H M:M 65

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google