Presentation is loading. Please wait.

Presentation is loading. Please wait.

Energy Security with Scalably Verifiable Dynamic Power Management Luwa Matthews Meng Zhang Daniel J. Sorin 0 Duke University ECE.

Similar presentations


Presentation on theme: "Energy Security with Scalably Verifiable Dynamic Power Management Luwa Matthews Meng Zhang Daniel J. Sorin 0 Duke University ECE."— Presentation transcript:

1 Energy Security with Scalably Verifiable Dynamic Power Management Luwa Matthews Meng Zhang Daniel J. Sorin 0 Duke University ECE

2 Source: Hennessy and Patterson Computer Architecture Chips have hit power density ceiling Case for Dynamic Power Management 1

3 Source: hp.com Reducing cloud electricity consumption by half saves as much as UK consumes Datacenters consume increasing amounts of power Case for Dynamic Power Management DPM goal: maximize performance at fixed power budget 2 Cloud map of Europe

4 DPM involves: -dynamically adjusting power states – clock gating, DVFS, etc -ensuring safety of power allocations to resources DPM can be performed at several granularities -cores, chips, racks, datacenters, etc. n cores in CMP … DPM Request Power grant deny 3 Dynamic Power Management

5 DPM involves: -dynamically adjusting power states – clock gating, DVFS, etc -ensuring safety of power allocations to resources DPM can be performed at several granularities -cores, chips, racks, datacenters, etc. … … DPM Request Power grant deny n machines in datacenter 4 Dynamic Power Management

6 Case for Verifiable DPM Want formal verification - prove correctness for all possible DPM situations - DPM correct in all situations  system is energy secure DPM can greatly improve energy efficiency 5 Unverified DPM could -overshoot power budget – system damage -underutilize resources -deadlock -not energy secure!

7 CMPs and datacenters have many computing resources S power states per CR + S n possible DPM states Why Scalably Verifiable DPM is Hard n computing resources (CR) 6 Checking S n states is intractable for typical values of S and n

8 Hypothesis and Assumptions Problem: verification of existing DPM protocols is unscalable Hypothesis: We can design DPM such that it is scalably verifiable -key idea: design DPM amenable to inductive verification -change architecture to match verification methodologies Approach: -abstract away details of computing resources -abstract power states – e.g., Medium power -focus on decision algorithm (not DVFS or power gating) 7

9 Outline Background and Motivation Fractal DPM Experimental Evaluation Conclusions 8

10 Our Inductive Approach Induction key to scalable verification  can prove DPM correct for arbitrary number of computing resources Base case: small scale system with few CRs is correct - small enough that it’s easy to verify with existing tools Inductive step: system behaves the same at every scale  fractal behavior Prove base case + prove inductive step  DPM scheme is correct for any number of CRs 9

11 Attaining Scalable Verification -base case of induction CRs request power from DPM controller DPM controller grants or denies each request Few states  easy to verify that DPM is correct note: over-simplified base case for now Request Power Grant/Deny DPM-C CR 10

12 11 CR DPM-C CR Root DPM-C Base Case -Refine our base case a little -Need all types of structures – CR, DPM-C, Root DPM-C Attaining Scalable Verification -inductive proof

13 12 behavior must be fractal Request Power Grant/Deny DPM-C CR Attaining Scalable Verification -inductive step

14 13 Request Power Grant/Deny DPM-C CR Request Power Grant/Deny DPM-C CR can scale system by replacing CR with larger system {DPM-C + 2 CRs} behaves just like 1 CR

15 14 Request Power Grant/Deny Request Power Grant/Deny DPM Controller CR DPM Controller Observed externally, node behaves like a single CR CR

16 15 Ready State: P Block State: P Request: R CR Requests R Parent DENIES R State: R State: P Parent GRANTS R CR sends ACK to Parent START CR sends ACK to Parent

17 16 Block State: P:X State: P:X Request: R:X if Avg(P:X)=Avg(R:X) & R:X!=H:H Block if R:X=H:H START Child REQUESTS R Child sends ACK Request Avg(R:X) from Parent Block if Avg(P:X)!=Avg(R:X) GRANT Child R DENY Child R if parent GRANTS Avg(R:X) GRANT Child R if parent DENIES Avg(R:X) DENY Child R

18 17 Block State: P:X State: P:X Request: R:X if R:X!=H:H Block if R:X=H:H START Child REQUESTS R Child sends ACK GRANT Child R DENY Child R

19 “Looking-down” equivalence check Attaining Scalable Verification -inductive proof 18 Inductive Step – Two Observational Equivalences Observed externally (from P1, P2), A and A’ behave same Small System Large System A A’ P1 P2

20 By induction, protocol correct for all scales (Zhang, MICRO 2010) “Looking-up” equivalence check Attaining Scalable Verification -inductive proof 19 Inductive Step – Two Observational Equivalences Observed externally (from P1,P2), B and B’ behave same Large System Small System B’ B P1 P2

21 CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) Parent DPM controller “sees” child DPM controller in averaged state 20 DPM controller state is : H L H:L L M:L M Fractal DPM Design Avg(H:L) = M

22 CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) Parent DPM controller “sees” child DPM controller in averaged state 21 DPM controller state is : Fractal DPM Design MH H MH:H L H:L H Avg(MH:H) = H

23 Fractal DPM Design -fractal invariant 22 Fractal design + inductive proof  invariant must also be fractal - Invariant must apply at every scale of system - Not OK to specify, e.g., <75% of all CRs are in H state Our fractal invariant: children of DPM controller not both in H H H H:H L H:L H MH H:MH H H:H ILLEGAL H

24 23 Translating Fractal Invariant to System-Wide Cap We must have fractal invariant for fractal design But most people interested in system-wide invariants We prove (not shown) that our fractal invariant implies system- wide power cap Max power for n CRs is: (n-1)MH + H i.e., (n-1) CRs in state MH and one CR in state H

25 24 Fractal DPM Design -illustration CR requests MH H L L M:L H:L Req. MH

26 H L L M:L MH:L block Grant MH 25 Fractal DPM Design -illustration CR requests MH Granting request doesn’t change controller’s Avg state Avg(H:L)=Avg(MH:L)=M Request Granted, doesn’t violate invariant Controller blocks waiting for ack

27 26 Fractal DPM Design -illustration CR sends ack to Controller MH L L M:L MH:L block ack CR sets its state

28 27 Fractal DPM Design -illustration Controller unblocks H L L M:L H:L

29 28 Computing Resource requests H Fractal DPM Design -illustration L L L L:L Req. H

30 29 Controller defers request to its parent -new request is M (not H) because Avg(H:L)=M CR requests H from its Controller Fractal DPM Design -illustration L L L L:L Req. M Req. H

31 L M M LM:M L:M Req. M Req. H L M:L Grnt M Grnt H Final Controller Intermediary Controller

32 31 Fractal DPM Design -illustration Root grants request to Controller, blocks L L L M:L L:L Grant M block

33 32 Controller grants request to CR, blocks Fractal DPM Design -illustration L L L M:L H:L Grant H block Grant M block

34 33 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L ack block

35 34 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L ack block Controllers unblock upon receiving ack ack

36 35 Fractal DPM Design -illustration acks percolate up tree from CR H L L M:L H:L Controllers unblock upon receiving ack

37 36 Use same model checker to verify observational equivalences - use prior aggregation method for equivalence check (Park, TCAD 2000) Use model checker to verify base case - we use well-known, automated Murphi model checker Verification Procedure

38 Outline Background and Motivation Fractal DPM Experimental Evaluation Conclusions 37

39 38 Experimental Evaluation -characterizing performance Goal of DPM is optimize performance Per allocation perf = f(requested power, allocated power) ? Might vary with implementation, we want abstraction

40 39 Experimental Evaluation -characterizing performance allocated power  Performance  max perf for Req H performance as a function of requested power and allocated power CR requests power needed for max perf H M L max perf for Req M max perf for Req L

41 Req. M Req. H Req. MH 40 Experimental Evaluation -performance optimality CR1CR2CR3 CR4 DPM cannot grant all requests DPM Req. L Some allocations are more optimal than others

42 Req. M Req. H Req. MH 41 Experimental Evaluation -performance optimality CR1CR2CR3 CR4 DPM cannot grant all allocations Some allocations are more optimal than others DPM Req. L L MH M H optimal allocation H MH M L suboptimal allocation

43 42 Oracle DPM always picks best performing legal allocation ie. allocation with Oracle DPM constrained by system-wide invariant, not fractal Experimental Evaluation -performance optimality DPM cannot grant all allocations Some allocations are more optimal than others -Oracle DPM unimplementable

44 Violating fractal invariant overshooting system-wide power cap Illegal: total power = 4MH Legal: total power = 4MH violates fractal invariant Some safe power requests are denied – fractal inefficiency 43 Our fractal invariant implies system-wide cap > n*MH MH MH:MH MH MH:MH M M M:H H H H:H M:M Fractal Inefficiency – cost of fractal behavior

45 44 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. MH Req. M Fractal DPM Oracle DPM Req. LM Req. M Req. H Req. MH

46 45 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. MH Req. M Fractal DPM Oracle DPM Req. LM Req. M Req. H Req. MH LM M H MH LM M H MH

47 46 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H

48 47 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H M M H LM M M H H

49 48 Simulating Fractal DPM vs Oracle DPM Simulate Fractal and Oracle DPMs managing 8 resources (4 shown) Every CR makes random power request to 2 DPMs at time steps Can compare total system perf. of Fractal DPM vs Oracle DPM Perf = f(request,allocation) Req. H Req. M Fractal DPM Oracle DPM Req. M Req. H M M H M M H LM H

50 Results 49 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100%

51 Results 50 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% On 72.6% of time steps Fractal DPM ≡ Oracle DPM

52 Results 51 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% On 99.9% of time steps Fractal DPM < 20% off from Oracle

53 Results 52 % CDF Millions of time steps simulated % system performance loss For each time step, system perf = % system perf loss = * 100% Worst case, Fractal DPM < 36.4% off from Oracle

54 We show how a scalably verifiable DPM can be built Fractal behavior enables one-time verification for all scales Entire verification is done in model checker Fractal invariants lead to some inefficient power allocations -very few and cause little loss in performance Conclusions 53 DPM protocol correct for all scales  system more energy secure

55 Thanks Questions? 54

56 Simulate Fractal DPM managing 8 computing resources All CRs make random power requests L-H every time step Quantifying Performance Cost 55 Compare performance with oracle DPM (optimal allocator) linear perf % loss assymptotic perf % loss

57 Naïve DPM cannot provide strong power guarantee in system with churn/failures Case for Scalably Verifiable DPM: 56

58 Naïve DPM cannot provide strong power guarantee in system with churn/failures 57 Case for Scalably Verifiable DPM:

59 Hence, the need for DPM with one-time proof of correctness for all scales of system DPM scheme correct for n nodes correct for n+x nodes Naïve DPM cannot provide strong power guarantee in system with churn/failures 58 Case for Scalably Verifiable DPM:

60 Attaining Scalable Verification System Model Abstract away specific resources managed, implementation mechanism Focus on DPM decision algorithm verification on model checkers Seek one-time DPM verification, any number of resources managed -inductive proof useful Fractal Behavior in DPM allocation decision -hierarchical structure enables inductive proof

61 Root DPM Controller DPM Controller Leaf Node Run exhaustive check of base case in Mur  Base case correct if no invariant violation Proof of Scalable Correctness of our DPM - Base Case 60

62 Req. P Req P Q X Y X:Y Large System “Looking-Down” Req X’ Proof of Scalable Correctness of our DPM - Inductive Cases Q=Avg (X:Y) Small System 61

63 Grant P P X’ Y X’:Y Large System “Looking-Down” Grant X’ P=Avg (X’:Y) Proof of Scalable Correctness of our DPM - Inductive Cases Small System 62

64 “Looking-Up” Req P Req P’ Req P X Y Small System Z Z=Avg(X,Y) Proof of Scalable Correctness of our DPM - Inductive Cases Large System 63

65 “Looking-Up” Grant P Grant P’ Large System Grant P X Y Small System Z Z=Avg(X,Y) Proof of Scalable Correctness of our DPM - Inductive Cases 64

66 MH MH:MH MH MH:MH M M M:H H H H:H M:M 65


Download ppt "Energy Security with Scalably Verifiable Dynamic Power Management Luwa Matthews Meng Zhang Daniel J. Sorin 0 Duke University ECE."

Similar presentations


Ads by Google