Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF March 7th, 2016 John Kubiatowicz Electrical Engineering and Computer.

Similar presentations


Presentation on theme: "EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF March 7th, 2016 John Kubiatowicz Electrical Engineering and Computer."— Presentation transcript:

1 EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF March 7th, 2016 John Kubiatowicz Electrical Engineering and Computer Sciences University of California, Berkeley CS252 S05

2 Today’s Papers Thoughts?
Composing Parallel Software Efficiently with Lithe Heidi Pan, Benjamin Hindman, Krste Asanovic. Appears in Conference on Programming Languages Design and Implementation (PLDI), 2010 Dominant Resource Fairness: Fair Allocation of Multiple Resources Types, A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Usenix NSDI 2011, Boston, MA, March 2011 Thoughts?

3 The Future is Parallel Software
Challenge: How to build many different large parallel apps that run well? Can’t rely solely on compiler/hardware: limited parallelism & energy efficiency Can’t rely solely on hand-tuning: limited programmer productivity If the future of software is parallel software, then in order to allow the software industry to thrive we better make sure that programmers can write parallel code productively.

4 Composability is Essential
modularity App same app, different library implementations App 1 App 2 Goto BLAS BLAS BLAS MKL BLAS code reuse same library implementation, different apps How do we enable programmer productivity? Well we probably shouldn’t abandon software engineering practices that work already. Composability is key to building large, complex apps.

5 Motivational Example Sparse QR Factorization (Tim Davis, Univ of Florida) Column Elimination Tree SPQR Frontal Matrix Factorization TBB MKL OpenMP OS So to begin, let’s take a real example of when parallel codes don’t compose well. Tim Davis, at U of F, has written this sparse QR factorization code. The structure of the code is shown on the left. You can view it as a task graph, and each of the tasks is a dense matrix operation. On the right, we show how he actually used many existing libraries to implement his code. He used Intel’s TBB to express the elimination tree task graph, and he calls LAPACK for each of the dense matrix operations, which in turn calls Intel’s MKL BLAS, which in turn calls OpenMP. Hardware Software Architecture System Stack

6 TBB, MKL, OpenMP Intel’s Threading Building Blocks (TBB)
Library that allows programmers to express parallelism using a higher-level, task-based, abstraction Uses work-stealing internally (i.e. Cilk) Open-source Intel’s Math Kernel Library (MKL) Uses OpenMP for parallelism OpenMP Allows programmers to express parallelism in the SPMD-style using a combination of compiler directives and a runtime library Creates SPMD teams internally (i.e. UPC) Open-source implementation of OpenMP from GNU (libgomp)

7 Suboptimal Performance
Performance of SPQR on 16-core AMD Opteron System Speedup over Sequential He ran his code using 4 different matrices. This graph shows the performance on a 16-core machine in terms of the speedup over the sequential version of the code, so higher is better. The dark blue bars on the right show the results he initially got, and the light blue bars on the left show the results he got after tuning the code. So what accounts for the performance disparity? And how come the out-of-the-box performance on a 16-core machine could be half of the sequential performance? Matrix

8 Out-of-the-Box Configurations
Core 0 Core 1 Core 2 Core 3 So the root of the problem occurs at the bottom of the system stack. Let’s simplify and say that the sparse QR code ran on a 4-core machine. The OS abstracts the machine, and exports virtualized kernel threads for the app to use. The TBB runtime is going to use 4 kernel threads to model the 4 cores, and map its tasks onto these threads. But the OpenMP runtime also uses 4 kernel threads to model the machine, and map its parallel region onto those threads. Since there are now more threads than cores, the OS multiplexes these threads onto the machine. The two parts of the app interfere with each other, and destroy the locality that was so carefully orchestrated by each of the runtimes. TBB OpenMP OS virtualized kernel threads Hardware

9 Providing Performance Isolation
Using Intel MKL with Threaded Applications Here’s another quick fix proposed by the MKL manual. It tells you that if you’re calling MKL from a parallel app to just use the sequential versoin of the library.

10 “Tuning” the Code Performance of SPQR on 16-core AMD Opteron System
Speedup over Sequential Well, Tim Davis tried that too. In some cases, it did better than than the out-of-the-box version, and in some cases, it actually did worse. But in all cases, it still didn’t do as well as the tuned version. Matrix

11 Partition Resources TBB OpenMP OS Hardware
Core 0 Core 1 Core 2 Core 3 TBB OpenMP So what did Tim Davis actually do for the “tuned” version? It turns out that he manually set the number of threads that TBB and OpenMP can use, which effectively partitioned the machine resources btwn the 2 libraries. We think that this is a step in the right direction, but we’d like to automate this process. It shouldn’t be the app writer’s responsibility for tuning low level libraries. OS Hardware Tim Davis’ “tuned” SPQR by manually partitioning the resources.

12 “Tuning” the Code (continued)
Performance of SPQR on 16-core AMD Opteron System Speedup over Sequential Well, Tim Davis tried that too. In some cases, it did better than than the out-of-the-box version, and in some cases, it actually did worse. But in all cases, it still didn’t do as well as the tuned version. Matrix

13 Harts: Hardware Threads
OS Core 0 virtualized kernel threads Core 1 Core 2 Core 3 OS Core 0 Core 1 Core 2 Core 3 harts Expose true hardware resources Applications requests harts from OS Application “schedules” the harts itself (two-level scheduling) Can both space-multiplex and time-multiplex harts … but never time-multiplex harts of the same application We argue that the existing abstraction for execution, virtualized and preemptive kernel threads, are inadequate. As Heidi has motivated, virtualization and preemption lead to oversubscription … and oversubscription impairs our ability to reason about optimizations like prefetching and cache blocking. So what is a better model for machine resources? (One might consider non-preemptive user-level threads, but even that is quite restricting.) We think that eliminating the abstraction created by virtualization and preemption is the right thing to do. Instead, the OS should expose raw hardware resources to applications. We call these resources harts, short for hardware thread contexts. There is a 1-1 mapping of a hart to a simple core. The OS gives the app a set of harts to use, which reflects some space partition of the machine. The app cannot arbitrarily create more harts. If it wants more, it can request more from the OS. But if the OS doesn’t give it more, it has to make do with what it has. The key property of harts is that they are always running … or to provide a well planned play on words, they are always beating. This is fundamentally different from how the OS has managed execution resources before. Before, applications threads were multiplexed by the OS, now there is a finite amount of harts that are always running, and the application has to explicitly map its own tasks onto its own harts. Differentiate between our work and tessellation … specifically the fact that our work doesn’t actually need a modified OS in order to work. Rather, a shim can be applied on top of OS supplied threads to provide a hart environment.

14 Sharing Harts (Dynamically)
TBB OpenMP time OS Hardware

15 How to Share Harts? call graph CLR scheduler hierarchy TBB Cilk OpenMP
OMP To understand how harts will be used it makes sense to remind ourselves how software tends to get written. Specifically, some code typically composes with other code (perhaps a library), and as is good software engineering practice we separate interface form implementation such that some code might be implemented using tbb, while other code might use openmp, yet other code might use cilk, and so forth and so on. It is To answer this, we need to see how these schedulers are related to each other. An app can be represented by a call graph. We believe that we should be able to schedule different pieces of that call graph independently, so we can split the call graph into regions that are managed by different schedulers. This is what SPQR would look like. And it might be called from a Cilk program, which also calls a routine written in Ct. So essentially we’ve formed a hierarchy of schedulers. We believe that when you call a function, the parent scheduler that corresponds to the caller should give its child scheduler, the callee, resources to actually execute that function, and the child scheduler will give back the resources when it’s done. So it’s a cooperative, hierarchical resource sharing model. Hierarchically: Caller gives resources to callee to execute Cooperatively: Callee gives resources back to caller when done

16 A Day in the Life of a Hart
TBB Sched: next? CLR TBB SchedQ executeTBB task TBB Cilk TBB Sched: next? OpenMP execute TBB task Non-preemptive scheduling. TBB Sched: next? nothing left to do, give hart back to parent So let’s see how a hart flows up and down this scheduler hierarchy. Let’s assume that a hart is currently executing tbb scheduler code trying to figure out what to do next. The tbb scheduler has a task q for that hart and pulls off the next task from the head of the q. after it’s done executing that task, it goes back to the scheduler, then pulls of the next task. At some point there are no more tasks in the q, and there are no more tasks to steal from other q’s, so the tbb scheduler decides to give the hart back to its parent. The cilk scheduler is concerned with bounding space, so it decides not to start a new task, but to finish an existing ct one first, so it gives the hart to the ct scheduler, which will then use it to execute its tasks. There are several observations to be made. First, this is a cooperative, not pre-emptive model. The important thing is that the entire app runs well, not any individual piece, and pre-emption is costly so we’d like to avoid it. Note that our work is not trying to address real-time or interactive gui apps, where latency matters. We just want our apps to finish in the least amount of time. Second, at any scheduling decision point, the scheduler has 3 options: execute another one of its tasks, give the hart to an existing child, or give it back to its parent. Third, the hart management is very modular. Harts flow where they’re needed without any of the schedulers having to know the entire structure of the app. What ends up happening is that a scheduler higher in the hierarchy has a more global view of the app and is directing harts toward one part of the call graph versus another. A scheduler lower in the hierarchy is usually managing leaf tasks and figuring out how to map them efficiently onto the given harts. CLR Sched: next? Cilk Sched: next? time

17 Lithe (ABI) Cilk Scheduler TBB Scheduler Parent Scheduler Caller
Callee return call interface for exchanging values unregister enter yield request register interface for sharing harts Child Scheduler TBB Scheduler OpenMP Scheduler unregister enter yield request register For the harts to move up and down the hierarchy, we need a way for a parent and child scheduler to share harts. The interface needs to work regardless of whether it’s the cilk scheduler sharing with the tbb scheduler, or the tbb sharing with the omp. So we designed a standard interface, called lithe, for any two schedulers to share harts. Note that we only provide the mechanism to share harts, but not the actual policy of how to share them. Each scheduler can implement this interface however it wants to. This interface is analogous to the function call abi, which standardizes the way functions pass parameters & return values. It’s also analogous to the function call abi in that only the lowest level of the application software needs to worry about this interface. Normal programmers never even have to think about how this works when it’s done right. Analogous to function call ABI for enabling interoperable codes.

18 A Few Details … A hart is only managed by one scheduler at a time
The Lithe runtime manages the hierarchy of schedulers and the interaction between schedulers Lithe ABI only a mechanism to share harts, not policy

19 Putting It All Together
Lithe-TBB SchedQ func(){ Lithe-TBB SchedQ Lithe-TBB SchedQ register(TBB); enter(TBB); enter(TBB); request(2); The implementation of TBB’s enter function is analogous to how new threads are created within TBB: a new task queue is created, and since it’s empty, the hart will start stealing tasks from another task queue. unregister(TBB); yield(); yield(); } time

20 Synchronization Can’t block a hart on a synchronization object
Synchronization objects are implemented by saving the current “context” and having the hart re-enter the current scheduler #pragma omp barrier TBB Scheduler (block context) enter request unregister enter yield request register yield #pragma omp barrier (unblock context) request(1) OpenMP Scheduler unregister enter yield request register enter enter (resume context) time

21 Lithe Contexts Includes notion of a stack
Includes context-local storage There is a special transition context for each hart that allows it to transition between schedulers easily (i.e. on an enter, yield)

22 Lithe-compliant Schedulers
TBB Worker model ~180 lines added, ~5 removed, ~70 modified (~1,500 / ~8,000 total) OpenMP Team model ~220 lines added, ~35 removed, ~150 modified (~1,000 / ~6,000 total)

23 Overheads? TBB Example micro-benchmarks that Intel includes with releases OpenMP NAS benchmarks (conjugate gradient, LU solver, and multigrid)

24 Flickr Application Server
GraphicsMagick parallelized using OpenMP Server component parallelized using threads (or libprocess processes) Spectrum of possible implementations: Process one image upload at a time, pass all resources to OpenMP (via GraphicsMagick) Easy implementation Can’t overlap communication with computation, some network links are slow, images are different sizes, diminishing returns on resize operations Process as many images as possible at a time, run GraphicsMagick sequentially Also easy implementation Really bad latency when low-load on server, 32 core machine underwhelmed All points in between … Account for changing load, different image sizes, different link bandwidth/latency Hard to program

25 Flickr-Like App Server
Libprocess Graphics Magick OpenMPLithe (Lithe) Tradeoff between throughput saturation point and latency.

26 Case Study: Sparse QR Factorization
Different matrix sizes deltaX creates ~30,000 OpenMP schedulers Rucci creates ~180,000 OpenMP schedulers Platform: Dual-socket 2.66 GHz Intel Xeon (Clovertown) with 4 cores per socket (8 total cores)

27 Case Study: Sparse QR Factorization
ESOC Rucci Tuned: 70.8 Out-of-the-box: 111.8 Sequential: 172.1 Tuned: 360.0 Out-of-the-box: 576.9 Sequential: 970.5 Lithe: 66.7 Lithe: 354.7

28 Case Study: Sparse QR Factorization
deltaX landmark Tuned: 14.5 Out-of-the-box: 26.8 Sequential: 37.9 Tuned: 2.5 Out-of-the-box: 4.1 Sequential: 3.4 Lithe: 13.6 Lithe: 2.3

29 Is this a good paper? What were the authors’ goals?
What about the evaluation/metrics? Did they convince you that this was a good system/approach? Were there any red-flags? What mistakes did they make? Does the system/approach meet the “Test of Time” challenge? How would you review this paper today?

30 Break

31 What is Fair Sharing? n users want to share a resource (e.g., CPU)
100% 50% 0% 33% What is Fair Sharing? n users want to share a resource (e.g., CPU) Solution: Allocate each 1/n of the shared resource Generalized by max-min fairness Handles if a user wants less than its fair share E.g. user 1 wants no more than 20% Generalized by weighted max-min fairness Give weights to users according to importance User 1 gets weight 1, user 2 weight 2 100% 50% 0% 20% 40% 100% 50% 0% 33% 66% Less than its fair share

32 Why is Fair Sharing Useful?
Weighted Fair Sharing / Proportional Shares User 1 gets weight 2, user 2 weight 1 Priorities Give user 1 weight 1000, user 2 weight 1 Revervations Ensure user 1 gets 10% of a resource Give user 1 weight 10, sum weights ≤ 100 Isolation Policy Users cannot affect others beyond their fair share CPU 100% 50% 0% 66% 33% CPU 100% 50% 0% 10% 40%

33 Properties of Max-Min Fairness
Share guarantee Each user can get at least 1/n of the resource But will get less if her demand is less Strategy-proof Users are not better off by asking for more than they need Users have no reason to lie Max-min fairness is the only “reasonable” mechanism with these two properties

34 Why Care about Fairness?
Desirable properties of max-min fairness Isolation policy: A user gets her fair share irrespective of the demands of other users Flexibility separates mechanism from policy: Proportional sharing, priority, reservation,... Many schedulers use max-min fairness Datacenters: Hadoop’s fair sched, capacity, Quincy OS: rr, prop sharing, lottery, linux cfs, ... Networking: wfq, wf2q, sfq, drr, csfq, ...

35 When is Max-Min Fairness not Enough?
Need to schedule multiple, heterogeneous resources Example: Task scheduling in datacenters Tasks consume more than just CPU – CPU, memory, disk, and I/O What are today’s datacenter task demands?

36 Heterogeneous Resource Demands
Some tasks are CPU-intensive Most task need ~ <2 CPU, 2 GB RAM> Some tasks are memory-intensive Put fb in title 2000-node Hadoop Cluster at Facebook (Oct 2010)

37 ? ? Problem Single resource example Multi-resource example
CPU 100% 50% 0% Single resource example 1 resource: CPU User 1 wants <1 CPU> per task User 2 wants <3 CPU> per task Multi-resource example 2 resources: CPUs & memory User 1 wants <1 CPU, 4 GB> per task User 2 wants <3 CPU, 1 GB> per task What is a fair allocation? 50% 50% CPU 100% 50% 0% mem ? ?

38 How to fairly share multiple resources when users have heterogeneous demands on them?
Problem definition

39 Model Users have tasks according to a demand vector
e.g. <2, 3, 1> user’s tasks need 2 R1, 3 R2, 1 R3 Not needed in practice, can simply measure actual consumption Resources given in multiples of demand vectors Assume divisible resources

40 What is Fair? Goal: define a fair allocation of multiple cluster resources between multiple users Example: suppose we have: 30 CPUs and 30 GB RAM Two users with equal shares User 1 needs <1 CPU, 1 GB RAM> per task User 2 needs <1 CPU, 3 GB RAM> per task What is a fair allocation?

41 First Try: Asset Fairness
Equalize each user’s sum of resource shares Cluster with 70 CPUs, 70 GB RAM U1 needs <2 CPU, 2 GB RAM> per task U2 needs <1 CPU, 2 GB RAM> per task Asset fairness yields U1: 15 tasks: 30 CPUs, 30 GB (∑=60) U2: 20 tasks: CPUs, 40 GB (∑=60) Problem User 1 has < 50% of both CPUs and RAM Better off in a separate cluster with 50% of the resources CPU User 1 User 2 100% 50% 0% RAM 43% 57% 28% Say whats the prob

42 Lessons from Asset Fairness
“You shouldn’t do worse than if you ran a smaller, private cluster equal in size to your fair share” Thus, given N users, each user should get ≥ 1/N of her dominating resource (i.e., the resource that she consumes most of) What to do with the rest of the resources?

43 Desirable Fair Sharing Properties
Many desirable properties Share Guarantee Strategy proofness Envy-freeness Pareto efficiency Single-resource fairness Bottleneck fairness Population monotonicity Resource monotonicity DRF focuses on these properties

44 Cheating the Scheduler
Some users will game the system to get more resources Real-life examples A cloud provider had quotas on map and reduce slots Some users found out that the map-quota was low Users implemented maps in the reduce slots! A search company provided dedicated machines to users that could ensure certain level of utilization (e.g. 80%) Users used busy-loops to inflate utilization

45 Two Important Properties
Strategy-proofness A user should not be able to increase her allocation by lying about her demand vector Intuition: Users are incentivized to make truthful resource requirements Envy-freeness No user would ever strictly prefer another user’s lot in an allocation Don’t want to trade places with any other user

46 Challenge A fair sharing policy that provides
Strategy-proofness Share guarantee Max-min fairness for a single resource had these properties Generalize max-min fairness to multiple resources

47 Dominant Resource Fairness
A user’s dominant resource is the resource she has the biggest share of Example: Total resources: <10 CPU, 4 GB> User 1’s allocation: <2 CPU, 1 GB> Dominant resource is memory as 1/4 > 2/10 (1/5) A user’s dominant share is the fraction of the dominant resource she is allocated User 1’s dominant share is 25% (1/4)

48 Dominant Resource Fairness (2)
Apply max-min fairness to dominant shares Equalize the dominant share of the users Example: Total resources: <9 CPU, 18 GB> User 1 demand: <1 CPU, 4 GB> dominant res: mem User 2 demand: <3 CPU, 1 GB> dominant res: CPU User 1 User 2 100% 50% 0% CPU (9 total) mem (18 total) 3 CPUs 12 GB 6 CPUs 2 GB 66%

49 DRF is Fair DRF is strategy-proof DRF satisfies the share guarantee
DRF allocations are envy-free See DRF paper for proofs

50 Online DRF Scheduler Whenever there are available resources and tasks to run: Schedule a task to the user with smallest dominant share O(log n) time per decision using binary heaps Need to determine demand vectors

51 Alternative: Use an Economic Model
Approach Set prices for each good Let users buy what they want How do we determine the right prices for different goods? Let the market determine the prices Competitive Equilibrium from Equal Incomes (CEEI) Give each user 1/n of every resource Let users trade in a perfectly competitive market Not strategy-proof!

52 Determining Demand Vectors
They can be measured Look at actual resource consumption of a user They can be provided the by user What is done today In both cases, strategy-proofness incentivizes user to consume resources wisely

53 DRF vs CEEI User 1: <1 CPU, 4 GB> User 2: <3 CPU, 1 GB>
DRF more fair, CEEI better utilization User 1: <1 CPU, 4 GB> User 2: <3 CPU, 2 GB> User 2 increased her share of both CPU and memory CPU mem user 2 user 1 100% 50% 0% Dominant Resource Fairness Competitive Equilibrium from Equal Incomes 66% 55% 91% CPU mem 100% 50% 0% Dominant Resource Fairness Competitive Equilibrium from Equal Incomes 66% 60% 80% Nash: u1:.5161 u2:

54 Example of DRF vs Asset vs CEEI
Resources <1000 CPUs, 1000 GB> 2 users A: <2 CPU, 3 GB> and B: <5 CPU, 1 GB> User A User B a) DRF b) Asset Fairness CPU Mem 100% 50% 0% c) CEEI DRF <2/5, 3/5> <3/5, 3/25> Asset <12/37, 18/37> <25/37, 5/37> = <0.324, 0.486> <0.675, 0.135> CEEI <0.5,0.75> <0.5,0.1>

55 Desirable Fairness Properties (1)
Recall max/min fairness from networking Maximize the bandwidth of the minimum flow [Bert92] Progressive filling (PF) algorithm Allocate ε to every flow until some link saturated Freeze allocation of all flows on saturated link and goto 1

56 Desirable Fairness Properties (2)
P1. Pareto Efficiency It should not be possible to allocate more resources to any user without hurting others P2. Single-resource fairness If there is only one resource, it should be allocated according to max/min fairness P3. Bottleneck fairness If all users want most of one resource(s), that resource should be shared according to max/min fairness

57 Desirable Fairness Properties (3)
Assume positive demands (Dij > 0 for all i and j) DRF will allocate same dominant share to all users As soon as PF saturates a resource, allocation frozen

58 Desirable Fairness Properties (4)
P4. Population Monotonicity If a user leaves and relinquishes her resources, no other user’s allocation should get hurt Can happen each time a job finishes CEEI violates population monotonicity DRF satisfies population monotonicity Assuming positive demands Intuitively DRF gives the same dominant share to all users, so all users would be hurt contradicting Pareto efficiency

59 Properties of Policies
Property Asset CEEI DRF Share guarantee Strategy-proofness Pareto efficiency Envy-freeness Single resource fairness Bottleneck res. fairness Population monotonicity Resource monotonicity

60 Evaluation Methodology
Micro-experiments on EC2 Evaluate DRF’s dynamic behavior when demands change Compare DRF with current Hadoop scheduler Macro-benchmark through simulations Simulate Facebook trace with DRF and current Hadoop scheduler

61 Dominant shares are equalized
DRF Inside Mesos on EC2 Dominant resource is CPU Share guarantee: ~70% dominant share Dominant resource is memory is CPU User 1’s Shares <1 CPU, 10 GB> <2 CPU, 4 GB> User 2’s Shares <1 CPU, 1 GB> <1 CPU, 3 GB> 2 jobs, 48 node extra-large EC2 on Mesos. 4 CPU and 15 GB ram. 6 minutes, jobs randomly changed demands. First 2 minutes, J1 <1 cpu, 10gb>, J2 <1 cpu, 1gb>. Min 2-4 J1 <2 cpu,4gb> <1cpu,3gb>, 4-6 min, <1cpu ,7gb> <1 cpu,4gb> Dominant shares are equalized Dominant Shares Share guarantee: ~50% dominant share

62 Fairness in Today’s Datacenters
Hadoop Fair Scheduler/capacity/Quincy Each machine consists of k slots (e.g. k=14) Run at most one task per slot Give jobs ”equal” number of slots, i.e., apply max-min fairness to slot-count This is what DRF paper compares against

63 Experiment: DRF vs Slots
Number of Type 1 Jobs Finished Jobs finished Thrashing Number of Type 2 Jobs Finished Low utilization Jobs finished EC2 8 cpu, 7 gb, 2 types of jobs j1 <1 cpu, 0.5gb> and j2 <2cpu, 2 gb>. Thrashing Type 1 jobs <2 CPU, 2 GB> Type 2 jobs <1 CPU, 0.5GB>

64 Experiment: DRF vs Slots
Completion Time of Type 1 Jobs Thrashing Job completion time Completion Time of Type 2 Jobs Low utilization hurts performance Thrashing Job completion time Type 1 job <2 CPU, 2 GB> Type 2 job <1 CPU, 0.5GB>

65 Reduction in Job Completion Time DRF vs Slots
Simulation of 1-week Facebook traces 2000 node cluster for a week. This one is a small excerpt of 3000 seconds Reduction in completion time 100*delta/old_time. This is a week’s worth of trace.

66 Utilization of DRF vs Slots
Simulation of Facebook workload 2000 node cluster for a week. This one is a small excerpt of 3000 seconds

67 Summary DRF provides multiple-resource fairness in the presence of heterogeneous demand First generalization of max-min fairness to multiple-resources DRF’s properties Share guarantee, at least 1/n of one resource Strategy-proofness, lying can only hurt you Performs better than current approaches

68 Is this a good paper? What were the authors’ goals?
What about the evaluation/metrics? Did they convince you that this was a good system/approach? Were there any red-flags? What mistakes did they make? Does the system/approach meet the “Test of Time” challenge? How would you review this paper today?


Download ppt "EECS 262a Advanced Topics in Computer Systems Lecture 13 Resource allocation: Lithe/DRF March 7th, 2016 John Kubiatowicz Electrical Engineering and Computer."

Similar presentations


Ads by Google