Presentation is loading. Please wait.

Presentation is loading. Please wait.

Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison,

Similar presentations


Presentation on theme: "Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison,"— Presentation transcript:

1 Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison, WI HotPar 20121

2 Accelerators  Specialized execution units for common tasks  Data parallel tasks, encryption, video encoding, XML parsing, network processing etc.  Task offloaded to accelerators  High performance, energy efficient (or both) HotPar 20122 GPU Wire speed processor APU DySER Crypto accelerator

3 Why accelerators?  Moore’s law and Dennard’s scaling  Single core to Multi-core  Dark Silicon: More transistors but limited power budget [Esmaeilzadeh et. al - ISCA ‘11]  Accelerators  Trade-off area for specialized logic  Performance up by 250x and energy efficiency by 500x [Hameed et. al – ISCA ’10]  Heterogeneous architectures HotPar 20123

4 Heterogeneity everywhere  Focus on diverse heterogeneous systems  Processing elements with different properties in the same system  Our work  How can OS support this architecture? 1. Abstract heterogeneity 2. Enable flexible task execution 3. Multiplex shared accelerators among applications HotPar 20124

5 Outline  Motivation  Classification of accelerators  Challenges  Our model - Rinnegan  Conclusion HotPar 20125

6 IBM wire-speed processor HotPar 20126 Accelerator Complex CPU cores Memory controller

7 Classification of accelerators  Accelerators classified based on their accessibility  Acceleration devices  Co-Processor  Asymmetric cores HotPar 20127

8 Classification HotPar 20128 PropertiesAcceleration devices Co-ProcessorAsymmetric cores SystemsGPU, APU, Crypto accelerators in Sun Niagara chips etc. C-cores, DySER, IBM wire-speed processor NVIDIA’s Kal-El processor, Scalable cores like WiDGET LocationOff/On-chip deviceOn-chip deviceRegular core that can execute threads Accessibility ControlDevice driversISA instructionsThread migration, RPC DataDMA or zero copySystem memoryCache coherency Resource contentionMultiple applications might want to make use of the device Sharing of resource across multiple cores could result in contention Multiple threads might want to access the special core 1) Application must deal with different classes of accelerators with different access mechanisms 2) Resources to be shared among multiple applications 1) Application must deal with different classes of accelerators with different access mechanisms 2) Resources to be shared among multiple applications

9 Challenges HotPar 20129  Task invocation  Execution on different processing elements  Virtualization  Sharing across multiple applications  Scheduling  Time multiplexing the accelerators

10 Task invocation HotPar 201210  Current systems decide statically on where to execute the task  Programmer problems where system can help  Data granularity  Power budget limitation  Presence/availability of accelerator AES Crypto accelerator

11 Virtualization HotPar 201211  Data addressing for accelerators  Accelerators need to understand virtual addresses  OS must provide virtual-address translations  Process preemption after launching a computation  System must be aware of computation completion 0x0000x1000x2000x5000x400 0x48000x49000x5000 Virtual memory crypto XML crypt 0x240 Accelerator units CPU cores

12 Scheduling HotPar 201212  Scheduling needed to prioritize access to shared accelerators  User-mode access to accelerators complicates scheduling decisions  OS cannot interpose on every request

13 Outline  Motivation  Classification of Accelerators  Challenges  Our model - Rinnegan  Conclusion HotPar 201213

14 Rinnegan  Goals  Application leverage the heterogeneity  OS enforces system wide policies  Rinnegan tries to provide support for different class of accelerators  Flexible task execution by Accelerator stub  Accelerator agent for safe multiplexing  Enforcing system policy through Accelerator monitor HotPar 201214

15 Accelerator stub  Entry point for accelerator invocation and a dispatcher  Abstracts di ff erent processing elements  Binding - Selecting the best implementation for the task  Static: during compilation  Early: at process start  Late: at call HotPar 201215 Process making use of accelerator Stub { Encrypt through AES instructions } { Offload to crypto Accelerator }

16 Accelerator agent  Manages an accelerator  Roles of an agent  Virtualize accelerator  Forwarding requests to the accelerator  Implement scheduling decisions  Expose accounting information to the OS HotPar 201216 Resource allocator Process trying to offload a parallel task Stub User Space Kernel Space Asymmetric core Thread migration Device driver Process can communicate directly with accelerator Stub Accelerator agents Hardware

17 Accelerator monitor  Monitors information like utilization of resources  Responsible for system- wide energy and performance goals  Acts as an online modeling tool  Helps in deciding where to schedule the task HotPar 201217 Accelerator agent Process making use of accelerator Stub Accelerator monitor User Space Kernel Space Accelerator Hardware

18 Programming model  Based on task level parallelism  Task creation and execution  Write: different implementations  Launch: similar to function invocation  Fetching result: supporting synchronous/asynchronous invocation  Example: Intel TBB/Cilk HotPar 201218 application() { task = cryptStub(inputData, outputData);... waitFor(task); } cryptStub(input, output) { if(factor1 == true) output = implmn_AES(input) //or// else if(factor2 == true) output = implmn_CACC(input) }

19 Summary HotPar 201219 Accelerator agent Process making use of accelerator Stub Accelerator monitor Other kernel components Process User Space Kernel Space CPU cores Accelerator Direct communication with accelerator

20 Related work HotPar 201220  Task invocation based on data granularity  Merge, Harmony  Generating device specific implementation of same task  GPUOcelot  Resource scheduling: Sharing between multiple applications  PTask, Pegasus

21 Conclusions  Accelerators common in future systems  Rinnegan embraces accelerators through stub mechanism, agent and monitoring components  Other challenges  Data movement between devices  Power measurement HotPar 201221

22 HotPar 201222 Thank You


Download ppt "Operating Systems Should Manage Accelerators Sankaralingam Panneerselvam Michael M. Swift Computer Sciences Department University of Wisconsin, Madison,"

Similar presentations


Ads by Google