Download presentation
Presentation is loading. Please wait.
Published byBrenda Daniels Modified over 5 years ago
1
Iroko: A Data Center Emulator for Reinforcement Learning
Introduction Iroko: Design and Data Center Topologies TEMPLATE IS FULLY CUSTOMIZABLE. BOXES AND PLACEHOLDERS CAN BE REMOVED. COLOURS CAN BE CHANGED. UBC LOGO MUST REMAIN TOP LEFT. UBC BRANDING BAR AT THE BOTTOM CAN BE DELETED IF DESIRED. Reinforcement Learning (RL) is gaining traction in computer networking Previous research calls into question current standards for RL algorithm performance Iroko is an RL benchmark platform for computer networking problems. Uses: Mininet for network emulator Ray RL algorithms framework Non-blocking Fat Tree Dumbbell Reinforcement Learning Formulation State: Congestion build up can be signaled by a number of switch statistics Action: Iroko allows agents to adjust hosts sending rates Reward: Iroko maximizes fair bandwidth allocations while minimizing queues Emulator Execution Flow Using Iroko to Compare Existing Controllers with RL Conclusions Fat Tree with UDP Iroko is a viable platform for comparing RL algorithms on data center problems Latency in data centers makes it difficult to correlate actions to currently detected state (credit assignment) Emulating authentic data center scenarios remains challenging Hard to determine best configuration for RL algorithms for usage without more hyper parameter tuning Compared DDPG1, PPO2 & REINFORCE3 to DCTCP4 and TCP New Vegas5 Experimented with Dumbbell, Fat Tree Algorithms can learn within Iroko environments RL can compete with existing controllers, but DCTCP outperforms learned policies Dumbell with UDP Dumbbell with TCP References LILLICRAP, T. P., HUNT, J. J., PRITZEL, A., HEESS, N., EREZ, T., TASSA, Y., SILVER, D., AND WIERSTRA, D. Continuous control with deep reinforcement learning. CoRR abs/ (2015). SCHULMAN, J., WOLSKI, F., DHARIWAL, P., RADFORD, A., AND KLIMOV, O. Proximal policy optimization algorithms. CoRR abs/ (2017). SUTTON, R. S., MCALLESTER, D., SINGH, S., AND MANSOUR, Y. Policy gradient methods for reinforcement learning with function approximation. NIPS 1999. SING, J., AND SOH, B. Tcp new vegas: improving the performance of tcp vegas over high latency links. NCA 2005. ALIZADEH, M., GREENBERG, A., MALTZ, D. A., PADHYE, J., PATEL, P., PRABHAKAR, B., SENGUPTA, S., AND SRIDHARAN, M. Data center tcp (DCTCP). SIGCOMM 2010 Fabian Ruffy Michael Przystupa Ivan Beschastnikh
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.