Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Alexandru V Staicu 1, Jacek R. Radzikowski 1 Kris Gaj 1, Nikitas Alexandridis 2, Tarek El-Ghazawi 2 1 George Mason University 2 George Washington University.

Similar presentations


Presentation on theme: "1 Alexandru V Staicu 1, Jacek R. Radzikowski 1 Kris Gaj 1, Nikitas Alexandridis 2, Tarek El-Ghazawi 2 1 George Mason University 2 George Washington University."— Presentation transcript:

1 1 Alexandru V Staicu 1, Jacek R. Radzikowski 1 Kris Gaj 1, Nikitas Alexandridis 2, Tarek El-Ghazawi 2 1 George Mason University 2 George Washington University Effective Use of Networked Reconfigurable Resources http://ece.gmu.edu/lucite

2 2 Problem: Reconfigurable resources expensive and underutilized Many of these resources available over the network It is desirable to leverage networked reconfigurable resources to help other users within the same organization

3 3 Tasks 1, 2, 3 Task 3 Task 1 Execution Host 1 Execution Host 2 Execution Host 3 Master Host Submission Host Task 2 Approach: Adapt and use a Job Management System

4 4 Approach: Select the most suitable existing Job Management System (JMS) Extend this JMS to recognize and utilize reconfigurable resources - identify and define functional requirements - rank known systems according to these requirements - identify which JMS is the easiest to extend - add new dynamic resources - configure scheduling to be based on these new resources

5 5 Tasks 1, 2, 3 Task 3 Task 1 Execution Host 1 Execution Host 2 Execution Host 3 Master Host Submission Host Task 2 Networked Reconfigurable Resource Management System FPGA boards

6 6 Myrinet SAN/LAN Switch WILDFORCE Dell WILDSTAR Dell SLAAC Dell WILDSTAR Dell WILDFORCE Dell Sparc 10 SLAAC Research Reference Platform Ethernet Intelligent Hub 100 Mbps Heterogeneous network with FPGA-based accelerators Dell HP Sparc 20DellGateway SLAAC WILDSTAR WILDFORCE SLAAC Ethernet Intelligent Hub 100 Mbps

7 7 Functional units of a typical Job Management System jobs & their requirements User Server Job Scheduler Resource Monitor available resources resource requirements scheduling policies Job Dispatcher resource allocation and job execution Resource Manager

8 8 Classification of Investigated Systems (1) Centralized JMS Distributed JMS w/o a Central Scheduler Distributed Operating System LSF CODINE PBS Condor RES Globus Legion NetSolve MOSIX

9 9 Parameter Study Scheduler Resource Monitor and Forecaster Distributed Computing Interface Compaq DCE AppLES NWS Classification of Investigated Systems (2)

10 10 Operating system, flexibility, user interface LSF Codine PBS CONDOR RES Distribution Source code OS Support User Interface Solaris Linux Tru64 NT GUI & CLI com pub pub/compubgov GUI & CLI GUI & CLI GUI & CLI

11 11 Scheduling and Resource Management LSF Codine PBS CONDOR RES Batch jobs Interactive jobs Parallel jobs Accounting

12 12 Efficiency and Utilization LSF Codine PBS CONDOR RES Stage-in and stage-out Timesharing Process migration Dynamic load balancing Scalability

13 13 Fault Tolerance and Security LSF Codine PBS CONDOR RES Checkpointing Daemon fault recovery Authentication Authorization

14 14 Documentation and Technical Support LSF Codine PBS CONDOR RES Documentation Technical support

15 15 JMS features supporting extension to reconfigurable hardware capability to define new dynamic resources strong support for stage-in and stage-out - configuration bitstreams - executable code - input/output data support for Windows NT and Linux

16 16 Ranking of Centralized Job Management Systems (1) Capability to define new dynamic resources: Excellent:LSF, PBS, CODINE More difficult:CONDOR, RES Stage-in and stage-out: Excellent:LSF, PBS Limited:CONDOR No:CODINE, RES

17 17 Ranking of Centralized Job Management Systems (2) Overall suitability to extend to reconfigurable hardware: 1.LSF 2.CODINE 3.PBS 4.CONDOR 5.RES without changing the JMS source code requires changes to the JMS source code

18 18 Submission host LIM Batch API Master host MLIM MBD Execution host SBD Child SBD LIM RES User job Extension of LSF to reconfigurable hardware (1) Operation of LSF LIM – Load Information Manager MLIM – Master LIM MBD – Master Batch Daemon SBD – Slave Batch Daemon RES – Remote Execution Server queue 1 2 3 4 5 6 7 8 9 10 11 12 13 Load information other hosts other hosts bsub app

19 19 Extension of LSF to reconfigurable hardware(2) Submission host LIM Batch API Master host MLIM MBD Execution host SBD Child SBD LIM RES User job ELIM – External Load Information Manager ACS API – Adaptive Computing Systems API queue 1 2 3 4 5 6 7 8 9 10 11 12 13 Load information other hosts other hosts bsub app ELIM ACS API 14 FPGA board Status of the board

20 20 Conclusions (1) 12 systems evaluated using 25 functional requirements + the suitability of extension to support reconfigurable hardware LSF, CODINE, PBS, and Condor ranked the highest in the functional requirements LSF, CODINE, and PBSPro found easy to extend without changes in their source codes LSF most suitable to support reconfigurable hardware

21 21 General software architecture of the extended system developed Experimental developments, verification and performance evaluation of the extended system in progress Conclusions (2)


Download ppt "1 Alexandru V Staicu 1, Jacek R. Radzikowski 1 Kris Gaj 1, Nikitas Alexandridis 2, Tarek El-Ghazawi 2 1 George Mason University 2 George Washington University."

Similar presentations


Ads by Google