Presentation is loading. Please wait.

Presentation is loading. Please wait.

So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.

Similar presentations


Presentation on theme: "So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management."— Presentation transcript:

1 So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management in a Cluster Environment

2 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 2 / 20 Introduction (1/2)  Supercomputer  High performance processor / high network bandwidth  Expensive system but Beowulf system is cost-effective  Motivation  Focus on Cluster system  Cluster Management system  Manual method / add-on method / integrated method  Registry  Central repository of information about all aspects of the computer

3 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 3 / 20 Introduction (2/2)  Challenge  Integrated method has low availability and reliability  Can’t manage computation nodes separately  When failure occurs, system can’t be rejuvenated  Goal ( using Registry )  Improve availability and reliability of integrated method  Administrator can manage a cluster system easily  Restore cluster system with a backup snapshot

4 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 4 / 20 Supercomputer Domestic Supercomputer Quantity : 14 Cluster : 4 MPP : 4 Constellation : 6 ※ SNU : 2 (51/413) 60.8%

5 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 5 / 20 Cluster Management System  Manual approach  System administrator brings up entire system manually  Add-on method  Bring up a frontend node, then add cluster packages  OSCAR / Warewulf / OpenMosix  Integrated method  Cluster packages are installed and configured during the initial installation  Rocks / Scyld

6 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 6 / 20 Cluster Management System  Software Stack Linux Kernel Linux Environment HPC Device Drivers Job Scheduling and Launching Cluster software management Cluster State management / Monitoring Message passing / communication Layer Parallel code / Grid / computer lab … OS (Linux) SGE Application HPC

7 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 7 / 20 Rocks Overview  Identity  System to build and manage a Linux Cluster  Free : Open source project  Goal  Make clusters easy  Philosophy  Computation nodes are 100% automatically installed  Roll : set of packages  Graph / Kickstart  Run on heterogeneous system architecture  Doesn’t attempt to incrementally update software

8 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 8 / 20 Rocks system  Architecture Front-end node node Local Network eth1 eth0 internet

9 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 9 / 20 What is Registry ?  Central repository of info about all aspects of the computer  Hardware, OS, applications, users information  Function  Retrieve system information  Update / add / delete software  Backup & restore system  Advantage  Easier for applications to access system  Storing large amounts of structured data (system info)

10 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 10 / 20 Registry Design ID (primary key) Name Membership CPUs Rack Rank Comment Nodes ID (primary key) Node MAC IP Gateway Name Device Module Network ID (primary key) Node Name Version Release Install Package ID (primary key) Node Name Aliases ID (primary key) Name Appliance Distribution Memberships ID (primary key) Name Graph Node Appliances ID (primary key) Name Release Lang Distribution Original Relational Schema Appended Relation H/W information S/W information

11 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 11 / 20 Strategy of management  Rocks Setup  Minimum modification  Take advantage of original Rocks system  Deploy cluster system easily  Modify related source codes  insert-ethers, kickstart.cgi, Kpp, Kgen, Rgen  Running System  Apply package modification  Package management program : add / update / delete packages  DB consistency management program

12 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 12 / 20 Collection Method Rgen Registry variables Package variables Appended component

13 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 13 / 20 Modification Method Insert command Packages table Package name / version / release Instruction : Add / update / delete add –c=compute-0-0 –i=amanda-2.4.5-2.i386 add –c=all –i=all del -c=compute-0-0 –i=amanda-2.4.5-2.i386 del -c=all -i=all Packages table Add / delete / update Compute Nodes

14 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 14 / 20 Registry consistency  Setup time  When frontend node removes / updates computation node  Dependency : change node table → change package table  Modify Kickstart.cgi / kgen  Apply cascading tables change ※ mysql not support transaction property  Running system  Package install / delete / update  Compute node rpm information = frontend node’s registry DB

15 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 15 / 20 Experiment Setup Public Ethernet Frontend node Compute nodes (14) Rocks.snu.ac.kr CPU 800Mhz RAM 768MB HDD 40G Compute-0-(1~14) CPU 850Mhz RAM 1G HDD 10G 468KB 117MB capacity 3 53 volume amanda HPC name Experiment Data 1.5GB479Rocks roll

16 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 16 / 20 Original Rocks Evaluation average service time : 18min 14secaverage transmit time : 11min 28sec Network card DHCP request

17 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 17 / 20 Amanda Packages Evaluation average install time : 6.62 secAverage delete time : 5.57sec

18 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 18 / 20 HPC Roll Evaluation average install time : 3min 38secaverage delete time : 1min 18sec

19 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 19 / 20 Conclusion  Registry takes advantage of cluster system  Improve availability and reliability using Registry  Administrator can manage cluster systems easily  Restore cluster systems with backup snapshots

20 So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 20 / 20 Q & A Questions or Comments ? Thank you !


Download ppt "So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management."

Similar presentations


Ads by Google