Presentation is loading. Please wait.

Presentation is loading. Please wait.

MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.

Similar presentations


Presentation on theme: "MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating."— Presentation transcript:

1 MSc. Miriel Martín Mesa, DIC, UCLV

2 The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating system

3 ¿Why? The current researches require a large amount of computational resources that can not be obtained with a single computer. The need to make several runs of the experiments, without having to wait to finish the current run to execute the next The possibility of having an electric back that allows running jobs that require several days to finish

4 Current Hardware  7 nodes Dell R410 with:  2 Intel processors with 6 cores x processors  12 GB RAM,  250 GB hard drive  2 NIC Gbps,  10 Blade nodes Dell 1955 with:  2 Intel processors with 2 cores x processors  12 GB RAM,  36 GB HDD

5 Current Hardware 17 nodes with:  28 processors,  132 cores,  204 GB RAM,  1.3 TFLOPS (theoretical)

6 Cluster design Beowulf design

7 Basic Software S/O: Debian 7 Resource manager: TorquePBS Scheduler: MAUI Central user authentication: NIS Server

8 Cluster installation (Master and nodes) PXE (Preboot eXecution Environment) DHCP TFTP HTTP server DNS Server (BIND) Preseed script (Answers to installation questions)

9 Preseed code d-i mirror/protocol string http d-i mirror/country string manual d-i mirror/http/hostname string master.cluster.uclv.edu.cu d-i mirror/http/directory string /debian d-i mirror/http/proxy string d-i mirror/suite string wheezy d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/choose_recipe select atomic d-i partman-auto/purge_regular_from_device boolean true d-i partman-regular/confirm boolean true d-i partman/confirm_write_new_label boolean true d-i partman/choose_partition select Finish partitioning and write changes to disk d-i partman/confirm boolean true # If the system has free space you can choose to only partition that space. tasksel tasksel/first multiselect minimal d-i pkgsel/include string openssh-server puppet d-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet d-i mirror/protocol string http d-i mirror/country string manual d-i mirror/http/hostname string master.cluster.uclv.edu.cu d-i mirror/http/directory string /debian d-i mirror/http/proxy string d-i mirror/suite string wheezy d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/choose_recipe select atomic d-i partman-auto/purge_regular_from_device boolean true d-i partman-regular/confirm boolean true d-i partman/confirm_write_new_label boolean true d-i partman/choose_partition select Finish partitioning and write changes to disk d-i partman/confirm boolean true # If the system has free space you can choose to only partition that space. tasksel tasksel/first multiselect minimal d-i pkgsel/include string openssh-server puppet d-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet d-i mirror/protocol string http d-i mirror/country string manual d-i mirror/http/hostname string master.uclv.cu d-i mirror/http/directory string /debian d-i mirror/suite string wheezy d-i mirror/protocol string http d-i mirror/country string manual d-i mirror/http/hostname string master.uclv.cu d-i mirror/http/directory string /debian d-i mirror/suite string wheezy d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/choose_recipe select atomic d-i partman-regular/confirm boolean true d-i partman/confirm_write_new_label boolean true d-i partman/choose_partition select Finish partitioning and write changes to disk d-i partman/confirm boolean d-i partman-auto/disk string /dev/sda d-i partman-auto/method string regular d-i partman-auto/choose_recipe select atomic d-i partman-regular/confirm boolean true d-i partman/confirm_write_new_label boolean true d-i partman/choose_partition select Finish partitioning and write changes to disk d-i partman/confirm boolean tasksel tasksel/first multiselect minimal d-i pkgsel/include string openssh-server puppet d-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet tasksel tasksel/first multiselect minimal d-i pkgsel/include string openssh-server puppet d-i preseed/late_command string sed -i 's/no/yes/g' /target/etc/default/puppet

10 Cluster Management Puppet Package management and configuration of the server and the nodes.

11 Cluster Management Module: commons Class packages-commons { $packages_commons = ["csh","flex","byacc","vim",tcsh","lsb", "lsb-core"] package { $packages_commons : ensure => installed }

12 Cluster Management Module: MPICH class mpich ($mpich_version ) { file {mpich: path => "${mpich_path}", owner => root, mode => 775, ensure => directory, } exec { "mpich_configure": cwd => "${mpich_source}-${mpich_version}/", command => "nice -19 sh configure ${mpich_prefix} ${mpich_with_torque}", onlyif => "test ! -e ${mpich_source}-${mpich_version}/config.log", } … }

13 cron { update_ntpdate: command=> "/usr/sbin/ntpdate ", user=> root, minute=> 0, hour=> '*/1', } service { cron: ensure => running, enable => true, } Cluster Management

14 Monitoring tools Ganglia Provides real-time monitoring and execution environment

15 Monitoring tools Icinga Monitors any network resource, notifies the errors, generates performance data for reporting and reports the status of resources

16 System Access

17  Secure shell (SSH): # ssh user@hpc.uclv.cu

18 System Access

19  Web page

20 System Access  Web page

21 System Access  Web page

22 System Access  Web page

23 Cluster applications

24

25 Example #!/bin/bash #PBS -N example1 #PBS -l nodes=2:ppn=4 #PBS -l walltime=01:20:00 #PBS -q default #PBS -m ae #PBS -M user_email@uclv.edu.cu cd $PBS_O_WORKDIR ############################email@uclv.edu.cu module load mpich/3.0.4 mpirun./application

26 Cluster queues QueuenodesaccessCores Memory (GB)jobs/users Max Time(hours)Priority Default small1 Blade nodes4841210 medium1-3Any241233620 long1-4Any2415216830

27 To do Implement system of user quotas Add an external storage Continue installing applications demanded by users We always need to do more

28 Thank you Muchas Gracias


Download ppt "MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating."

Similar presentations


Ads by Google