Presentation is loading. Please wait.

Presentation is loading. Please wait.

BY Curtis Michels 1. Where did the name come from? “I knew him of yore in his youthful days; his aged father was Ecgtheow named, to whom, at home, gave.

Similar presentations


Presentation on theme: "BY Curtis Michels 1. Where did the name come from? “I knew him of yore in his youthful days; his aged father was Ecgtheow named, to whom, at home, gave."— Presentation transcript:

1 BY Curtis Michels 1

2 Where did the name come from? “I knew him of yore in his youthful days; his aged father was Ecgtheow named, to whom, at home, gave Hrethel the Geat his only daughter. Their offspring bold fares hither to seek the steadfast friend. And seamen, too, have said me this, — who carried my gifts to the Geatish court, thither for thanks, — he has thirty men's heft of grasp in the gripe of his hand, the bold-in- battle.”, Beowulf (description of Beowulf by Hrothgar, the Danish king) What is a Beowulf Cluster 2

3 What is a cluster? - Is a group of computers connected with each other that perform parallel computing together. - One type of cluster is a Commodity Cluster - Is a cluster which uses commercially available networks. -There are four classes of Commodity Cluster -Workstation Cluster -Is a cluster made up of Workstations connected by a LAN. -Cluster Farm -Is a cluster that uses workstations that aren’t being used to perform tasks within a local network. -Superclusters -Is a Cluster of Clusters within a local network. Computer Cluster 3

4 What is a Beowulf Cluster? - A class of Commodity Cluster. - It is a cluster which is made up of commodity off the shelf (COTS) computers or components. - A Beowulf cluster is made up of one host and multiple clients (Nodes) connected by an Ethernet network. - Has become the most widely used parallel computer structure. - Most applications are either written in Fortran or C. 4 Beowulf Cluster A picture of a Linux based Beowulf Cluster. - picture was retrieved from http://www.copyright-free-images.com

5 1. Scalability -All that is needed to increase the capabilities of the cluster is to add another computer. -The cluster can be built in phases. 2.Convergence Architecture -Over time Beowulf clusters has become like a standard. -It is the mostly likely choose architecture for parallel computing. -Before the HPC industry would constantly change parallel architecture types requiring software rework through different generations of parallel computing. Advantages 5

6 3.Performance/Price -Has a bigger performance to cost ratio than single super computers. -All components are COTS with no custom parts makes it cheaper than a single super computer. -A single super computer can cost millions dollars. -A Beowulf cluster with the similar capability would have cost in the thousands of dollars. -With Linux the performance to cost advantage is even higher. 4.Flexibility of configuration and upgrade -The cluster can be easily configured for any application needing computing power. -There is a wide variety of components to choose to make a Beowulf Cluster. -This flexibility makes it easier to upgrade the cluster when new technology comes out. Advantages (continued) 6

7 5.Able to keep up with changes in technology -Nodes able to be easily added to system enables the cluster the ability to keep up with changes in technology. -Able to immediately integrate new technology into the cluster as soon as it is available to consumers. 6.More reliable and better fault tolerance. -A Beowulf cluster is able to gracefully degrade in performance. -As components fail the number of available processors will decrease but the system will continue to run. -The reason for this is the fact that there is redundant hardware in a Beowulf cluster. Advantages (continued) 7

8 7.Users have higher level of control -Installation of the system is easy. -Administrator’s and users are able to control the structure of the system, operation, and the evolution of the system. 8.Maintenance is easier -Doesn’t need any special training to maintain since it is made up of off the shelf components. -Only need to know basic computer maintenance. -Don’t need special tools. 8 Advantages (continued)

9 9.Development cost and time -There is no need to design components since they are all off the shelf parts. -The components are cheaper than custom made. -The time to implement is shorter since all the system designer has to do is pick the parts that would give the system the desired capabilities. -The system would be very quick to set up and configure. Since once you have installed the software on one node can easily be copied over to the other nodes. 9 Advantages (continued)

10 1.Programming is more difficult. - Each processor has it’s own memory. -The programs have to been written to take advantage of parallel processing. -Bad code can cause the system to become extremely slow eliminating advantages. 2.Fragmentation of program data. - The data of the program is split among the different systems. -Once the calculations or simulation is finished then the data has to be recombined. When the size of the data get big enough then it will take the host two long 3.Limited by speed of network Disadvantages 10

11 -In the 1950’s under IBM contracts with the United States air force -Developed the Sage system which was used by NORAD. -This was the first computer cluster. -It was based upon MIT’s whirlwind computer Architecture. History of Cluster computers 11

12 1.Development of microprocessor with the VLSI (very large scale Integration) technology. -This made it easier to have multiple computers able to be integrated with each other. 2.Ethernet was developed -The First widely used local area network technology. -Created a standard for modestly price interconnect devices and a data transport layer. ( Sterling, p.6) 3.Multitasking support was added to Unix. 1970’s 12

13 1.160 interconnected Apollo workstations was configured as a cluster to perform computational task for the NSA. 2.Software for task management tools for running a workstation farm were developed - UW-Madison developed the Condor software package. 3.PVM (Parallel Virtual Machines) was developed. - It is a Library of linkable functions that enabled message passing between computers on a network. To exchange data and to coordinate efforts. 1980’s 13

14 1.1992 – NASA Lewis Research Center -Used a small cluster of IBM workstations to simulate the steady-state behavior of jet aircraft engines. 2.1993 – NOW (Network of Workstations) project at UC Berkley. - First of several cluster they developed in 1993. -One of those clusters was put on the Top500 list of the world most powerful computers. 1990’s 14

15 3.First Beowulf Cluster -Developed at the NASA Goddard Space Flight Center by Thomas Sterling and Donald Becker. -It used an early release of Linux and PVM. -It was made up of 16 computers -Intel 100 MHz 80486-based -Connected by dual 10 Mbps Ethernet LAN’s -Had to develop the necessary LAN drivers for Linux. -Low-Level cluster management tools were developed. -This project demonstrated performance to cost advantage that a Beowulf Cluster had for real-world scientific applications. 4.The first Message-Passing Interface (MPI) standard was adopted by the parallel computing community. - This created a uniform set of message-passing semantics and syntax. 1994 15

16 5.DOE Los Alamos National Laboratory and California Institute of Technology with NASA Jet propulsion lab. -Demonstrated sustained performance of over 1 GFlops with a Beowulf system costing $50,000. -Awarded the Gordon Bell prize for price/performance of this accomplishment. 1996 16

17 6.Compaq created a Beowulf Cluster Capable of 30 TFlops. -Got awards from both the DOE and the NSF. -Fortran or C was used to write programs using linkable libraries for message-passing. 2000’s 17

18 - Requirements - Software - Set and configuration of the cluster How to Setup a Beowulf cluster 18

19 1.The host computer should be faster, have more memory than the clients. 2.Works best if all computers are using the same processor architecture. ( AMD, INTEL, etc.) 3.A version of Linux or Windows. 4.Each computer has an Ethernet card. 5.A network switch, router able to handle all computer being used in cluster. Requirements 19

20 1.Condor (Red Hat or Debian) -supports distributed Job Streaming. -Management emphasizes capacity or throughput computing. -Schedule independent Jobs on cluster nodes to handle large user workloads -Many scheduling policy options. 2.PBS ( Linux and Windows) -Widely used system for distributin parallel user jobs across parallel Beowulf cluster resources. -Admin. Tools for professional systems supervision. Software 20

21 3.Maui (Linux) -Advanced Scheduler -Has policies and mechanisms for handling many user request and resource states. -Sits on top other low-level management software. 4.Cluster controller (Windows) -Used at Cornell Theory Center. -Designed for windows. It takes full advantage of the windows environment. Software (continued) 21

22 5.PVFS (Parallel Virtual Files System) (Unix and Linux) -manages secondary storage of Beowulf Cluster. -Provides parallel file management shared among distributed nodes of the system. -It delivers faster response and much higher effective disk bandwidth than with NFS (Network File System). 6.Windows Server 2008 R2 HPC, Windows server 2003 (Compute Cluster Edition) -Is able to automatically install nodes. 7.OpenMPI (Linux) -Is an open source implementation of MPI-2 -Can schedule process 22 Software (continued)

23 Assumptions made for this setup -Have two computers or more connected together vie TCP/IP network ( I used two virtual machines) -Each machine is using the same processor architecture. -All machines have a common login name with the same password. For this example : mpiuser. -Each machine is sharing the same /home folder or has the important folders synchronized. ( for this example /home is shared) -The computers are going to have Debian 6.0.0 installed on them. -OpenMPI will be the software used for the Beowulf cluster. Setup 23

24 1.Install Linux on all machines. (Using Debian for this example). 1.The slave computers can have a minimum install 2.The master can have a full install 2.Make sure that all the computers can communicate with each other. Setup (continued) 24

25 Host 1.Install openMPI The packages needed are: openmpi-bin openmpi-common libopenmpi1.3 libopenmpi-dev ( not needed for clients ) $ apt-get install openmpi-bin openmpi-common libopenmpi1 libopenmpi-dev Setup (continued) 25

26 Host 2.Setting up SSH (used to control the clients) The package openssh-client is needed $ apt-get install openssh-client Login as mpiuser and create ssh public/private key with a password using the file: /home /mpiuser/.ssh/id_dsa $ ssh-keygen -t dsa Make each computer know that the user mpiuser is authorized to login $ cp /home/mpiuser/.ssh/id_dsa.pub /home/mpiuser/.ssh/authorized_keys Fix file permissions $ chmod 700 /home/mpiuser/.ssh $ chmod 600 /home/mpiuser/.ssh/authorized_keys To test to make sure connection is correct $ ssh 192.168.137.215 Setup (continued) 26

27 # The Hostfile for Open MPI # The master node, 'slots=1' is used because it is a single-processor machine. localhost slots=1 # The following slave nodes are single processor machines: 192.168.137.215 Host 3.Configure Open MPI To tell openmpi which machines to run the programs on. A file to store the info has to be created. I created the file /home/mpiuser/.mpi_hostfile. Setup (continued) 27 This is the contents of.mpi_hostfile

28 Host Machine 4.Make SSH not ask for a password Open MPI uses SSH to connect to the slaves, password should not have to be entered. $ eval ‘ssh-agent’ $ ssh-add ~/.ssh/id_dsa ( tells the ssh-agent the password for the SSH key) $ ssh 192.168.137.215 ( to test) Setup (continued) 28

29 Clients 1.Install openmpi The same as the host except don’t use pakage libopenmpi-dev 2.Install ssh server $ apt-get install openssh -server Setup (continued) 29

30 A simple program that just sends random numbers to each node. This sample was written to use the setup above. Sample program for a Beowulf Cluster 30

31 Sample code (testprogram.c) 31

32 Sample code(continued) 32

33 This code example was gotten from http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open- mpi-on-linux / Sample code (continued) 33

34 1.The Master Node has to be running before the slaves are started. 2.Running a program on this Beowulf cluster isn’t hard. Using testprogram.c First compile it  $ mpicc testporgram.c To run the program on 20 proccess on local machine  $ mpirun –np 20./a.out To run the program over the Beowulf Cluster (assuming.mpi_hostfile is in the current directory)  $ mpirun –np 5 --hostfile.mpi_hostfile./a.out I used: $ mpirun –np 2 –hostfile.mpi_hostfile./a.out 34 Running program

35 Waiting Task 1: Received 1 char(s) (103) from task 0 with tag 1 Task 0: Sent message 103 to task 1 with tag 1 35 Output of sample program

36 Los Alamos uses a computer simulation to determine how the aging stockpile of Nuclear weapons would behave since testing is banned. (2009) -The simulation revealed how individual atoms and molecules interact with each other. -Had to be a much higher resolution than what was used in the past. -Was used to visualize a number of components within a data set. -Scalar fields, vector fields, cell-centered variables, vertex-centered variables, and polygon information. -Cost: $35,000 -Had one Host and 15 Clients. -Being compared to a SGI ONYX 2 supercomputer Simulation of Nuclear Explosion 36

37 2.Client hardware OS is RedHat Linux 6.2 733-Mhz processor Nvidia Geforce 2 1 GB of Ram 40-GB disk 1.Host hardware OS is RedHat Linux 6.2 733-Mhz processor Nvidia Geforce 2 2 GB of Ram 55-GB disk 3.Network hardware 100-Mbit Ethernet Cards HP ProCurve 4000 switch Beowulf Cluster used 37

38 Singapore first satellite X-Sat used a Beowulf cluster (1995) The satellite was equipped with: 10m resolution multispectral (color) camera to obtain pictures in the Singapore and the region around it. A radio link for an Australian-distributed sensor network. PPU ( parallel processing unit) First use of Linux in space Beowulf Cluster in Space 38

39 The components in the PPU are: 20 x SA 1110 Peak performance: 4,000 MIPS (million instructions per second) 1,280 MB of memory 3,125 cm 3 (size) 25 Watt (power consumption) Cost: $3,500 Processing cost: 0.88 $/MIPS Processing volume: 0.78 cm 3 /MIPS Processing power: 6.25 mW/MIPS OS: Linux PPU components 39

40 1.Beowulf Cluster are made up of COTS computers and components. 2.Programs for Beowulf Clusters are written in either C or Fortran. 3.Beowulf Clusters are cheaper than traditional single system supercomputers. 4.The advantages over other supercomputers are Scalability, Convergence Architecture, Performance/Price, Flexibility of configuration and upgrade, Users have higher level of control, easier to maintain, Able to keep up with changes in technology, More reliable and better fault tolerance, and Development costs and time. 5.Beowulf Clusters can be used for many applications. Summary 40

41 Beowulf.org. n.d.. Chiu, Steve. "Current issues in high performance computing I/O architectures and systems." Journal of Supercomputing (2008): 105-107. Cluster resources :: Products - Maui Cluster Scheduler. 2011. 23 October 2011.. Condor High Throughput Computing. 12 October 2011. 23 October 2011.. Heckendorn, Robert B. "Building a Beowulf: Leveraging Research and Department Needs for StudentEnrichment via Project Based Learning." Computer Science Education December 2002: 255 - 273. "Keener eyes for Beowulf." Mechanical Engineering June 2001: 78-79. "Linux Clusters Serve Low End." InfoWorld 26.45 (2004): 21. References 41

42 McLoughlin, Ian, Timo Bretschneider and Bharath Ramesh. "First Beowulf Cluster in Space." Linux Journal September 2005: 34-38. Open MPI: Open Source High Performance Computing. 3 October 2011. 23 October 2011.. Parallel Virtual File System, version 2. n.d. 23 October 2011.. Roach, Ronald. "Ball State creates supercomputer from old desktop computers." Black Issues in Higher Education 19.5 (2002): 29. Sclyd Clusterware. n.d. 23 October 2011.. References (continued) 42

43 Sterling, Thomas. Beowulf Cluster Computing with Linux. Massahusetts Institue of Technology, 2002. —. Beowulf Cluster Computing with Windows. Massachusetts Institute of Technology, 2002. Woodman, Lawrence. Setting up a Beowulf Cluster Using Open MPI on Linux. 2 December 2009. 15 10 2011.. 43 References (continued)


Download ppt "BY Curtis Michels 1. Where did the name come from? “I knew him of yore in his youthful days; his aged father was Ecgtheow named, to whom, at home, gave."

Similar presentations


Ads by Google