Presentation is loading. Please wait.

Presentation is loading. Please wait.

VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.

Similar presentations


Presentation on theme: "VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT."— Presentation transcript:

1 VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT

2 Introduction Physical machine can have a number of smaller virtual machines (VMs), each running a separate operating system instance. Challenges –partitioning of a machine –concurrent execution of multiple operating systems –Isolation of virtual machines from one another –Support heterogeneity of applications –Low performance overhead Xen is a virtual machine monitor for x86 that supports execution of multiple guest operating systems  hypervisor, kernel and user space applications

3 Objective Automation of creation and deletion of a virtual cluster for hosting Hadoop using Xen A large physical cluster can be simulated on few physical machines Steps Input user configuration by editing configuration files. Generates user specified number of VM running Hadoop. Users can manage the Hadoop file system Users can submit jobs for each physical machine.

4 Need for virtualisation Ability to recover from software problems quickly by saving a copy of guest image. High availability by relocating guests when a server machine in inoperable. Dynamic load balancing by migrating guests from server machines. Consolidation of many services in one physical machine and administer them independently in VM. Usage of abundant computational power on the physical machine. Minimisation of cost. Switch between applications on different OS using hypervisors.

5 HADOOP CLUSTER CONFIGURATION Host node is configured as master (NN) and also acts as slave (DN) Guest node (DN) is configured as slave

6 Master is the HostOS which acts as job tracker/Name node. Slave is the GuestOS which acts as task tracker/Data node.

7 Installation of Xen kernel Creation of Guest OS Configuration of Guest OS Installation of Java Development Kit Extraction and Configuration of Hadoop Cluster Creating OS image for new Guest Machines Creation and removal of other Virtual machines, copy the OS images Steps in implementing

8 Automated Creation of a Hadoop Virtual cluster XML file has configuration details of new VM

9 Automated Shut down of Hadoop Virtual cluster

10 Advantages of automated virtualization in Hadoop 1.Effective isolation of the datanode from the load on the machine caused by other processes makes the datanode more responsive/reliable. 2.The availability of multiple virtual machines on each machine lowers the granularity of scheduling units thus making it possible to schedule multiple task trackers on the same machine and to improve the overall utilization of the whole clusters. 3.The snapshot a virtual cluster makes it possible to re-activate the same cluster in the future and start to work from the snapshot. (rollback)

11 Enhancements 1.Providing a graphical console for monitoring and managing virtual cluster. 2.Creation and Migration of virtual machine for the purpose of load balancing. 3.Enabling snapshot of the virtual machine. For checkpointing 4.Providing Intelligent Monitoring System which could detect the failure of a virtual machine in the cluster and restarts the particular virtual machine increasing the reliability.

12 Performance of Physical vs Virtual clusters

13 7 Nodes Data nodes – 6 Virtual nodes Name node – 1 physical node Master as a Physical Node

14 7 Nodes Data nodes – 1 physical node + 5 Virtual nodes Name node – 1 virtual node Master as a Virtual Node

15 Performance with varying number of Virtual nodes


Download ppt "VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT."

Similar presentations


Ads by Google