Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Survey on Parallel Computing in Heterogeneous Grid Environments Takeshi Sekiya Chikayama-Taura Laboratory M1 Nov 24, 2006.

Similar presentations


Presentation on theme: "A Survey on Parallel Computing in Heterogeneous Grid Environments Takeshi Sekiya Chikayama-Taura Laboratory M1 Nov 24, 2006."— Presentation transcript:

1 A Survey on Parallel Computing in Heterogeneous Grid Environments Takeshi Sekiya Chikayama-Taura Laboratory M1 Nov 24, 2006

2 Parallel Computing in Grid Environments Increase opportunity in which we can use multi cluster environments –But, schemes for stand alone clusters cause problems in grid-like usage New mechanisms are needed –Handling heterogeneity –Firewall/NAT traversal –Adaptation to dynamic environment –Monitoring Heterogeneous hardware and software Failure Firewall/ NAT Maintenance Dynamic Change of CPU/Network Load Complex Configuration Difficult to Know What’s Happening

3 Heterogeneous Environments Heterogeneous machines –Binaries are different –Complex configuration are required when hardware/software is different Heterogeneous networks –Overheads of synchronization in parallel application with different latency/bandwidth –Firewalls/NATs

4 Firewall/NAT Firewalls/NATs hinder bi-directional connectivity Bi-directional TCP/IP connectivity needs to be provided to support a wide spectrum of applications Firewall or NAT

5 Solutions to the Internet Asymmetric-Connectivity Problem MPI Environment on Grid with Virtual Machines [Tachibana et al. 2006] –X–Xen for VM and VPN for Virtual Network –L–Low cost VM migration ViNe [Tsugawa et al. 2006] –A–A host named Virtual Router –O–Overlay network base WOW [Ganguly et al. 2006]

6 Outline Introduction WOW –IPOP: IP over P2P –Routing IP on the P2P Overlay –Connection Setup –Joining an Existing Network –NAT Traversal –Experiments Summary

7 Objective and Approach The system architected to … –Adapt heterogeneous environments Present to end-users a cluster-like environment –Scale to large number of nodes –Facilitate the addition of nodes through self- organization of virtual network Less manual configuration Approach with Virtualization –Virtual Machines Homogeneous software –Self-organizing overlay network All-to-all connectivity

8 Virtual Machine A homogeneous software environment Offering opportunities for load balancing and fault tolerance Users can use pre- configured systems –Linux distribution –Libraries and softwares

9 Virtual Network NAT P2P overlay network IPOP (IP over P2P) Physical Infrastructure P2P Network Virtual Grid Cluster firewall

10 IPOP [Ganguly et al. 2006] Characteristics –A virtual IP address space –Self-organizing Architecture –IP tunneling over P2P –A virtualized network interface (tap) captures virtual IP packets –Brunet P2P overlay network

11 Capturing Virtual IP Packets The tap appears as a network interface from applications IPOP translates virtual IP addresses to Brunet P2P network addresses IPOP application tap IPOP application tap Ethernet Frame IP Packet Brunet Message IP Packet Ethernet Frame IP Packet Tunneling

12 Brunet P2P Ring-structured overlay Organized connections –Near: with neighbors –Far: across the ring 160 bit SHA-1 hash address Greedy routing Each node has constant number of connections –O(log 2 (n)) overlay hops n1 n2 n3 n4 n5 n6 n7 n8 n9 n10 n11 n12 Multi hop path from n1 to n7

13 Connection Setup Connection Protocol Node A wishes to connect to node B 1. A sends a CTM (Connect To Me) request to B over P2P network The CTM request contains A’s URI 2.When B receives the CTM request, B sends a CTM reply to A The CTM reply contains B’s URI A B CTM request CTM reply URI (Uniform Resource Indicator) ex.) brunet.tcp:192.0.0.1:1024

14 Connection Setup Linking Protocol A B 3. B sends a link request message to A over the physical network 4.When A receives the link request, A simply responds with a link reply message 5.Finally, new connection is established between A and B link request link reply Direct connection A to B

15 Linking Race Condition (1) Race condition may occur because linking protocol is initiated by both peers link request link reply Both attempts succeed

16 Linking Race Condition (2) Check no existing connection or connection attempt, when nodes receive link request When nodes receive link error, they restart protocol with random back-off link request link error link request link reply Active linking on? Random back-off

17 Joining an Existing Network Leaf Connection A new node N creates a leaf connection to an initial node I by directly using linking protocol I acts as forwarding agent for N New node N Leaf connection Initial node I Correct position of new node

18 Joining an Existing Network Send CTM request N sends a CTM request addressed to itself over P2P network –the CTM request contains N’s URI A CTM request is received by right and left neighbors, since N is still not in the ring CTM request Right neighbor R Left neighbor L New node N Initial node I

19 Joining an Existing Network Send CTM reply L and R send CTM reply including their URI to I I forwards CTM reply to N CTM reply Right neighbor R Left neighbor L New node N Initial node I CTM reply

20 Joining an Existing Network Linking Protocol Start linking protocol L and R send link request message to N over the physical network Right neighbor R Left neighbor L New node N Initial node I Link request

21 Joining an Existing Network Complete Joining N forms connections with neighbors and is in ring Acquires “far” connections Right neighbor R Left neighbor L New node N Initial node I

22 Adaptive Shortcut Creation High latencies were observed in experiments due to multi-hop overlay routing Shortcut creation –Count IPOP packets to other nodes –When number of packets within an interval exceeds threshold, initiate connection setup –Because overhead incurred during maintenance connections, drop connections no longer in use

23 NAT Host aNAT Host b NAT Table 192.168.0.2:5000 ⇔ 133.11.23.100:6000 IP: 192.168.0.2IP: 133.11.238.100IP: 157.82.13.244 Src: 192.168.0.2:5000 Dst: 157.82.13.244:80 Src: 133.11.23.100:6000 Dst: 157.82.13.244:80 Src: 157.82.13.244:80 Dst: 192.168.0.2:5000 Src: 157.82.13.244:80 Dst: 133.11.23.100:6000 Private Network Global Network

24 NAT Traversal UDP Hole Punching NAT Host AHost B IP: AIP: NIP: MIP: B NAT Table A:a ⇔ N:n NAT Table M:m ⇔ B:b Src: A:a Dst: M:m Src: B:b Dst: N:n Src: M:m Dst: A:a Src: N:n Dst: M:m Src: M:m Dst: N:n

25 Experimental Setup Hosts: 2.4GHz Xeon, Linux 2.4.20, VMware GSX Host: 1.3GHz P-III Linux 2.4.21 VMPlayer Host: 1.7GHz P4, Win XP SP2, VMPlayer Hosts: 2.0 GHz Xeon, Linux 2.4.20, VMware GSX 34 compute nodes, 118 P2P router nodes on PlanetLab

26 Experiment 1 Joining and Shortcut Connections Node A: IPOP node Node B: new joining node –A and B are in different network domains with NAT –B sends ICMP packets to A at 1sec intervals Within period 1 (about 3 seconds), B establish a route to other nodes Within period 2 (about 28seconds), B establish a shortcut connections to A

27 Experiment 2 PVM parallel application: FastDNAml (1) Parallelization with PVM based master-workers model FastDNAml has a high computation-to- communication ratio Dynamic task assignment tolerates performance heterogeneities among computing nodes Master Workers Task Pool

28 Experiment 2 PVM parallel application: FastDNAml (2) The execution with shortcuts enabled is 24% faster than that with shortcuts disabled The parallel speedup is 13.6x –23x is reported in previous work in homogeneous cluster Sequential Execution Parallel Execution Node #230 Nodes Shortcuts disabledShortcuts enabled Execution time (sec)2227220331642 Parallel Speed upn/a11.013.6

29 Summary Introduced WOW –Scalable, fault-resilient and low management infrastructure Future works –Research on middleware which is easy to use for heterogeneous adaptive Grid environment


Download ppt "A Survey on Parallel Computing in Heterogeneous Grid Environments Takeshi Sekiya Chikayama-Taura Laboratory M1 Nov 24, 2006."

Similar presentations


Ads by Google