Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Computing and Information Systems laboratory Self-configuring Condor Virtual Machine Appliances for Ad-Hoc Grids Renato Figueiredo Arijit Ganguly,

Similar presentations


Presentation on theme: "Advanced Computing and Information Systems laboratory Self-configuring Condor Virtual Machine Appliances for Ad-Hoc Grids Renato Figueiredo Arijit Ganguly,"— Presentation transcript:

1 Advanced Computing and Information Systems laboratory Self-configuring Condor Virtual Machine Appliances for Ad-Hoc Grids Renato Figueiredo Arijit Ganguly, David Wolinsky, J. Rhett Aultman, P. Oscar Boykin, ACIS Lab, University of Florida http://wow.acis.ufl.edu

2 Advanced Computing and Information Systems laboratory 2 Outline Motivations Background Condor Virtual Appliance: features On-going and future work

3 Advanced Computing and Information Systems laboratory 3 Motivations Goal: plug-and-play deployment of Condor grids High-throughput computing; LAN and WAN Collaboration: file systems, messaging,.. Synergistic approach: VM + virtual network + Condor “WOWs” are wide-area NOWs, where: Nodes are virtual machines Network is virtual: IP-over-P2P (IPOP) overlay VMs provide: Sandboxing; software packaging; decoupling Virtual network provides: Virtual private LAN over WAN; self-configuring and capable of firewall/NAT traversal Condor provides: Match-making, reliable scheduling, … unmodified

4 Advanced Computing and Information Systems laboratory 4 1. Prime base VM image with O/S, Condor, Virtual network; publish (Web/Torrent) Condor WOWs - outlook 2. Download image; boot using free VM monitor (e.g. VMware Player or Server) 4. Download base and custom VM images; boot up 3. Create virtual IP namespace for pool: MyGrid:10.0.0.0/255.0.0.0 Prime custom image with virtual namespace, desired tools Bootstrap manager(s) 10.0.0.1 5. VMs obtain IP addresses from MyGrid Virtual DHCP server, join virtual IP network, discover available manager(s), and join pool 10.0.0.2 10.0.0.3 10.0.0.4 10.0.0.1 5b. VMs obtain IP addresses from OtherGrid Virtual DHCP server, join virtual IP network, discover available manager(s), and join pool 10.0.0.2 10.0.0.3 10.0.0.4

5 Advanced Computing and Information Systems laboratory 5 Condor WOW snapshot Zurich Gainesville Long Beach

6 Advanced Computing and Information Systems laboratory 6 Roadmap The basics: 1.1 VMs and appliances 1.2 IPOP: IP-over-P2P virtual network 1.3 Grid Appliance and Condor The details: 2.1 Customization, updates 2.2 User interface 2.3 Security 2.4 Performance Usage experience

7 Advanced Computing and Information Systems laboratory 7 1.1: VMs and appliances System VMs: VMware, KVM, Xen Homogenous system Sandboxing Co-exist with unmodified hosts Virtual appliances: Hardware/software configuration packaged in easy to deploy VM images Only dependences: ISA (x86), VMM

8 Advanced Computing and Information Systems laboratory 8 1.2: IPOP virtual networking Key technique: IP-over-P2P tunneling Interconnect VM appliances WAN VMs perceive a virtual LAN environment IPOP is self-configuring Avoid administrative overhead of VPNs NAT and firewall traversal IPOP is scalable and robust P2P routing deals with node joins and leaves IPOP networks are isolated One or more private IP address spaces Decentralized DHCP serves addresses for each space

9 Advanced Computing and Information Systems laboratory 9 1.2: IPOP virtual networking App IPOP Node B eth0 (139.70.24.100) IPOP Node A eth0 (128.227.136.244) A B tap0 (10.0.0.3) tap0 (10.0.0.2) P2P Overlay Structured overlay network topology Bootstrap 1-hop IP tunnels on demand Discover NAT mappings; decentralized hole punching VM keeps IPOP address even if it migrates on WAN [Ganguly et al, IPDPS 2006, HPDC 2006]

10 Advanced Computing and Information Systems laboratory 10 1.3 Grid appliance and Condor Base: Debian Linux; Condor; IPOP Works on x86 Linux/Windows/MacOS; VMware, KVM/QEMU 157MB zipped Uses NAT and host-only NICs No need to get IP address on host network Managed negotiator/collector VMs Easy to deploy schedd/startd VMs Flocking is easy – virtual network is a LAN

11 Advanced Computing and Information Systems laboratory 11 2.1: Customization and updates VM image: Virtual Disks Portable medium for data Growable after distribution Disks are logically stacked Leverage UnionFS file system Three stacks: Base – O/S, Condor, IPOP Module – site specific configuration (e.g. nanoHUB) Home – user persistent data Major updates: replace base/module Minor updates: automatic, apt-based

12 Advanced Computing and Information Systems laboratory 12 2.2: User interface (Windows host) VM console: X11 GUI Host-mounted loop-back Samba folder Loopback SSH

13 Advanced Computing and Information Systems laboratory 13 2.2: User interface (Mac host) VM console: X11 GUI Host-mounted loop-back Samba folder Loopback SSH

14 Advanced Computing and Information Systems laboratory 14 2.2: User interface (Linux host) VM console: X11 GUI Host-mounted loop-back Samba folder Loopback SSH

15 Advanced Computing and Information Systems laboratory 15 2.3 Security Appliance firewall eth0: block all outgoing Internet packets Except DHCP, DNS, IPOP’s UDP port Only traffic within WOW allowed eth1 (host-only): allow ssh, Samba IPsec X.509 host certificates Authentication and end-to-end encryption VM joins WOW only with signed certificate bound to its virtual IP Private net/netmask: ~10 lines of IPsec configuration for an entire class A network!

16 Advanced Computing and Information Systems laboratory 16 2.4: Performance User-level C# IPOP implementation (UDP): Link bandwidth: 25-30Mbit/s Latency overhead: ~4ms Connection times: ~5-10s to join P2P ring and obtain DHCP address ~10s to create shortcuts, UDP hole-punching SimpleScalar 3.0 (cycle-accurate CPU simulator)

17 Advanced Computing and Information Systems laboratory 17 Experiences Bootstrap WOW with VMs at UF and partners Currently ~300 VMs, IPOP overlay routers (Planetlab) Exercised with 10,000s of Condor jobs from real users nanoHUB: 3-week long, 9,000-job batch (BioMoca) submitted via a Condor-G gateway P2Psim, CH3D, SimpleScalar Pursuing interactions with users and the Condor community for broader dissemination

18 Advanced Computing and Information Systems laboratory 18 Time scales and expertise Development of baseline VM image: VM/Condor/IPOP expertise; weeks/months Development of custom module: Domain-specific expertise; hours/days/weeks Deployment of VM appliance: No previous experience with VMs or Condor 15-30 minutes to download and install VMM 15-30 minutes to download and unzip appliance 15-30 minutes to boot appliance, automatically connect to a Condor pool, run condor_status and a demo condor_submit job

19 Advanced Computing and Information Systems laboratory 19 On-going and future work Enhancing self-organization at the Condor level: Structured P2P for manager publish/discovery Distributed hash table (DHT); primary and flocking Condor integration via configuration files, DHT scripts Unstructured P2P for matchmaking Publish/replicate/cache classads on P2P overlay Support for arbitrary queries Condor integration: proxies for collector/negotiator Decentralized storage, cooperative caching Virtual file systems (NFS proxies) Distribution of updates, read-only code repositories Caching and COW for diskless, net-boot appliances

20 Advanced Computing and Information Systems laboratory 20 Acknowledgments National Science Foundation NMI, CI-TEAM SURA SCOOP (Coastal Ocean Observing and Prediction) http://wow.acis.ufl.edu Publications, Brunet/IPOP code (GPL’ed C#), Condor Grid appliance

21 Advanced Computing and Information Systems laboratory 21 Questions?

22 Advanced Computing and Information Systems laboratory 22 Self-organizing NAT traversal, shortcuts Node A Node B CTM request: connect to me at my NAT IP:port Sends CTM request - A starts exchanging IP packets with B - Traffic inspection triggers request to create shortcut - Connect-to-me (CTM) - “A” tells “B” its known address(es): - “A” had learned NATed public IP/port when it joined overlay

23 Advanced Computing and Information Systems laboratory 23 - “B” sends CTM reply – routed through overlay - “B” tells “A” its address(es) - “B” initiates linking protocol by attempting to connect to “A” directly Node A Node B CTM reply through overlay: send NAT (IP:port) B Self-organizing NAT traversal, shortcuts Link request: NAT endpoint (IP:port) A

24 Advanced Computing and Information Systems laboratory 24 - B’s linking protocol message to A pokes hole on B’s NAT - A’s linking protocol message to B pokes hole on A’s NAT CTM protocol establishes direct shortcut A Gets CTM reply; initiates linking Node A Node B Self-organizing NAT traversal, shortcuts

25 Advanced Computing and Information Systems laboratory 25 Performance considerations CPU-intensive application, Condor SimpleScalar 3.0d execution-driven computer architecture simulator

26 Advanced Computing and Information Systems laboratory 26 Performance considerations I/O: PostMark Version 1.51 Parameters: Minimum file size: 500 bytes Maximum file size: 4.77 MB Transactions: 5,000

27 Advanced Computing and Information Systems laboratory 27 Performance considerations User-level C# IPOP implementation (UDP): Link bandwidth: 25-30Mbit/s (LAN) Latency overhead: ~4ms Connection times: (Fine-tuning has reduced mean acquire time to ~ 6-10s, with degree of redundancy n=8)

28 Advanced Computing and Information Systems laboratory 28 Condor Appliance on a desktop Linux, Condor, IPOP Domain- specific tools User files Swap VM Hardware configuration

29 Advanced Computing and Information Systems laboratory 29 Related Work Virtual Networking VIOLIN VNET; topology adaptation ViNe Internet Indirection Infrastructure (i3) Support for mobility, multicast, anycast Decouples packet sending from receiving Based on Chord p2p protocol IPv6 tunneling IPv6 over UDP (Teredo protocol) IPv6 over P2P (P6P)


Download ppt "Advanced Computing and Information Systems laboratory Self-configuring Condor Virtual Machine Appliances for Ad-Hoc Grids Renato Figueiredo Arijit Ganguly,"

Similar presentations


Ads by Google