Presentation is loading. Please wait.

Presentation is loading. Please wait.

Global Overlay Network : PlanetLab Claudio E.Righetti October, 2006 (some slides taken from Larry Peterson)

Similar presentations


Presentation on theme: "Global Overlay Network : PlanetLab Claudio E.Righetti October, 2006 (some slides taken from Larry Peterson)"— Presentation transcript:

1 Global Overlay Network : PlanetLab Claudio E.Righetti October, 2006 (some slides taken from Larry Peterson)

2 “PlanetLab: An Overlay Testbed for Broad-Coverage Services “ Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak. ACM SIGCOMM Computer Communications Review. Volume 33 Number 3 : July 2003 “ Overcoming the Internet Impasse through Virtualization “ Anderson, Peterson, Shenker, Turner. IEEE Computer. April 2005 “Towards a Comprehensive PlanetLab Architecture”, Larry Peterson, Andy Bavier, Marc Fiuczynski, Steve Muir, and Timothy Roscoe, June 2005 http://www.planet-lab.org/PDN/PDN-05-030

3 Overview 1.What is PlanetLab? 2.Architecture 1.Local: Nodes 2.Global: Network 3.Details 1.Virtual Machines 2.Maintenance

4 What Is PlanetLab? Geographically distributed overlay network Testbed for broad-coverage network services

5 PlanetLab Goal “…to support seamless migration of an application from an early prototype, through multiple design iterations, to a popular service that continues to evolve.”

6 PlanetLab Goal “…support distributed virtualization – allocating a widely distributed set of virtual machines to a user or application, with the goal of supporting broad- coverage services that benefit from having multiple points-of-presence on the network. This is exactly the purpose the PlanetLab slice abstraction.”

7 Slices

8

9

10 User Opt-in Server NAT Client

11 Challenge of PlanetLab “ The central of challenge PlanetLab is ti provide decentralized control of distributed virtualization.”

12 Long-Running Services Content Distribution –CoDeeN: Princeton –Coral: NYU –Cobweb: Cornell Storage & Large File Transfer –LOCI: Tennessee –CoBlitz: Princeton Anomaly Detection & Fault Diagnosis –PIER: Berkeley, Intel –PlanetSeer: Princeton DHT –Bamboo (OpenDHT): Berkeley, Intel –Chord (DHash): MIT

13 Services (cont) Routing / Mobile Access –i3: Berkeley –DHARMA: UIUC –VINI: Princeton DNS –CoDNS: Princeton –CoDoNs: Cornell Multicast –End System Multicast: CMU –Tmesh: Michigan Anycast / Location Service –Meridian: Cornell –Oasis: NYU

14 Services (cont) Internet Measurement –ScriptRoute: Washington, Maryland Pub-Sub –Corona: Cornell Email –ePost: Rice Management Services –Stork (environment service): Arizona –Emulab (provisioning service): Utah –Sirius (brokerage service): Georgia –CoMon (monitoring service): Princeton –PlanetFlow (auditing service): Princeton –SWORD (discovery service): Berkeley, UCSD

15 PlanetLab Today www.planet-lab.org

16 PlanetLab Today Global distributed systems infrastructure –platform for long-running services –testbed for network experiments 583 nodes around the world –30 countries –250+ institutions (universities, research labs, gov’t) Standard PC servers –150–200 users per server –30–40 active per hour, 5–10 at any given time –memory, CPU both heavily over-utilised

17 Usage Stats Slices: 600+ Users: 2500+ Bytes-per-day: 3 - 4 TB IP-flows-per-day: 190M Unique IP-addrs-per-day: 1M

18 Priorities Diversity of Network –Geographic –Links Edge-sites, co-location and routing centers, homes (DSL, cable-modem) Flexibility –Allow experimenters maximal control over PlanetLab nodes –Securely and fairly

19 Key Architectural Ideas Distributed virtualization –slice = set of virtual machines Unbundled management –infrastructure services run in their own slice Chain of responsibility –account for behavior of third-party software –manage trust relationships

20 Architecture Overview Slice : horizontal cut of global PlanetLab resources Service : set of distributed and cooperating programs delivering some higher-level functionality Each service runs in a slice of PlanetLab’s global resources Multiple slices run concurrently “ … slices act network-wide containers that isolate services from each other. “

21 Architecture Overview (main principals) Owner: is an organization that hosts ( owns ) one or more PlanetLab nodes User : is a researcher that deploys a service on a set of PL nodes PlanetLab Consortium (PLC) : trusted intermediary that manages nodes on behalf a set owners, and creates sliceson those nodes on behalf of a set of users

22 Trust Relationships Princeton Berkeley Washington MIT Brown CMU NYU EPFL Harvard HP Labs Intel NEC Labs Purdue UCSD SICS Cambridge Cornell … princeton_codeen nyu_d cornell_beehive att_mcash cmu_esm harvard_ice hplabs_donutlab idsl_psepr irb_phi paris6_landmarks mit_dht mcgill_card huji_ender arizona_stork ucb_bamboo ucsd_share umd_scriptroute … N x N Trusted Intermediary (PLC)

23 Trust Relationships (cont) Node Owner PLC Service Developer (User) 1 2 3 4 1)PLC expresses trust in a user by issuing it credentials to access a slice 2) Users trust PLC to create slices on their behalf and inspect credentials 3) Owner trusts PLC to vet users and map network activity to right user 4) PLC trusts owner to keep nodes physically secure

24 Principals ( PLC = MA + SA ) Node Owners –host one or more nodes (retain ultimate control) –selects an MA and approves of one or more SAs Service Providers (Developers) –implements and deploys network services –responsible for the service’s behavior Management Authority (MA) –installs an maintains software on nodes –creates VMs and monitors their behavior Slice Authority (SA) –registers service providers –creates slices and binds them to responsible provider

25 Trust Relationships ( PLC decoupling) (1) Owner trusts MA to map network activity to responsible slice MA Owner Provider SA (2) Owner trusts SA to map slice to responsible providers 1 2 5 6 (3) Provider trusts SA to create VMs on its behalf 3 (4) Provider trusts MA to provide working VMs & not falsely accuse it 4 (5) SA trusts provider to deploy responsible services (6) MA trusts owner to keep nodes physically secure SA is analogous to a virtual organization

26 Architectural Elements MA NM + VMM node database Node Owner VM SCS SA slice database VM Service Provider

27 Services Run in Slices PlanetLab Nodes

28 Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A

29 Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A Service / Slice B

30 Services Run in Slices PlanetLab Nodes Virtual Machines Service / Slice A Service / Slice B Service / Slice C

31 “… to view slice as a network of Virtual Machines, with a set of local resources bound to each VM.”

32 Architectural Components Node Virtual Machine (VM) Node Manager (NM) Slice Slice Creation Service (SCS) Auditing Service (AS) Slice Authority (SA) Management Authority (MA) Owner Script Resource Specification (Rspec)

33 Per-Node View Virtual Machine Monitor (VMM) Node Mgr Local Admin VM 1 VM 2 VM n …

34 Node Architecture Goals Provide a virtual machine for each service running on a node Isolate virtual machines Allow maximal control over virtual machines Fair allocation of resources –Network, CPU, memory, disk

35 Global View ……… PLC

36 Node Machine capable of hosting one or more VM Unique node_id ( is bound set of attributes ) Must have at least one non shared IP address

37 Virtual Machine ( VM) Execution environment in which slice runs on a particular node. VMs are tipically implemented by Virtual Machine Monitor (VMM). Multiple VMs run on each PlanetLab node VMM arbitrates the nodes’s resources among them VM is speficied by a set of attributes ( resource specification, RSpec) RSpec defines how much of the node’s resources are allocated to the VM ; it also specifices the VM’s type

38 Virtual Machine ( cont) PlanetLab currently supports a single Linux-based VMM Defines a single the VM’s type (linux-vserver-x86) Most important properties today is that VMs are homogeneous

39 Node Manager (NM) Program running on each node that creates VMs on that node, and controls the resources allocated to those VMs All operations that manipulate VMs on a node are made through the NM Provides an interface by which infraestructure services running on the node create VMs and bind the resources them

40 Slice Set of VMs, with each element of the set running on a unique node The individual VMs that make up a slice contain no information about the other VMs in the set, except as managed by the service running in the slice Are uniquely identified by name Interpretation depends on the context ( is no single name resolution service) Slices names are hierarchical Each level denoting the slice authority

41 Slice Creation Service (SCS) SCS is a an infrastructure service running on each node Typically responsible, on behalf of PLC, for creation of the local instantiation of a slice, which it accomplishes by calling the local NM to create a VM on the node Users may also contact the SCS directly if they wish to synchronously create a slice on a particular node To do so the user presents a cryptographically- signed ticket ( essentially RPsec’s )

42 Auditing Service (AS) PLC audits behaovir of slices, and to aida in this process, each node runs an AS. The AS records information about packet transmitted from the node, and is responsible for mapping network activity to the slice generates it. Trustworthy audit chain : packet signature--> slice name --> users packet signature ( # source, # destination, time) AS offers a public, web-based interface on each node

43 Slice Authority (SA) PLC, acting as a SA, maintains state for the set of system-wide slices for which it is responsible There may be multiple SA but this section focuses on the one managed by PLC. SA to refer to both the principal and the server that implements it

44 Management Authority (MA)

45 Owner Script

46 Resource Specification (Rspec)

47 PlanetLab’s design philosophy Application Programming Interface used by tipical services Protection Interface implemented by the VMM PlanetLab node virtualization mechanisms are characterized by the these two interfaces are drawn

48 PlanetLab Architecture Node-level –Several virtual machines on each node, each running a different service Resources distributed fairly Services are isolated from each other Network-level –Node managers, agents, brokers, and service managers provide interface and maintain PlanetLab

49 One Extreme: Software Runtimes (e.g., Java Virtual Machine, MS CLR) Very High level API Depend on OS to provide protection and resource allocation Not flexible

50 Other Extreme: Complete Virtual Machine (e.g., VMware) Very Low level API (hardware) –Maximum flexibility Excellent protection High CPU/Memory overhead –Cannot share common resources among virtual machines OS, common filesystem High-end commercial server, 10s VM

51 Mainstream Operating System API and protection at same level (system calls) Simple implementation (e.g., Slice = process group) Efficient use of resources (shared memory, common OS) Bad protection and isolation Maximum Control and Security?

52 PlanetLab Virtualization: VServers Kernel patch to mainstream OS (Linux) Gives appearance of separate kernel for each virtual machine –Root privileges restricted to activities that do not affect other vservers Some modification: resource control (e.g., File handles, port numbers) and protection facilities added

53

54 Node Software Linux Fedora Core 2 –kernel being upgraded to FC4 –always up-to-date with security-related patches VServer patches provide security –each user gets own VM (‘slice’) –limited root capabilities CKRM/VServer patches provide resource mgmt –proportional share CPU scheduling –hierarchical token bucket controls network Tx bandwidth –physical memory limits –disk quotas

55 Issues Multiple VM Types –Linux vservers, Xen domains Federation –EU, Japan, China Resource Allocation –Policy, markets Infrastructure Services –Delegation Need to define the PlanetLab Architecture

56 Narrow Waist Name space for slices Node Manager Interface rspec = < vm_type = linux_vserver, cpu_share = 32, mem_limit - 128MB, disk_quota = 5GB, base_rate = 1Kbps, burst_rate = 100Mbps, sustained_rate = 1.5Mbps >

57 Node Boot/Install Process NodePLC Boot Server 1. Boots from BootCD (Linux loaded) 2. Hardware initialized 3. Read network config. from floppy 7. Node key read into memory from floppy 4. Contact PLC (MA) 6. Execute boot mgr Boot Manager 8. Invoke Boot API 10. State = “install”, run installer 11. Update node state via Boot API 13. Chain-boot node (no restart) 14. Node booted 9. Verify node key, send current node state 12. Verify node key, change state to “boot” 5. Send boot manager

58 PlanetFlow Logs every outbound IP flow on every node –accesses ulogd via Proper –retrieves packet headers, timestamps, context ids (batched) –used to audit traffic Aggregated and archived at PLC

59 Chain of Responsibility Join Request PI submits Consortium paperwork and requests to join PI Activated PLC verifies PI, activates account, enables site (logged) User Activated Users create accounts with keys, PI activates accounts (logged) Nodes Added to Slices Users add nodes to their slice (logged) Slice Traffic Logged Experiments run on nodes and generate traffic (logged by Netflow) Traffic Logs Centrally Stored PLC periodically pulls traffic logs from nodes Slice Created PI creates slice and assigns users to it (logged) Network Activity Slice Responsible Users & PI

60 Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceNodesAdd( ) SliceAttributeSet( ) SliceInstantiate( ) SliceGetAll( ) slices.xml VM …............

61 Slice Creation PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) User SliceAttributeSet( ) SliceGetTicket( ) VM …............ (distribute ticket to slice creation service) SliverCreate(ticket)

62 Brokerage Service PLC (SA) VMM NMVM PI SliceCreate( ) SliceUsersAdd( ) Broker SliceAttributeSet( ) SliceGetTicket( ) VM …............ (distribute ticket to brokerage service) rcap = PoolCreate(ticket)

63 Brokerage Service (cont) PLC (SA) VMM NMVM …............ (broker contacts relevant nodes) PoolSplit(rcap, slice, rspec) VM User BuyResources( ) Broker

64 VIRTUAL MACHINES

65 PlanetLab Virtual Machines: VServers Extend the idea of chroot(2) –New vserver created by system call –Descendent processes inherit vserver –Unique filesystem, SYSV IPC, UID/GID space –Limited root privilege Can’t control host node –Irreversible

66 Scalability Reduce disk footprint using copy-on-write –Immutable flag provides file-level CoW –Vservers share 508MB basic filesystem Each additional vserver takes 29MB Increase limits on kernel resources (e.g., file descriptors) –Is the kernel designed to handle this? (inefficient data structures?)

67 Protected Raw Sockets Services may need low-level network access –Cannot allow them access to other services’ packets Provide “protected” raw sockets –TCP/UDP bound to local port –Incoming packets delivered only to service with corresponding port registered –Outgoing packets scanned to prevent spoofing ICMP also supported –16-bit identifier placed in ICMP header

68 Resource Limits Node-wide cap on outgoing network bandwidth –Protect the world from PlanetLab services Isolation between vservers: two approaches –Fairness: each of N vservers gets 1/N of the resources during contention –Guarantees: each slice reserves certain amount of resources (e.g., 1Mbps bandwidth, 10Mcps CPU) Left-over resources distributed fairly

69 Linux and CPU Resource Management The scheduler in Linux provides fairness by process, not by vserver –Vserver with many processes hogs CPU No current way for scheduler to provide guaranteed slices of CPU time

70 MANAGEMENT SERVICES

71 PlanetLab Network Management 1.PlanetLab Nodes boot a small Linux OS from CD, run on RAM disk 2.Contacts a bootserver 3.Bootserver sends a (signed) startup script Boot normally or Write new filesystem or Start sshd for remote PlanetLab Admin login Nodes can be remotely power-cycled

72 Dynamic Slice Creation 1.Node Manager verifies tickets from service manager 2.Creates a new vserver 3.Creates an account on the node and on the vserver

73 User Logs in to PlanetLab Node /bin/vsh immediately: 1.Switches to the account’s associated vserver 2.Chroot()s to the associated root directory 3.Relinquishes true root privileges 4.Switch UID/GID to account on vserver –Transition to vserver is transparent: it appears the user just logged into the PlanetLab node directly

74 PlanetLab - Globus


Download ppt "Global Overlay Network : PlanetLab Claudio E.Righetti October, 2006 (some slides taken from Larry Peterson)"

Similar presentations


Ads by Google