Presentation is loading. Please wait.

Presentation is loading. Please wait.

From Theory to Action: A Blueprint for Experimentation on Large Scale Research Platforms Abhimanyu Gosain Technical Program Director October 10, 2017.

Similar presentations


Presentation on theme: "From Theory to Action: A Blueprint for Experimentation on Large Scale Research Platforms Abhimanyu Gosain Technical Program Director October 10, 2017."— Presentation transcript:

1 From Theory to Action: A Blueprint for Experimentation on Large Scale Research Platforms
Abhimanyu Gosain Technical Program Director October 10, 2017

2 Public Private Partnership
1 2 PAWR Industry Consortium <$+ In-Kind=$50M> National Science Foundation <$50M> Whats being contributed is just the beginning. working together has a greater impact than if parts were in silos

3 PAWR Key Figures $22.5M 4 5 Cash + In-Kind City Scale Platforms
Years of Funding Cash + in-kind equipment and services 4 platforms ( upto 2 in first year) 5 years of funding after which sustainability of operation

4 Key RFP Dates Expression of Intent to Submit proposal (optional)
May 8, 2017 Please submit your team’s intent to submit:   Preliminary Proposal (required)  June 1, 2017 6pm Eastern Time Full Proposal (required) July 31, 2017 Finalists announced No later than October 2017 Site visits to be completed by the end of 2017 Winner(s) announced during the first quarter of 2018

5 DISTINGUISHED PANELISTS
Kapil Dandekar, Professor Drexel University KK Ramakrishnan, Professor UC Riverside Paul Ruth, Senior Distributed Systems Researcher RENCI, UNC Chapel Hill

6 Introduction Individual Group Panel Format Discussion Audience Q/A
5 minutes minutes minutes Introduction Motivation Topic Panelist Individual *Kapil ( Wireless) *KK ( NFV/Edge) *Paul (Cloud) Group Discussion Audience Q/A Cash + in-kind equipment and services 4 platforms ( upto 2 in first year) 5 years of funding after which sustainability of operation

7 Application Driven Experimentation
PAWR Platforms HEALTH Public SAFETY MANUFACTURING TRANSPORTATION ENERGY Experimenters Once deployed, Platform owners declare victory but the real work begins when operationalizing and supporting experimentation. Shift from technology driven <-> application driven. Tangible use cases

8 Optimize Experimenter Experience
Clear User journey (end-to-end) Design Plan Execute Analyze Publish Need Alignment Shared and easy access Open software, tools, With Platform Owner hardware, Common APIs

9 Architectural Convergence
Eliminate Technology (*G) Lock-in Abstractions of Hardware, Software Defined “Everything” Technology Agnostic white-box Future Proof Multi-Tenant Solution Virtualization Network Slicing Customize Resources and share among many users Manage Multi-Technology Diversity and Hosting Support

10 OpenNetVM – NFV Open Source Platform
Network Functions run in Docker containers DPDK based design, to achieve zero-copy, high-speed I/O Key: Shared memory across NFs and NF Manager Open source Multiple industrial partners evaluating use of OpenNetVM Of course, there are many competitors (e.g., Fast Data Project (fd.io), etc.)

11 OpenNetVM Architecture
NF Manager (with DPDK) runs in host’s User Space NFs run inside Docker containers NUMA-aware processing Zero-copy data transfer to and between NFs No Interrupts using DPDK poll-mode driver Scalable RX and TX threads in manager and NFs Each NF has its own ring to receive/transmit a packet descriptor NFs start in 0.5 seconds; throughput of 68 Gbps w/ 6 cores

12 Performance w/ Real Traffic
Send HTTP traffic through OpenNetVM 1 RX thread, 1 TX thread, 1 NF = 48Gbps 2 RX threads, 2 TX threads, 2 NFs = 68Gbps (NIC bottleneck?) 2 RX threads, 5 TX threads, chain of 5 NFs = 38Gbps Fast enough to run a software-based core router; Middleboxes that function as a ‘bump-in-the-wire’

13 Cellular Network Architecture (3G & LTE)
RNC Node B Uu UTRAN 3G flow (User to Internet) UE <–> nodeB <–> RNC <–> SGSN <–> GGSN <–> Internet RAN (SGSN) 3G CORE Network Backbone network (IP based) (GGSN) Packet network backhaul IoT Cloud Control plane (MME) Firewall Packet network Internet Server cache 4G WiFi control & data plane (SGW/PGW) Server Uu eNodeB eUTRAN management, policy (PCRF) Content Provider 4G flow (User to Internet) UE <–> eNodeB <–> SGW <–> PGW <–> Internet

14 Cellular EPC: Distributed Hardware
Cellular network EPC: Separate purpose-built hardware appliances for each function A number of distinct components Architecture: evolved from traditional circuit/virtual circuit-switched network design Separate data plane and control plane components Need to keep state consistent across all components Complex protocol; many messages Use of GTP tunnels results in overhead, latency

15 Messages Exchanges in Traditional 3GPP Approach
Initial Attach Service Request (Idle to Active)

16 Mobility Management Handoff without change of S-P GW – (S1 handoff)
Results in up to control messages in total, across S-GW, MME and eNBs. Handoff with change of S-GW or MME has more overhead Mobility of a large # of IoT devices – overhead will be of concern S-GW UE P-GW MME Primary synchronization sequence: 3 SSS: 168 A total of 504 cell identities

17 Opportunity: NFV-based Arch. & Protocol
Tight coupling between data and control plane in cellular networks Industry proposals: all control state for session in separate entity, distinct from the data plane Separating this state comes at a price! Re-think the control plane protocol to leverage the opportunities of new architecture

18 Rethinking the Design for Cellular EPC
CleanG Protocol: Reduce the number of control plane transactions Simplify the control plane protocol CleanG Architecture: A more scalable architecture Consolidation of EPC components NFV based: scale-out to meet demand for data plane and control plane Optimize protocol & architecture for common case Lower overhead for the common cases

19 CleanG architecture A single system implementing EPC components
‘CORE’ Data and Control Run on NFV platforms Like OpenNetVM Minimize delay between control and data plane Leverage shared memory Dynamic resource alloc. Control/data plane resources can be scaled separately

20 CleanG protocol - Attach
Compare to the 30 messages in current protocol

21 CleanG protocol – Handover
No X2 – Delay Tolerant Traffic Compare to about messages in current protocol

22 Performance with CleanG
Representative Workload of users attaching to network, transitioning between active/idle, handovers

23 Summary Networks are changing – moving to a software base
SDN’s centralized control NFV’s software based implementations NetVM/OpenNetVM efforts enhance industry direction NFV platform provides significant performance improvement A more coherent and effective software network architecture CleanG rethinks the partitioning of cellular control plane Consolidate control and data plane components SDN controller responsible for high-level policy/config. CleanG simplifies the protocol and architecture a scalable solution for next generation cellular networks

24 Getting OpenNetVM Source code and NSF CloudLab images at

25 Paul Ruth RENCI - UNC Chapel Hill
From Theory to Action: A Blueprint for Experimentation on Large Scale Research Platforms Paul Ruth RENCI - UNC Chapel Hill From RENCI Operating Federated SDN Infrastructure And using it as a virtualized sdx Work on ExoGENI Chameleon Using both for science/research

26 ExoGENI ExoGENI About 20 sites Each is small OpenStack cloud
Dynamic provisioning of L2 paths between them (sometime from a pool of existing vlans) 26

27 Chameleon: SDN Experiments
Internet 2 AL2S, GENI, Future Partners Chameleon Phase 2 RENCI added to the team Hardware Network Isolation Corsa DP2000 series OpenFlow v1.3 Sliceable Network Hardware Tenant controlled Virtual Forwarding Contexts (VFC) Isolated Tenant Networks BYOC – Bring your own controller Wide-area Stitching Between Chameleon Sites (100 Gbps) ExoGENI Campus networks (ScienceDMZs) Austin Chameleon Core Network 100Gbps uplink public network Chicago Standard Cloud Unit Corsa DP2400 Switch VFC (Tenant A) VFC (Tenant b) Compute Node (Tenant A) Compute Node (Tenant A) Compute Node (Tenant B) Compute Node (Tenant B) Ryu OpenFlow Controller (Tenant A) OpenFlow Controller (Tenant B) oals: -production testbed for SDN experimentation (production?) -high-bandwidth wide-area SDN experiments Gaps: -Testbed for users to experimenting with wide-area SDN -Isolation below VLAN -looking forward to creating “network appliances”

28 Superfacilities the SAFE way
Automating Superfacilites Multiple domains Friction free L2 paths Naked L2 paths are not secure Handshake model of trust is not going to work SAFE ( Secure Authorization for Federated Environments) Trust model for reasoned policies for connectivity Trust model for reasoned policies for forwarding traffic Virtual SDX Distributed Enforces client’s forwarding policy Gaps Formal trust model for automating low-level network paths Formal trust model for specifying traffic forwarding policy Define desired policies Goals Superfacilites Combinations of resources from many different organizations put together for a single purpose or community Thus far they are mostly built by hand Automate the creation of superfacilities Basically and SDX that connects multiple facilities Requires multi-domain SDN Needs to retain trust traditionally provided by human interaction User specified connectivity and forwarding policies Gaps Formal trust model for automating multi-domain low-level network paths Formal trust model for tenant specified traffic forwarding policy. Extends network control to the end-user. What functionality should a vSDX provide? Dynamic distributed vSDX policy design Automation of policy changes (maybe just policy that includes variables that change over time) These are Northbound issues but extended to clients/tenants (specific underlying SDN implementation is not important) How can the SDX run the client’s “code” (i.e. client’s policy) NSF CICI Award #

29 Questions Do we have a strong demand for Networking emulation environments as well as large scale testbeds for your topic area? If yes, what are technical challenges and pilot experiments ? The demand for emulation seems to be greater Its easier (mininet, OpenVSwitch, network name spaces) Many experiments are about scalability of the control plane End-to-end dataplane experiments need large scale testbeds Opt-in – bringing in real traffic Real equipment with real flaws Realistic control plane experiments need large scale testbeds Peering with production networks (SDX, BGP, etc.) Technical challenges to moving to large scale testbeds GENI is an edge network testbed. We need a core network testbed Reliability: many experimenters view GENI as fragile The more aggregates in an experiment the greater chance of something failing Access to diverse/modern hardware Original GENI SDN hardware is mostly under-implemented OpenFlow 1.0 Real hardware ages faster and needs refresh Diversity of available hardware Presentation title goes here

30 Questions Most SDN GENI experiments are emulated
What are the critical impediments for using existing experimental testbeds to port ideas into experiments on real infrastructure ? Most SDN GENI experiments are emulated OpenVSwitch works better than deployed hardware SDN switches Need testbed providing realistic experimental SDN hardware Difficulty deploying large experiments The more aggregates in an experiment the greater chance of something failing Presentation title goes here

31 Questions Your wishlist of infrastructure/software services or tools that you may have used or would like to see for your topic area ? GENI as a core experimental network testbed that connects all the other testbeds, clouds, compute facilities, your campus, ScienceDMZs, etc. Peering to Internet, campus, etc. (SDX/BGP) Connections to research computing facilities (TACC, NERSC, etc.) Connections to all the clouds (Amazon, Azure, Google Cloud Platform, Chameleon, CloudLab, Bridges, Jetstream, etc.). Modern SDN hardware allocated to tenants (stable and unstable) Legacy, OpenFlow, P4, whatever is next… Compute/storage in networking core Chip Elliot said switches/routers are using legacy design and should have compute/storage/etc. Even more… (GPUs? Machine learning informing network decisions) Presentation title goes here

32 Key Questions What are the critical impediments for using existing experimental testbeds to port ideas into experiments on real infrastructure ? Your wishlist of infrastructure/software services or tools that you may have used or would like to see for your topic area ?


Download ppt "From Theory to Action: A Blueprint for Experimentation on Large Scale Research Platforms Abhimanyu Gosain Technical Program Director October 10, 2017."

Similar presentations


Ads by Google