Presentation is loading. Please wait.

Presentation is loading. Please wait.

Joint Genome Institute

Similar presentations


Presentation on theme: "Joint Genome Institute"— Presentation transcript:

1 Joint Genome Institute
Scalable Platform: A Next Generation Ecosystem with SDN for LHC Run2 and Beyond + INDUSTRY Joint Genome Institute Harvey Newman, Caltech S2I2 Workshop, May 2, 2017 NGenIAES_S2I2PrincetonWorkshop_hbn pptx?dl=0

2 A New Era of Challenges and Opportunity
Meeting the Needs for LHC Run2 and Beyond Paving the Way for Other Science Programs

3 New Levels of Challenge
Challenges: Global Exabyte Data Distribution, Processing, Access and Analysis Exascale Data 0.4 EB now, 1 EB by end of LHC Run2 To ~100 EB during HL LHC Network Total Flow of >1 EB this Year 850 Pbytes flowed over WLCG in 2016 Projected Shortfalls by HL LHC CPU ~4-12X, Storage ~3X, Networks Network Dilemma: Per technology generation (~8 years) Capacity per unit cost: 4X Bandwidth growth: 40X (Internet2); X (ESnet) During Run3 (~2022) we will likely reach a network limit This is unlike the past Technology outlook: optical, switch advances are evolutionary HEP will experience increasing Competition from other data intensive programs Sky Surveys: LSST, SKA Next Gen Light Sources Earth Observation Genomics New Levels of Challenge Global data distribution, processing, access and analysis Coordinated use of massive but still limited diverse compute, storage and network resources Coordinated operation and collaboration within and among global scientific enterprises

4 Energy Sciences Network
Updates and Outlook for Gbps Typical; Peaks to 300+ Gbps Long term traffic growth of 72%/year (10X per 4 Years) continues: Traffic Doubled to 64 PB/mo in 2016; 100+ PB/mo est. by end 2017 MyESnet (my.es.net) traffic portal: for both users and IT experts LHCONE Rapid Growth in : Now the largest class of ESnet traffic 200G 100G ESnet6: the next SDN-enabled generation, planned by 4

5 Responding to the Challenges New Overaching “Consistent Operations” Paradigm
Scientific workflow management systems that are agile, deeply network aware and proactive Responding to moment-to-moment feedback on State changes in the networks and end systems Actual versus estimated transfer progress, access IO Preequisites for effective co-scheduling of computing, storage and network resources allocation End systems, data transfers and access methods capable of high of throughput A holistic view of workflows with diverse characteristics Real-time end-to-end monitoring systems Technical Elements for efficient operations within the limits: SDN-driven bandwidth allocation, load balancing, flow moderation at the network edges and in the core + On-the-fly topology reconfiguration where needed Result: Moving towards best use of the available network, computing and storage infrastructures while avoiding saturation and blocking of other network traffic

6 Consistent Operations Paradigm Technical Implementation Directions
METHOD: Construct autonomous network-resident services that dynamically interact with site-resident services, and with the experiments’ principal data distribution and management tools To coordinate use of network, storage + compute resources, using: Smart middleware to interface to SDN-orchestrated data flows over network paths with allocated bandwidth levels all the way to a set of high performance end-host data transfer nodes (DTNs), Protocol agnostic traffic shaping services at the site edges and in the network core, coupled to high throughput data transfer applications that provide stable, predictable transfer rates Machine learning + system modeling and Pervasive end-to-end monitoring To track, diagnose and optimize system operations on the fly

7 Next Generation “Consistent Operations”:
Site-Core Interactions for Efficient, Predictable Workflow Key Components: (1) Open vSwitch (OVS) at edges to stably limit flows, (2) Application Level Traffic Optimization (ALTO) in Open Daylight for end-to-end optimal path creation, flow metering and high watermarks set in the core Real-time flow adjustments triggered as above Optimization using “Min-Max Fair Resource Allocation” (MFRA) algorithms on prioritized flows Flow metering in the network fed back to OVS edge instances; to ensure smooth progress of flows end-to-end Consistent Ops with ALTO, OVS and MonALISA FDT Schedulers Demos: Internet2 Global Summit in May SC16 in November With Yale CS Team: Y. Yang, Q. Xiang et al.

8 SC15-16: SDN Driven Next Generation Terabit/sec Integrated Network for Exascale Science
supercomputing.caltech.edu SDN-driven flow steering, load balancing, site orchestration Over Terabit/sec Global Networks Preview PetaByte Transfers to/ from Site Edges of Exascale Facilities With 100G -1000G DTNs SC16: Consistent Operations with Agile Feedback Major Science Flow Classes Up to High Water Marks LHC at SC15: Asynchronous Stageout (ASO) with Caltech’s SDN Controller Tbps Ring Designed for SC16: Caltech, Ciena, SciNet StarLight + Many HEP, Network, Vendor Partners at SC16 45

9 Consistent Operations Paradigm R&D Areas Where Development is Required
Deep site orchestration among virtualized clusters, storage subsystems and subnets to successfully co-schedule CPU, storage and network resources Science-program designed site architectures, operational modes, priorities and metrics of success, adjudicated across multiple network domains and among multiple virtual organizations Seamlessly extending end-to-end operation across both extra-site and intra-site boundaries through the use of next generation Science DMZs, and edge control tools such as Open vSwitch Funneling massive sets of streams to DTNs at the site edge hosting petascale buffer pools configured for flows of 40, 100 Gbps and up, exploiting state of the art data transfer applications where possible Unsupervised and supervised machine learning and modeling methods to drive the optimization of end-to-end workflow involving terabyte to multi-petabyte datasets; develop effective metrics and methods NOTE: Some of these will exploit existing and ongoing developments of state of the art servers, interfaces, transfer tools, SDN and emerging network “operating systems”. A great deal of additional work is needed.

10 Exascale Ecosystems with Petabyte Transactions for Next-Generation Data Intensive Sciences
Opportunity for HEP (CMS example): CPU needs will grow 65 to 200X by HL LHC Dedicated CPU that can be afforded will be an order of magnitude less; even after code improvements on the present trajectory Short term Goal: Making such systems a grid resource for CPU using data resident at Tier1s and US Tier2s Method: Petabyte transactions over 100G, then G (2018) then Terabit/sec networks (~ ) with Secure proxies at the site edge Important Long Term benefits Folding LCFs into a global ecosystem for HEP and data intensive sciences Building a modern coding workforce Helping to Shape the future architecture and operational modes of Exascale Computing Facilities Pilots Programs with Argonne, ORNL HPC systems grid resources DTN and process design for 100G+ data transfers Precise NLO generators with new more efficient methods

11 Bringing Diverse HPC Systems and Clouds Into the HEP Ecosystem: R&D Areas
Recasting HEP’s generation, reconstruction & simulation codes to adapt to the HPC architectures Identifying and matching units of work in HEP’s workflow to specific HPC resources or sub-facilities well-adapted to the task Developing algorithms that effectively co-schedule resources: CPU, memory, storage, IO port, local and wide area network resources Developing a security infrastructure and corresponding hardware/ software system architecture that meets the site security needs Building dynamic and adaptive “just in time” systems that respond (rapidly; as required) to offered resources as they occur Applying machine learning, modeling + game theoretic methods to optimize the workflow, within the facility’s conditions and constraints Exploiting the intense ongoing development of virtualized computing systems, networks and services (SDN, NFV, NV) in the data center, campus and wide area network space For coherent distributed system operations

12 Service Diagram: LHC Pilot

13 Backup Slides Follow

14 CMS at SC16: ExaO - Software Defined Data Transfer Orchestrator with Phedex and ASO
Leverage emerging SDN techniques to realize end-to-end orchestration of data flows involving multiple host groups in different domains Maximal link utilization with ExaO: PhEDEx: CMS data placement tool for datasets ASO: Stageout of output files from CMS Analysis Jobs Tests across the SC16 Floor: Caltech, UMich, Dell booths and Out Over the Wide Area: FIU, Caltech, CERN, UMich Dynamic scheduling of PetaByte transfers to multiple destinations Partners: UMich, StarLight, PRP, UNESP, Vanderbilt, NERSC/LBL, Stanford, CERN; ESnet, Internet2, CENIC, MiLR, AmLight, RNP, ANSP


Download ppt "Joint Genome Institute"

Similar presentations


Ads by Google