Presentation on theme: "Www.gridlab.org What I really want from networks NOW, and in 5-10 years time The Researchers View Ed Seidel Max-Planck-Institut für Gravitationsphysik."— Presentation transcript:
What I really want from networks NOW, and in 5-10 years time The Researchers View Ed Seidel Max-Planck-Institut für Gravitationsphysik (Albert Einstein Institut) + GridLab Project PHYSICS TODAYTOMORROW
NSF e-Science Panel, Dec 2001 What a scientist wants When I hit Enter on my laptop, I want the solution to this very complex calculation to appear on my screen as a volumetric rendering, within a fraction of a second. When I hit Enter on my laptop, I want a table of requested data to be transferred from an unknown TB database somewhere to my laptop and displayed, within a fraction of a second. Whatever it is, I want it to act like a wire that connects me to resources I need, as if I were the only one on the circuit and using those resources.
Quiz: What does a Researcher Care about? Theoretical bandwidth, latency, and topology, switches, lambdas, etc Application-level features as they experience them Guaranteed reliable data transport performance, remote control of instrumentation and experimental apparatus, information searching performance, Delivered parallel/distributed computational performance, Functional multicast video/audio for collaboration
FAQ: Why isnt my network bandwidth used? Answers: You dont provide enough (end-to-end) bandwidth! You dont provide enough QoS The last mile problem… NSF Grand Challenges 1992 Develop High Perf, demanding applications for science/engineering Create distributed teams of Apps, CS, etc Maximum bandwidth between centers was 45Mb, barely used at that time!
How did we do this EU Calculation? Needed largest academic machines (in US) for simulation LBL/NERSC (US DOE) NCSA Platinum cluster High Speed backbone for data transfer Flew students from Berlin to Illinois 3 weeks analysis and visualization special facilities and experts available to our EU project there Brought 1TB data back on 6 disks purchased in US Remote access QoS very poor Airplanes have better bandwidth than networks Discovery Channel Movie for EU Network 3000 frames Volume Rendering, TB of simulation data EU Project simulation had to be computed and visualized in US!
Network Taxonomy Production Networks: High-performance networks, 24/7 dependablilty (e.g. ESnet, Abilene), for everyone. Experimental Networks: High-performance trials of cutting-edge networks, based on advanced application needs. They MUST be robust, support application-dictated software toolkits, middleware, computing and networking. provide delivered services on a persistent basis, yet encourage experimentation with innovative/novel concepts. Research Networks: Small-scale prototypes; basic research on components, protocols, architecture. Not persistent, dont support applications. Scientists Need/Want new generation Experimental Networks for e-Science Apps, Grand Challenge teams to develop them
Conclusions of NSF Panel Participants overwhelmingly agreed: networks for e- Science must have known and knowable characteristics These are not features of todays Production Networks These are needed in Experimental Networks for next generation e-Science High-performance users Require networks that allow access to information about their operational characteristics. Expect deterministic and repeatable behavior from networks Demand end-to-end service …Or else they will never depend on them for persistent e- Science applications.
Current Grid Application Types Community Driven Serving the needs of distributed communities Video Conferencing Virtual Collaborative Environments Code sharing to experiencing each other at a distance… Data Driven: will grow exponentially in next decade! Remote access of huge data, data mining Weather Information systems Particle Physics Process/Simulation Driven Demanding Simulations of Science and Engineering Get less attention in the Grid World, yet drive HPC!
What we cant quite do now We have the technology, but not the bandwidth SC90 - SC01 Typical scenario Find remote resource Where? Portal! Launch job Visualize results Steer job Metacomputing the Einstein Equations: Connecting T3Es in Berlin, Garching, San Diego Remote Viz, Streaming HDF5 Gridftp Autodownsample Any Viz Client: LCA Vision, OpenDX John Shalf (LBL) won SC2001 Bandwidth Challenge: ~3.5Gbit/sec
Spawning across ARG Testbed Main BH Simulation starts here All analysis tasks spawned automatically to free resources worldwide These task farmed jobs may feed back, steer main job
What we want in 5-10 years Many Disciplines Require Common Infrastructure Common Needs Driven by the Science/Engineering Large Number of Sensors / Instruments Data to community in real time! Daily Generation of Large Data Sets Growth in Computing power from TB ---> PB machines Experimental data Data is on Multiple Length and Time Scales Automatic Archiving in Distributed Repositories Large Community of End Users Multi-Megapixel and Immersive Visualization Collaborative Analysis From Multiple Sites Complex Simulations Needed to Interpret Data Some will need Optical Networks Communications Dedicated Lambdas Data Large Peer-to-Peer Lambda Attached Storage Source: Smarr
Rollout Over 14 Years Starting With Existing Broadband Stations Source: Smarr NSFs EarthScope--USArray: Explosions of Data! Typical of Many Projects 70km spacing, data can be coupled to simulations
Physicist has new idea ! S1S1 S2S2 P1P1 P2P2 S1S1 S2S2 P2P2 P1P1 S Brill Wave Dynamic Grid Computing Found a black hole, Load new component Look for horizon Calculate/Output Grav. Waves Calculate/Output Invariants Find best resources Free CPUs!! NCSA SDSC RZG LRZ Archive data SDSC Add more resources Clone job with steered parameter Queue time over, find new machine Further Calculations AEI We see something, but too weak. Please simulate to enhance signal! Archive to LIGO experiment
Summary Researchers need much higher bandwidth networks now, but simply wont use them unless they provide better end- to-end QoS Data needs will grow exponentially in coming decade Experimental data Simulation data from Petascale computing Virtual presence, video, etc eScience Grand Challenge teams (Networking experts, CS and Science) & Experimental Networks to blaze path Future Grid Apps will be very innovative if Networks are there, and middleware frameworks support them Complex live interaction between users, data, simulation Instantanenous bandwidth on demand: Give me a lambda!