Presentation is loading. Please wait.

Presentation is loading. Please wait.

TeraGrid and Web 2.0 Technologies Daniel S. Katz Director of Science, TeraGrid GIG Senior Computational Researcher, Computation Institute,

Similar presentations


Presentation on theme: "TeraGrid and Web 2.0 Technologies Daniel S. Katz Director of Science, TeraGrid GIG Senior Computational Researcher, Computation Institute,"— Presentation transcript:

1 TeraGrid and Web 2.0 Technologies Daniel S. Katz d.katz@ieee.org Director of Science, TeraGrid GIG Senior Computational Researcher, Computation Institute, University of Chicago & Argonne National Laboratory Affiliate Faculty, Center for Computation & Technology, LSU Adjunct Associate Professor, Electrical and Computer Engineering Department, LSU

2 Outline Introduction to TeraGrid Web 2.0 in TeraGrid in general Web 2.0 in Cactus Web 2.0 in Science Gateways

3 What is the TeraGrid World’s largest distributed cyberinfrastructure for open scientific research, supported by US NSF Integrated high performance computers (>2 PF HPC & >27000 HTC CPUs), data resources (>3 PB disk, >60 PB tape, data collections), visualization, experimental facilities (VMs, GPUs, FPGAs), network at 11 Resource Provider sites Allocated to US researchers and their collaborators through national peer-review process DEEP: provide powerful computational resources to enable research that can’t otherwise be accomplished WIDE: grow the community of computational science and make the resources easily accessible OPEN: connect with new resources and institutions Integration: Single: portal, sign-on, help desk, allocations process, advanced user support, EOT, campus champions

4 Who Uses TeraGrid (2008)

5 How TeraGrid Is Used Use Modality Community Size (rough est. - number of users) Batch Computing on Individual Resources 850 Exploratory and Application Porting 650 Workflow, Ensemble, and Parameter Sweep 250 Science Gateway Access 500 Remote Interactive Steering and Visualization 35 Tightly-Coupled Distributed Computation 10 2006 data

6 How One Uses TeraGrid Compute Service Viz Service Data Service Network, Accounting, … RP 1 RP 3 RP 2 TeraGrid Infrastructure (Accounting, Network, Authorization,…) POPS (for now) Science Gateways User Portal Command Line Slide courtesy of Dane Skow and Craig Stewart

7 User Portal: portal.teragrid.org

8 Access to resources Terminal: ssh, gsissh Portal: TeraGrid user portal, Gateways –Once logged in to portal, click on “Login” Also, SSO from command-line

9 Science Gateways A natural extension of Internet & Web 2.0 Idea resonates with Scientists –Researchers can imagine scientific capabilities provided through familiar interface Mostly web portal or web or client-server program Designed by communities; provide interfaces understood by those communities –Also provide access to greater capabilities (back end) –Without user understand details of capabilities –Scientists know they can undertake more complex analyses and that’s all they want to focus on –TeraGrid provides tools to help developer Seamless access doesn’t come for free –Hinges on very capable developer Slide courtesy of Nancy Wilkins-Diehr

10 Current Science Gateways Biology and Biomedicine Science Gateway Open Life Sciences Gateway The Telescience Project Grid Analysis Environment (GAE) Neutron Science Instrument Gateway TeraGrid Visualization Gateway, ANL BIRN Open Science Grid (OSG) Special PRiority and Urgent Computing Environment (SPRUCE) National Virtual Observatory (NVO) Linked Environments for Atmospheric Discovery (LEAD) Computational Chemistry Grid (GridChem) Computational Science and Engineering Online (CSE-Online) GEON(GEOsciences Network) Network for Earthquake Engineering Simulation (NEES) SCEC Earthworks Project Network for Computational Nanotechnology and nanoHUB GIScience Gateway (GISolve) Gridblast Bioinformatics Gateway Earth Systems Grid Astrophysical Data Repository (Cornell) Slide courtesy of Nancy Wilkins-Diehr

11 TG App: Predicting storms Hurricanes and tornadoes cause massive loss of life and damage to property TeraGrid supported spring 2007 NOAA and University of Oklahoma Hazardous Weather Testbed –Major Goal: assess how well ensemble forecasting predicts thunderstorms, including the supercells that spawn tornadoes –Nightly reservation at PSC, spawning jobs at NCSA as needed for details –Input, output, and intermediate data transfers –Delivers “better than real time” prediction –Used 675,000 CPU hours for the season –Used 312 TB on HPSS storage at PSC Slide courtesy of Dennis Gannon, ex-IU, and LEAD Collaboration

12 TG App: SCEC-PSHA Part of SCEC (Tom Jordan, USC) Using the large scale simulation data, estimate probablistic seismic hazard (PSHA) curves for sites in southern California (probability that ground motion will exceed some threshold over a given time period) Used by hospitals, power plants, schools, etc. as part of their risk assessment For each location, need a Cybershake run followed by roughly 840,000 parallel short jobs (420,000 rupture forecasts, 420,000 extraction of peak ground motion) –Parallelize across locations, not individual workflows Completed 40 locations to date, targeting 200 in 2009, and 2000 in 2010 Managing these requires effective grid workflow tools for job submission, data management and error recovery, using Pegasus (ISI) and DAGman (Wisconsin) 12 Information/image courtesy of Phil Maechling

13 App: GridChem Slide courtesy of Joohyun Kim Different licensed applications with different queues Will be scheduled for workflows

14 TG Apps: Genius and Materials HemeLB on LONI LAMMPS on TeraGrid Fully-atomistic simulations of clay-polymer nanocomposites Slide courtesy of Steven Manos and Peter Coveney Why cross-site / distributed runs? 1.Rapid turnaround, conglomeration of idle processors to run a single large job 2.Run big compute & big memory jobs not possible on a single machine Modeling blood flow before (during?) surgery

15 TeraGrid Future Current RP agreements end in March 2011 –Except track 2 centers (current and future) TeraGrid XD (eXtreme Digital) starts in April 2011 –Potential interoperation with OSG and others Current TG GIG continues through July 2011 –Allows four months of overlap in coordination –Probable overlap between GIG and XD members Blue Waters (track 1) production in 2011

16 TeraGrid: Both Operations and Research Operations –Facilities/services on which users rely –Infrastructure on which other providers build AND R&D –Learning how to do distributed, collaborative science on a global, federated infrastructure –Learning how to run multi-institution shared infrastructure

17 Outline Introduction to TeraGrid Web 2.0 in TeraGrid in general Web 2.0 in Cactus Web 2.0 in Science Gateways

18 Web 2.0 in TeraGrid in general EOT/ER –TGCommunity: Use social networking tool(s) to provide a collaboration environment in support of users of on-line training materials –Student Engagement: use of social networks to engage more students in Computational Science Problem of the week; general student engagement –ER communications: Facilitate storage of and access to science images and stories RDAV –Remote Data Analysis and Visualization (RDAV) system to be added to TeraGrid in 2010 –Will include DoE’s Scientific Data Management (SDM) Dashboard, which includes methods for disseminating (sharing) results among defined groups TG staff makes heavy use of wiki –Almost all working group communication/collaboration –Including drafting and assembling quarterly and annual report, project plans, budgets –Most old presentations –Area for campus champions –Working on user wiki, but not there yet

19 Outline Introduction to TeraGrid Web 2.0 in TeraGrid in general Web 2.0 in Cactus Web 2.0 in Science Gateways

20 Web 2.0 in Cactus Adaptive mesh refinement, parallel I/O, interaction, … Flesh: APIs, information, orchestration Domain specific shared infrastructure Individual research groups Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009) Cactus: Community toolkit or framework or environment, http://www.cactuscode.org/

21 Cactus Project Info Historically 1995 - Material put in http for Mosaic –Mosaic encouraged content to make WWW useful mid to late 90’s – collaborative cork board (CoCoBoard) –Web-based project pages – could attach images (1-D result plots) Up to present – wiki –Project-based private wiki –Cons: network needed to access/edit wiki, editing slow Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009)

22 Cactus Simulation Info Historically 1999 – httpd thorn (in main dist. in 2000) –First collaborative tool integrated into Cactus –Published simulation status, variables, timing, viewport, output files, etc. to web page –Allowed parameter steering through web page –Issues: Authorization to web pages (username/password in parameter file) is insecure and awkward, newer version uses https and can also use X.509) Browsers can display images in certain formats, a Visualization thorn uses gnuplot to include e.g. performance with time, physical parameters Problem deploying on compute nodes where web server cannot be directly accessed (port forwarding, firewalls) How to find and track the simulations, publicize existence to a collaboration? Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009)

23 Cactus Simulation Info Historically 2001 – prototype of readable report automatically generated for each simulation (computation and physics) How to collect reports in one place? Mail Thorn (sendmail) –Email reliable and fault tolerant (spool) –Supercomputers do not allow mail to be sent from compute nodes Notification also was done by SMS All had to be customize written for Cactus, then maintained, ported to various machines Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009)

24 New Web 2.0 technologies in Cactus Twitter thorn –Uses libcurl –Includes parameters for twitter name/passwd –Uses twitter API for status/updates Flickr thorn –Uses flickcurl, libcurl, libxm2, openssl –Authentication more complex (API key, shared secret) –Sends images from running simulation, generated by other Cactus thorns –Each simulation gets its own Flickr set of images Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009)

25 Future Web 2.0 technologies in Cactus Video thorn –Sends animations of simulations to Flickr, YouTUbe, Vimeo Common authentication mechanism for multiple services? Social networking model – making it easier for groups to form and collaborate Other possibilities - DropBox to publish files across a collaboration, SlideShare for presentations, WordPress for simulation reports/blogs, FaceBook to replace grid portals and aggregate services, Cloud computing APIs for “grid” scenarios, … Credit: Gabrielle Allen (Integrating Web 2.0 Technologies..., IEEE Cluster Comp. 2009)

26 Outline Introduction to TeraGrid Web 2.0 in TeraGrid in general Web 2.0 in Cactus Web 2.0 in Science Gateways

27 Science Portals 2.0 Workspace customized by user for specific projects Pluggable into iGoogle or other compliant (and open source) containers Integrates with user workspace to provide a complete and dynamic view of the user’s science, alongside other aspects of their lives (e.g. weather, news) Integrates with social networks (e.g. FriendConnect, MySpace) to support collaborative science TG User Portal provides this same information, but this view is more dynamic and more reusable, and can be more flexibly integrated into the user’s workspace Gadgets suitable for use on mobile devices Technology Detail Gadgets are HTML/JavaScript embedded in XML Gadgets conform to specs that are supported by many containers (iGoogle, Shindig, Orkut, MySpace) Technology Detail Gadgets are HTML/JavaScript embedded in XML Gadgets conform to specs that are supported by many containers (iGoogle, Shindig, Orkut, MySpace) Resource Load gadget shows current view of load on available resources Job Status gadget shows current view of job queues by site, user, status File Transfer gadget for lightweight access to data stores, simple global file searching, and reliable transfers Domain science gadgets complement general purpose gadgets to encompass the full range of scientists’ interests Slide courtesy of Wenjun Wu, Thomas Uram, Michael Papka

28 OLSGW Gadgets OLSGW Integrates bio-informatics applications BLAST, InterProScan, CLUSTALW, MUSCLE, PSIPRED, ACCPRO, VSL2 454 Pyrosequencing service under development Four OLSGW gadgets have been published in the iGoogle gadget directory. Search for “TeraGrid Life Science”. OLSGW Integrates bio-informatics applications BLAST, InterProScan, CLUSTALW, MUSCLE, PSIPRED, ACCPRO, VSL2 454 Pyrosequencing service under development Four OLSGW gadgets have been published in the iGoogle gadget directory. Search for “TeraGrid Life Science”. Slide courtesy of Wenjun Wu, Thomas Uram, Michael Papka

29 Run Social and Behavior Science Tools as SIDGrid Gadgets 3. Launch SIDGrid gadgets (Praat and workflow history gadget) to run analysis and monitor the progress SIDGrid Experiment browsing page Listing project files and available analysis tools; Providing browser-side gadget execution environment Three steps to launch SIDGRID application gadgets: SIDGrid Experiment browsing page Listing project files and available analysis tools; Providing browser-side gadget execution environment Three steps to launch SIDGRID application gadgets: 1. Select data files to analyze 2. Select an analysis application Slide courtesy of Wenjun Wu, Thomas Uram, Michael Papka

30 PolarGrid Goal: Work with Center for Remote Sensing of Ice Sheets Requirements: –View CReSIS data sets, run filters, and view results through Web map interfaces; –See/Share user’s events in a Calendar; –Update results to a common repository with appropriate access controls; –Post the status of computational experiments. –Support collaboration and information exchange by interfacing to blogs and discussion areas Slide courtesy of Raminder Singh, Gerald Guo, Marlon Pierce Login Screen Interface to create new users and login using existing accounts. Integrated with OpenID API for authentication. Solution: Web 2.0-enabled PolarGrid Portal

31 PolarGrid Home Page with a set of gadgets like Google Calendar, Picasa, Facebook, Blog, Twitter Slide courtesy of Raminder Singh, Gerald Guo, Marlon Pierce

32 Google Gadgets & Satellite Data Purdue: disseminating remote sensing products –Flash- and HTML-based gadgets, programmed by undergraduate students Bring traffic/usage to a broad set of satellite products at Purdue Terrestrial Observatory Backend system including commercial satellite data receivers, a smaller cluster, TG data collection, etc. User control includes zoom, pause/resume, frame step through, etc. MODIS satellite viewer GOES-12 satellite viewer Slide courtesy of Carol Song

33 Future App: Real-time High Resolution Radar Data Delivering 3D visualization of radar data via a Google gadget LiveRadar3D –Super high res, real-time NEXRAD data –Continuously updated as new data comes –3D rendering that includes multiple stations in the US –Significant processing (high throughout) and rendering supported by TG systems –To be released next spring Slide courtesy of Carol Song

34 GISolve 2.0 TG GIScience Gateway (for high performance, distributed, collaborative GIS) uses the GISolve Toolkit middleware to synthesize cyberinfrastructure, GIS, and spatial analysis and modeling capabilities, including Web 2.0: –AJAX for highly-interactive user interface –GeoServer & OpenLayers & Google Maps for online mapping & visualization, interactive data browsing/selection/editing –Twitter for status updates of TG analysis jobs for online collaboration –REST for TeraGrid/application/gateway integration Slide courtesy of Shaowen Wang and Yan Liu

35 Comparison of Gadgets and Portlets OpenSocial Gadget (Web 2.0)Java Portlet (Web 1.0) Reusability Reusable browser-side web module XML, HTML, CSS, JavaScript Advantage: Wider applicability Reusable server-side portal module Web Form, Portlet/JSP Markup, Portlet code Application Logic Defined in the JavaScript code of the gadget Advantage: Greater interactivity and scalability Defined in the server-side portlet Communication with Server AJAX Advantage: Greater interactivity Web Form, Portlet/Servlet Container/Langu age Dependence OpenSocial-compliant container, language independent (PHP, Java,…) Advantage: Larger community of users and developers == faster advancement JSR 168 Portal Java Deployment OpenSocial Container: iGoogle, MySpace, Orkut,… Advantage: Larger community, more reuse between sites Portlet container: Gridsphere, Websphere OpenSocial Containers Slide courtesy of Wenjun Wu, Thomas Uram, Michael Papka

36 Thanks & an advert GCE2009 Workshop: Fifth workshop on Grid Computing Environments –Supercomputing 2009 Workshop, November 20 th Topics: Science Gateways, Social Networking, Web Security, Gateway Toolkits, Mobile Applications, and Information Services –See http://www.collab- ogce.org/gce09/index.php/Programhttp://www.collab- ogce.org/gce09/index.php/Program –Proceedings to be published by ACM Digital Library GCE08 Proceedings: –http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=47 38437&isYear=2008http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=47 38437&isYear=2008


Download ppt "TeraGrid and Web 2.0 Technologies Daniel S. Katz Director of Science, TeraGrid GIG Senior Computational Researcher, Computation Institute,"

Similar presentations


Ads by Google