Presentation is loading. Please wait.

Presentation is loading. Please wait.

Grid computing using Alina Bejan University of Chicago.

Similar presentations


Presentation on theme: "Grid computing using Alina Bejan University of Chicago."— Presentation transcript:

1 Grid computing using Alina Bejan University of Chicago

2 Open Science Grid (OSG) takes High Throughput Computing to the next level, to transform data-intensive science through a cross- domain, self-managed nationally distributed cyber- infrastructure. brings together campuses and communities, and facilitates the needs of Virtual Organizations at all scales. The OSG Consortium includes – universities – national laboratories – scientific collaborations – software developers working together to meet these goals

3 What is a grid? Grid is a system that: –coordinates resources that are not subject to centralized control, –using standard, open, general-purpose protocols and interfaces, –to deliver nontrivial qualities of service (based on Ian Foster’s definition in http://www.gridtoday.com/02/0722/100136.html) http://www.gridtoday.com/02/0722/100136.html

4 Grids consist of distributed clusters Grid Client Application & User Interface Grid Client Middleware Resource, Workflow & Data Catalogs 4 Grid Site 2: Sao Paolo Grid Service Middleware Compute Cluster Grid Storage Grid Protocols Grid Site 1: Fermilab Grid Service Middleware Compute Cluster Grid Storage …Grid Site N: UWisconsin Grid Service Middleware Compute Cluster Grid Storage

5 Do you have a project that takes too long when running on a single processor ? Do you deal with large amounts of data from simulations or experiments ?

6 Scaling up Science: Citation Network Analysis in Sociology Work of James Evans, University of Chicago, Department of Sociology 6

7 Scaling up the analysis Query and analysis of 25+ million citations Work started on desktop workstations Queries grew to month-long duration With data distributed across U of Chicago TeraPort cluster: –50 (faster) CPUs gave 100 X speedup –Many more methods and hypotheses can be tested! Higher throughput and capacity enables deeper analysis and broader community access. 7

8 Mining Seismic data for hazard analysis (Southern Calif. Earthquake Center ). Seismic Hazard Model Seismicity Paleoseismology Local site effects Geologic structure Faults Stress transfer Crustal motion Crustal deformation Seismic velocity structure Rupture dynamics 88

9 Grids work like a CHARMM for molecular dynamics Understanding the mathematics of molecular movement helps researchers simulate slices of the atomic world But when accurate nanosecond simulations pose a serious challenge, how can you simulate full microseconds of complex molecular dynamics?

10 Designing Proteins from Scratch Scientists use OSG to design proteins that adopt specific 3D structures and more ambitiously bind and regulate target proteins important in cell biology and pathogenesis

11 Genetics Grid computing is helping microbiologists solve the mysteries of mapping new genomes using GADU (Genome Analysis and Database Update)

12 Genome Analysis and Database Update (GADU) Runs across OSG and TeraGrid. Uses the Virtual Data System (VDS) workflow & provenance. Pass through public DNA and protein databases for new and newly updated genomes of different organisms and runs BLAST, Blocks, Chisel. 1200 users of resulting DB. Request: 1000 CPUs for 1-2 weeks. Once a month, every month. On OSG at the moment >600CPUs and 17,000 jobs a week.

13 Stormy weather: grid computing powers fine-scale climate modeling Why run individual models when you can run models in combination? When it comes to climate modeling, meteorologists are showing 16 forecasts are better than one.

14 Which sciences can benefit ? particle and nuclear physics astrophysics bioinformatics gravitational-wave science computer science mathematics medical imaging nanotechnology potentially any other science …

15 Research Participation  Majority from physics : Tevatron, LHC, STAR, LIGO.  Used by 10 other (small) research groups.  90 members, 30 VOs,  Contributors: - 80 sites / 50 organizations  5 DOE Labs : BNL, Fermilab, NERSC, ORNL, SLAC.  65 Universities.  5 partner campus/regional grids.  Accessible resources:  43,000+ cores  6 Petabytes disk cache  10 Petabytes tape stores  14 internetwork partnership  Usage  15,000 CPU WallClock days/day  1 Petabyte data distributed/month.  100,000 application jobs/day.  20% cycles through resource sharing, opportunistic usage. Research Participation  Majority from physics : Tevatron, LHC, STAR, LIGO.  Used by 10 other (small) research groups.  90 members, 30 VOs,  Contributors: - 80 sites / 50 organizations  5 DOE Labs : BNL, Fermilab, NERSC, ORNL, SLAC.  65 Universities.  5 partner campus/regional grids.  Accessible resources:  43,000+ cores  6 Petabytes disk cache  10 Petabytes tape stores  14 internetwork partnership  Usage  15,000 CPU WallClock days/day  1 Petabyte data distributed/month.  100,000 application jobs/day.  20% cycles through resource sharing, opportunistic usage.

16 OSG We are successfully operating a large, shared, very distributed system for many users. The size & capabilities continue to increase: –software functionality –Users –Sites – partners – cycles used –data storage capabilities.

17 The OSG project research Has two components: –To enable scientific discovery by providing a state of the art production distributed infrastructure for science – To advance the state of the art in distributed computing through experimental computer science through a large scale production quality distributed system

18 OSG’s grids Grids can be – Campus, Community, Regional, National, International OSG scope includes bridging, and interfacing between them

19 Grid Resources in the US Research Participation  Majority from physics : Tevatron, LHC, STAR, LIGO.  Used by 10 other (small) research groups.  90 members, 30 VOs,  Contributors:  5 DOE Labs  BNL, Fermilab, NERSC, ORNL, SLAC.  65 Universities.  5 partner campus/regional grids.  Accessible resources:  43,000+ cores  6 Petabytes disk cache  10 Petabytes tape stores  14 internetwork partnership  Usage  15,000 CPU WallClock days/day  1 Petabyte data distributed/month.  100,000 application jobs/day.  20% cycles through resource sharing, opportunistic use. Research Participation  Majority from physics : Tevatron, LHC, STAR, LIGO.  Used by 10 other (small) research groups.  90 members, 30 VOs,  Contributors:  5 DOE Labs  BNL, Fermilab, NERSC, ORNL, SLAC.  65 Universities.  5 partner campus/regional grids.  Accessible resources:  43,000+ cores  6 Petabytes disk cache  10 Petabytes tape stores  14 internetwork partnership  Usage  15,000 CPU WallClock days/day  1 Petabyte data distributed/month.  100,000 application jobs/day.  20% cycles through resource sharing, opportunistic use. Research Participation  Support for Science Gateways  over 100 scientific data collections (discipline specific databases)  Contributors:  11 Supercomputing centers IndianaIndiana, LONI, NCAR, NCSA, NICS, ORNL, PSC, Purdue, SDSC, TACC and UC/ANLLONINCARNCSANICS ORNLPSCPurdueSDSCTACCUC/ANL Computational resources: –> 1 Petaflop computing capability –30 Petabytes of storage (disk and tape)‏ –Dedicated high performance internet connections (10G )  750 TFLOPS (161K-cores) in parallel computing systems and growing Research Participation  Support for Science Gateways  over 100 scientific data collections (discipline specific databases)  Contributors:  11 Supercomputing centers IndianaIndiana, LONI, NCAR, NCSA, NICS, ORNL, PSC, Purdue, SDSC, TACC and UC/ANLLONINCARNCSANICS ORNLPSCPurdueSDSCTACCUC/ANL Computational resources: –> 1 Petaflop computing capability –30 Petabytes of storage (disk and tape)‏ –Dedicated high performance internet connections (10G )  750 TFLOPS (161K-cores) in parallel computing systems and growing TeraGrid OSG

20 OSG vs TG OSGTG Computational Resource 43K-cores across 80 institutions 161K-cores across 11 institutions and 22 systems Storage support a shared file system is not mandatory, and hence applications need to be aware of this shared file system (NFS, PVFS, GPFS, Lustre) on each system, and even has a WAN GPFS mounted across most systems Accessibility Private IP space for compute nodes no interactive sessions supports Condor throughout supports GT2 (and a few GT4) the firewall is locked down More compute nodes  public IP space Support for interactive sessions with login and compute nodes Supports GT2, GT4 for remote access, and mostly PBS/SGE and some Condor for local access 10K ports open in the TG firewall on login and compute nodes

21 Before FermiGrid e.g.Fermilab User Resource Head Node Workers Astrophysics Resource Head Node Workers Common Resource Head Node Workers ParticlePhysics Resource Head Node Workers Theory Existing Common Gateway & Central Services Common Gateway & Central Services Guest User Local Grid with adaptor to national grid Central Campus wide Grid Services Enable efficiencies and sharing across internal farms and storage Maintain autonomy of individual resources Next Step: Campus Infrastructure Days - new activity OSG, Internet2 and TeraGrid

22 grid service providers: – middleware developers – cluster, network and storage administrators – local-grid communities the grid consumers: – global collaborations – single researchers – campus communities – under-served science domains  into a cooperative infrastructure to share and sustain a common heterogeneous distributed facility in the US and beyond. The Open Science Grid Consortium brings:

23 OSG sites

24 96 Resources across production & integration infrastructures 30 Virtual Organizations +6 operations Includes 25% non-physics. ~30,000 CPUs (from 30 to 4000) ~6 PB Tapes ~4 PB Shared Disk Snapshot of Jobs on OSGs Sustaining through OSG submissions: 3,000-4,000 simultaneous jobs. ~100K jobs/day ~50K CPUhours/day. Peak test jobs of 15K a day. Using production & research networks OSG Snapshot

25 The Grid Middleware Stack (and course modules) Grid Security Infrastructure (M4) Job Management (M4) Data Management (M5) Grid Information Services (M4) Core Globus Services (M4) Standard Network Protocols and Web Services Workflow system (explicit or ad-hoc) (M2) Grid Application (M2) (often includes a Portal) 25

26 Globus and Condor play key roles Globus Toolkit provides the base middleware –Client tools which you can use from a command line –APIs (scripting languages, C, C++, Java, …) to build your own tools, or use direct from applications –Web service interfaces –Higher level tools built from these basic components, e.g. Reliable File Transfer (RFT) Condor provides both client & server scheduling –In grids, Condor provides an agent to queue, schedule and manage work submission 26

27 Virtual Organization (VO) Concept VO for each application or workload Carve out and configure resources for a particular use and set of users

28 OSG - a Community Consortium DOE Laboratories and DOE, NSF, other, University Facilities contributing computing farms and storage resources, infrastructure and user services, user and research communities. Grid technology groups: Condor, Globus, Storage Resource Management, NSF Middleware Initiative. Global research collaborations: High Energy Physics - including Large Hadron Collider, Gravitational Wave Physics - LIGO, Nuclear and Astro Physics, Bioinformatics, Nanotechnology, CS research…. Partnerships: with peers, development and research groups Enabling Grids for EScience (EGEE),TeraGrid, Regional & Campus Grids (NYSGrid, NWICG, TIGRE, GLOW..) Education: I2U2/Quarknet sharing cosmic ray data, Grid schools… 19992000200120022005200320042006200720082009 PPDG GriPhyN iVDGL TrilliumGrid3 OSG (DOE) (DOE+NSF) (NSF)

29 To efficiently use a Grid, you must locate and monitor its resources. Check the availability of different grid sites Discover different grid services Check the status of “jobs” Make better scheduling decisions with information maintained on the “health” of sites

30 Virtual Organization Resource Selector - VORS http://vors.grid.iu.edu/ http://vors.grid.iu.edu/ Custom web interface to a grid scanner that checks services and resources on: –Each Compute Element –Each Storage Element Very handy for checking: –Paths of installed tools on Worker Nodes. –Location & amount of disk space for planning a workflow. –Troubleshooting when an error occurs.

31 Open Science Grid

32 VORS entry for OSG_LIGO_PSU OSG Consortium Mtg March 2007 Quick Start Guide to the OSG

33 Gratia -- job accounting system http://gratia-osg.fnal.gov:8880/gratia-reporting/ http://gratia-osg.fnal.gov:8880/gratia-reporting/

34 Grid School Syllabus Intro to distributed computing and the Grid Grid security and basic Grid access Grid resource and job management Grid data management Building, monitoring, maintaining & using Grids Grid applications and frameworks Workflow and related issues

35 Conclusion: Why Grids? New approaches to inquiry based on –Deep analysis of huge quantities of data –Interdisciplinary collaboration –Large-scale simulation and analysis –Smart instrumentation –Dynamically assemble the resources to tackle a new scale of problem Enabled by access to resources & services without regard for location & other barriers 35

36 Grids: Because Science needs community … Teams organized around common goals –People, resource, software, data, instruments… With diverse membership & capabilities –Expertise in multiple areas required And geographic and political distribution –No location/organization possesses all required skills and resources Must adapt as a function of the situation –Adjust membership, reallocate responsibilities, renegotiate resources 36

37 Getting Started with OSG I want to use OSG resources I want to get my application running on OSG I want information about adapting my campus IT facility to form a campus gridI want information about adapting my campus IT facility to form a campus grid I want to federate or partner my grid with OSG I want to make resources available to OSG I want to help build OSG

38 I want to use OSG resources Must join a VO: –Individual/small for independent research Join OSG VO –Join an existing member VO (see the list of current OSG VOs )list of current OSG VOs –Form new VO for your research community Run VO specific applications

39 Want to get my app running on OSG Engagement teamEngagement –Dedicated effort –Genetics, library science, earthquake simulation, video processing, physics; Examples: Production running using 20,000+ CPU hours of the CHARMM molecular dynamic simulation to the problem of water penetration in staphylococcal nuclease using opportunistically available resources across 10+ OSG sites ( see Grids work like a CHARMM for molecular dynamics)Grids work like a CHARMM for molecular dynamics) Improvement of the performance of the nanoWire application from the nanoHub project on OSG/TeraGrid, such that stable running of batches of 500 jobs across more than 5 sites is routine; (see Keeping up with Moore's Law)Keeping up with Moore's Law Adaptation and production running opportunistically using 100,000+ CPU hours of the Rosetta application from the Kuhlman Laboratory in North Carolina across more than 13 OSG sites (see Designing proteins from scratch) Designing proteins from scratch) Production runs of the Weather Research and Forecast (WRF) application using more 150,000 CPUhours on the NERSC OSG site at Lawrence Berkeley National Laboratory (LBNL)

40 Want to form a campus grid You have a campus IT facility –Want to make it a campus grid –And federate it with OSG OSG is committed to including US universities in the national cyberinfrastructure. –The OSG middleware and operational framework enables any site to participate as an OSG resource, provided it is a well maintained resource that users can count on Technically there are no hurdles in having every US university and college contribute resources to OSG and use OSG resources in return. –See module on Friday Several campuses have done so very well: Purdue University, University of Wisconsin- Madison, and Clemson University Several other universities participate in OSG through individual research groups.

41 I want to federate or partner my grid with OSG OSG Consortium envisions a world-wide grid formed of a number of different federations of grids (analogous to the Internet as a network of networks) Federation, therefore, is a natural concept within OSG We are interested in partnerships with other grids trying to develop richer methods and tools for federation.

42 I want to make resources accessible to OSG Recommended that you join a VO Minimal requirements –sufficient to assure interoperability, stability –set of "standard" services which define the requirements on interfaces and capabilities. Register resource with GOC –See module on Friday

43 OSG - Education, Training and Outreach OpenScienceGrid.org/Education OpenScienceGrid.org/About/Outreach OpenScienceGrid.org/Education OpenScienceGrid.org/About/Outreach eot@OpenScienceGrid.org

44 OSG EOT Mission Organize and deliver training for OSG –OSG End Users –Site Administrators –Support new communities / VOs joining OSG Engage young people in (e)Science and CS –Primary focus: graduate students and faculty –Promote and train in interdisciplinary collaboration –Reach high schools through I2U2 (QuarkNet follow-on) Reach out –To under-represented communities Engage and assist minority students and minority serving institutions by providing resources and opportunities. –internationally Strengthen and assist emerging, underserved regions of strategic importance to form bonds to US science and Grid communities Focus (for outreach) is on Latin America and Africa OISE focus on engagement in Europe and Asia

45 OSG EOT Program Overview End User Education –In-person workshops –Online training –EOT VO for student engagement, access and support Community Outreach –International student/faculty exchange via OISE –Supporting under-represented and under-resourced communities in US, Latin America and Africa through workshops, technical assistance and grid access –High School Education – I2U2 support - http://ed.fnal.gov/uueo/i2u2.html Site Admin Training –Training grid administrators in setup and support of OSG sites using the OSG/VDT software stack

46 2007-08 Workshop Program www.opensciencegrid.org/workshops Georgetown University Grid School 2008, April 15-17, DC Tuskegee University Grid School 2008, Feb 6-8 - Tuskegee AL Florida International Grid School 2008, Jan 23-25, at Florida International University, Miami, Florida Supercomputing ’07 tutorials, Nov 11 & 13, at Reno, Nevada Great Plains Grid School (GPGS’07), Aug 8-10, at the U. of Nebraska- Lincoln Rio Grande Grid School (RGGS’07), Jun 8-10, at the U. of Texas at Brownsville, coordinated with UT-Pan American TeraGrid Conference tutorials, Jun 4-8, at the U. of Wisconsin-Madison South Africa Workshop, Mar 26-30, at the IFIP School on Software (ISS’07), Gordon's Bay, South Africa Midwest Grid Workshop (MGW’07), Mar 24-25 at the U. of Illinois at Chicago Argentine Grid Workshop, Mar 12-14 at Santa Fe, Argentina

47 Self-paced / online instruction opensciencegrid.org/OnlineGridCourse Flexible roadmaps for navigating the material Lectures and labs Access to online community to provide support Online office hours

48 I2U2 Interactions In Understanding the Universe The Grid for Secondary Science Education “educational virtual organization” creates an infrastructure to develop – hands-on laboratory course content and – an interactive learning experience that brings tangible aspects of each experiment into a “virtual laboratory.” These labs use the Grid for education in the same way that science uses the Grid. www.i2u2.org

49 I2U2 "e-Labs” –delivered as Web-based portals accessible in the classroom and at home – implemented with of Web-based media capabilities "i-Labs” –delivered as interactive interfaces typically located within science museums and similar public venues –leverage the latest advances in display technology and human-computer interaction, –and bring the experiences and appreciation of scientific investigation and inquiry to the wide audience of informal education

50 List of e-Labs –Cosmic Ray e-LabCosmic Ray e-Lab High school students investigate data from a cosmic ray detector array. (not necessary to have a detector to participate.) Possible investigations: ・ Muon Lifetime ・ Diurnal changes in flux ・ Effects of shielding ・ High-energy showers ・ Altitude effects –CMS Test Beam e-Lab (Beta Version)CMS Test Beam e-Lab High school students analyze CMS test beam data in an online graphical ROOT environment. Shower Depth ・ Lateral Shower Size ・ Beam Purity ・ Detector Resolution –LIGO e-Lab (Beta version)LIGO e-Lab High school and middle school students investigate seismic behavior with data from LIGO ( Laser Interferometer Gravitational-wave Observatory). Earthquake Studies ・ Frequency Band Studies ・ Microseismic Studies ・ Studies of Human-induced Seismic Activity –ATLAS e-Lab –STAR e-Lab

51 i-Labs To engage the general public in science, we envision using appealing museum exhibits to attract visitors' attentions and engage them in a short taste of exploration they will use virtual data tools and techniques to access, process and publish data, report their results as online posters, have online discussions about their work with peers, and then present posters and meet scientists at museums. Example: – Adler Planetarium is developing a cosmic ray i-Lab with support from QuarkNet and the Compact Muon Solenoid (CMS) experiment. effort to research an informal-education model

52 Cooperation with EGEE International Schools on Grid Computing –OSG as co-organizer for ISSGC’07/08 and ISSGC’09 International Summer School on Grid Computing www.issgc.orgwww.issgc.org sponsor alumni of US Grid Schools to attend the International Summer school. –Joint lectureships and material sharing / development efforts –Content sharing

53 Cooperation with TeraGrid Another major national cyberinfrastructure Use of TG and OSG resources Contribute content Joint training

54 Education VO Interested in getting started with OSG ? Join OSGEDU VO –Use OSG resources –Contribute resources Wiki, email lists, follow-up discussions –Support, engagement –Postings of opportunities for students

55 Students 2004-2008 facts: International participation: –Argentina, Brazil, Canada, Colombia, India, Mexico, New Zealand, Russia, South Africa, Uruguay Women –Approx. 15% Minorities –Approx 15% Try to improve these statistics

56 Participants’ domains Computer Science Image processing Communications Networking Physics Astrophysics High Energy Nuclear Physics Optical Networks Theoretical solid state physics Atomic Physics Computational Physics Chemistry Computational Chemistry Molecular Dynamics & Simulation Applied Mathematics Geosciences Computational Multibody Dynamics for Distributed computing Judicial Administration Engineering Materials Science Quantum theory …and others …

57 How do you join the OSG? A software perspective see module on Friday

58 Summary of OSG Provides core services, software and a distributed facility for an increasing set of research communities. Helps VOs access resources on many different infrastructures. Interested in collaborating and contributing our experience and efforts.

59 it’s the people…that make the grid a community! http://www.opensciencegrid.orghttp://www.opensciencegrid.org

60 Acknowledgments Various OSG members and contributors (Alain Roy, Mike Wilde, Ruth Pordes, Gabrielle Allen and many others …)

61 Joining OSG Assumption: – You have a campus grid Question: –What changes do you need to make to join OSG?

62 Your Campus Grid assuming that you have a cluster with a batch system: –Condor –Sun Grid Engine –PBS/Torque –LSF

63 Administrative Work You need a security contact –Who will respond to security concerns You need to register your site You should have a web page about your site. –This will be published –People can learn about your site.

64 Big Picture Compute Element (CE) –OSG jobs submitted to CE, which gives them to batch system –Also has information services and lots of support software Shared file system –OSG requires a couple of directories to be mounted on all worker nodes Storage Element (SE) –How do you manage your storage at your site

65 Installing Software The OSG Software Stack –Based on the VDT The majority of the software you’ll install It is grid independent –OSG Software Stack: VDT + OSG-specific configuration Installed via Pacman

66 What is installed? GRAM: –Allows job submissions GridFTP: – Allows file transfers CEMon/GIP: – Publishes site information Some authorization mechanism –grid-mapfile: file that lists authorized users, or –GUMS (grid identity mapping service) And a few other things…

67 OSG Middleware Infrastructure Applications VO Middleware Core grid technology distributions: Condor, Globus, Myproxy: shared with TeraGrid and others Virtual Data Toolkit (VDT) core technologies + software needed by stakeholders: many components shared with EGEE OSG Release Cache: OSG specific configurations, utilities etc. HEP Data and workflow management etc Biology Portals, databases etc User Science Codes and Interfaces Existing Operating, Batch systems and Utilities. Astrophysics Data replication etc

68 Picture of a basic site

69 Shared file system OSG_APP –For users to store applications OSG_DATA –A place to store data –Highly recommended, not required OSG_GRID –Software needed on worker nodes –Not required –May not exist on non-Linux clusters Home directories for users –Not required, but often very convenient

70 Storage Element Some folks require more sophisticated storage management –How do worker nodes access data? –How do you handle terabytes (petabytes?) of data Storage Elements are more complicated –More planning needed –Some are complex to install and configure Two OSG supported options of SRMs: –dCache –Bestman


Download ppt "Grid computing using Alina Bejan University of Chicago."

Similar presentations


Ads by Google