Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 P-GRADE Portal Family for e-Science Communities Peter Kacsuk Peter Kacsuk MTA SZTAKI Univ. of Westminster.

Similar presentations


Presentation on theme: "1 P-GRADE Portal Family for e-Science Communities Peter Kacsuk Peter Kacsuk MTA SZTAKI Univ. of Westminster."— Presentation transcript:

1 1 www.lpds.sztaki.hu/pgportal pgportal@lpds.sztaki.hu P-GRADE Portal Family for e-Science Communities Peter Kacsuk Peter Kacsuk MTA SZTAKI Univ. of Westminster

2 2 The community aspects of e-science Web2 is about creating and supporting web communities Grid is about creating virtual organizations where e- science communities –can share resources and –can collaborate A portal should support e-science communities in their collaborations and resource sharing And even more: it should provide simultaneous access to any accessible –Resources –Databases –Legacy applications –Workflows, etc. no matter in which grid they are operated on.

3 3 Who are the members of an e-science community? Grid Portal Developers Develop the portal core services (job submission, etc.) Develop higher level portal services (workflow management, etc.) Develop specialized/customized portal services (grid testing, rendering, etc.) Writes technical, user and installation manuals End-users (e-scientists) Execute the published applications with custom input parameters by creating application instances using the published applications as templates Grid Application Developers Develop grid applications by the portal Publish the completed applications for end-users

4 4 by transparently accessing a large set of various IT resources from the e-science infrastructure Clouds Local clusters Supercomputers Desktop grids (DGs) (BOINC, Condor, etc.) Cluster based service grids (SGs) (EGEE, OSG, etc.) Supercomputer based SGs (DEISA, TeraGrid) Grid systems E-science infrastructure What does an individual e-scientist need? App. Repository Access to a large set of ready-to-run scientific applications (services) Portal Using a portal to parameterize and run these applications

5 5 What does an e-science community need? App. Repository Portal Clouds Local clusters Supercomputers Desktop grids (DGs) (BOINC, Condor, etc.) Cluster based service grids (SGs) (EGEE, OSG, etc.) Supercomputer based SGs (DEISA, TeraGrid) Grid systems Application developers E-scientists The same as an individual scientist but in collaboration with other members of the community

6 6 Collaboration between e-scientists and application developers App. Repository Portal Application developers E-scientists End-users (e-scientists) Specify the problem/application needs Execute the published applications via the portal with custom input parameters by creating application instances Application Developers Develop e-science applications via the portal in collaboration with e-scientists Publish the completed applications for end-users via an application repository

7 7 Collaboration between application developers App. Repository Portal Clouds Local clusters Supercomputers Desktop grids (DGs) (BOINC, Condor, etc.) Cluster based service grids (SGs) (EGEE, OSG, etc.) Supercomputer based SGs (DEISA, TeraGrid) Grid systems Application developers Application developers use the portal to develop complex applications (e.g. parameter sweep workflow) for the e-science infrastructure Publish templates, legacy code appls. and half- made applications in the repository to be continued by other appl. developers

8 8 Collaboration between e-scientists App. Repository Portal Clouds Local clusters Supercomputers Desktop grids (DGs) (BOINC, Condor, etc.) Cluster based service grids (SGs) (EGEE, OSG, etc.) Supercomputer based SGs (DEISA, TeraGrid) Grid systems E-scientists Sharing parameterized appls via the repository Joint run appls via the portal in the e-science infrastructure Joint observation and control of appl execution via the portal

9 9 Requirements for an e-science portal from the e-scientists point of view It should be able to Support large number of e-scientists (~ 100) with good response time Enable the store and share of ready-to-run applications Enable to parameterize and run applications Enable to observe and control application execution Provide reliable appl. execution service even on top of unreliable infrastructures (like for example grids) Provide specific, user community views Enable the access of the various components of an e- science infrastructure (grids, databases, clouds, local clusters, etc.) Support users collaboration via sharing: –Applications (legacy, workflow, etc.) –Databases

10 10 Requirements for an e-science portal from the app. developers point of view It should be able to Support large number of application developers (~ 100) with good response time Enable the store and share of half-made applications, application templates Provide graphical appl. developing tools (e.g. workflow editor) to develop new applications Enable to parameterize and run applications Enable to observe and control application execution Provide methods and API to customize the portal interface towards specific user community needs by creating user-specific portlets Enable the access of the various components of an e-science infrastructure (grids, databases, clouds, local clusters, etc.) Support application developers collaboration via sharing: –Applications (legacy, workflow, etc.) –Databases Enable the integration/call of other services

11 11 Choice of an e-science portal Basic question for a community: –Buy a commercial portal? (Usually expensive) –Download OSS portal? (Good choice but: Does the OSS project survive for a long time?) –Develop own portal? (Requires long time and can become very costly) The best choice is: Download OSS where there is an active development community behind the portal

12 12 The role of the Grid portal developers community Grid Portal Developers Jointly develop the portal core services (e. g. GridSphere, OGCE, Jetspeed-2, etc.) Jointly develop higher level portal services (workflow management, data management, etc.) Jointly develop specialized/customized portal services (grid testing, rendering, etc.) Never build a new portal from scratch, use the power of the community to create really good portals Unfortunately, we are not quite there: –Hundreds of e-science portals have been developed –Some of them are really good: Genius, Lead, etc. –However, not many of them OSS (see the sourceforge list on the next slide) –Even less is actively maintained –Even less satisfies the generic requirements of a good e-science portal

13 13 Downloadable Grid portals from SourceForge GenericSinceNumber of downloads Active or Finished activity P-GRADEyes2008-01-041468Active SDSC Gridport yes2003-10-0112662004-01-15 Lunarc App. yes2006-10-05783Active GRIDPortal for NorduGrid yes2006-07-072312006-08-09 NCHCyes2007-11-07161Active TelemedApp. Spec.2007-11-15283Active

14 14 P-GRADE portal family The goal of the P-GRADE portal family –To meet all the requirements of end-users and application developers listed above –To provide a generic portal that can be used by a large set of e-science communities –To provide a community code based on which the portal developers community can start to develop specialized and customized portals

15 15 P-GRADE portal family P-GRADE portal 2.4 NGS P-GRADE portal P-GRADE portal 2.5 Param. Sweep P-GRADE portal 2.8 Current release P-GRADE portal 2.9 Under development WS-PGRADE Portal Beta release 3.3 WS-PGRADE Portal Release 3.4 GEMLCA Grid Legacy Code Arch. GEMLCA, repository concept Basic concept Open source from Jan. 2008 2008 2009 2010

16 16 P-GRADE Portal in a nutshell General purpose, workflow-oriented Grid portal. Supports the development and execution of workflow-based Grid applications – a tool for Grid orchestration Based on GridSphere-2 –Easy to expand with new portlets (e.g. application-specific portlets) –Easy to tailor to end-user needs Basic Grid services supported by the portal: ServiceEGEE grids (LCG-2/gLite)Globus 2 grids Job submissionComputing ElementGRAM File storageStorage Element, LFCGridFTP server Certificate managementMyProxy/VOMS Information systemBDIIMDS-2, MDS-4 BrokeringWMS (Workload Management System) GTbroker Job monitoringMercury Workflow & job visualizationPROVE

17 17 The typical user scenario Part 1 - development phase Certificate servers Portal server Grid services START EDITOR OPEN & EDIT or DEVELOP WORKFLOW SAVE WORKFLOW, UPLOAD LOCAL FILES

18 18 Certificate servers Portal server Grid services TRANSFER FILES, SUBMIT JOBS DOWNLOAD (SMALL) RESULTS The typical user scenario Part 2 - execution phase VISUALIZE JOBS and WORKFLOW PROGRESS MONITOR JOBS DOWNLOAD PROXY CERTIFICATES SUBMIT WORKFLOW

19 19 P-GRADE Portal architecture Tomcat DAGMan workflow manager gLite and Globus Information systems MyProxy server & VOMS P-GRADE Portal portlets (JSR-168 Gridsphere-2 portlets) Information system clients CoG API & scripts Java Webstart workflow editor Web browser shell scripts Grid middleware services (gLite WMS, LFC,…; Globus GRAM, …) Client P-GRADE Portal server Grid Grid middleware clients Backend layer Frontend layer

20 20 P-GRADE portal in a nutshell Certificate and proxy management Grid and Grid resource management Graphical editor to define workflows and parametric studies Accessing resources in multiple VOs Built-in workflow manager and execution visualization GUI is customizable to certain applications

21 21 What is a P-GRADE Portal workflow? A directed acyclic graph where –Nodes represent jobs (batch programs to be executed on a computing element) –Ports represent input/output files the jobs expect/produce –Arcs represent file transfer operations and job dependencies Semantics of the workflow: –A job can be executed if all of its input files are available

22 22 Introducing three levels of parallelism Each job can be a parallel program – Parallel execution inside a workflow node – Parallel execution among workflow nodes Multiple jobs run parallel – Parameter study execution of the workflow Multiple instances of the same workflow with different data files

23 23 Parameter sweep (PS) workflow execution based on the black box concept PS port: 4 instances of the input file PS port: 3 instances of the input file 1 PS workflow execution = 4 x 3 normal executable workflows (e-workflows) This provides the 3 rd level of parallelism resulting a very large demand for Grid resources

24 24 Workflow parameter studies in P-GRADE Portal Generator component(s) Initial input data Generate or cut input into smaller pieces Collector component(s) Aggregate result Files in the same LFC catalog (e.g. /grid/gilda/sipos/myinputs ) Results produced in the same catalog Core workflow E-workflows

25 25 Generic structure of PS workflows and their execution 1 st phase: executing all Generator s in parallel 3 rd phase: executing all Collectors in parallel 2 nd phase: executing all generated eWorkflows in parallel Core workflow to be executed as PS Generator jobs to generate the set of input files Collector jobs to collect and process the set of output files

26 26 Integrating P-GRADE portal with DSpace repository Goal: to make available workflow applications for the whole P-GRADE portal user community Solution: Integrating P-GRADE portal with DSpace repository Functions: –App developers can publish their ready-to-use and half- made applications in the repository –End-users can download, parameterize and execute the applications stored in the repository Portal DSpace repository Portal End-user App developer Portal Advantage: Appl. developers can collaborate with end-users Members of a portal user community can share their WFs Different portal user communities can share their WFs

27 27 Integrating P-GRADE portal with DSpace repository DSpace Repository Upload WF to DSpace Download WF from DSpace

28 28 Creating application specific portals from the generic P-GRADE portal Creating an appl. spec. portal does not mean to develop it from scratch P-GRADE is a generic portal that can quickly and easily be customized to any application type Advantage: –You do not have to develop the generic parts (WF editor, WF manager, job submission, monitoring, etc.) –You can concentrate on the appl. spec. part –Much shorter development time

29 29 Application Specific Module Concept of creating application specific portals Custom User Interface (Written in Java, JSP, JSTL) Web browser EGEE and Globus Grid services (gLite WMS, LFC,…; Globus GRAM, …) Client P-GRADE Portal server Grid Services of P-GRADE Portal (workflow management, parameter study management, fault tolerance, …) P-GRADE portal developer Appl. developer End user

30 30 Roles of people in creating and using customized P-GRADE portals Grid Application Developer develops a grid application by P-GRADE Portal sends the application to the grid portal developer Grid Portal Developer Creates new classes from the ASM for P-GRADE by changing the names of the classes Develops one or more Gridsphere portlets that fit to the application I/O pattern and the end users needs Connects the GUI to P-GRADE Portal using the programming API of P-GRADE ASM Using the ASM he publishes the grid application and its GUI for end users End User Executes the published application with custom input parameters by creating application instances using the published application as a template They can be the same group

31 31 Application Specific P-GRADE portals Rendering portal by Univ. of Westminster OMNeT++ portal by SZTAKI Traffic simulation portal by Univ. of Westminster

32 32 Grid interoperation by P-GRADE portal P-GRADE Portal enables: Simultaneous usage of several production Grids at workflow level Currently connectable grids: –LCG-2 and gLite: EGEE, SEE-GRID, BalticGrid –GT-2: UK NGS, US OSG, US Teragrid In progress: –Campus Grids with PBS or LSF –BOINC desktop Grids –ARC: NorduGrid –UniCore: D-Grid

33 33 User P-GRADE Portal SZTAKI Portal Server Simultaneous use of production Grids at workflow level WMS broker Workflow Manchester Leeds UK NGS GT2 EGEE-VOCE gLite Job Budapest Athens Brno Supports both direct and brokered job submission

34 34 P-GRADE Portal references P-GRADE Portal services: –SEE-GRID, BalticGrid –Central European VO of EGEE –GILDA: Training VO of EGEE –Many national Grids (UK, Ireland, Croatia, Turkey, Spain, Belgium, Malaysia, Kazakhstan, Switzerland, Australia, etc.) –US Open Science Grid, TeraGrid –Economy-Grid, Swiss BioGrid, Bio and Biomed EGEE VOs, MathGrid, etc. Portal services and account request: –portal.p-grade.hu/index.php?m=5&s=0

35 35 Community based business model for the sustainability of P-GRADE portal Some of the developments are related to EU projects. Examples: –PS feature: SEE-GRID-2 –Integration with DSpace: SEE-GRID-SCI –Integration with BOINC: EDGeS, CancerGrid There is an open Portal Developer Alliance with the current active members: –Middle East Technical Univ. (Ankara, Turkey) gLite file catalog management portlet –Univ. of Westminster (London, UK) GEMLCA legacy code service extension SRB integration (workflow and portlet) OGSA-DAI integration (workflow and portlet) Embedding Taverna, Kepler and Triana WFs into the P-GRADE workflow All these features are available in the UK NGS P-GRADE portal

36 36 Business model for the sustainability of P-GRADE portal Some of the developments are ordered by customer academic institutes: –Collaborative WF editor: Reading Univ. (UK) –Accounting portlet: MIMOS (Malaysia) –Separation of front-end and back-end: MIMOS –Shiboleth integration: ETH Zurich –ARC integration: ETH Zurich Benefits for the customer academic institutes: –Basically they like the portal but they have some special needs that require extra development –Instead of developing from scratch a new portal (using many person-months) rather they pay only for the required little extension/modification of the portal –To solve their problem gets priority –They become expert of the internal structure of the portal and will be able to further develop it according to their needs –Joint publications

37 37 Main features of NGS P-GRADE portal Extends P-GRADE portal with –GEMLCA legacy code architecture and repository –SRB file management –OGSA-DAI database access –WF level interoperation of grid data resources –Workflow interoperability support All these features are provided as production service for the UK NGS

38 38 J1 J2 J3 J4 J5 Grid 1 Grid 2 DB2 FS2 DB1 FS1 Workflow engine J: Job FS: File storage system, e.g. SRB or SRM DB: Database management system (based on OGSA-DAI) Interoperation of grid data resources

39 39 Running at OSG Running at EGEE Running at NGS From NGS SRB (both) From NGS SRB From NGS GFTP From local (both) From NGS SRB To EGEE SRM Running at NGS From NGS GFTP To NGS SRB Workflow level Interoperation of local, SRB, SRM and GridFTP file systems Jobs can run in various grids and can read and write files stored in different grid systems by different file management systems

40 40 WF interoperability: P-GRADE workflow embedding Triana, Taverna, and Kepler workflows Taverna workflow Kepler workflow Triana workflow P-GRADE workflow hosting the other workflows Available for UK NGS users as production service

41 41 WS-PGRADE and gUSE New product in the P-GRADE portal family: –WS-PGRADE (Web Services Parallel Grid Runtime and Developer Environment) WS-PGRADE uses the high-level services of –gUSE (Grid User Support Environment) architecture Integrates and generalizes P-GRADE portal and NGS P-GRADE portal features –Advance data-flows (PS features) –GEMLCA –Workflow repository gUSE features –Scalable architecture (can be installed on one or more servers) –Various grid submission services (GT2, GT4, LCG-2, gLite, BOINC, local –Built-in inter-grid broker (seamless access to various types of resources) Comfort features –Different separated user views supported by gUSE application repository

42 42 gUSE: service-oriented architecture Graphical User Interface: WS-PGRADE Workflow Engine Workflow storage File storage Application repository Logging gUSE information system Submitters Gridsphere portlets Autonomous Services: high level middleware service layer Resources: middleware service layer Local resources, Service grid resources, Desktop Grid resources, Web services, Databases gUSE Meta-broker Submitters File storage Submitters

43 43 Ergonomics Users can be grid application developers or end-users. Application developers design sophisticated dataflow graphs –embedding into any depth, recursive invocations, conditional structures, generators and collectors at any position –Publish applications in the repository at certain stages of work Applications Projects Concrete workflows Templates Graphs End-users see WS-PGRADE portal as a science gateway –List of ready-to-use applications in gUSE repository –Import and execute application without knowledge of programming, dataflow or grid

44 44 Dataflow programming concept for appl. developers Cross & dot product data- pairing –Concept similar to Taverna –All-to-all vs. one-to-one pairing of data items Any component can be generator, PS node or collector, no ordering restriction Conditional execution based on equality of data Nesting, recursion 40 1000 50 20 5000 1 1 7042 tasks

45 45 Current users of gUSE beta release CancerGrid project –Predicting various properties of molecules to find anti-cancer leads –Creating science gateway for chemists EDGeS project (Enabling Desktop Grids for e-Science) –Integrating EGEE with BOINC and XtremWeb technologies –User interfaces and tools ProSim project –In silico simulation of intermolecular recognition –JISC ENGAGE program (UK)

46 46 molecule database executing workflows browsing molecules DG clients from all partners Molecule database server Portal and DesktopGrid server BOINC server 3G Bridge Portal gUSE DG jobs WU 1 WU 2 WU N Job 1 Job 2 Job N GenWrapper for batch execution BOINC client Legacy Application Portal Storage Local Resource Local jobs Legacy Application WU X WU Y The CancerGrid infrastructure

47 47 CancerGrid workflow x1 xN NxM= 3 millions NxM xN N=30K xN xNxN NxM Generator job N = 30K, M = 100 --> about 0.5 year execution time NxM= 3 millions Execute on local desktop Grid

48 48 Protein Molecule Simulation on the Grid G-USE in ProSim Project Grid Computing team of Univ. of Westminster

49 49 The User Scenario PDB file 1 (Receptor) PDB file 2 (Ligand) Energy Minimization (Gromacs) Validate (Molprobity) Check (Molprobity) Perform docking (AutoDock) Molecular Dynamics (Gromacs)

50 50 The Workflow in g-USE Parameter sweeps in phases 3 and 4 Executed on 5 different sites of the UK NGS

51 51 The ProSim visualiser

52 52 P-GRADE portal family summary P-GRADENGS P-GRADEWS-PGRADE Scalability++++++ RepositoryDSpace/WFJob & legacy code services WF (own development) Graphical workflow editor +++ Parameter sweep support +-++ Access to various grids GT2, LCG-2, gLite GT2, LCG-2, gLite, GT4 GT2, LCG-2, gLite, GT4, BOINC, campus Access to cloudsIn progress- Access to databases -via OGSA DAISQL Support for WF interoperability -+In progress

53 53 Further information… –Take a look at www.lpds.sztaki.hu/pgportal (manuals, slide shows, installation procedure, etc.) –Visit or request a training event! (list of events is on P-GRADE Portal homepage) Lectures, demos, hands-on tutorials, application development support –Get an account for the GILDA P-GRADE Portal: www.portal.p-grade.hu/gilda –Get an account for one of its production installations: Multi-grid portal (SZTAKI) for VOCE, SEEGRID, HUNGrid, Biomed VO, Compchem VO, ASTRO VO NGS P-GRADE portal (Univ. of Westminster) for UK NGS –Install a portal for your community: If you are the administrator of a Grid/VO, download the portal from sourceforge (http://sourceforge.net/projects/pgportal/ ) SZTAKI is pleased to help you install a portal for your community!

54 54 Thank you for your attention! Any questions? www.portal.p-grade.huwww.wspgrade.hu


Download ppt "1 P-GRADE Portal Family for e-Science Communities Peter Kacsuk Peter Kacsuk MTA SZTAKI Univ. of Westminster."

Similar presentations


Ads by Google