Presentation is loading. Please wait.

Presentation is loading. Please wait.

Seed Project Working Group 07.05.20071 Seed Project Working Group Report Seed Project Working Group Wibke.

Similar presentations


Presentation on theme: "Seed Project Working Group 07.05.20071 Seed Project Working Group Report Seed Project Working Group Wibke."— Presentation transcript:

1 Seed Project Working Group 07.05.20071 Seed Project Working Group Report Seed Project Working Group seed-wg@swiss-grid.org seed-wg@swiss-grid.org Wibke Sudholt, University of Zurich (lead) wibke@oci.uzh.ch wibke@oci.uzh.ch

2 Seed Project Working Group 07.05.20072 Seed Project Working Group Nabil Abdennadher, EIG/HES-SO Peter Engel, UniBE Derek Feichtinger, PSI Dean Flanders, FMI Placi Flury, SWITCH Pascal Jermini, EPFL Sergio Maffioletti, CSCS Cesare Pautasso, IBM Heinz Stockinger, SIB Wibke Sudholt, UZH (lead) Michela Thiemard, EPFL Nadya Williams, UZH/CSCS Christoph Witzig, SWITCH

3 Seed Project Working Group 07.05.20073 Goals Followup Workshop November 2007 –What is there already? Low hanging fruit. –Come up with a plan. –Interoperability among sites. –Security infrastructure. Report February 2007 –Identify which resources (people, hardware, middleware, applications, ideas) are readily available and represent strong interest among the current partners of the SGI. –Based on available resources, propose one or more seed project(s) that will help to initialize, test, and demonstrate the SGI collaboration. The seed project(s) should be realizable in a fast, easy, and inexpensive manner (“low hanging fruit”). –Provide help with the coordination and the realization of the defined seed project(s).

4 Seed Project Working Group 07.05.20074 Collaboration and Activities Foundation on Swiss Grid Initiative Followup Workshop in Bern on November 23, 2006 Wiki page: https://twiki.cscs.ch/bin/view/SwissGridInitiative/ SeedProjectWorkingGrouphttps://twiki.cscs.ch/bin/view/SwissGridInitiative/ SeedProjectWorkingGroup Mailing list: seed-wg@swiss-grid.orgseed-wg@swiss-grid.org Subversion source repository: http://svn.cscs.ch/SGI-seed/http://svn.cscs.ch/SGI-seed/ In-person meetings: 07.12.2006 in Fribourg (Grid Crunching Day), 07.05.2007 in Bern (Swiss Grid Day) Phone conferences: 18.01.2007, 01.02.2007, 20.02.2007, 20.03.2007, 04.04.2007, 24.04.2007 (summaries available) Seed Project Survey Intermediate report on February 28, 2007 Work on realization of seed project

5 Seed Project Working Group 07.05.20075 Seed Project Survey Informal inventory of resources available for seed project Form requesting information about –Member groups –Available personnel –Computer hardware –Lower-level grid middleware –Higher-level grid middleware –Scientific application software and data –Seed project ideas Submitted to Swiss Grid Initiative mailing list on December 12, 2006, deadline for responses on December 22, 2006 Received 12 completed forms and two supportive emails as responses

6 Seed Project Working Group 07.05.20076 Survey Results There is high interest in the Swiss Grid Initiative and the seed project. Already a lot of grid computing expertise from other grid projects and middleware exists. Some groups willing to invest more than spare time, but manpower still sparse resource. No shortage in computer hardware, mostly Linux clusters or desktop PCs. Actual availability for seed project remains to be seen. Different lower-level grid middleware employed or developed at member sites, but enough common ground. Higher-level grid middleware and applications diverse and often coupled. Typical high-performance computing domain areas (biology, chemistry, physics). Two main themes in the project ideas –Specific middleware or application projects –Work towards grid interoperability Related to opinion if there should be several or one seed project.

7 Seed Project Working Group 07.05.20077 Seed Project Proposal Cover the two different aspects from the survey –Involve as many existing infrastructure sites and suggested scientific applications as possible –Serve as testbed for grid interoperability Tackle the seed project in a two-fold way –Build a cross-product infrastructure of selected grid middleware and applications by gridifying each application on each middleware pool in a non-intrusive manner –Record experiences and deduct a list of requirements for selection or creation of a meta-middleware Handling of practical aspects –Seed Project Working Group manages the seed project in collaboration with the other Swiss Grid Initiative members –Rely on the help of SWITCH for security aspects (authentication and authorization, virtual organization, grid certificates, etc.)

8 Seed Project Working Group 07.05.20078 Seed Project Definition

9 Seed Project Working Group 07.05.20079 Selection of Middleware Criteria –Already deployed at partner sites –Sufficient expertise and manpower –Representative of existing larger grid efforts –Not too complex requirements –Must be diverse and provide sufficient set of capabilities Initial focus –EGEE gLite / Globus Toolkit 2 (deployed at CHIPP, CSCS, SWITCH, UniBas, SIB) - responsible: Heinz Stockinger, SIB –Nordugrid ARC (deployed at CSCS, SIB/Vital-IT, UniBas, UZH) - responsible: Sergio Maffioletti, CSCS –XtremWeb-CH (developed and deployed at EIG/HES-SO) - responsible: Nabil Abdennadher, EIG/HES-SO –Condor (deployed at EPFL) - responsible: Pascal Jermini, EPFL Later focus –United Devices (deployed at UniBas and others) –Globus Toolkit 4 / WSRF (pre-WS components deployed at CSCS, UZH) –UNICORE (deployed at UZH and others)

10 Seed Project Working Group 07.05.200710 Selection of Applications Criteria –Need for application from the Swiss scientific user community –Sufficient expertise and manpower –Not too complex requirements –Gridification on basis of individual executions or embarrassingly parallel parameter scans, without changing the source code if possible –Should be diverse and cover sufficient set of requirements –Reusage of existing grid-enabled applications Initial focus –Cones (mathematical crystallography, individual code) - responsible: Peter Engel, UniBE –GAMESS (quantum chemistry, standard free open source code) - responsible: Wibke Sudholt, UZH –Huygens (remote deconvolution for imaging, standard commercial code) - responsible: Dean Flanders, FMI –PHYLIP (bioinformatics, standard free open source code) - responsible: Nabil Abdennadher, EIG/HES-SO Later focus –Mascot (proteomics analysis, standard commercial code) - responsible: Dean Flanders, FMI –Monte Carlo simulation (high-energy physics) - responsible: Derek Feichtinger, PSI –Swiss Bio Grid applications

11 Seed Project Working Group 07.05.200711 Work towards Realization Wiki pages and documents –Middleware and application requirements lists - done –Middleware and application information - in progress –Project plan and status - in progress Building of middleware pools –Test infrastructure - in progress –Production infrastructure - to do Preparation of applications –Functional and “real-life” test cases - in progress –Grid-enabling of codes - in progress –Small program library for input and output processing - in progress Focus on collaboration, infrastructure and knowledge building, not on achieving scientific results

12 Seed Project Working Group 07.05.200712 Middleware: EGEE gLite Responsible person: Heinz Stockinger, SIB The EGEE middleware provides software tools for secure job submission, data management, etc. –Deployed in most of European countries –Biggest grid infrastructure world-wide gLite homepage: http://www.glite.org/http://www.glite.org/ Deployment status in Switzerland –Switch Resource Broker, VO management services, etc. Locally (behind firewall): Computing Element, Worker Node, gLite clients –CSCS Computing Element Storage Element gLite client software –SIB Lausanne Client software (LCG version)

13 Seed Project Working Group 07.05.200713 gLite (cont.) A Virtual Organisation called SGA has been created –Needs to be made available at CSCS and SIB Currently, the three sites have different versions of middleware due to different activities in EGEE –Versions will be adapted soon –SIB is planning to provide an additional gLite client machine Job submission to gLite is already possible using existing VOs such as CMS, biomed –Gives access to ~50-100 sites per VO –User certificate registration with EGEE is required

14 Seed Project Working Group 07.05.200714 Middleware: NorduGrid ARC Responsible person: Sergio Maffioletti, CSCS Grid middleware development and testbed deployment project in the Nordic countries NorduGrid middleware is ARC (Advanced Resource Connector) –Enables production-quality computational and data grids –Open source under GPL license –Uses replacements and extensions of Globus Toolkit pre-WS services NorduGrid homepage: http://www.nordugrid.org/http://www.nordugrid.org/

15 Seed Project Working Group 07.05.200715 NorduGrid (cont.) NorduGrid ARC middleware deployed in Switzerland as part of the Swiss Bio Grid project Status of seed project: –Installed and set up at CSCS and UZH –Cones and GAMESS applications deployed and tested To do: –Deploy and test other applications –Integrate other NorduGrid sites

16 Seed Project Working Group 07.05.200716 Middleware: XtremWeb-CH Responsible person: Nabil Abdennadher, HES-SO XtremWeb-CH is a desktop grid middleware –Public (non-dedicated) platform –Supports communicating “jobs” and direct communications between “providers” (workers) –Can fix the “granularity” of the application according to the “state” of the platform XtremWeb-CH homepage: http://www.xtremwebch.net/http://www.xtremwebch.net/ XtremWeb-CH Wiki page: http://www.xtremwebch.net/mediawiki/index.php/Main_Page http://www.xtremwebch.net/mediawiki/index.php/Main_Page Deployed applications –PHYLIP: PHYLogeny Inference Package XtremWeb-CH today –~200 workers (mainly Windows platforms) –2 sites: EIG (Geneva) and HEIG-VD (Yverdon)

17 Seed Project Working Group 07.05.200717 XtremWeb-CH: Architecture Web Service XtremWeb-CH Coordinator User Application OS Binaries XML file Work request Work Result Scheduler Worker’s manager Task’s manager Warehouse Worker Work Alive Data C XWCH application Structure Brokers B A XWCH DB

18 Seed Project Working Group 07.05.200718 Middleware: Condor Responsible person: Pascal Jermini, EPFL Condor homepage: http://www.cs.wisc.edu/condor/http://www.cs.wisc.edu/condor/ Greedy@EPFL: http://greedy.epfl.ch/http://greedy.epfl.ch/ Condor provides the infrastructure for desktop grids –Job queues management –Resources management –Data and binaries transfer to the compute nodes –Promotes fair computing ressources sharing –Multi-platform (Linux, Windows, Mac OS X, some other UNIX variants) Can be interfaced with other middlewares such as UNICORE or Globus Middleware still in active developement (i.e., project not dead)

19 Seed Project Working Group 07.05.200719 Condor (cont.) Deployment status at EPFL –In production with approximately 200 desktop CPUs available 60% of Windows machines, 40% of Linux or OS X machines Computing power available only during the night and weekends. Machine owner has priority over any running job. Size of the pool is still growing –One submit server and one Central Manager (running Linux) –All nodes and servers behind EPFL firewall –Condor managers generally have no access to the compute nodes for third-party software installation (Condor is installed by node owners, not by Grid managers!) –Smaller grid (4 very old nodes) also available, but only for tests; restricted to Grid managers, but with full access to them. Due to the «desktop» nature of EPFL grid, relatively short jobs are advised (6h max.; not enforced)

20 Seed Project Working Group 07.05.200720 Application: Cones Responsible person: Peter Engel, UniBE (with help of Nadya Williams, UZH/CSCS) Cones is a crystallography program. For a given representative quadratic form it calculates its subcone of equivalent combinatorial parallelohedra. For dimension d = 6 number expected to be greater than 100’000’000 Cones Wiki page: https://twiki.cscs.ch/bin/view/ SwissGridInitiative/CONES https://twiki.cscs.ch/bin/view/ SwissGridInitiative/CONES

21 Seed Project Working Group 07.05.200721 Cones (cont.) C program developed by an individual Several text input files, one execution command, several text output files Can be executed in parallel in two ways –Running of several jobs off the same input file (best suited for cluster infrastructure, less than 24 nodes) –Cutting of input file into pieces (best suited for grid infrastructure, 500-2000 cones per input) Status of Cones deployment and testing –Wiki page - in progress –Adaptation and generalization of source code - done –Configuration and makefile creation - done –Test installation at CSCS and UZH - done –Test runs and comparison with known input - done –Remote job submission using NorduGrid testbed at CSCS and UZH - done

22 Seed Project Working Group 07.05.200722 Application: GAMESS Responsible person: Wibke Sudholt, UZH (with help of Nadya Williams, UZH/CSCS) General Atomic and Molecular Electronic Structure System Program package for ab initio molecular quantum chemistry Standard free open source code developed and used by many groups Available for large variety of operating systems and hardware Mainly Fortran 77 and C code and shell scripts GAMESS homepage: http://www. msg.ameslab.gov/GAMESS/http://www. msg.ameslab.gov/GAMESS/ GAMESS Wiki page: https://twiki. cscs.ch/bin/view/SwissGridInitiative/ GAMESShttps://twiki. cscs.ch/bin/view/SwissGridInitiative/ GAMESS Usually one keyword-driven text input file, one execution command, several text output files Well parallelized by its own implementation, called Distributed Data Interface (DDI) Comes with lots of functional test cases

23 Seed Project Working Group 07.05.200723 GAMESS (cont.) Two possible layers of grid distribution –External: Embarrassingly parallel parameter scans in input file (already previously implemented with Nimrod and BOINC) –Internal: Component distribution based on current DDI parallelization implementation (probably needs considerable programming efforts) Status of seed project –Wiki page - ongoing –Configuration and makefile creation - in progress –Test installation at CSCS and UZH - done –Test runs and comparison with known input - done –Remote job submission using NorduGrid testbed at CSCS and UZH - done –Collection of “real life” scientific test cases - ongoing (available on SVN) –Development of small Java program and library for creating parameter scan input files and commands - ongoing (available on SVN) –Integration with XtremWeb-CH - starting

24 Seed Project Working Group 07.05.200724 Application: Huygens Responsible: Dean Flanders, FMI Huygens is an image deconvolution software developed and distributed by Scientific Volume Imaging It can be used for the restoration, visualization, and analysis of microscopy images Standard commercial software, but parts available as freeware Scientific Volume Imaging homepage: http://www.svi.nl/http://www.svi.nl/ Five node-locked licenses available at Friedrich Miescher Institute, not always fully used Status of seed project –No known progress up to now To do –Probably agreement with Scientific Volume Imaging needed for grid use, but company usually very collaborative

25 Seed Project Working Group 07.05.200725 Application: PHYLIP Responsible person: Nabil Abdennadher, HES-SO PHYLogeny Inference Package Used to generate “life” trees (evolutionary trees) The most widely-distributed phylogeny package In distribution since 1980, 15’000 users PHYLIP homepage: http://evolution.genetics.washin gton.edu/phylip.html http://evolution.genetics.washin gton.edu/phylip.html PHYLIP Wiki page: https://twiki.cscs.ch/bin/view/Swi ssGridInitiative/PHYLIP https://twiki.cscs.ch/bin/view/Swi ssGridInitiative/PHYLIP “Life” tree

26 Seed Project Working Group 07.05.200726 PHYLIP (cont.) A package of programs (~34) –Source code (C) and executables (Win, Mac OS, Linux) are available –Input data are read into the program from a text file –Output data are written onto text files Data types –DNA sequences –Protein sequences –Etc. Methods available –Parsimony –Distance matrix –Likelihood methods –Bootstrapping and consensus trees Already deployed on XtremWeb-CH

27 Seed Project Working Group 07.05.200727 Some Lessons Learned There is considerable interest and expertise in grid collaboration in Switzerland Consequences of diverse seed project middleware and application selection –Setup and testing more complex and heterogeneous –More knowledge gain, participation, and collaboration of people –Requirements, procedures, and results can be better abstracted and generalized Middleware architecture and security differ considerably between computational grid tools (gLite, NorduGrid) and desktop grid tools (XtremWeb-CH, Condor) Applications can be distributed onto a grid infrastructure at two different levels –External: Parameter scans, input splitting or other “wrapper” tasks, usually embarrassingly parallel, often corresponding to how users apply a code (focus of seed project) –Internal: Directed towards tightly-coupled parallel computer systems, requiring implementation on the source code level and balance of parallel tasks, often performing inter-process communication, usually transparent to the application user Application gridification usually needs direct cooperation between scientific developers and grid experts Suggestions for grid project management –Dedicated partners, regular team meetings, reaching of consensus, and conscious project steering important –Selection of responsible person for each middleware and application tool good idea to bundle knowledge and ease communication –Considerable investment of people time and thus money expected to reach production state

28 Seed Project Working Group 07.05.200728 Further Plans Finishing of the seed project –Continue with current approach, potentially after revising middleware and application lists –Each application should run on each middleware pool –Each middleware pool should consist of at least two partner sites –Completion planned until summer 2007 Documentation and communication –Regular meetings and reports about status and results –Documentation about middleware and application setup –Recording of the lessons learned –Requirements lists and recommendations for meta-middleware and production infrastructure –Publication on conference and/or in article Continuation of work –Transfer into sustainable production infrastructure for grid computing in Switzerland –Extension to further middleware and applications (e.g., UNICORE) –Inclusion of data grid features –Selection or development of meta-middleware to integrate different middleware pools (e.g., ISS) –Collaboration with other national and international grid projects –Transfer of Seed Project Working Group into other working groups

29 Seed Project Working Group 07.05.200729 Questions to the Audience Have we taken the right approach? Do you agree with the seed project scope? Is there other grid middleware we should consider? Are there other scientific applications we should consider? Does anybody else would like to participate and contribute? How should we document and communicate our status and results? Do you agree with the seed project timeline? What should happen after the end of the seed project? How to transfer the seed project into a sustainable production infrastructure? How to fund these efforts? What are the requirements for data grid features? How to select or develop a meta-middleware? How to transfer the Seed Project Working Group into other working groups? What should happen with the survey data? Do you have any other ideas?

30 Seed Project Working Group 07.05.200730 Thanks Discussion

31 Seed Project Working Group 07.05.200731 Tentative Agenda for Working Group Session 15:20-16:00: Individual and informal 5 min presentations about each of the middleware and application tools by the corresponding responsible people. Mainly so that we all better understand each other's tools –gLite, NorduGrid, XtremWeb-CH, Condor –Cones, GAMESS, Huygens, PHYLIP 16:00-17:00: Time for informal discussion. Some ideas –Responses to feedback received from the audience in the early afternoon –Technical discussions within the middleware pools and with the application drivers about setup and testing plans and problems –Potential changes/additions/deletions on the middleware and application lists –Potential setup of an additional UNICORE/ISS-based middleware pool –Further timeline of the Seed Project –Documentation and publication of results –Anything else you would like to discuss in person


Download ppt "Seed Project Working Group 07.05.20071 Seed Project Working Group Report Seed Project Working Group Wibke."

Similar presentations


Ads by Google