Presentation is loading. Please wait.

Presentation is loading. Please wait.

UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)

Similar presentations

Presentation on theme: "UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)"— Presentation transcript:

1 UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC) e-Biodiversity (BBSRC)  Open EPSRC call for new e-Science Centres  Reading e-Science Centre (Nov. 2003)  Resources: Access Grid Node Technical Director: Jon Blower

2 The Reading e-Science Centre (ReSC) Jon Blower Technical Director

3 Aims of the ReSC  Promote e-Science methods in the environmental science community –CGAM, DARC, ESSC, JCMM, NCAS all at Reading  Act as a focus for all e-Science activities in Reading  Provide expertise, help and support for these activities  Reach out into government agencies and industry –esp. Met Office, Environment Agency –British Maritime Technology

4 What is e-Science?  “science increasingly done through distributed global collaborations enabled by the Internet, using very large data collections, terascale computing resources and high performance visualization”

5 What is e-Science? (2)  Easier definition: “Collaborative science using distributed computing”  Who can benefit? –Users of lots of computing power –Users of large datasets –Users of very distributed datasets –scientists who work across geographical and institutional boundaries  Easier to explain with some concrete examples

6 Case Studies

7 Case 1: Ensemble modelling  The Problem: –Climate is sensitive to very many factors. How do we work out which factors are most important in determining our future climate?  The Solution: –Run (fairly simple) simulations many, many times over with different parameters (an ensemble run) – participants all over the world run the model on their home PCs

8 results  Already largest climate model ensemble ever (by factor of >200)  >45,000 users, >15,000 complete model runs, >1,000,000 model years in ~3 months (this is equivalent to 1.5 Earth Simulators) Large range of sensitivities found: Global outreach (participants in all 7 continents, inc. Antarctica!) Generated much interest in schools ( 10K 2K

9 Case 2: Sharing large datasets  The Problem: –There are many different models of ocean circulation and we would like to compare and visualize the results. But there are lots of different data formats, and there’s lots of data!  The Solution: –Create an Internet-based service that allows users to cut out just the data they want, and get it in the format they want (this is called Grid Access Data Service, GADS) –Developed under the GODIVA project

10 GODIVA Web Portal Allows users to interactively select data for download using a GUI Users can create movies on the fly cf. Live Access Server

11 Case 3: Highly distributed data  The Problem: –In order to study the genetic origins of a disease it is necessary to interrogate many data sources to perform in silico experiments to test hypotheses  The Solution: –Provide Web Services to access these data sources and a means for combining these Services into workflows. –These workflows can be shared between scientists, experiments can be easily repeated –myGrid project is doing just this (

12 The Taverna workbench  Each blob on the diagram is a Web Service  Flexible way of creating a distributed application

13 e-Science concepts

14 e-Science buzzwords  The GRID –highly heterogeneous network of supercomputers, clusters and commodity machines (and one PS2!) –cf. power grids (long way off!) –not all e-Science is done on The GRID (in fact, most isn’t at the moment)  Interoperability / standards –absolutely necessary for working together and avoiding duplication of effort  Metadata and Semantics (“The Semantic Web”) –Metadata = “data about data”, vital for discovering data resources –Meaning of data (semantics) must be precisely specified

15 The tools of the trade  Middleware –software that “glues together” existing systems and connects people with distant resources  Condor –Manages task of running jobs over several computers  Globus (Toolkit) –Most popular middleware, handles authentication, job submission, etc –version 3 very different from previous versions; it’s based on…  Web Services

16 Web Services  “Black box” subroutine that can be accessed over the Internet  Platform and language neutral –for example, code can run on Solaris, but be called from Mac, Windows, Linux etc, any language  Huge industry backing –IBM, Microsoft, Sun, etc  Grid Services extend WS for long-lived jobs –notification of progress, persistence of data etc

17 Workflows  Web Services can be composed into “workflows” to create a distributed application –hot topic of research and debate in e-Science  Lots of standards and tools to do this, but no one clear “winner” yet  BPEL is popular, but really designed for business-to- business (B2B) interaction

18 Example workflow Compare datasets Visualize results Perform diagnostics Extract dataset 1 Extract dataset 2 Convert format

19 Visualization  Key component of many e-Science projects  Vital for validating models and finding features of interest –not just “pretty pictures”  Can do collaborative visualization –several groups can look at the same thing at the same time –e.g. mammography in hospitals  Real-time visualization of model results permits computational steering –RealityGrid ( –explore parameter space much more quickly

20 GODIVA visualization  Adaptive meshing gives data compression with little visible degradation  60 x 60 x 66 data points ~ ¼ million reduced by factor of ~10

21 Back to the ReSC...

22 Why ReSC?  Centre of Excellence in Environmental e-Science  Reading Uni has strong links with Met Office, and Environment Agency  Support existing Reading e-Science activities –in ESSC, Comp Sci, Plant Sciences, etc –acts as focus and central point of contact –not just environmental e-Science  Complements NIEeS –National Institute for Environmental e-Science in Cambridge –

23 Who are we?  Two co-Directors –Keith Haines (ESSC) –Rachel Harrison (Computer Science)  Technical Director (first point of contact) –Jon Blower (ESSC)  Many Associates –Mike Evans, Lizzie Froude, Kevin Hodges, Chunlei Liu, Kecheng Liu, Adit Santokhee –join us!

24 What are we doing?  Building Reading e-Science community –Comp Sci, Met Dept, CGAM, DARC, Plant Sciences  Building infrastructure –Building Condor pool between ESSC and Comp Sci, further in future –Bidding for dedicated compute cluster  Building software –Web Services for environmental data access and manipulation  Outreach into govt agencies and industry –BMT, ECMWF, MCA, SEEDA –using Reading Enterprise Hub

25 ReSC projects  Flexible Online Environmental Data Systems (EDAS) –SEEDA project –delivery of live Met Office data to end users –e.g. BMT for search and rescue / oil spill mitigation  GODIVA –Grid for Ocean Diagnostics, Interactive Visualization and Analysis  GADS –Grid Access Data Service  Lizzie Froude’s PhD studentship –storm tracking diagnostics on large, distributed data sets  Lots more going on in Reading –e.g. BiodiversityWorld –Computer Science

26 How you can get involved  Talk to us!  Join the Reading University e-Science mailing list  Read our website:  Use the Wiki site to share ideas –Register expertise and interests –Share documents that might be of general use

27 What we can do for you  Provide technical expertise –e.g. on Web Services, workflow, etc  Provide advice on getting funding  Help find collaborators, resources etc  Provide computational resources  Provide live data  Provide Access Grid for use

28 The Access Grid

29 What is the Access Grid?  (not to be confused with The GRID!)  State-of-the-art videoconferencing suite  Can hold meetings with many sites at once –everyone can see and hear everyone else  Reduces travel costs and saves lots of time  Uses high-speed internet –no running costs!  Easy to operate –don’t need dedicated technician

30 In conclusion…  ReSC is here to support all Reading e-Science activity  We specialise in environmental e-Science  We’re always looking for new projects to be involved in  Many potential future projects –especially in area of delivery of real-time Met Office or Environment Agency data –engage GIS community  Let us know what you would like us to do!

31 Other environmental e-Science projects

32 GENIE  Grid-Enabled Integrated Earth System model  Aims to create a distributed, component-based model of the earth system  Will study long-term climate change and palaeoclimate  Will incorporate components representing atmosphere, ocean, land surface, ice, ocean and land biogeochemistry, ocean sediments  Developing novel computing techniques for model framework, integration, data management, visualization

33 GENIE (contd.) Response of Atlantic circulation to freshwater forcing  New ways of working: –Web Portal for composing + executing simulations, retrieving results –Use of flocked Condor pools (London, Soton) and Beowulf clusters –Data client for post-processing

34 GENIE (contd.)  3 international collaborators (Japan, US, Switzerland)  Involvement in international projects: PRISM, EMIC, GAIM  4 Oral, 2 poster presentations at EUG/AGU (Nice), IUGG (Japan), AHM 03  4 refereed journal papers (1 in press, 3 submitted)  Engagement with industry (50K each from Intel, Compusys for meetings)  ~20 people at present using shared code repository –Tyndall Centre will use code in integrated assessment model

35 GODIVA  Grid for Ocean Diagnostics, Interactive Visualisation and Analysis  Aims to quantify the thermohaline circulation via analysis of model results and observational data  Developing Web Services for performing common tasks on oceanographic data: –Data extraction, processing, analysis, visualisation  These Services will be composed into “workflows” to create flexible, distributed applications –collaborating with other e-Science projects (e.g. myGrid) in this matter

36 GODIVA progress  Talks/demonstrations at All Hands meeting and SCGlobal 2003  Created prototype client application: –extracts live data and performs 3-D rendering  Also created data portal providing global access to data (next slide)  Will engage GIS community (e.g. MarineGIS project in Ireland)  MENTION irregular mesh

37 GODIVA Data Portal  Web-based, similar to Live Access Server  Users select area of interest and can download data or create movies in matter of seconds or minutes  Uses distributed computing for visualisation

38 NERC Data Grid  Objective is build a grid which makes data discovery, delivery and use much easier than it is now  Standards compliant (ISO 19115, 19118), semantic data model for maximum interoperability  Data can be stored in many different ways (flat files, databases…)  Clear separation between discovery and use of data.  1 PI, 2 co-Investigators, 4 FTE staff, 3 registered US collaborators

39 NERC Data Grid progress  Involved in many UK events (All Hands, Met Soc, NIEeS workshops etc)  Generated much international interest (US, France, Netherlands, Australia…)  Major challenges: –Influencing OGC and ISO to support the complex requirements of the climate simulation community –Developing a “feature-registry” to allow semantics of data types to be well understood by different communities

40  Have created extremely powerful and distributed climate modelling facility by running model simulation on home computers (cf.  Launch ensemble of coupled simulations of and compare with observations.  Run on to 2050 under a range of natural and anthropogenic forcing scenarios.  Investigates sensitivity of climate system to increasing CO 2 with range of parameter values  Have collaborated with other universities and industry to build system

41 e-Minerals  Models the atomistic processes involved in environmental issues (radioactive waste disposal, pollution, weathering) –Simulation of radiation damage (Daresbury) –Order-N quantum mechanical model of fluids (Cambridge) –Complex fluid-mineral interfaces – crystal growth and dissolution (Bath)  Developing new methods –embedded clusters: links simulations of various sophistication to cover greater ranges of scales –first use of quantum Monte Carlo techniques in mineral sciences

42 e-Minerals (contd.)  Have constructed minigrid across institutions to run code –~30 scientists in 8 institutions  Users submit jobs using a Web Portal –This integrates the CCLRC Data Portal with the HPC Portal  Developing tools for collaborative visualisation across the virtual organisation  Collaborating with Peter Murray-Rust to extend the Chemical Markup Language (CML) for computational chemistry

43 NIEeS  National Institute for Environmental e-Science  Promotes and supports the use of e-science and grid technologies within the UK environmental science community  Holds workshops, courses, training events, visitor programmes, demonstration projects  Industry event forthcoming (Feb 12 th ) –generating much interest

44 NIEeS (contd.)  Up to end of 2003 (since launch in July 2002): –14 events held –901 participants  e.g. Earth Systems Modelling workshop (Oct 03) received coverage in national press and engaged Earth Simulator community in Japan  Event sponsorship from BNFL, LaserScan  In-kind support from EDINA, ICE, IEMA, MIRO  Additional help from Hi Consulting

45 Illustration of an e-Science problem  SOC’s latest OCCAM model runs at 1/12 degree resolution, covering the entire globe  Every model day, model outputs 8GB of data –Hence whole data set will be several TB in size  How do we work with this data set? –Might want to do analysis, visualisation etc –Extract just the data you want and work with it –OR move the programs (code) to the data, not vice-versa  These are two key principles of e-Science

46 Subset / resample Transform / regrid / rotate Analyse Compare Working with large data sets

47 UK e-Science Centres National e-Science Centre (NeSC) National Institute for Environmental e- Science (NIEeS)

48 GADS: Background Climate scientists have a need to access large datasets: –Model data and satellite observations –Data in a variety of formats (netCDF, HDF, GRIB, more), grids, naming conventions –Model intercomparisons (MERSEA) Existing standards (DODS/OPeNDAP) are limited

49 Advantages of GADS  Data are abstracted from storage  Data can be exposed with standard variable names, even if data files do not conform to standards  Data can be delivered in many formats, irrespective of internal storage format  Deployed as Web Service –Platform – independent –Compatible with current eScience advances

Download ppt "UK e-Science Program  Core Centres 2001 (EPSRC)  Research Council Pilot projects Godiva Ocean grid (NERC) Genie Earth System (NERC) e-Minerals (NERC)"

Similar presentations

Ads by Google