Presentation is loading. Please wait.

Presentation is loading. Please wait.

DataGrid is a project funded by the European Union ICT KennisCongres 2003 Grids – Achtergronden en praktijk in het EU Data Grid David Groep, NIKHEF

Similar presentations


Presentation on theme: "DataGrid is a project funded by the European Union ICT KennisCongres 2003 Grids – Achtergronden en praktijk in het EU Data Grid David Groep, NIKHEF"— Presentation transcript:

1 DataGrid is a project funded by the European Union ICT KennisCongres 2003 Grids – Achtergronden en praktijk in het EU Data Grid David Groep, NIKHEF davidg@nikhef.nl http://www.dutchgrid.nl/ http://www.eu-datagrid.org/ http://www.edg.org/

2 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 2 Talk Outline u The vision u What makes a Grid u How was it created? u Building a production Grid in Europe u Will it become commonplace…?

3 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 3 Grid – a vision The GRID: networked data processing centres and ”middleware” software as the “glue” of resources. Researchers perform their activities regardless geographical location, interact with colleagues, share and access data Scientific instruments and experiments provide huge amounts of data Federico.Carminati@cern.ch next: beyond distributed computing

4 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 4 Beyond distributed computing A grid integrates resources that are n not owned or administered by one single organisation n speak a common, open protocol … that is generic n working as a coordinated, transparent system And … n can be used by many people from multiple organisations n that work together in one Virtual Organisation Checklist items based on: Ian Foster What is the Grid? July 2002 next: virtual organisations

5 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 5 Virtual Organisations u A VO is a temporary alliance of stakeholders n Users n Service providers n Information Providers A set of individuals or organisations, not under single hierarchical control, temporarily joining forces to solve a particular problem at hand, bringing to the collaboration a subset of their resources, sharing those at their discretion and each under their own conditions. Viewgraph: Foster, Kesselman, Tuecke, the Globus Project next: common and open protocols

6 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 6 Enhanced collaboration u owners of resources and data stay in control u sharing conditions are explicit, … u … and can vary for every resource or service u each VO, and each user, has its own view of the Grid u his “own” grid is transparent and gives easy access u results can again be shared under specified conditions u the Grid a user sees is flexible and resilient to failure

7 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 7 Common and open protocols Applications Grid Services GRAM Grid Security Infrastructure (GSI) Grid Fabric FARMSSupersDesktopsTCP/IPApparatus Application Toolkits DUROCMPICH-G2Condor-G GridFTPInformation VLAM-G Resources must talk standard protocols … … for interoperability of application toolkits Replica DBs next: protocol standards

8 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 8 Common protocol example Data Access Protocol u GridFTP protocol can be used to access different types of systems u Single Sign-On u Security enabled u performance enhancements u generic, usable for many applications tape robot with CXFS disk-based storage GridFTP server CASTOR user Job

9 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 9 Standard protocols u New Grid protocols based on popular Web Services Open Grid Services Architecture u service discovery u many different bindings u easily integrated in hosting environments (Java, WebSphere,.NET) u is entirely generic u adds: transient services, stateful services Global Grid Forum (GGF) promotes the open standards process next: access in a coordinated way

10 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 10 Access in a coordinated way u New ‘qualities-of-service’ u Transparently crossing of domain boundaries satisfying constraints of n site autonomy n authenticity, integrity, confidentiality u single sign-on to all services u ways to address services collectively u preferably via portals and visual programming next: example GOME analysis

11 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 11 Example: GOME analysis Task: ozone is the component in the atmosphere that protects us from harmful UV radiation. Its concentration varies widely. What is happening? n the EnviSat satellite is orbiting the earth and measuring light absorption in the atmosphere n the absorption is related to the ozone concentration, but needs instrument corrections n ground-based observation give absolute concentrations n linking both datasets can give us the concentration everywhere n terabytes of data come in at several ground stations, and various labs need the final products  Grid can provide a good solution to this problem next: GOME analysis on the Grid, domains

12 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 12 Example: Ozone Analysis on the Grid 101001000101111010010001001011 010100100010001010110101001010 101000010111101010010100110100 100101110010010010100100111110 101010010101110010101010101010 010010011111010101001000101001 011000101000001010100010100100 010111101001000100101101010010 001000101011010100101010100001 011110101001010011010010010111 001001001010010011111010101001 010111001010101010101001001001 111101010100100010100101100010 10000010101000 NOPREGO OPERA LIDAR database validation visualize resource broker next: DataGrid overview

13 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 13 A Working Grid: the EU DataGrid Objective: build the next generation computing infrastructure providing intensive computation and analysis of shared large-scale databases, from hundreds of TeraBytes to PetaBytes, across widely distributed scientific communities u official start in 2001 u 21 partners u in the Netherlands: NIKHEF, SARA, KNMI u Pilot applications: earth observation, bio-medicine, high-energy physics u aim for production and stability next: history of grids

14 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 14 Physics @ CERN LHC particle accellerator operational in 2007 ~10 Petabyte per year 150 countries > 10000 Users lifetime ~ 20 years level 1 - special hardware 40 MHz (40 TB/sec) level 2 - embedded level 3 - PCs 75 KHz (75 GB/sec) 5 KHz (5 GB/sec) 100 Hz (100 MB/sec) data recording & offline analysis http://www.cern.ch/ Other applications in the EU DataGrid next: BioMedical

15 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 15 Bio informatics in EU DataGrid For access to data –Large network bandwidth to access computing centers –Support of Data banks replicas (easier and faster mirroring) –Distributed data banks For interpretation of data –GRID enabled algorithms BLAST on distributed data banks, distributed data mining next: GSI and VOMS

16 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 16 Realising the Grid Vision Grid was the logical next step in the end of the 1990: u Harnassing desktop power became commonplace – 1988: Condor, later: SETI@Home, Entropia, Distributed.NET u Peer-to-peer data access protocols emerged – 1999: Napster, later: Gnutella, KaZaa, BitTorrent u Network access became extremely fast – 1997: wide area bandwidth starts to double every 9 months! u 1997: Globus starts developing basic middleware – 1996: middleware by Legion, 2000: Unicore u Massive take-up of the Grid vision in 1999 – lead in Europe by the EU DataGrid – others include: NASA-IPG, CrossGrid, GridLab, PPDG, Alliance, … next: the EU DataGrid project

17 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 17 Grid Security Infrastructure Crucial in Grid computing: it gives Single Sign-On GSI uses a Public Key Infrastructure with proxy-ing and delegation multiple VOs per user, groups and role support C=IT/O=INFN /L=CNAF /CN=Pinco Palla /CN=proxy VOMS pseudo -cert Query Authentication Request Auth DB VOMS pseud o-cert connect to providers Grid Service 1 Service 2 contracts next: information services overview VOMS overview: Luca dell’Agnello and Roberto Cecchini, INFN and EDG WP6

18 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 18 VO Membership Service features u User can exploit membership of multiple VOs u User can pick selected Roles for specific tasks u Site authorization based on VO membership … u … but has all the means to act on per-user characteristics! u Fine-grained authorization for data base and replica access u All connections are two-way authenticated n no spoofing n no data corruption n no spying

19 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 19 What is needed to get the work done u Fabric information n what are the resources (computers, disk, tape) available to my VO? n how do I access these resources (the “contact information”)? u “Physical” meta-data n when was this dataset written? n where can I find copies of it ‘close’ to me? u Contextual meta-data or ‘information’ n Which datasets contain feature “X”? n Which DNA sequence corresponds to this protein? u Actual storage, processing power, network connectivity next: spitfire

20 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 20 Spitfire: Access to Data Bases u based on common EDG Trust and Authorization Manager u VO and Role mapping to data base views Access via u Browser u Web Service u Commands Screenshots: Gavin McCance, Glasgow University and EDG WP2 next: R-GMA

21 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 21 Grid information: R-GMA Relational Grid Monitoring Architecture u a Global Grid Forum standard u Implemented by a relational model u used by grid brokers next: RLS and RMC Screenshots: R-GMA Browser, Steve Ficher et al., RAL and EDG WP3

22 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 22 Replica Location Service u Search on file attributes (date, name, …) u Find replicas on (close) Storage Elements SE1 SARA SE2 CERN cache UvA DAS2 CE DAS-2 CE CERN higgs1.dat,... sara:atlas/data/higgs1.dat cern:lhc/atlas/higgses/1.dat higgs2.dat,... cern:lhc/atlas/higgses/2.dat ATLAS Replica Service next: CE and RB, brokering and LCAS

23 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 23 Compute Brokering: reliable execution u User can delegate all job actions to the Resource Broker … … and go away u Reliable scheduling of jobs over the entire grid (as seen from the R-GMA information system) u Users are roaming, and can retrieve their results anywhere, anytime next: EDG test bed overview

24 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 24 Current EU DataGrid Facilities CERN Lyon RAL NIKHEF EDG and LCG sites CNAF Core site Tokyo Taipei BNL ~1000 CPUs ~100 Tbyte storage several key databases ~60 sites, ~600 users in ~7 VOs next: using EDG, VisualJob

25 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 25 Using the DataGrid for Real next: Portals Screenshots: Krista Joosten and David Groep, NIKHEF

26 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 26 Portals next: conclusions and outlook Screenshots: ICES/KIS and WTCW: VLAM-G; INFN-GRID and EDG: Genius; NPACI: Rocks

27 ICT KennisCongres 2003 – Grids: Achtergronden en praktijk– n° 27 What more is there to see and do? The current Grids are only the beginning! n portals will get more users on the Grid n more functionality, better resilience, strong reliability n joining the Grid will be as simple as joining a file-sharing network n EGEE: a pan-European Grid Infrastructure being created today The EU DataGrid project webwww. edg.orgwww. edg.org DutchGrid Platformwww. dutchgrid.nlwww. dutchgrid.nl For other grid projects, seewww. gridstart.org www. enterthegrid.comwww. gridstart.org www. enterthegrid.com


Download ppt "DataGrid is a project funded by the European Union ICT KennisCongres 2003 Grids – Achtergronden en praktijk in het EU Data Grid David Groep, NIKHEF"

Similar presentations


Ads by Google