Presentation is loading. Please wait.

Presentation is loading. Please wait.

Enrico Fattibene INFN-CNAF

Similar presentations


Presentation on theme: "Enrico Fattibene INFN-CNAF"— Presentation transcript:

1 Enrico Fattibene INFN-CNAF
Grid introduction Enrico Fattibene INFN-CNAF 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

2 Calcolo Parallelo su Grid e CSN4cluster
Outline The scientific “demand” What is a Grid? Primary Grid components European Grid Infrastructure (EGI) Italian Grid Infrastructure (IGI) IGI Grid management and support 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

3 Calcolo Parallelo su Grid e CSN4cluster
eScience Science is becoming increasingly digital, needs to deal with increasing amounts of data and computational needs Simulations get ever more detailed Nanotechnology – design of new materials from the molecular scale Modelling and predicting complex systems (weather forecasting, river floods, earthquake) Decoding the human genome Experimental Science uses ever more sophisticated sensors to make precise measurements Need high statistics Huge amounts of data Serves user communities around the world Science is getting more digital world-wide – LHC as example 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster 3

4 Calcolo Parallelo su Grid e CSN4cluster
LHC at CERN CMS LHCb ATLAS ALICE 40,000,000 collisions/sec in each of the four detectors 100,000 of today’s fastest processors 15 PetaBytes of new data each year 150 times the total content of the Web each year 1 Petabyte (1PB) = 1000TB = 10 times the text content of the World Wide Web** ** Urs Hölzle, VP Operations at Google 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

5 Calcolo Parallelo su Grid e CSN4cluster
What is a Grid? Computational Grid  is a collection of distributed, possibly heterogeneous resources which can be used as an ensemble to execute large-scale applications The three fondamental properties of Grid computing: coordinating resources that are not subject to centralized control using standard, open, general-purpose protocols and interfaces delivering nontrivial qualities of service Large-scale coordinated management of resources belonging to different administrative domains (multi-domain vs single domain) Standard, open, multi-purpose protocols and interfaces that provide a range of services (standard vs proprietary) Delivery of complex Quality of Service (QoS): Grid computing allows its constituent resources to be used in a coodinated fashion to deliver various types of QoS, such as resposed time, throughput, avaiability, reliability, security, etc. 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

6 Calcolo Parallelo su Grid e CSN4cluster
Primary components The primary components of a production Grid are: Computing resources Storage resources Access points to the grid Core services Other elements are as much fundamental for the working, managing and monitoring of the Grid: Monitoring tools Accounting tools Management and control infrastructure 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

7 Calcolo Parallelo su Grid e CSN4cluster
Computing resources Provide the possibility to execute a computation But also: Get the status of the computation Cancel the computation Computing resources are typically provided by possibly large farms of computers - Worker Nodes (WNs) Usually managed by a batch system (e.g. LSF, PBS, Condor) The corresponding Grid abstraction is called a Computing Element (CE) 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

8 Calcolo Parallelo su Grid e CSN4cluster
Storage resources Provide the possibility to manage the storage of data Data are typically in the form of files Create, read, write, delete files/directories Storage may be provided using different technologies DPM, Castor, dCache, StoRM for management GridFTP for transfer rfio, gsidcap, posix,... for access The corresponding Grid abstraction is called Storage Element (SE) 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

9 Calcolo Parallelo su Grid e CSN4cluster
Authentication A Grid may count hundreds of CEs and SEs. Do I need an account on each of them? No A Grid identity is managed with an X.509 certificate, which represents that user's credentials /C=IT/O=INFN/OU=Personal Certificate/L=CNAF/CN=Enrico Fattibene A Grid identity is transparently mapped to a local identity/account, provided the authorization is granted 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

10 Calcolo Parallelo su Grid e CSN4cluster
Authorization Grid users belongs to an experiment and, within that experiment, to different groups On the Grid an experiment is a Virtual Organization (VO) VO, groups and roles can be associated to an identity by a VO Membership Service (VOMS) VO, groups and roles are included in the user's credentials and used, for example, in the local mapping 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

11 Calcolo Parallelo su Grid e CSN4cluster
Information System How do I know which resources are available? How do I know which ones I can use? Services publish their existence, characteristics and status in the Information Service The information is published according to an agreed-upon schema, called the GLUE schema The most common implementation is based on LDAP and is called BDII (Berkley Database Information Index) 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

12 Calcolo Parallelo su Grid e CSN4cluster
Job Management The Workload Management System (WMS) is responsible for the distribution and management of tasks across Grid resources, in particular Computing Elements, in such a way that applications are conveniently, efficiently and effectively executed Complemented by the Logging&Bookkeeping (LB) Service Keep track of a number of events generated by different components involved in job management Provide the status of a job 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

13 Calcolo Parallelo su Grid e CSN4cluster
Monitoring Observing the composition, state and features of available resources Analyzing their behavior and performance Detecting and prevent fault situations 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

14 Calcolo Parallelo su Grid e CSN4cluster
Accounting How many resources have I used? How many resources have a certain VO used? An accounting system provides support to give precise answers to such questions Collect information at resource level Propagate the info at higher-levels, where it can be aggregated according to different views 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

15 Calcolo Parallelo su Grid e CSN4cluster
Grid schema Information System Data Catalogs User Support Core Services VO Management Job Broker (WMS) File Transfer Service CE SE Site A Site B CE SE Site C CE SE 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

16 Calcolo Parallelo su Grid e CSN4cluster
Grid advantages: site Make better use of existing resource Monitoring tools Accounting tools Support for site managers Installation, upgrading, problems, ticketing system Coordination of security aspects 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

17 Calcolo Parallelo su Grid e CSN4cluster
Grid advantages: user Can solve larger, more complex problems in a shorter time Easier to collaborate with other organizations Support for users Application porting 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

18 European e-Infrastructures
European Data Grid (EDG) Middlewere developing and testbed deployment Enabling Grid for E-sciencE (EGEE) I-II-III From the prototype to the production infrastructure European Grid Infrastructure (EGI) Towards a sustainable Grid infrastructure Key role of the National Grid Initiatives (NGIs) Based on the gLite/UMD (Unified Middleware Distribution) middleware release 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

19 Calcolo Parallelo su Grid e CSN4cluster
EGI in numbers Logical CPUs (cores) 248,424 EGI 337,608 All Storage resources 106.7 PB disk 112.8 PB tape Resource Centres 329 EGI 346 All 35% of logical cores provided by the 9 largest Resource Centres Countries 50 EGI 57 All 38 National Grid Infrastructures (NGIs) providing resources 1 European International Research Organisations (EIRO) providing resources (CERN) 19 countries in 4 non-European Operations Centres 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

20 Calcolo Parallelo su Grid e CSN4cluster
IGI The Italian Grid Infrastructure (IGI) is part of EGI together with many European National Grid Initiatives (NGIs) It’s one of the widest NGIs 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

21 Calcolo Parallelo su Grid e CSN4cluster
IGI in numbers Logical CPUs (cores) 26,087 Storage resources 24.5 PB disk 5 PB tape Resource Centres 58 Partners 19 Institutes/Universities Users 1100 Job per year 30 millions 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

22 Calcolo Parallelo su Grid e CSN4cluster
IGI Grid management IGI Grid management is performed by the Operation Center. The main activities are: Production of the Infngrid middleware release (customization of the gLite/UMD release) and test Deployment of the release to the sites, support to local administrators and sites certification Periodical check of the resources and services status Support at an Italian level Support at an European level Introduction of new Italian sites in the infrastructure Introduction of new regional VOs 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

23 Calcolo Parallelo su Grid e CSN4cluster
IGI support About 10 supporters perform a checking activity composed of 1 shift per week, with 2 person per shift: provides a first support to sites and users (1st line supporters team) Specialists of Grid services (2nd line supporters) take place in case of more complex problems The main activities is: Checking of the Grid status and problem warning, tailing them until their solution if possible Checking of the ticket still opened and pressing the expert or the site-managers for answering and solving them In case of problems with IGI infrastructure: Register and submit tickets through CMT (Central Management Team) is the generic department Evidenziare il cambiamento nel sistema dei turni 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

24 Calcolo Parallelo su Grid e CSN4cluster
Monitoring tools Nagios Simplifies Grid resources operations Visualization & management interface on Grid resources status Provides site admin-centric monitoring Issues notifications as soon as problem appears GStat Queries the Information System every 5 minutes The sites and nodes checked are those registered in the GOC DB The inconsistency of the information published and the eventual missing of a service that a site should publish are reported as an error 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

25 HLRmon accounting portal
Open section with general accounting data The personal certificate installed in the browser is required Data aggregated per: Grid site VO, groups and roles CA, RA Job type (Grid or local) Restricted section providing per-user information visible only by registered and authorized users 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster

26 Calcolo Parallelo su Grid e CSN4cluster
Thank you Questions ? 26 Settembre 2011 Calcolo Parallelo su Grid e CSN4cluster


Download ppt "Enrico Fattibene INFN-CNAF"

Similar presentations


Ads by Google