Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring of Interactive Grid Applications Marian.

Similar presentations


Presentation on theme: "Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring of Interactive Grid Applications Marian."— Presentation transcript:

1 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring of Interactive Grid Applications Marian Bubak with Bartosz Baliś, Wlodek Funika, Tomasz Szepieniec, Roland Wismueller Institute of Computer Science and ACC CYFRONET AGH, Cracow, Poland LRR-TUM, Muenchen, Germany Institute for Software Science, University of Vienna, Austria EU CrossGrid Project www.eu-crossgrid.org

2 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Outline 1.Motivation - CrossGrid in a nutshell Applications and their requirements Architecture Tools for applications development Monitoring system 2.Concept of Grid application monitoring 3.Grid extensions for OMIS 4.Design of OCM-G 5.Security 6.Status

3 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany EU Funded Grid Project Space (Kyriakos Baxevanidis) GRIDLAB GRIA EGSO DATATAG CROSSGRID DATAGRID Applications GRIP EUROGRID DAMIEN Middleware & Tools Underlying Infrastructures Science Industry / business - Links with European National efforts - Links with US projects (GriPhyN, PPDG, iVDGL,…)

4 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany CrossGrid Collaboration Poland: Cyfronet & INP Cracow PSNC Poznan ICM & IPJ Warsaw Portugal: LIP Lisbon Spain: CSIC Santander Valencia & RedIris UAB Barcelona USC Santiago & CESGA Ireland: TCD Dublin Italy: DATAMAT Netherlands: UvA Amsterdam Germany: FZK Karlsruhe TUM Munich USTU Stuttgart Slovakia: II SAS Bratislava Greece: Algosystems Demo Athens AuTh Thessaloniki Cyprus: UCY Nikosia Austria: U.Linz

5 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Biomedical Application CT / MRI scan Medical DB Segmentation Medical DB LB flow simulation VE WD PC PDA Visualization Interaction HDB 10 simulations/day 60 GB 20 MB/s

6 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany VR-Interaction

7 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Cascade of Flood Simulations Data sources Meteorological simulations Hydraulic simulations Hydrological simulations Users Output visualization

8 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Example of the Flood Simulation - Flow and Water Depth

9 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Distributed Data Analysis in High Energy Physics –Objectives Distributed data access Distributed data mining techniques with neural networks –Issues Typical interactive requests will run on o(TB) distributed data Transfer/replication times for the whole data about one hour Data transfers once and in advance of the interactive session Allocation, installation and set-up of corresponding database servers before the interactive session

10 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany –Distributed/parallel codes on the Grid Coupled Ocean/Atmosphere Mesoscale Prediction System STEM-II Air Pollution Code –Integration of distributed databases –Data mining applied to downscaling weather forecast Weather Forecast and Air Pollution Modeling

11 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Key Features of CrossGrid Applications –Data Data sources and data bases geographically distributed To be selected on demand –Processing Large processing capacity required; both HPC & HTC Interactive –Presentation Complex data requires versatile 3D visualisation Support for interaction and feedback to other components

12 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Overview of the CrossGrid Architecture Supporting Tools 1.4 Meteo Pollution 1.4 Meteo Pollution 3.1 Portal & Migrating Desktop Applications Development Support 2.4 Performance Analysis 2.4 Performance Analysis 2.2 MPI Verification 2.3 Metrics and Benchmarks 2.3 Metrics and Benchmarks App. Spec Services 1.1 Grid Visualisation Kernel 1.3 Data Mining on Grid (NN) 1.3 Data Mining on Grid (NN) 1.3 Interactive Distributed Data Access 3.1 Roaming Access 3.1 Roaming Access 3.2 Scheduling Agents 3.2 Scheduling Agents 3.3 Grid Monitoring 3.3 Grid Monitoring MPICH-G Fabric 1.1, 1.2 HLA and others 3.4 Optimization of Grid Data Access 3.4 Optimization of Grid Data Access 1.2 Flooding 1.2 Flooding 1.1 BioMed 1.1 BioMed Applications Generic Services GRAM GSI Replica Catalog GIS / MDS GridFTP Globus-IO DataGrid Replica Manager DataGrid Replica Manager DataGrid Job Submission Service Resource Manager (CE) Resource Manager (CE) CPU Resource Manager Resource Manager Resource Manager (SE) Resource Manager (SE) Secondary Storage Resource Manager Resource Manager Instruments ( Satelites, Radars) Instruments ( Satelites, Radars) 3.4 Optimization of Local Data Access 3.4 Optimization of Local Data Access Tertiary Storage Replica Catalog Globus Replica Manager Globus Replica Manager 1.1 User Interaction Services 1.1 User Interaction Services

13 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Tool Environment Grid Monitoring (Task 3.3) Performance Prediction Component High Level Analysis Component User Interface and Visualization Component Performance Measurement Component Benchmarks (Task 2.3) Applications (WP1) executing on Grid testbed Application source code G-PM RMD PMD Legend RMD – raw monitoring data PMD – performance measurement data data flow manual information transfer

14 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Tools Environment and Grid Monitoring Applications Portals (3.1) Portals (3.1) G-PM Performance Measurement Tools (2.4) G-PM Performance Measurement Tools (2.4) MPI Debugging and Verification (2.2) MPI Debugging and Verification (2.2) Metrics and Benchmarks (2.4) Metrics and Benchmarks (2.4) Grid Monitoring (3.3) (OCM-G, RGMA) Grid Monitoring (3.3) (OCM-G, RGMA) Application programming environment requires information from the Grid about current status of applications and it should be able to manipulate them

15 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring of Grid Applications –Monitor = obtain information on or manipulate target application –e.g. read status of application’s processes, suspend application, read / write memory, etc. –Monitoring module needed by tools –Debuggers –Performance analyzers –Visualizers –...

16 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany CrossGrid Monitoring System

17 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Concept of Grid Applications Monitoring –OCM-G = Grid-enabled OMIS-Compliant Monitor –OMIS = On-line Monitoring Interface Specification –Application-oriented information about running applications –On-line information collected at runtime immediately delivered to consumers –Information collected via instrumentation activated / deactivated on demand information of interest defined at runtime (lower overhead)

18 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring – Autonomous System Separate monitoring system Tool / Monitor interface – OMIS

19 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Why OMIS ? –Universal generic interface supporting different tools –May be extended to add new grid-oriented functionality –Fits to the GGF’s Grid Monitoring Architecture (GMA) e.g., event-action paradigm enables data-subscription scenario

20 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Very Short Overview of OMIS –Target system view hierarchical set of objects nodes, processes, threads For the Grid: new objects – sites objects identified by tokens, e.g. n_1, p_1, etc. –Three types of services information services manipulation services event services

21 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OMIS Services –Information services obtain information on target system e.g. node_get_info = obtain information on nodes in the target system –Manipulation services perform manipulations on the target system e.g. thread_stop = stop specified threads –Event services detect events in the target system e.g. thread_started_libcall = detect invocations of specified functions –Information + manipulation services = actions

22 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OMIS Requests Services are combined into two types of monitoring requests: –Unconditional requests to be executed immediately executed only once –Conditional requests to execute actions whenever event occurs actions can be executed multiple time

23 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OMIS Unconditional Requests :thread_stop(t_1) Operands Actions = stop thread t_1

24 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OMIS Conditional Requests thread_started_libcall(t_1, „MPI_Send”): counter_inc(c_1) Event Operands Actions = whenever thread t_1 invokes MPI_Send, increment counter c_1

25 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany New OMIS Services for Grid (1/3) 1.Services related to the new object site site_attach – attach to a site site_get_info – return information on a site site_get_nodelist – return a list of nodes on a site 2.Services for application-related metrics hardware_read_counter – return value of a hardware performance counter

26 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany New OMIS Services for Grid (2/3) 3.Services for infrastructure-related metrics network_get_info – return information on a network connection 4.Benchmark-related services benchmark_get_result – return a result of a benchmark benchmark_execute – execute benchmark

27 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany New OMIS Services for Grid (3/3) 5.Services for application handling app_attach – attach to an application app_attach2 – attach to an application app_get_list – get a list of running applications app_get_proclist – return process list of an application 6.Services related to probes thread_executes_probe – a probe has been executed

28 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Grid-enabled OMIS-Compliant Monitor –Features Permanent Grid service External interface: OMIS –Architecture: two types of components Local Monitors Service Managers

29 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Components of OCM-G –Service Managers one per site in the system permanent request distribution reply collection –Local Monitors one per [node, user] pair transient (created or destroyed when needed) handle local objects actual execution of requests

30 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring Environment –OCM-G Components Service Managers Local Monitors –Application processes –Tool(s) –External name service Component discovery

31 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – Unconditional Requests –Immediate response from the OCM-G

32 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – Conditional Request Two stages : 1.Request registration (msgs 1-1.2.2) 2.Request executed when event occurs (msgs 2-2.3.1)

33 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Core Initialization of the OCM-G components Initial preprocessing of all messages

34 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Communication Uniform Interface for component-to- component communication

35 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Internal localization Internal name service Tokens

36 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –External localization Uniform access to external information services

37 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Services Implementation of OMIS services

38 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Request management OMIS requests analysis and distribution Reply handling

39 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –Application context Represents information about applications

40 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G – SM and LM Modules –User User management Authentication and authorization

41 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany OCM-G - SM and LM Modules –Application module Part of OCM-G linked to the application

42 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Security Issues –OCM-G components handle multiple users, tools and applications possibility to issue a fake request (e.g., posing as a different user) authentication and authorization needed –LMs are allowed for manipulations unauthorized user can do anything

43 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Security - Solutions –LMs are user-bound Run as user processes Security ensured by OS mechanisms –Service Managers are permanent Run as unprivileged processes (nobody) User Grid Id checked internally (partial security) Grid certificates for users, tools and SMs incorporated (ultimate security)

44 Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Status –OCM implementation for clusters –Software requirements specification –OMIS extensions for the Grid –OCM-G concept + OO design –1 st prototype in December 2002 –Available via a public software licence –More: www.eu-crossgrid.org


Download ppt "Dagstuhl Seminar 02341: Performance Analysis and Distributed Computing, August 18 – 23, 2002, Germany Monitoring of Interactive Grid Applications Marian."

Similar presentations


Ads by Google