Presentation is loading. Please wait.

Presentation is loading. Please wait.

Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of.

Similar presentations


Presentation on theme: "Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of."— Presentation transcript:

1 Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of various software features Task 4: Creating the specialized testbed for developing test suites Task 5: Preparing intermediate and final reports PNPI – Yu. Ryabov, N. Klopov

2 Plans for the second year 1.Development of the stress and performance tests for WMS and CE according with requests from developers and/or certification team 2. New gLite 3.1 middleware installation on the testbed

3 Requirements to the test 1.Submit a large number of jobs simultaneously 2.Submit jobs from one or many users. 3.Monitoring of a load of CE and WMS during testing. 4.Monitoring of jobs status (pass through system’s components) on the CE and WMS during testing. 5.Storing of status information for all submitted jobs. 6.Possibility of express visual analysis of results.

4 Functional schema of the test Monitoring CE Monitoring WMS Job submitter UI Parametric job ….. Parametric job Data collector Jobs logging info Monitoring data

5 Jobs submission Job submission program (several scripts) has the following input parameters: -u- the number of the users -x- path to the directory with users proxy certificates (x1- path to the user proxy certificate) -n -the number of the subjobs from each user -s- time interval between jobs status request -t -max time of the test execution -a- the time of a subjob will execute on WN -l- path to the logfile

6 Monitoring These scripts run on CE and WMS and provide receiving and saving information about load average and system processes names. The script runs with the following parameters: –t - pool time –l -request for load average –p -request for process names Load average ~The quantity of active processes (from UNIX)

7 Data collector The Data collector script is executed after finish of all jobs and does the following: -copy monitoring data from WMS and CE; -request the event time information for each subjob, using glite-wms-job-logging- info command; -preliminary data processing (formatting);

8 Parametric job Parametric job functionality was used to solve the problem of simultaneous submission of large number of jobs to CE. Parametric job is a set of jobs (subjobs) with the same descriptions apart from the values of the parametric attributes. JobType = "Parametric"; Executable = "tst.sh"; InputSandbox = {“tst.sh", "input_PARAM_.txt"}; StdOutput = "out_PARAM_.txt"; StdError="err_PARAM_.txt"; OutputSandbox={"out_PARAM_.txt", "err_PARAM_.txt"}; Parameters=1000; ParameterStart=0; ParameterStep=1; Parametric attributes get values from 0 to 999. WMS will create individual subjob for each parameter value. N=(Parameters-ParameterStart)/ParameterStep subjobs will be created Both main parametric job and its subjobs will have unique IDs.

9 Testbed gLite 3.1 middleware was installed on the testbed: CE WMS+LB+ BDII UI WN

10 Measurement of “load average” as function of time under the following condition: N jobs from each of K users Test usage in PNPI: 1000 jobs from each user (1 user, 3 users) for “old” and “new” versions gLite; Old - we had been using till January 2008 New (with marshal patches) - we have been using since January 2008 New version with marshal patches was released to production 10 April 2008 (gLite update 23) Marshal patches was developed by A.Kiryanov (PNPI) Test usage

11 Aim is to improve behavior of LCG CEs under load by regulating requests from job managers (hence the term ‘marshal’) due to : 1.Eliminate the necessity to recompile heavy Perl code on every job manager invocation Memory-persistent daemons handle the requests 2.Control of the number of simultaneously running job manager queries Decreases load on file system and batch system 3.Prevent CE overload by WMSes 4.Decrease system’s losses Jobs complete faster, especially visible with large number of short jobs Marshal patches for LCG CE

12 CE monitoring

13

14 WMS monitoring

15 Express visual analysis (WEB viewer) Each job passes through the different WMS components (the corresponding events are generated and stored in LB. Example of these events: “RegJob,NetworkServer”, “Match,WorkloadManager”,…,”Done, LogMonitor”). It gives the possibility to evaluate the performance of the WMS components. The WEB viewer was developed to provide the visual representation of events timestamp for the jobs running through the different components. This viewer provide the following functions: - to choose the event type which will be sorted by the timestamp value; - to choose data file with logging info data; - to get the graph of the event time since job registration in WMS for each job; - to choose the additional event type (will be represented on the same graph); - to get and store graph data as text file for the future analysis; - to get ID and logging info data for the subjobs those lost the chosen events; - to view monitoring data.

16 Express visual analysis Transfer (source- Logmonitor destination- LRMS) Accepted (source Logmonitor)

17 Express visual analysis We can view the monitoring data

18 Summary  The testbed was created with the gLite 3.1  A complex test was developed which provide the following:  Submission of the large number of jobs from many users  Load average monitoring on WMS and CE  Data acquisition of the test results  Developed test has been used on concrete sets of input parameters  HTML viewer was developed for the presentation of test results

19 Summary (First year of the grant) Set of WMS tests (control of functionality) was developed according to the request from gLite certification team for the following types of jobs: parametric, interactive, checkpointable, partitionable. Long and complex JDL stress test (for estimation of critical size of file) Some of the tests were included into certification SAM framework. 5 bugs were found and submitted in Savannah.

20 Conclusion Task 2 (PNPI)- done Task 4 (PNPI)-done Task 5 –under preparation (together with collaborating teams)


Download ppt "Development of test suites for the certification of EGEE-II Grid middleware Task 2: The development of testing procedures focused on special details of."

Similar presentations


Ads by Google