Presentation is loading. Please wait.

Presentation is loading. Please wait.

Discussions on group meeting

Similar presentations


Presentation on theme: "Discussions on group meeting"— Presentation transcript:

1 Discussions on group meeting
2013.5

2 Site Monitoring Two kinds of monitoring are proposed
“SAM test” monitoring Just like SAM tests Send regular tests, collect and filter results and publish Easy to know critical service status, eg. CVMFS, PBS, SE…… Ganglia-based monitoring Similar to Atlas T3 monitoring Set up local and global ganglia monitoring, collect info and publish Easy to know server status and total job numbers….. Site info from two monitoring will be collected into one database and summarized in one web page like dashboard We need to decide what kind of information are necessary Service status: ce, se, cvmfs Transfer status: channel, fts Job number: production, analysis, tests CPU consumption, CPU efficiency

3 Similar to LCG one

4 Further thoughts about “SAM tests” monitoring
DIRAC resource status system Similar functions, not completely what we want Not send tests, only collect info from the existing jobs In development and in plan Propose to establish our own one It seemed as if not too difficult, if someone can spend time on it

5 Preliminary designs of site monitoring
Develop based on DIRAC framework Monitor Agent Configuration Service Resources MonitorDB Command Line Web Page get site info send tests to sites record test results to DB

6 Preliminary designs of site monitoring
Tests design CE, SE, CVMFS….. CE and CVMFS tests by jobs SE tests by issuing gLite commands Agents Monitor Agent is responsible for getting site info from DIRAC configuration service, sending tests, retrieving and filtering results, updating DB

7 Preliminary designs of site monitoring
DB MonitorDB and table SiteStatus to record site status Commands bes-dirac-site-monitor --sitename --timerange the default print out the latest site status Interact with DB interface to get status Web DiracWeb is in migration period to tornado, better consider later

8 BESIII data transfer Two transfer protocols are added Testing
DIRACFTS(dirac-dms-fts-submit) dirac-dms-fts-submit is not well coded and not easy to debug, need to be fixed if we have time DIRACDMS(dirac-dms-replicate-lfn) Testing Preliminary tests are successful with two modes Dataset created-> transfer request created->transfer status can be followed->transfer errors are showed Error logs still need to be improved

9 BESIII data transfer Accounting
Currently no good channels are available. Dubna SE is in downtime, USTC and IHEP SE need to be tuned Going to use IHEP and IHEPD for testing, a certain volume of transfer tests need to be done Accounting Update transfer info to central DIRAC accounting system DIRACFTS accounting is available, but not correct, need to be fixed DIRACDMS accounting is not available, need to be added. We do it ourselves, or ask DIRAC to fix? More and more small fix need to be done inside DIRAC, need to find out regular procedure to do that

10 BESIII data transfer More functions are needed
Options needed to be introduced to do the switch between two transfer types Transfer types used need to be recorded for each request in DB Functions such as cancelling requests need to be introduced Need to consider to use datasets defined from badger

11 BESDIRAC An extension to DIRAC How to manage and maintain extension
More and more BESIII-specific extensions are coming Definitely need an extension How to manage and maintain extension Need a new release for server and client Need someone to look into it If simply add extra packages locally, there would be problems with pilot jobs during software download We have set up a development env in bager01 Dubna need one too Use Git for code management? To be consistent with DIRAC development environment

12 UMN site UMN site is going to have a SE for BES
Good news! Currently they are working in joining SE to BES VO Their SE type is BestMan We seemed not trying to add BestMan SE to BES VO before Document for that is not available Someone need to look into it if they help

13 Virtual sites PBS cluster are set up over virtual resources
WHU is using VirtualBOX NSCCSZ is using KVM It is easy to add new nodes and extend cluster Use images generating by existing node Light configuration and check can be done to VM after booting to make all the necessary services up and running Virtual sites are working well as a normal DIRAC cluster site

14 Virtual sites(2)

15 Virtual sites Advantage: Expect to be improved:
Site don’t need to change basic OS Clusters are easy to set up and extend with virtual images Expect to be improved: Virtual sites expect to provide cloud resource management platform (eg. Openstack) and provide API for creating and deleting VM DIRAC has a good support to some well-known resource management platform such as openstack, cloudstack, opennebula The size of virtual resources is able to vary with the number of job in real time In this way resource usage is more flexible and efficient Currently resources are relatively static and VM set-up are done by hand


Download ppt "Discussions on group meeting"

Similar presentations


Ads by Google