Presentation is loading. Please wait.

Presentation is loading. Please wait.

DCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG dCache best practice additional funding, support or contributions by d-grid DGI.

Similar presentations


Presentation on theme: "DCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG dCache best practice additional funding, support or contributions by d-grid DGI."— Presentation transcript:

1 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG dCache best practice additional funding, support or contributions by d-grid DGI II Patrick

2 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data Distribution issue Data distribution on data arrival Allowing automatic data replication Using manual redistribution tools (Migration Module) Content Best practice (Components, Queues, etc) Examples by Andreas and Kay (DESY Zeuthen) Other Business Chimer a Postgres ( optimizing, backup) Virtualization

3 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution Konfuzius says Get your data distributed properly.

4 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution There are at least three ways of influencing the distribution and replication of data. ➢ Allow as many pools as possible when writing. ➢ Enable data replication (p2p transfers) ➢ Use the 'migration' module to get data redistributed.

5 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution Distribution on data arrival Pool Manager Parameter : set cost decision -spacecostfactor=SC - cpucostfactor=CC SC = 0 & CC = 0 : random selection of pools SC > 0 & CC = 0 : empty pools will be filled first SC = 0 & CC > 0 : Number of movers will be leveraged SC > 0 & CC > 0 : Something between but difficult pm set [default] -spacecostfactor=SC - cpucostfactor=CC

6 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution Automatic data preplication (p2p)

7 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Link Group Space Token Link, not in any LinkGroup) Cached Copy Data replication and distribution You can only replicate (p2p) files into a Link, which is not member of a LinkGroup. Link in LinkGroup

8 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution BUT

9 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Link Group Space Token Link, not in any LinkGroup) Cached Copy Link in LinkGroup Data replication and distribution Same pool can be in both LinkGroups This is not really supported as it confuses the Space calculation, but if you make sure the replicated files are 'cached' that works fine.

10 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution Configuring p2p (PoolManager) Allow 'p2p on cost' Set the link preference for p2p > 0 (normally this is the same as read) Find cost threshold from which on the p2p should be triggered. In order to find a reasonable threshold you need to know the current pool costs.

11 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG The dCache cost magic How do I get the pool cost (CPU) : do some calculation Each mover queue has tree values A : Active movers W : Waiting movers m : Max Number of allowed movers The cost of each queue is : A + W m if m > 0, zero otherwise For the total cost of the pool you sum up all queues with m > 0 : queues : store,restore,p2p,p2pclient,and movers Ʃ A + W m ( ) queue queues (m>0) Cost = pool

12 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG The dCache cost magic How do I get the pool cost (CPU) : Commad line interface Log into the dCache command line interface ssh -l admin -c blowfish -p 22223 my-dcache-headnode admin@my-dcache-headnode's password: dCache Admin (VII) (user=admin) [my-dcache-headnode] (local) admin > cd PoolManager [my-dcache-headnode] (PoolManager) admin > cm ls -r You get two lines per pool dcache-atlas06-09={R={a=0;m=0;q=0};S={a=0;m=0;q=0};M bla bla A long line with all the queue information And a short one with the calculation done dcache-atlas06-09={Tag={{hostname=dcache-atlas06}};size=0;SC=5.5;CC=0.55;}

13 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG The dCache cost magic How do I get the pool cost (CPU) : Get the latest GUI (1.6) Not supported yet Click on the bar, to get the pool name(s)

14 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG How do I set the pool to pool threshold ? Log into the dCache command line interface ssh -l admin -c blowfish -p 22223 my-dcache-headnode admin@my-dcache-headnode's password: dCache Admin (VII) (user=admin) [my-dcache-headnode] (local) admin > cd PoolManager [my-dcache-headnode] (PoolManager) admin > cm ls -r get the cost table here [my-dcache-headnode] (PoolManager) admin > set costcuts -p2p=0.55 or if you are using the Partition Manager [my-dcache-headnode] (PoolManager) admin > pm set default -p2p=0.55 The dCache cost magic

15 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG The dCache cost magic ● The optimal p2p values change over time. ● Therefor, you need to : ✗ watch and re-tune the system frequently ✗ run a script doing this for you (Jon B. approach) In one of the upcoming 1.9.4 releases we will offer an mechanism which does this tuning automatically. You would only have to specify a relative parameter. Do p2p for all pools which report a cost above xx percent of the pool with the highest cost. pools pool-cost 100 % nn %

16 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution Manual redistribution of data, or The Migration Module

17 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution The Migration Module (contributed by NDGF, Gerd) Starting with 1.9.1 ➢ The Migration Module is a pool component. ➢ Steering is done by the command line interface only. ➢ It supports tasks, which copy a set of files between pools. ➢ The number of concurrent transfers can be configured. ➢ The selection, of files to be copied, is currently based on the Storage-Unit (x:y@osm) and the 'last access time'. Total number of files and amount of data will follow in a future release.y@osm ➢ The destination of the copy can be an entire dCache link. The task tries to balance the transfers. ➢ There are one-shot tasks and permanent tasks. ➢ e.g. NDGF is using tasks to permanently replicate files to a second country. ➢ Please get more information from dCache, The Book

18 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Data replication and distribution The Migration Module ➢ Permanently replicate all files ➢ belonging to x:y@osmy@osm ➢ and having been accessed within the last week ➢ from this pool ➢ to pools of a particular Link ➢ and make them cached ! A possible task could be :

19 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Other business About chimera You should migrate to chimera Beside all the advantages mentioned plenty of times, ACL inheritance is only possible with chimera. Without ALC inheritance, automatic directory creating (e.g. with SRM) can become painful.

20 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Other business About postgres Get the latest postgres version and make yourself a database expert (up to a point)

21 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG About Virtualization We don't support dCache services on virtual machines.

22 dCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG Further reading www.dCache.OR G


Download ppt "DCache.ORG Fuhrman n Wuppertal 30.7.2009 Atlas GridKa Cloud dCache.ORG dCache best practice additional funding, support or contributions by d-grid DGI."

Similar presentations


Ads by Google