Presentation on theme: "Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002."— Presentation transcript:
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002
Contents Introduction Production status Data management Job submission Data production monitoring Bookkeeping DataGrid involvement status and plans Conclusions
Introduction LHCb – one of the 4 LHC experiments for studies of the CP violation phenomena in the Beauty system; The experiment is reconsidering the general setup now: The revised TDR due September 2003; Will be based on the large volume MC data; An adequate MC production system is being set up now.
Summer 2002 production = Data Challenge 1 In 48 days we produced 3.3 M events ( 6 TB of data ) We have shown we can produced ~70 K per day We can expect to produce ~ 100 K per day when all centres are operational. Usual job (sim+reco) of 500 events: 390s/evt (I.e. 55 hours of CPU ! More than T class limit in CC) SICBMCsimulation(s/evt)Brunelreconstruction(s/evt)RAWHGeant(KB/evt)OODSTReco.(KB/evt) Min bias 392840094 BB incl. 188125900232
Some Data Challenge 1 lessons Thorough data quality checks on each step: Formal based on log files analysis; Informal based on a small analysis job for the produced data; Crash trace back; Flexible workflows should be possible; Production centre dependancies should be as limited as possible; Bookkeeping: Maintaining integrity; Managing distributed replicas.
Next Data Challenge 2 MC production = Physics Data Challenge; Volume: DC2 = 10 x DC1; Available capacity seems to match requirements: ~1000 CPU world wide during 5 months; Planning for the DC2: Production software ready by mid Nov 2002; Preproduction: mid Dec 2002 – mid Jan 2003; Production: Feb – May 2003.
Job submission Storage Local production demon Production center Bookkeeping DB Data Production DB CERN Monitoring service Castor Job scripts Production service Bookkeeping service
Local production demon (at a production center) Customized for the particular center; Checks availability of the local resources; Gets jobs scripts from the Production service; Installs the necessary software if needed; Submits jobs; Updates job status in the Production DB; Checks the jobs output; Initiates data transfer to CERN/Castor; Updates the Bookkeeping database; Technology: Python; XML-RPC, can easily migrate to SOAP;
Bookkeeping Bookkeeping DB available via a Web service interface: XML-RPC server; ODBC mediated persistant back-end (ORACLE, MySQL) Flexible schema: Allow easy addition of new data types, parameters; Handles distributed dataset replicas; Web based user GUI is in the works.
DataGrid status and plans Installation 1.2.2 operational Long job problem fixed Long file transfer problem (~ 1 Gb) New production tools being installed Test: Run 500 event MC generation Store on SE Recover logs and histograms to CERN Run reconstruction. Output to SE. Recover log files and histos. Write recon output to mass store (Castor) Read Castor data with an analysis job outside Grid
Conclusions Data Challenge 1 in summer 2002 – the need to upgrade the production tools; Deployment of the new Data Management tools: Oct-Dec 2002 to support Data Challenge 2 production; DC2 in Dec 2002 – May 2002; DataGRID facilities will be used in DC2.