Data Acquisition Graham Heyes May 20th 2011. Outline Online organization - who is responsible for what. Resources - staffing, support group budget and.

Slides:

Advertisements

Similar presentations

June 19, 2002 A Software Skeleton for the Full Front-End Crate Test at BNL Goal: to provide a working data acquisition (DAQ) system for the coming full.

Advertisements

Software Summary Database Data Flow G4MICE Status & Plans Detector Reconstruction 1M.Ellis - CM24 - 3rd June 2009.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.

1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.

DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.

The Project AH Computing. Functional Requirements  What the product must do!  Examples attractive welcome screen all options available as clickable.

May 2010 Graham Heyes Data Acquisition and Analysis group. Physics division, JLab Data Analysis Coordination, Planning and Funding.

Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.

23 July, 2003Curtis A. Meyer1 Milestones and Manpower Curtis A. Meyer.

The June Software Review David Lawrence, JLab Feb. 16, 2012.

Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.

October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.

Optical Anchor / Interferometer Status: June, 2004 Josef Frisch.

Feb. 19, 2015 David Lawrence JLab Counting House Operations.

May. 11, 2015 David Lawrence JLab Counting House Operations.

DAQ Status & Plans GlueX Collaboration Meeting – Feb 21-23, 2013 Jefferson Lab Bryan Moffit/David Abbott.

Online Data Challenges David Lawrence, JLab Feb. 20, /20/14Online Data Challenges.

SBS/A1n DAQ status Alexandre Camsonne August 28 th 2012.

Online Update Elliott Wolin JLab 4-Oct Outline Online Data Challenges Networking DAQ JInventory System Farm Manager Counting House, Computers, Databases.

Hall A DAQ status and upgrade plans Alexandre Camsonne Hall A Jefferson Laboratory Hall A collaboration meeting June 10 th 2011.

DAQ Status Graham. EMU / EB status EMU framework prototype is complete. Prototype read, process and send modules are complete. XML configuration mechanism.

DAQ Status Report GlueX Collaboration – Jan , 2009 – Jefferson Lab David Abbott (In lieu of Graham) GlueX Collaboration Meeting - Jan Jefferson.

David Abbott CODA3 - DAQ and Electronics Development for the 12 GeV Upgrade.

DAQ Issues for the 12 GeV Upgrade CODA 3. A Modest Proposal…  Replace aging technologies  Run Control  Tcl-Based DAQ components  mSQL  Hall D Requirements.

Data Acquisition for the 12 GeV Upgrade CODA 3. The good news…  There is a group dedicated to development and support of data acquisition at Jefferson.

ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.

David Abbott - JLAB DAQ group Embedded-Linux Readout Controllers (Hardware Evaluation)

Comments from Physics Division Rolf Ent Associate Director for Experimental Nuclear Physics SOLID collaboration meeting, February 03, 2012 Thomas Jefferson.

EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,

Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.

Computing Division Requests The following is a list of tasks about to be officially submitted to the Computing Division for requested support. D0 personnel.

Online Software 8-July-98 Commissioning Working Group DØ Workshop S. Fuess Objective: Define for you, the customers of the Online system, the products.

Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.

Online monitoring and filtering Graham July 2009 Graham July 2009.

David Abbott - Jefferson Lab DAQ group Data Acquisition Development at JLAB.

Experiment Management System CSE 423 Aaron Kloc Jordan Harstad Robert Sorensen Robert Trevino Nicolas Tjioe Status Report Presentation Industry Mentor:

DEPARTEMENT DE PHYSIQUE NUCLEAIRE ET CORPUSCULAIRE JRA1 Parallel - DAQ Status, Emlyn Corrin, 8 Oct 2007 EUDET Annual Meeting, Palaiseau, Paris DAQ Status.

DØ Online16-April-1999S. Fuess Online Computing Status DØ Collaboration Meeting 16-April-1999 Stu Fuess.

Online Reconstruction 1M.Ellis - CM th October 2008.

SoLID review DAQ recommendations Alexandre Camsonne.

Jefferson Laboratory Hall A SuperBigBite Spectrometer Data Acquisition System Alexandre Camsonne APS DNP 2013 October 24 th 2013 Hall A Jefferson Laboratory.

1 EIR Nov 4-8, 2002 DAQ and Online WBS 1.3 S. Fuess, Fermilab P. Slattery, U. of Rochester.

DoE Review January 1998 Online System WBS 1.5  One-page review  Accomplishments  System description  Progress  Status  Goals Outline Stu Fuess.

Status & development of the software for CALICE-DAQ Tao Wu On behalf of UK Collaboration.

Monitoring Update David Lawrence, JLab Feb. 20, /20/14Online Monitoring Update -- David Lawrence1.

The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.

DAQ Status & Plans GlueX Collaboration Meeting – Feb 21-23, 2013 Jefferson Lab Bryan Moffit/David Abbott.

October Test Beam DAQ. Framework sketch Only DAQs subprograms works during spills Each subprogram produces an output each spill Each dependant subprogram.

Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.

1 Tracker Software Status M. Ellis MICE Collaboration Meeting 27 th June 2005.

CODA Graham Heyes Computer Center Director Data Acquisition Support group leader.

Event Management. EMU Graham Heyes April Overview Background Requirements Solution Status.

Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.

David Lawrence JLab May 11, /11/101Reconstruction Framework -- GlueX Collab. meeting -- D. Lawrence.

Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.

DAQ and Trigger for HPS run Sergey Boyarinov JLAB July 11, Requirements and available test results 2. DAQ status 3. Trigger system status and upgrades.

Hall D Computing Facilities Ian Bird 16 March 2001.

Gu Minhao, DAQ group Experimental Center of IHEP February 2011

BaBar Transition: Computing/Monitoring

WP18, High-speed data recording Krzysztof Wrona, European XFEL

CLAS12 DAQ & Trigger Status

LHC experiments Requirements and Concepts ALICE

for the Offline and Computing groups

Example of DAQ Trigger issues for the SoLID experiment

Data Acquisition.

The only thing worse than an experiment you can’t control is an experiment you can. May 6, 2019 V. Gyurjyan CHEP2007.

Chapter-1 Computer is an advanced electronic device that takes raw data as an input from the user and processes it under the control of a set of instructions.

Presentation transcript:

Data Acquisition Graham Heyes May 20th 2011

Outline Online organization - who is responsible for what. Resources - staffing, support group budget and resources. Requirements - scope of 12 GeV data acquisition. Plan - technology and concepts. How the plan meets the requirements. Online farms. Experiment control. Status Front end and electronics. Status Back end. The path forward. Summary.

Organization Hall online groups –A - Alexandre Camsonne, B - Serguey Boyarinov, C - Brad Sawatzky, D - Elliott Wolin. –Responsible for the trigger, hall specific hardware and software, management and operation of hall online systems. DAQ support group –Staff Manager - me Front end systems and software - David Abbott and Bryan Moffitt. Custom electronics - Ed Jastrzembski and William Gu Data transport, experiment control and back end software - Carl Timmer and Vardan Gyurjyan. –Responsibilities Support resource for new development and implementation. Support resource for hall operation. Develop generic custom software and hardware. Develop experiment specific custom software and hardware.

Resources DAQ group –$120k per year non-labor funding for R&D and operations support. –R&D lab on 1st floor of F-wing, “ cluster ” in F-wing data center. –Five DAQ group members have 15+ years of experience at JLab and a proven record. –Adding two group members in 2010 the group is two deep in the three core areas. Prior to that hire we were only two deep in back-end but most of the 12 GeV work is front-end and electronics. –Group is now sized for both 12 GeV implementation and operation. –Dip in 6 GeV operation load matches bump in 12 GeV construction. Hall Online groups –Three 6 GeV halls have existing mature online groups and mature DAQ systems. –Hall-D has an online group that has existed for a few years and is lead by a former DAQ group member. –The hall-online groups are appropriately sized for operation and support from the DAQ group and collaborators will provide manpower for construction. –DAQ systems are funded by the halls.

12 GeV Requirements A - Various short experiments with varying rates. Highest, 100 kByte events at 1 kHz = 100 MByte/s ~ This high rate running is two 60 day periods. –SOLID detector planned for has 1 kByte events at 500 kHz in the front end but a level 3 software trigger is planned before storage. B - 10 kHz and 10 kByte events = 100 MByte/s ~ 2015 C - Various rates but highest is 10 kHz and 4 kByte events = 40 MByte/s D - The detector and hardware trigger are capable of producing rates one to two orders of magnitude larger than those seen in 6 GeV hall-B running. A level 3 is planned but is outside the 12 GeV scope. Initial running will be at low luminosity tuned to maintain a manageable storage rate –Data spread over 50+ front end systems. –Front end rates (requires level 3), 300 kHz, 20 kByte events = 6 GByte/s –Rate to storage, 10 kHz, 20 kBytes = 200 MByte/s - assumes online farm.

Plan to meet requirements DAQ system anatomy will be similar in each hall. –High trigger rates: Pipelined ADC and TDC electronics. New Trigger Supervisor (TS), Trigger Distribution (TD) and Trigger Interface (TI)boards. Updated data format to support pipeline readout. Rewritten front-end software (ROC) under Linux on Intel hardware. –More complex systems of 50+ ROCs: New experiment control to control more complex systems. New inter-process messaging system cMsg. –High data rates Rewritten event transport to move data more efficiently in complex webs of components. Distributed event building.

How the plan meets the requirements Pipelined electronics and trigger meets the trigger rate requirements in the front end by decoupling the trigger from the readout. Data rate requirements for individual readout controllers are still modest. The most exacting requirement is to read out the modules over VME fast enough to not introduce dead-time. Since the readout is decoupled from the trigger it loses any real time OS requirements and we use Linux instead of VxWorks. Data rates of 300 MBytes/s coupled with the need to reformat data and build events at that rate is close to the “ single box ” limit. Use a “ divide and conquer ” method: –Staged event building where one physical EB deals with data from only a few readout controllers. Each EB outputs to a second stage where each EB handles a few first stage EBs. –Parallel event building where each stage of EBs can output to one or more EBs in the next stage. –EB becomes a network of Event Management Units (EMUs) that take data from a number of sources, processes the data and sends it to a number of destinations. –The EMU can be customized and reused in various parts of the system. –In the unlikely case where the output data rate is too large to store on a single device CODA will manage multiple data streams to different disks.

Online trigger and/ or monitoring farm If not on day one then at some date each of halls A, B and D have a requirement for an online filter. –Monitor data quality in real time. –Reject unwanted data. –Keep rate to storage down to reasonable levels. We are working on a generic solution for all three halls. –A modified EMU will manage starting, stopping and monitoring online processes for monitoring or filtering data. –An ET (Event Transport) system will distribute events between processes based on a user definable algorithm and gather results or filtered events for storage. Hall-D will start with a prototype farm and implement a larger farm in stages. The infrastructure (power, cooling etc) is part of the 12 GeV project scope in the counting house civil construction. Hall-B was originally to have an online farm in the 6 GeV program that was never implemented. A farm for 12 GeV is in the early stage of planning. Halls A and C may also implement online farms but they are similar in size to the systems that are already used for monitoring in 6 GeV.

Experiment control A totally rewritten online control system will be used that has the following features: –Can control more complex systems than the ROC-EB-ER chain of CODA 2. –Can implement user definable state machines. Allows more or less complex control of the system. For example hall A likes one big button marked “ Start Run ” while hall B likes to perform extra steps and require operator input during the startup. –The system is distributed which removes speed and complexity restrictions. –The system can talk to other control systems, for example EPICS, to allow feedback loops and semi-autonomous control of test data runs. –As a “ spin off ’ the underlying framework AFECS along with the cMsg message passing system used in CODA have been used to develop CLARA, a distributed offline analysis framework for CLAS.

Put it all together

Other online requirements System administration support. Network bandwidth from the halls to the data center to support at least 600 MB/s, of this 300 from hall-D and 300 from existing counting house. Disk buffer storage for one or two days running TB. Support for offline databases of comparable scope to the databases already managed for the existing 6 GeV halls scaled up by the addition of hall-D (4/3). Support for remote access to control systems and data. Control of magnets, beam-line components, targets and other hardware: –EPICS for most systems - help from Accelerator control group. –CODA experiment control, AFECS, for other systems. –Some Labview and Windows software for commercial systems - may be layered under AFECS.

Status - Front end Linux ROC code and module drivers for most DAQ hardware modules have been written and tested in the 6 GeV program. Custom hardware, ADCs, TDCs, TD (trigger distribution) and TI (trigger interface)designed and prototypes tested. –Production run of 12 TI modules, summer 2011, will allow full crate tests of pipelined readout and detector testbeds. TS is being designed and expected to be ready in 6 months to 1 year. Each TD can fan out the trigger to up to 8 TI modules. The TS can fan out the trigger to up to 16 TDs for 127 readout controllers in total (the 128th is reserved for the global trigger processor). Self configuring readout list software being designed in conjunction with all halls. –Use an online database to store crate configuration. Can the database be populated automatically? - under discussion. –Database will be used by the disentangler at a later stage of readout. Event format is stabilizing with feedback from all four halls. –Needs to handle entangled and disentangled data. –Needs to handle the trigger and status info from up to 127 crates.

Status - Back end EMU framework is written and is being integrated with the Linux ROC. Multi stage EB and parallel EB modules for EMU have been written and tested. Libraries to read and write EVIO format data have been implemented in C, C++ and Java. –Working with hall-D to generate files of simulated data that can be fed through the DAQ components. Hopefully what comes out = what goes in! Aim to test a parallel EB at hall-D rates on the DAQ group cluster by the end 2011 ET system is complete and in use in 6 GeV program. cMsg messaging system is complete and in use in 6 GeV. Event disentangler at design stage. Prototype in FY12. Experiment Control –Main framework is complete and in use in 6 GeV program. –Same underlying technology used in CLARA offline - in use by hall-B –SPR from hall-D to design and implement custom GUIs for hall-D.

The path forward Much of the underlying software and hardware framework and technologies have been designed, implemented and tested in the 6 GeV program. We have plans for testbeds and technology tests over the next two years before online hardware needs to be procured and technology choices locked down. The responsibility of implementing the online systems and budget for materials falls to the halls. R&D for custom hardware and software and non generic support tasks are funded by the hall online groups and work done by hall staff or via SPRs to the DAQ support group. Generic tasks to provide services shared by all four halls is funded from general physics operations. –During FY12 through FY15 DAQ group funding is 50% OPs and 50% 12 GeV SPRs. –The hall online groups are lean but DAQ work in halls B and D is staged so that the DAQ group can assist hall D first then hall B. Hall D has generated a task list in the Mantis tracking system for DAQ group staff and there is funding beginning in FY11 for that work. Hall B is not at the same advanced stage as hall D but their DAQ work comes later.

Summary I think that we are in good shape! –Mature online groups in each hall. –Much of the software and hardware components tested in 6 GeV running. –Clear series of tests of technology planned before final hardware procurements for online components such as processors and networking. –Infrastructure already in place in three existing halls and part of 12 GeV scope for hall D. –Staffing and funding is adequate to meet the milestones between now and turn on. This depends on the current timeline with hall B DAQ tasks falling after D. Careful ballet to not over subscribe key people. Six staff in the DAQ group gives us two people in each key area of expertise. That is certainly better than what we had two years ago but is not ideal. We recognize that the lab is running a lean staffing model.

A Blank Slide This slide is left deliberately blank to provide minimum entertainment while questions are asked.

Offline summary Organization –Offline data processing for each hall is “ owned ” by the hall collaboration as an integral part of each experiment approved by the PAC. –At some level it is the responsibility of the experiment to assure that the resources required for data analysis exist. Converting what needs to be done into what is needed to do it is difficult. Knowing what is available as part of a shared resource requires coordination. –Each hall has an offline coordinator. –I am available as a technical resource to help translate requirements into hardware specifications. Requirements are given to SCI who return a proposal with a cost and schedule. I confirm that the proposal meets the requirements and communicate that to the hall groups. I work with the division to set a budget that funds the proposal at some level. Given the proposal and funding I communicate to the groups what can or can ’ t be done. I manage the budget, this includes staging the work done by SCI so that we can react to changing requirements.

12 GeV requirement summary The numbers presented are 2015 and beyond, pre-2015 levels are down by a factor of 10 and covered by the scale of the existing analysis farm. The total represents “ worst case ” with all four halls running at once which probably won ’ t happen. The rough cost is based on a few assumptions: –The halls benchmarked against 2011 vintage 64-bit processors. In four years we assume a performance per $ three times better than today ’ s $160 per core. –Assume LTO6 tapes at 3 TB and $60 per tape. –Assume $500 per terabyte of disk. The 9,000 core figure for hall-D is conservative (maybe high). It includes multiple passes through the data and neglects any off-site simulation. HallABCDTotalCost k$ 2011 vintage cores , Raw data, PByte/yr / yr All data, including Raw, PByte/yr / yr Disk TB

Concluding remarks The total requirement for disk storage and compute nodes is a $1.25M+ investment, this does not include racks, cables, network hardware, labor for installation etc. –This is a larger number than originally estimated but there are still uncertainties in the generation the requirements. Time to process is based on existing code, we expect optimization. Calculations contain integer multipliers, “ number of passes through data ” that are currently 2 or 3 but could well be lower. There is still uncertainty about what off-site resources collaborating institutions will provide. –Each group does have a plan to refine these numbers. Effort in this direction is extremely important! The current NP analysis cluster budget is approximately $150k with a breakdown of $50k spent on each of cluster nodes, disk and “ other ”. If we are to meet the current requirements for a cluster capable handling the sustained load from the 12 GeV experiments either this budget needs to be at least doubled for the next four years or a single investment made prior to FY15 running. Given the relative uncertainty in the requirements it is likely that the best approach is to plan for a late one time investment but still make an increased annual investment.