Presentation is loading. Please wait.

Presentation is loading. Please wait.

DZERO Data Acquisition Monitoring and History Gathering G. Watts U. Of Washington, Seattle CHEP04 - #449.

Similar presentations


Presentation on theme: "DZERO Data Acquisition Monitoring and History Gathering G. Watts U. Of Washington, Seattle CHEP04 - #449."— Presentation transcript:

1 DZERO Data Acquisition Monitoring and History Gathering G. Watts U. Of Washington, Seattle CHEP04 - #449

2 Monitor Data Archiver Save Selected Data at 15 second intervals Produce Plots on web in real time of any quantity vs run, store, or date Started as a one person project! Not a great deal of data! Raw data is approximately 20 GB/year. Built on work of DØ L3 Trigger/DAQ Group (Mature Project at this point). Hope was a project written, and forgotten (except in use!) Project is more interesting for its failures than its successes… Introduction to DØ L3 Trigger/DAQ & Monitoring Design of l3xHistoryViewer Multiple Incarnations (Root, Oracle, Root)

3 Monitor Server Display The DØ L3 Trigger/DAQ System Level 1 Level 2 Level 3 DAQ Readout Tape Archive Typical Collider Multilevel Trigger System Read Out Crate Switch Farm Node See #477 (Chapin) for details Routing Master Supervisor This project would not be possible without all the work of the DØ DAQ Group!

4 The Monitor Server Monitor Data Cache Client Handle r Display Handle r Reply Builder l3mq Web Application The Monitor Server Typically > 150 clients, > 25 displays Hole in online firewall allows external monitor displays. All monitor data returned from clients is cached. Displays may request data no older than. Heavily multithreaded C++ program. Uses ACE for communication library L3mq Web Application caches complex requests Web pages used to alter requests.

5 Monitor Server Communication All Monitor Server Communication is done in XML Name of a Monitor Item: Data Source Type Machine Name Item Name Item Data Format is extensible. Items defined by the Clients. Adding new items is a matter of adding a new client and requesting the items

6 The l3xHistoryViewer Data Flow Data Store Web Pages (Plots) Data Collector Monitor Server l3mq Cached Monitor Request l3xHistoryViewer Archive Monitor Data 4 times/minute Archive ~4000 values Generate plots on the fly for viewing on the web Must be quick (<5 seconds) View by Run Number, Store Number, Date… Little or no load on the DAQ/Online System Allow stale Monitor Data to prevent re-queries Easily change the Archived Monitor Items l3mq web pages used for this.

7 Data Collector Data Store Web Pages (Plots) Data Collector Monitor Server l3mq Cached Monitor Request l3xHistoryViewer Parse MS XML Generate Monitor Item Names Save to Data Store Robust against Monitor Server failures. Robust against various L3/DAQ machines crashing. Written in C++ and/or C# Depending upon version. Robust against Monitor Server failures. Robust against various L3/DAQ machines crashing. Written in C++ and/or C# Depending upon version.

8 Web Pages Data Store Web Pages (Plots) Data Collector Monitor Server l3mq Cached Monitor Request l3xHistoryViewer Discover Data Location for Query Extract Data Plot and Send JPEG to Web Fast enough to work in real time -- ~5 seconds Robust against web Written in C++ and/or C# and ASP.NET Depending upon version. Fast enough to work in real time -- ~5 seconds Robust against web Written in C++ and/or C# and ASP.NET Depending upon version.

9 History Data Store Data Store Web Pages (Plots) Data Collector Monitor Server l3mq Cached Monitor Request l3xHistoryViewer Relational Data Base (RDB) vs ROOT drove the design of the other components All choices have worked to various degrees Speed & Size were always the issue Insertion speed Can all the data be inserted in less than 15 seconds Extraction Speed Produce a plot before user hits reload button. Trouble Maker!

10 The 3 Implementations The Prototype 1 Monitor Items Appear, Disappear Homegrown XML parsing isn’t robust ROOT doesn’t perform well on TTree’s with 4000 branches! Oracle Implementation 2 DØ Computing Support offered Backup, Oracle Maintenance Included RDB expert (Consultant) Offered ability to select data on many criteria Speed & Size issues never satisfactorily addressed. ROOT Implementation 3 Use ROOT to store data, RDB to store lookup indices. Use both for what they are good at Speed still an issue Mostly understanding ROOT I/O. Back to standalone project. Ad-hoc backup mechanism.

11 Oracle Database Design Was Difficult! Never Fully Resolved tension between speed and space

12 Oracle Contains one Entry Per 15 Seconds. The ID is used as reference to EVENT_TO_VALUE table.

13 Oracle Contains one entry per monitor item per 15 seconds – 4000 entries per 15 seconds. Relates Time and Monitor Data. Compact, and heavily indexed for lookups. Stored Procedures and Array Inserts used for speed and database consistency

14 Oracle Contains one entry Per Value. A Monitor Item that is constant will not generate extra space in this table! Each value is attached to the actual name, in the ITEM_NAME table..

15 Performance for Oracle Version 8.1.7, Running on a Dual 2.4 GHz Xeon, 2 GB, SCSI Disk (not RAID) Without tuning single SQL query for non-existent run took 31 seconds! Database was filled with 3-4 weeks of data for these tests. Request that returned 147 values took 247 seconds! Tuning by adding Indices and studying Oracle SQL Plan. Toadsoft has excellent free-ware tool for this! Empty query 0.3 seconds, but 147 item query still 150 seconds! Started by using monolithic SQL statement to return data. Perform JOINs in local computer, extract significant portions of data. Reduce number of EVENT_TO_VALUE entries by putting 20 values on a single entry. Empty query 0.3 seconds, but 147 item query 31 seconds! Oracle on Linux Learning Curve for DØ also meant downtime (1 month at one point due to disk problems)

16 Data Store Disk Space DataIndex Data mirror Index mirror Redo Rollback Backup Replication N Gb 8 x N Gb Good rule of thumb: You need 10x the disk to hold a given amount of data in an RDB. Unexpected? From Jack Cranshaw’s Talk on CDF experiences in DBs given at ATLAS Software Workshop last week… History DB was heading towards 100 GB/year

17 ROOT II Use Root to hold Data, RDB to hold Index History NamesHistory Data Branch: One Per Data Source Array of Values Branch Array of Monitor Item Names and Index One Root File Keeps # of Branches Small (~100) Close/Open file on new day and when monitor data changes (5 per day is typical)

18 ROOT Performance Query involving 20 files, 900 values on 1.3 GHz PIII (M): 7.5 seconds! Database is currently Access; change when speed an issue… Tricky to achieve: string read back! On track to write ~15 GB/year Evidence that this can be made even faster. Web Front End Undergoing major rewrite from prototype version Allows for caching common plot requests Allows authorized users to upload root code to generate custom plots

19 Conclusions Iterations of this took 2 years! –Seemingly Simple Task… Don’t store data in a database you won’t index on. –Store in separate file or BLOB –If need to index on something later, regenerate database from binary data! Oracle and its ilk aren’t one-person shows –Very hard to develop on a portable, for example. –Requires team to manage and run. Tools Matter! –Good debugger and Development Environment Should understand databases. RAD meant 1 hour after started with Oracle I was writing to it. Allowed me to concentrate on DB design mostly, rather than coding. Used Windows, but… –No reason any of this couldn’t have been done on Linux and PHP, etc.


Download ppt "DZERO Data Acquisition Monitoring and History Gathering G. Watts U. Of Washington, Seattle CHEP04 - #449."

Similar presentations


Ads by Google