Presentation is loading. Please wait.

Presentation is loading. Please wait.

D0 Run IIb Review 15-Jul-2004 Run IIb DAQ / Online status Stu Fuess Fermilab.

Similar presentations


Presentation on theme: "D0 Run IIb Review 15-Jul-2004 Run IIb DAQ / Online status Stu Fuess Fermilab."— Presentation transcript:

1 D0 Run IIb Review 15-Jul-2004 Run IIb DAQ / Online status Stu Fuess Fermilab

2 D0 Run IIb Review 15-Jul-2004 Introduction  In order to meet the DAQ and Online computing requirements for Run IIb we plan: u Level 3 farm node increase –Brown, Univ. of Washington, Fermilab u Host system replacements / upgrades –Hardware: Fermilab, Software:various u Control system node upgrade –Fermilab  The requirements, plans, status, and future activities will be discussed

3 D0 Run IIb Review 15-Jul-2004 Level 3 1.3.1

4 D0 Run IIb Review 15-Jul-2004 Level 3 farm nodes  Need: greater L3 processing capabilities for higher luminosities Dual Nodes “GHz” plan 48 1.0 to be removed 34 1.6 existing 32 2.0 existing 96 2.2 to be added 332 GHz equiv CPUs now 659 GHz equiv CPUs for start of RunIIb For example: 1 kHz @ 500 ms-GHz requires 500 GHz of CPUs For example: 1 kHz @ 500 ms-GHz requires 500 GHz of CPUs

5 D0 Run IIb Review 15-Jul-2004 Level 3 farm nodes, cont’d.  Plan: Single purchase, summer 2005, of $210K* of nodes u 3 racks of 32 = 96 nodes plus infrastructure  Strategy: u This is an “off the shelf” purchase, but a major one u Similar to CompDiv farms purchases u Used a Run IIa purchase to refine the procedure: s 32 node addition s History –Req preparation begun 1/04/04 –Req submitted 1/29/04 –PO created 3/23/04 ($51.5K) –Prototype system delivery 4/21/04 –Full order delivery 6/21/04 –Operational in Level 3 on 6/23/04 5 month process ! Thanks to Computing Division for help! * Unburdened FY02 $

6 D0 Run IIb Review 15-Jul-2004 Level 3 farm nodes, cont’d.  Other preparations u Will replace 3 racks / 48 nodes of older processors with 3 racks / 96 nodes u Existing electrical circuits and cooling sufficient for new racks u Will need additional 48 network ports on Level 3 and Online switches  Impact u Installation somewhat disruptive, as will remove 3 racks (48) of older nodes to make room for these s Remaining 66 nodes operational during installation  Schedule u Plan for arrival of nodes at start of 2005 shutdown s Start purchase process ~ 3/05 u Continued replacement with upgraded nodes will be necessary over the duration of Run IIb (Operating funds)

7 D0 Run IIb Review 15-Jul-2004 Host systems 1.3.2

8 D0 Run IIb Review 15-Jul-2004 Host systems  Need u Replace 3-node Alpha cluster, which has the functions: s Event data logger, buffer disk, transfer to FCC s Oracle database s NFS file server s User database  Plan u Replace with Linux servers s Install a number (~4) of clusters which supply "services“ s Shared Fibre Channel (FC) storage and failover software to provide flexibility and high availability u $247K* for processor and storage upgrades * Unburdened FY02 $

9 D0 Run IIb Review 15-Jul-2004 DØ Online Linux clusters RAID Array Legacy RAID Array JBOD Array Legacy JBOD Array Fibre Channel Switch SAN Clients Network Switch DAQ Services Cluster File Server Cluster Online Services Cluster Database Cluster

10 D0 Run IIb Review 15-Jul-2004 Cluster Configuration Cluster Service Name Domain Check interval Script Cluster Member Name Power Controller ip Address Device Device special file Mount point File system Mount options NFS Export Export directory NFS Clients Client names / addresses Export options

11 D0 Run IIb Review 15-Jul-2004 Cluster Services Details of configuration of cluster services … Using experience of Run IIa in how things actually work! Details of configuration of cluster services … Using experience of Run IIa in how things actually work!

12 D0 Run IIb Review 15-Jul-2004 Host systems, cont’d.  System tests u Performed tests of Fibre Channel, network, storage rates s Network: capable of wire rate (1 Gb/sec) s Storage: Write (MB/s)Read (MB/s) Local disk JBOD1853 SW RAID 08870 FC disk JBOD5241 HW RAID 09430 HW RAID 1 HW RAID 53134 SW RAID 07583 Target is 25 MB/s for event path Target is 25 MB/s for event path

13 D0 Run IIb Review 15-Jul-2004 Host systems, cont’d.  System tests, cont’d. u Checked relative performance of dual vs quad processor systems s Conclude: dual processor nodes, at 20% the cost, are sufficient for all but possibly the highest I/O DAQ data logging nodes u Potential issues/concerns s Linux 2.4 kernel has problems with multiple high-rate buffered I/O streams; much better in 2.6 kernel; alleviated somewhat with use of direct I/O –Expect to see 2.6 next Spring/Summer in Fermi Linux –The design avoids this situation s Fibre Channel redundant paths somewhat complicated –Expect to use a “manual” solution, but is solvable ($$) with commercial Secure Path software

14 D0 Run IIb Review 15-Jul-2004 Host systems, cont’d.  Cluster implementation u Red Hat Cluster Suite s Available open source, distributed in Fermi Linux –But also a supported ($) Red Hat Application Suite product s No kernel modifications required –Can use non-homogeneous distributions s Can be made to work with non-homogeneous hardware –Use LVM as virtual storage layer  Cluster tests u Storage device access u NFS failover s File reads/writes transparently complete when active node turned off and service transitioned to backup node

15 D0 Run IIb Review 15-Jul-2004 Host systems, cont’d.  Status u A 2-node cluster has been created s Single-path FC SAN s Service failover demonstrated u 6 new servers delivered 6/21/04 s Will construct 4 clusters during summer/fall 2004 shutdown  Schedule u Fall 04 attempt to move everything! s DAQ, Oracle, NFS, etc s Need involvement of software system experts s Dual-path SAN still a challenge s DAB2 rack space juggling a challenge Disruptive (possibly a day or two)! Essential functions will have to be relocated and debugged u Summer 05 to enhance with best processors

16 D0 Run IIb Review 15-Jul-2004 Control system 1.3.3

17 D0 Run IIb Review 15-Jul-2004 Control System  Need: u The current control system processors (~100 of them) s are becoming obsolete and not maintainable –Lost 2 nodes, repaired 5 during Run IIa s are limiting functionality in some areas –Tracker readout crates are CPU limited  Plan: u Upgrade ~1/3 of the control system processors s either with latest generation of processors (PowerPC) which run current software (VxWorks), or transition to different architecture (eg Intel) with new OS (eg Linux) u Inclination is to just purchase appropriate number of the current processor family and minimize software changes  Strategy: u $140K* to upgrade processors u Scheme for replacement on next slide: * Unburdened FY02 $

18 D0 Run IIb Review 15-Jul-2004 Control System Processors Detector subsystem # of Processors Processor typesReplacement plan Control and Monitoring 18(11) 16MB PowerPC (6) 64MB PowerPC (1) 128MB PowerPC Replace, use old processors for HV or spares; need 12 additional for CAL High Voltage30(30) 16MB PowerPCRetain, with new and spare needs met from other replacements Muon~40(23) 4MB 68K (16) 128MB PowerPC OK as is, 16 in readout crates are recent replacements Tracker readout26(10) 32MB Power PC (11) 64MB PowerPC (5) 128MB PowerPC Replace, use old processors for HV or spares Test stands~13mixed low endUse available

19 D0 Run IIb Review 15-Jul-2004 Control System, cont’d.  Impact: u Potential short disruptions in control system functions as processors are replaced  Schedule: u Recently purchased latest PowerPC processor for testing s Testing EPICS and D0 controls software u Follow evolutionary developments of OS (VxWorks) and Control System Framework (EPICS) u Purchases in advance of Summer 05, then incremental installation of nodes

20 D0 Run IIb Review 15-Jul-2004 Conclusion  Three activities: u Level 3 u Host systems u Control system  Level 3 is an “addition of nodes”  Host system changes are most revolutionary u Attempting to perform upgrade this summer/fall u Improvements in functionality  Control system is a “replacement of nodes” u With evolutionary progress of VxWorks, EPICS software Expect nearly seamless transition, ready for IIb


Download ppt "D0 Run IIb Review 15-Jul-2004 Run IIb DAQ / Online status Stu Fuess Fermilab."

Similar presentations


Ads by Google