E.Fullana, J. Torres, J. Castelo Speaker: Jose Castelo IFIC - Universidad de Valencia Tile Calorimeter – ROD Colmar, 11 Sept. 2002 ROD General Requirements.
Published byModified over 5 years ago
Presentation on theme: "E.Fullana, J. Torres, J. Castelo Speaker: Jose Castelo IFIC - Universidad de Valencia Tile Calorimeter – ROD Colmar, 11 Sept. 2002 ROD General Requirements."— Presentation transcript:
E.Fullana, J. Torres, J. Castelo Speaker: Jose Castelo IFIC - Universidad de Valencia Tile Calorimeter – ROD Colmar, 11 Sept. 2002 ROD General Requirements and Present Hardware Solution for the ATLAS Tile Calorimeter
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Introduction: Basic functionalities DATA PROCESSING: Raw Data gathering from FE at the L1A event rate 100Khz to ROBS with intermediate processing and formatting. TRIGGER: TTC signals will be present (latency ~2 s after L1A) at each module providing ROD L1ID, ROD BCID and Ttype (trigger type). ERROR DETECTION: Synchronism Trigger Tasks. The ROD must check that the BCID and L1ID numbers received from CTP match with the ones received from the FEB. If a mismatch is detected an error flag must be reported. DATA LINKS: Input data must be received with optical input RX. Event data processed must be sent to ROB through the standard ATLAS readout links and standard DAQ-1 data format at the L1A event rate (100khz). BUSY GENERATION: Provide a ROD busy signal in order to stop L1A generation. This signal will be an OR function ROD crate modules and managed by the ROD TTC crate as an interface with CTP. LOCAL MONITORING: VME access of the data during a run without introducing dead-time or additional latency in the main DAQ data. Each ROD motherboard are VME slaves commanded by the ROD Controller (VME SBC).
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar ATLAS TDAQ & Read-Out System LVL1 decision taken with calorimeter data (coarse granularity) and muon trigger chambers data. Buffering is done on Detector Electronics. LVL2 is using Region of Interest data (up to 4% of whole event) with full granularity and combines information from all detectors. Buffering is implemented in ROBs. EF refines the selection, can perform event reconstruction at full granularity using latest alignment and calibration data. Buffering in EB & EF RODs
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar 9856 calorimeter channels. (Two fibers/drawer. 19712 ch with redundant info) TileCal ROD Dataflow and TTC partitions
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar ROD Crate Crate Hardware: ROD Controller TTC interface (TBM): Trigger and Busy Module FE, Calibration Cards: for data injection, calibration, etc M RODs per N Crates: Motherboards with Processing Units + Transition modules for Output optical links. Nowadays design M=8 and N=4. Total 32 ROD modules for 10.000 calorimeter cells acquisition. Interface with Environment: ATLAS DAQ and TileCal Run Control: Online Software CTP: Reception of TTC information and management of a per crate ROD BUSY. DATAFLOW: Process FEB events, detects synchronization errors, and send data to ROBs for ROIs lvl2 decision. TTC signals from CTP Crate OR BUSY to CTP Online Software Detector Data Event Data
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar TCC scheme Number of TTC partitions: 4 Organized around [0, 2 ]: EB( <0), CB( <0), CB( 0), EB( 0) This distribution will allow us to work with only central barrels if there are not enough RODs in the early runs. The scheme shown is for 4 partitions in two crates with 2 TBM (actually is only possible with 4 crates because TBM architecture design)
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Tiles ROD Racks and interconnections between crates USA15 room (radiation free) 2 Racks (52U) Standard 9U Atlas crates. 9U(slot) with transition module (probably 160mm). External size for CERN crates: 10U Cooling: Air/water (not decided, it depends on the amount of power dissipated by G-LINK receivers) Readout fibers: Input fibers (front in MB) Output fibers (rear TM)
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Using new LArg ROD Motherboard for Tiles
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Using LArg new motherboard. Hardware performance LArg needs more processing power per link: 128 channels/link (LArg), 45 for CB and 32 for EB ch/link (tilecal); so only 2 Processing Units, and 2 Output Controllers plus SDRAM data storage are enough for Tiles dataflow needs. Some estimations: Input bandwidth. The maximum input BW of each link for a tilecal physic event is 467,2 Mbit/sec, so 4 links (4 drawers) is 1,825 Gbits/sec. Input bandwidth of the Processing Unit is 2,5Gbits/sec (64bits@40Mhz) => One PU has enough input BW for 4 links. The processing unit. We need to process 154 channels (four drawers) in two TMS320C6414@600MHz DSPs (4800 MIPS each). This DSP has the same core with some improvements in number of registers and an enhanced DMA unit over the actual DSP we have tested is the TMS320C6202@250MHz (2000 MIPS). Our actual lab routines could process 45 channels in around 5,5 s (assembler) or 15,5 s (C code). Potentially, we could process 154 channels with the new PU TMS320C6414@600MHz with 9600MIPs (two DSPs) in 3,92 s (assembler) or around 11 s (C language). Because our limit is 10 s at LVL1 100KHz rate, thus if we believe in improvements in the C compiler from Texas Instruments, probably we could program the final system in a better maintainable C code and only with 2 Processing Unit mezzanines installed in the motherboard. Output Bandwidth. The typical BW for 154 channels (four drawers) is 656 Mbits/sec. Then, an Output Controller FPGA of 1,28 Gbits/sec (32@40MHz) has enough BW for the output of each Processing Unit (154 channels each). Transition Module: 2 mezzanine links are enough for this configuration.
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar From Tiles FEB electronics we need to adjust a simple VHDL change in FEB FPGAs (Glink HDMP1032 pin ESMPXENB=0) From LArg new motherboard design we need: To connect the CAV line of the HDMP1024 to the Staging FPGA in order to receive control words (not only data words). In the Figure this line is highlighted in RED color. To connect the FLAGSEL line of HDMP1024 to Staging FPGA. LArg used the FLAG bit to mark the even and odd 16 bit fragments, and Tilecal use CAV (control bit) to mark the start of transmission and count for the even and odd 16 bit fragment. Tiles use the FLAG bit to mark the global CRC word so Tiles need FLAGSEL set high, and LArg set low, so is needed to connect this pin to Staging FPGA for maintaining compatibility. In Figure this line is highlighted in RED color. The connection of FDIS, ACTIVE, LOOPEN and STAT1, seems to be correct since they are standard. The configuration is “Simplex method III”. To replace the LArg 80MHz clock with Tiles 40MHz clock, (pinout compatible oscillators needed) for getting the right reference clock for the G-LINK receivers (Tiles uses HDMP1032@40MHz in the interface links). Compatibility issues with LArg motherboard
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Noise reduction of optimal filtering algorithm (OF) versus others algorithms (FF). Data: testbeam 2001 ROD SOFTWARE : OPTIMAL FILTERING ALGORITHM (I) A multisampled method firstly developed for liquid ionization calorimeters. (See W.E. Cleland and E.G. Stern, Nucl. Instr. And Meth. A 338 (1994) 467) Allows the reconstruction of Energy and time information. Additionally minimizes the noise coming from thermal sources (electronics) and also from minimum bias events. (see plots)
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Samples coming from the front end electronics: S 1 S 2 S 3 …. S n First Set of weights: a 1, a 2, a 3 … a n, Second Set of weights: b 1, b 2, b 3 … b n, Energy information: E Time information: ROD SOFTWARE : OPTIMAL FILTERING ALGORITHM (II) Weights are obtained from noise atocorrelation matrix and ideal PMT shape waveform g(t). The mathematical procedure uses Lagrange Multipliers in order to reduce the electronic noise factor (see E.Fullana’s talk @feb_tilecal_analysis_meeting)
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar DSP Software Implementation: DSP Core Architecture Based on Texas DSP C6202 Harvard Architecture: Program and data memory could be accessed at the same time. F CLK = 250Mhz. Cycle time = 4ns. 2000 MIPs Data/Program Memory: 1Mbit (128kbyte)/ 2Mbit (64k 32bits) DMA channels: 4 EMIF & HPI: 32bits McBSP: 3 Timers: 2 (32 bit) V CORE : 1.8v / V I/O : 3.3v 3 phase PIPELINE 8 independent ALUS. Load-Store Architecture With 32 32-Bit General-Purpose Registers (two banks of 16). All instruction conditional
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar DSP Software Implementation: OF Algorithm Using Optimal filtering for obtain Energy, Tau, and 2. The implementation is considering 7 samples of 10 bits. Actual studies demonstrate that the resolution will not be improved with different weights for each cell => Use of the same calibration constants table for all channels (this could be changed). The calculations are 32bit integer except for Multiplication (16bits) due to DSP architecture. Always trying to get the maximum resolution of the integer ALU operations. C and Assembler code were developed: compare two input languages for coding the algorithm
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar C vs. Assembler Energy/Tau/ 2 algorithm for 45 channels and 7 samples of 10bit. Compilation characteristics comparative for the best speed performance profiling option:
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Actual developments: Staging FPGA (I) Online dataflow: Routes the incoming data from the different FEB inputs and send it to the connector of the PU concerned, depending if it is staging or not. This feature provides the possibility to use only two processing units instead of four, routing the data to the desired PU (the staging is configured through VMEbus). Glink chips configuration that will be performed by the staging chips. It gets the temperature of the Glink and transmits it to the VME chip (generates an IRQ). Because these chip usually has high power consumption. It transmits the Glink errors (parity and ready) to the PUs and to the VME. During the tests it will: Read the Glink data and transfer it to the VME. Transfer data from the VME to the PU. Similar function as 'data distributor block' in the demonstrator board. Transfer at high rate some pattern data to the PU. The code must be written in VHDL because of maintainability reasons.
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Actual developments: Staging FPGA (II)
LECC 2002, 8th Workshop on Electronics for LHC Experiments, Colmar Summary Work with ROD demonstrator board were very useful and instructive Waiting for the first prototypes of new motherboard and studying LArg and Tilecal compatibilities for a common ROD board for ATLAS calorimeters Programming a versatile program of Staging FPGA compliant with tilecal and LArg Optimal Filtering studies and analysis were successful for 1998 testbeam data. Actually this work is focused over 2001 and 2002 tilecal testbeam data. Better results are expected with final FEB electronics mounted on these test beams (real electronic noise). DSP implementation of Optimal Filtering under 10 s (100KHz atlas lvl1 rate) in ASSEMBLER, but not in “C” using 250MHz C6202 DSP. We expect to be below this threshold with new generation PU with 600MHz C6414 DAQ (online software) integration of demonstrator board.