Presentation is loading. Please wait.

Presentation is loading. Please wait.

28/03/2003Julie PRAST, LAPP CNRS, FRANCE 1 The ATLAS Liquid Argon Calorimeters ReadOut Drivers A 600 MHz TMS320C6414 DSPs based design.

Similar presentations


Presentation on theme: "28/03/2003Julie PRAST, LAPP CNRS, FRANCE 1 The ATLAS Liquid Argon Calorimeters ReadOut Drivers A 600 MHz TMS320C6414 DSPs based design."— Presentation transcript:

1 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 1 The ATLAS Liquid Argon Calorimeters ReadOut Drivers A 600 MHz TMS320C6414 DSPs based design

2 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 2 The LHC LHC is an accelerator ring, where the protons beams are accelerated to energy of 7 TeV. The LHC goal will be to have protons from 1 beam collide with the protons from the other. 4 experiments. LHC : Large Hadron Collider (27 km diameter)

3 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 3 The ATLAS experiment Goal: explore the fundamental nature of matter and the basic forces that shape our universe. About the size of a five story building. Collaboration of 2000 physicists. 150 universities and laboratories in 34 countries.

4 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 4 The electromagnetic calorimeter ATLAS : Several sub- detectors Electromagnetic calorimeter –Identifies electrons and photons. –Measures energy carried by these particles. –200 000 cells to be read at 40 MHz. Electromagnetic calorimeter

5 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 5 The calorimeter electronic chain DETECTOR FRONT END ELECTRONICS 1600 optical links Glink 800 Optical links Slink 12 Bits ADC AMPLI ANALOG MEMORY (SCA) Shaping FEB BACK END ELECTRONICS ROB ROD Timing Trigger Control (TTC)

6 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 6 The ROD modules Calculate precise energy and timing of calorimeter signals from discrete time samples (  t = 25 ns). Perform monitoring. Format data for the following element in the electronics chain.

7 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 7  200 modules, each receiving data from 1024 calorimeter cells.  Calculate energy for these data using optimal filtering weights: E =  a i (S i - PED)  If E > threshold, calculate timing and pulse quality factor: (< 10% cells) E  =  b i (S i - PED)  2 =  (S i - PED - E g i ) 2  Performs histograms of E, ,  2,...  During calibration runs, perform signal averaging to calculate calibration constants for each channel. The ROD modules goals

8 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 8 Requirements The ROD module must be able to process an event in less than 10 µs, including histograms. Use of commercial programmable processor.  A natural choice is Digital Signal Processor  Efficient power calculation for that kind of algorithm.  High I/O bandwidth. Modular design. Basic components should be easily changed/upgraded. Low power consumption.

9 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 9 The ROD : a 9U VME board

10 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 10 The ROD Motherboard

11 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 11 The Staging Mode At the beginning of LHC. ROD equipped with half of the PU. Level 1 trigger rate <50 kHz. Data from 4 FEB are routed to one PU. 1 DSP process 256 channels instead of 128.

12 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 12 The DSP Processing Unit config EMIFA EXT_INT EMIFA EXT_INT TMS320C6414 Input FPGA Apex 20k160 FEB1 FEB3 FEB2 FEB4 Input FPGA Apex 20k160 16 64 FIFO 4k*16 FIFO 4k*16 16 Data stream TTCVME JTAG EMIF B 16 BCID TType Acex 1k30 McBSP0 McBSP1 McBSP0 McBSP1 TTC interface 16 McBSP2 HPI VME interface

13 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 13 The DSP Processing Unit Input FPGA DSP Output FPGA FIFO

14 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 14 PU Software Summary DSP : For 128 channels per events E calculation or E, t,  2 Input FPGA : Parallelized data In DSP format Input data : Serial data in FEB format. Output FPGA : TTC data Output data : Integer 16 bit E or Integer 16 bit E 32 bit t,  2 and gain or 32 bit E 32 bit t,  2 and gain « Programmable » Part Fixed part inoutROD HistogramsVME Interface

15 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 15 8 Calculation Units 64 Registers The TMS320C6414 : a last generation DSP from TI Instruction Decoding Périphérals DMA Controller Central Memory 1MB CPU Core C64x Cache Memory 16kB data Cache Memory 16kB data External Memory Interface

16 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 16 The DSP code structure

17 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 17 DSP Software Developed with Code Composer Studio. Whole code written in C language except Physics loops written in linear assembly and then optimized using CCS.  Code complexity limited  Good legibility and maintenance

18 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 18 Example of Linear Assembly Calculation of the cell energy : E=  a i (s i -p)  Let the compiler do all the laborious work of parallelizing, pipelining and register allocation. a 1 s 1 a 2 s 2 +a 2 s 2 a 5 s 5 +a 5 s 5  a i s i (i=2..5)  a i s i (i=1..5) E=  a i s i -  a i p mpys1,a1,sa1 dotp2a23,s23,sa23 dotp2s45,a45,sa45 addsa23,sa45,sa25 add sa1,sa25,sa15 sub sa15,px,e

19 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 19 DSP software results Physics calculation of 128 channels : 3.5  s. –Includes all the necessary histograms – ,  2 for a fraction of 10 % of high energy cells. 30 to 40% of time is due to stall cycles. –Cycles lost because data are not in the cache.

20 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 20 When a data or instruction is not in the cache memory => 6 stalls cycles until the data is copied from the central memory to the cache. For the E calculation : 6 data to be read => 36 wait cycles The cache memory must be understood to ameliorate these numbers. Périphérals DMA Controller Central Memory 1MB CPU Core C64x Cache Memory 16kB data Cache Memory 16kB data The Cache Memory

21 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 21 L1D Mapping: Take care of which data is loaded, from which address and in what order. L1D Pipelining: Use of consecutive loads 1 miss : 6 wait cycles 2 misses : 8 wait cycles 4 misses : 12 wait cycles L1D access optimization Samples preloading Interleaved histograms Périphérals DMA Controller Central Memory 1MB CPU Core C64x Cache Memory 16kB data Cache Memory 16kB data Which improvements ?

22 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 22 DSP software results Physics calculation of 128 channels : 3.5  s. –Includes all the necessary histograms – ,  2 for a fraction of 10 % of high energy cells. 30 to 40% of time is due to stall cycles. –Cycles lost because data are not in the cache. The complete code takes about 7  s (600 MHz DSP). –Includes the RTX kernel, synchronization and send tasks, …  30 % of margin for further improvements.

23 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 23 Agenda Mid March : Motherboard + PU assembled May 2003: Validation in standalone mode. Fall 2003: System test in the experiment environment. Spring 2004: production launch. Summer 2004: Boards installation at LHC.

24 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 24 Conclusion: the ROD Calculate precise energy and timing of the signals calorimeter. 1 motherboard and 4 Processing Units. 1 PU = two 600 MHz TMS320C6414 DSP. 30 % of margin for future improvements. 200 ROD to be produced in 2004.

25 28/03/2003Julie PRAST, LAPP CNRS, FRANCE 25 Thank You


Download ppt "28/03/2003Julie PRAST, LAPP CNRS, FRANCE 1 The ATLAS Liquid Argon Calorimeters ReadOut Drivers A 600 MHz TMS320C6414 DSPs based design."

Similar presentations


Ads by Google