Download presentation
Presentation is loading. Please wait.
1
Moving NN Triggers to Level-1 at LHC Rates Triggering Problem in HEP Adopted neural solutions Specifications for Level 1 Triggering Hardware Implementation Results Conclusion Proposed architecture Jean-Christophe Prévotet L aboratoire des I nstruments et S ystèmes d’ I le de F rance
2
Triggering problem in High Energy Physics Detector Level 1 Trigger ~1µs Level 2 Trigger ~20µs Level 3 Trigger Level 4 Trigger Offline event reconstruction Reject Dedicated Hardware Implementation Conventional Microprocessors Incoming data from sub-detectors Y~0 Background Y~1 Physics
3
Level 1 Trigger Level 2 Trigger Hardware Adopted Solutions Latency of 500ns => No digital circuits possible OR Straightforward Circuits made of RAMs : lack of precision, small networks Latency of 10µs => Possible use of digital circuits Exple: CNAPS in the H1 experiment => 8µs to execute a 64x64x1 net DSPs Current solutions Future solutions Technology trend enables to transpose L2 complexity of neural computations into L1
4
Level 1 Trigger Scheme Preprocessor Digitization Pre-Sums, … Analog signals from the calorimeter Output data To Level2 (every 25ns) Main control module 500ns Neural processing FPGAsDemultiplex unitMultiplex unit Data arrive each BC (25ns) and processed in a time multiplexed way Timing Specifications of the ATLAS experiment at LHC
5
Specifications …….. 64 128 4 Execution time : 500 ns Weights coded in 16 bits States coded in 8 bits with data arriving every BC=25ns Electrons, tau, hadrons, jets
6
Neural processor Architecture TanH PE I/O module Control unit Matrix of n*m Processing Elements (PEs) 256 PEs for a 128x64x4 network PE TanH 1 matrix row computes a neuron ACC Control unit I/O module The result is back-propagated To calculate the output layer TanH are stored in LUT
7
PE architecture X AccumulatorMultiplier Weights mem Input data8 16 Addr gen + Data in cmd bus Control Module Data out
8
Row Accumulator Trunc Registers Din Input bus (data coming from other rows) Output bus (data going to other rows) Adder Multiplexers / Demultiplexers Truncation unit 29 8 32 8 8 Register bank
9
Hardware Implementation in a FPGA What is a FPGA… I/O Ports Block Rams Programmable connections Programmable Logic Blocks DLL LUT Carry & Control Carry & Control DQ DQ y yq xb x xq cin cout G4 G3 G2 G1 F4 F3 F2 F1 bx Xilinx Virtex slice
10
Results Timing Time in clock cycles for the whole neural net : around 60 cycles. Target Clock frequencyProcessing time: 8.33ns120MHz => VIRTEX2 compatible Global synthesis and implementation on the FPGA Timing and resources optimization What has to be done… What is done today… Description of the whole design in VHDL Functionnal simulations of the different modules (Multipliers, acc, control, PE..) Individual Modules synthesis (translated into logic blocks)
11
Flexibility Implementation in a FPGA => easily re-configurable Processing time: doesn’t really depend on the number of neurons in the hidden layer Coding precision easily changeable Weight Precision, activation functions, etc. Advantages 1 neuron = 4 added PEs Disadvantages Resources consuming => many FPGAs required Summary Implementation of digital neural network feasible in real time Transposition of level2 concepts into Level 1 Proposed architecture Fewer performances than custom circuits
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.