08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon.

08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon Lebedevs GSI, Darmstadt, Germany and LIT JINR, Dubna, Russia email: ososkov@jinr.ru Talk on the NEC-2009, Varna 07-14 September, 2009

08.09.09G.Ososkov NEC-20092 CBM at FAIR Facility for Antiproton and Ion Research accelerator complex serving several experiments at a time (up to 5) from a broad community SIS100 and SIS300 synchrotrons highest beam intensities!(e.g. 2x10 13 /s 90 GeV protons and 10 9 Au ions/s at 45 AGeV beam energy) rare isotope beams first experiments ~2012, fully operational ~2016 Condenced Barion Matter GSI, Darmstadt, Germany

08.09.09G.Ososkov NEC-20093 The CBM setup for electron identificationfor muon identification STS (Silicon Tracking System): track, vertex and momentum reconstruction MVD (MultiVetex Detector): determination of secondary vertices RICH ( Ring Imaging CHerenkov) : electron identification (two designs) MuCH : (Muon CHambers): muon identification TRD ( Transition Radiation Detector) : global tracking and identification of electrons TOF: ( Time Of Flight) : time of flight measurement for hadron identification ECAL ( Electromagnetic CALorimeter) : measurement of photons and neutral particles

08.09.09G.Ososkov NEC-20094 MuCH and TRDchallenges MuCH and TRD challenges common for both types of detectors – massive absorbers  Aternating detector-absorber layout for continuos tracking of the muons through the iron absorber  Two MuCH designs to measure: 1. Low mass vector mesons 5 Fe absorbers (125 cm), p> 1.5 GeV/c 2. Charmonium 6 Fe absorbers (225 cm) p> 2.8 GeV/c  2(3) detector stations between the absorbers  high hit density (up to 1 hit per cm 2 per event), high event rates (10 7 events/s), position resolution < 300 μm TRD  Tracking and electron identification via energy losses  12 identical layers grouped in 3 stations  Layer=absorber-radiator and MWPC readout  Stations: 5m, 7m, 9m  Pad size: 0.03cm-0.05cm across and 0.27cm-3.3cm along the pad MuCH

08.09.09G.Ososkov NEC-20095 CBM RICH detector RICH in CBM will serve for electron identification for momenta up to 10 GeV/c – in order to study vector mesons and J/Ψ Sketch of the STS and the RICH detector, track extrapolation and track projection onto the photodetector plane Two different options of RICH are under discussions: Large Compact Two different options of RICH are under discussions: Large length 2.5 m, mirror size 22 m 2, 200k channels Compact length 1.5 m, mirror 11.8 m 2, 55k channels mirrors photodetectors Main problems of ring recognition high ring density (~200 rings per event, many secondary electrons); many overlapping rings; elliptic shape distortions and elliptic shape of the rings; measurement errors (the dimensions of sensitive pad are 0.6x0.6 cm and mean ring radius is ~6 cm). ring-track matching (high density of projected tracks) A fragment of photodetector plane In average there are 2-3 K-points per event forming 200-300 rings

08.09.09G.Ososkov NEC-20096 CBM data peculiarities The particular features of the data from the CBM detector  data arrives at an extremely high rate: about 800 particles per central event at reaction rates up to 10 MHz.  signal particles are very rare ( ~ 10 -6, ratio of low-mass vector mesons ~ 10 -5 )  recognized patterns are discrete and have a quite complex structure, due to very sophisticated designs of used detectors (strip- and straw tube chambers etc)  noise counts are numerous and correlated; Requirements to process such data - the maximum speed of data analysis – no triggers, only on-line! - the satisfactory accuracy and efficiency of methods of estimating physical parameters interesting for experimentalists - all software must be coordinated into the CBM framework available for each of collaboration members

08.09.09G.Ososkov NEC-20097 Global tracking algorithm combines local track-following with the next global track joining in event  Initial seeds are tracks reconstructed in STS (L1 tracking by a fast cellular automaton)  Tracking is based on  Track following  Kalman Filter (KF)  Validation gate  Different hit-to-track association techniques  Two main steps:  Tracking  Global track selection STS MUCH TRD TOF L1 STS tracking Global LIT tracking Hit-to-track merging

G.Ososkov NEC-20098 Track following and Kalman Filter Track propagation through absorber material We regard a track in space as a dynamic system with the state vector Its dynamic is with transport matrix. Measurements relate to x k via projection matrix. Here e k, w k are corresponding errors, independent random values with zero mean Prediction Residuals Residuals covar. matrix Filtration after adding next hit, Kalman gain matrix K k, new x k, R k and C k are calculated. Total track χ 2 is increased by Validation gate at station k-1:

9 Track propagation Extrapolation. Two models: –Straight line in case of absence of magnetic field. –Solution of the equation of motion in a magnetic field with the 4 th order Runge- Kutta method, with a parallel integration of the derivatives. Material Effects. –Energy loss (ionization: Bethe-Bloch, bremsstrahlung: Bethe-Heitler, pair production) –Multiple scattering (Gaussian approximation) Navigation. –Based on the ROOT TGeoManager. The Algorithm: Trajectory is divided into steps. For each step: Straight line approximation for finding intersections with different materials (geometry navigator) Geometrical extrapolation of the trajectory Material effects are added at each intersection point Track propagation components

08.09.09G.Ososkov NEC-200910 KF details: KF details: 1. Different hit-to-track assignments BranchingBranching –Branch is created for each hit in the validation gate. Track recognition efficiency 0.949 Nearest NeighborNearest Neighbor –The closest by a Euclidean statistical distance hit from a validation gate is assigned to track. Track recognition efficiency 0.927 WeightingWeighting –No track splitting, it collects all the hits in the validation gate. Track recognition efficiency 0.941 2. Track selection Aim: remove clone and ghost tracks Tracks are sorted by their quality, obtained by chi-square and track length

08.09.09G.Ososkov NEC-200911 Data processing for CBM RICH RICH hits (blue), found rings (red), track projections (green). Data processing stages:  Ring recognition and their parameters evaluation elliptic shapes  Compensating the optical distortions lead to elliptic shapes of rings Large RICH Compact RICH Results of radius restoration for momentum < 25 GeV/c  Matching the rings found to tracks of particles which are of interest for physicists  Eliminatingfake rings  Eliminating fake rings which could lead to wrong physical conclusions particle identification  Accomplishing the particle identification with the fixed level of the ring recognition efficiency

08.09.09G.Ososkov NEC-200912 Ring recognition algorithm Global search. Filter: algorithm compares all ring-candidates and chooses only good rings, rejecting clone and fake rings. Standalone ring finder. Local search of ring- candidates, based on local selection of hits and Hough Transform. Two steps: 99% 1% Time consumption

08.09.09G.Ososkov NEC-200913 Ring recognition algorithm, local search Preliminary selection of hits Histogram of ring centers Fast Hough Transform Ellipse fitter Ring quality calculation Remove hits of found ring (only best matched hits) Ring array

14 Rejection of fake ring candidates, ring quality calculation ANN output value for correctly found (solid line) and fake (dashed line) rings Nine ring parameters selected for ring quality calculation: number of hits in ring; chi-squared biggest angle between neighboring hits; number of hits in a small corridor around the ring; position of ring on photodetector plane; major and minor half axes of ellipse; rotation angle of the ellipse vs. azimuthal angle. ANN derives ring quality from these parameters. ANN derives ring quality from these parameters. The ANN output provides a ring quality parameter or probability, whether ring-candidate was found correctly or not. The ANN output provides a ring quality parameter or probability, whether ring-candidate was found correctly or not.

08.09.09G.Ososkov NEC-200915 Ring recognition algorithm, global search Reject candidate with worse quality if it shares more than N max hits with a better quality ring candidate. N max is set to 30% of the total number of hits in ring

RICH ring fitting methods Circle fitting Ellipse fitting Newton method for nonlinear equations with one variable is used Newton method for nonlinear equations with one variable is used 3-4 iterations 3-4 iterations algorithm is very robust to the initial parameters algorithm is very robust to the initial parameters program realization of the COP (Chernov-Ososkov-Pratt), based on the minimization of the functional program realization of the COP (Chernov-Ososkov-Pratt), based on the minimization of the functional general, as conic section Rings in the photodetector plane have a slight elliptic shape usage in ring finding algorithm Taubin method is used Taubin method is used Minimize P(x) by A,B,C,D,E,F, but measuring deviations along normals to the curve. Minimize P(x) by A,B,C,D,E,F, but measuring deviations along normals to the curve. non-linearity is avoided by Tailor expansion non-linearity is avoided by Tailor expansion non-iterative very fast direct algorithm no need of starting parameter values Mean B/A for CBM RICH rings = 0.9 Ref: N. Chernov J Math Im Vi, 27 (2007), 231-239. Thanks to A.Ayriyan (JINR, Dubna) and N. Chernov (USA) Ref: Comp Ph Com Volume 33, Issue 4, 1984, 329-333

17 Ring parameter correction Why? How? BEFORE B distribution BEFORE correction AFTER B distribution AFTER correction B correction map B distribution onto photodetector plane Example for minor half axis of ellipse (B) Resolution 3.3% Resolution 1.7% cm

08.09.09G.Ososkov NEC-200918 Ring finding efficiency Typical reaction for CBM -> central Au+Au collisions at 25 AGeV beam energy (UrQMD) Compact RICH Large RICH Accepted rings = rings with >= 5 hits LargeCompact radiator gas and length N 2 length 2.5 mCO 2 length 1.5 m photodetector size (No. of channels) 9 m 2 (200k)2.4 m 2 (55k) Efficiency for e+ and e- embedded in central Au+Au collisions at 25 AGeV beam energy

08.09.09G.Ososkov NEC-200919 Why do we need wavelets for handling invariant mass spectra? -we need them when S/B ratio is << 1 1. Smoothing after background subtraction without losing any essential information 2. resonance indicating even in presence of background 3. evaluating peak parameters

20 Math background Wavelet transfom is a convolution of invariant mass spectrum f(x) and a special biparametric wavelet function. Most famous example of wavelet noisy signal transformed into a surface position is “Mexican hat” where ψ(x)= is in fact, the 2d derivative n=2 of a gaussian with σ=1, x 0 =0, A=1 Wavelet-analysis is used, in common, as a mean for smoothing signals and filtering them from noise

08.09.09G.Ososkov NEC-200921 Specifics to keep in mind 1. remarcable robustness to noise and corresponding wavelets 2. Gibbs effect on edges variiation of granulation bin missing 3. Continuos wavelets are non-orthogonal and therefore are quite rare used. 4. Wavelet G 2 transforms a gaussian g(x;A,x o,σ) into wavelet of the same order, but with parameters of that gaussian: and leads to the idea It is true for any order n and leads to the idea of looking for the peak parameters directly in G 2 domain without its inverting

22 How to estimate peak parameters in G 2 wavelet domain Let us have a noisy inv. mass spectrum 1. transform it by G 2 into wavelet domain 2. look for the wavelet surface maximum b max,a max. From the formula for W G2 (a,b;x 0,σ)g one can derive analytical expressions for its maximum x 0 and. which should correspond to the found b max,a max. Thus we can use coordinates of the maximum as estimations of wanted peak parameters From them we can obtain halfwidth, amplitude and even the integral peak has bell-shape form

23 Real problems 1. CBM spectra Low-mass dileptons (muon channel) ω. Gauss fit of reco signal M=0.7785 σ =0.0125 A=1.8166 I g =0.0569 ω. Wavelets M=0.7700 σ =0.0143 A=1.8430 I w =0.0598 - ω– wavelet spectrum ω.ω. ω-meson φ-meson Even φ- and mesons have been visible in the wavelet space, so we could extract their parameters. Thanks to Anna Kiseleva

08.09.0924 2. Wavelet application for resonance structure study Invariant mass distributions of γγ pairs without (upper panel) and with (bottom panel) the background subtraction. d c thanks to V.Toneev

08.09.09G.Ososkov NEC-200925 2. Wavelet application for resonance structure study - II The invariant mass distribution of γγ pairs Gaussian wavelets of the 8-th order for dC interactions. A srip of abscissa shown in logarithmic scale

08.09.09G.Ososkov NEC-200926 Back to CBM tracking and RICH ring recognition Speeding up is the main problem now Status of speeding up Status of speeding up on the CBM tracking example Two stages: 1.Optimizing all tracking algorithms to make them as fast as possible 2.Invent parallelization Optimizing step 1: Optimizing step 1: number of branches Motivation: The main problem with branching algorithm is that its computational and memory requirements can grow with time and saturate computing system. Solution: Only a limited number of the nearest hits in the validation gate can start a new branch. Maximum number of hits in the validation gate 30201510754321 Tracking efficiency,% 93.1 92.9 91.8 Time, s/event 1.911.631.371.050.80.620.520.440.340.20 4 times faster! MUCH tracking, UrQMD 25 AGeV Au-Au + 10 muons CPU: Intel C2D P8400

08.09.09G.Ososkov NEC-200927 Optimizing step 2: fast search of hits Motivation -Hits are sorted by x position -Binary search is used to find Min and Max index Maximum measurement error on the station Fast search of hits Loop over MIN and MAX indices Loop over MIN and MAX indices 2 times increase in speed! 1.05 s/event -> 0.52 s/event (MUCH tracking, UrQMD+10mu) CPU: Intel C2D P8400

08.09.09G.Ososkov NEC-200928 Optimizing step 3: simple geometry Monte Carlo geometry –Very detailed (800k nodes) –Navigation is based on the ROOT TGeo Simplified geometry –There is no need for detailed geometry for tracking purposes –Stations and passive materials are approximated as planes perpendicular to beam pipe Reduction of number of nodes from 800k to 100 –Navigation in such geometry is much more simple CPU: Intel C2D P8400 Efficiency [%] / time [sec/event] ROOT TGeoSimplifiedSpeed up factor Branching94.9 / 0.8894.0 / 0.204.4 NN92.6 / 0.2092.2 / 0.063.3 Weighting94.0 / 0.3192.6 / 0.112.8 The total optimization speed up factor is on the level of 8-10

08.09.09G.Ososkov NEC-200929 Parallelism on the level of duo cores CPU Serial Parallel The easiest solution by applying the Intel library Intel Threading Building Blocks CPU: Intel C2D P8400 with 2 cores serial and parallel nearest neighbor method: Speed up factor: 1.59

08.09.09G.Ososkov NEC-200930 Summary and outlook  Track and ring recognition methods for the CBM experiment were discussed, vital need of speeding up all recognition algorithms was stressed  The variation of reconstruction time depends on selected parameters in the track and ring finders trading efficiency vs. speed.  Corresponding algorithms were significantly optimized in terms of calculation speed without loosing any efficiency (gain is ~ 10 times!)  All developed algorithms were tested on large statistics of simulated events, included then into the CBM framework and are intensively used now by CBM collaboration members  Possibilities to elaborate parallel versions of these algorithms look promising, but need a more advanced computing systems  Particle identification methods and wavelet approaches for handling invariant mass spectrum were also implemented in corresponding software, tested and applied for simulated and real data handling  New approaches like Boosted decision Tree and Fuzzy inferences are studied to improve the efficiency of recognition

08.09.09G.Ososkov NEC-200931 Thanks for your attention!

08.09.09G.Ososkov NEC-200932 Back-up slides

08.09.09G.Ososkov NEC-200933 Outline 1.Artificial Intelligence (AI) and pattern recognition (PR) 2.The CBM experiment 3.CBM data peculiarities and corresponding PR problems 4.Global tracking for MuCH and TRD 5.RICH ring recognition 6.Particle identification 7.Invariant mass spectrum handling by wavelets 8.Speeding up approaches 9.Summary and outlook

08.09.09G.Ososkov NEC-200934 What is Artificial Intelligence (AI) Turing's “polite” definition: If a machine acts as intelligently as a human being, then it is as intelligent as a human being. AI AI is a hierarchical system for information processing and coordinating physical actions according to a prescribed goal. AI system can input data automatically and has such features of the human intellect as  Information perception and processing for feature extraction and decision making;  Ability for learning and self-learning, for generalizing the learned knowledge and methods of already solved problems;  Ability to present results in a form suitable for human understanding This AI feature is usually interpreted as Pattern Recognition

08.09.09G.Ososkov NEC-200935 Pattern Recognition Pattern recognition (PR) takes in raw data (patterns) and classify them on the basis either a priori knowledge or statistical information extracted from the patterns. The patterns to be classified are usually groups of measurements or observations, defining points in an appropriate multidimensional space.patternsa prioristatistical multidimensional space The most known PR methods applied in HEP are:  Hought transform  Kalman filter  Cellular automata  Artificial Neural Networks  Boosted Decision Trees  Fuzzy (soft) approaches  Wavelet analysis tracking, ring recognition Partical identification Image enhancement, data compression

08.09.09G.Ososkov NEC-200936 Recognition methods in HEP Local track-following is a typical example: –algorithms starts from selecting of a few initial points, –then predicts the next point position using some of already found points for extrapolating with a known track model –found points are added to the current track candidate –the candidate is rejected after a certain number of bad attempts Global  The method is called global when all points enter into an event recognition algorithm in a similar manner.  Such an algorithm may be considered a general handling of the total set of measured coordinates of an event.  The examples of global approach for track chambers are - template selection - Hough transform (variable slope histogramming) - neural networks

08.09.09G.Ososkov NEC-200937 Solution of the initial value problem: Radon-Hough transform (RHT) from measurement space to parameter space Variable Slope Histograms The Method of Variable Slope Histograms is a particular case of the Hough transform for straight lines recognition, which implementation is much faster than the classic RHT. Similar algorithm was developed as well for the circle recognition, but it operates in 3-D space. However, RHT is too time-consuming and needs cardinal speeding up to be used for CBM Point-to-line correspondence x=x 0 +t x z x 0 =x-t x z Radon-Hough transform

08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon.

Similar presentations

Presentation on theme: "08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon.

Similar presentations

Presentation on theme: "08.09.09G.Ososkov NEC-20091 Pattern recognition methods for data handling of the CBM experiment Gennady Ososkov LIT JINR, Dubna, Russia Andrey and Semeon."— Presentation transcript:

Similar presentations

About project

Feedback