Presentation on theme: "Pattern Recognition Random Thoughts"— Presentation transcript:
1Pattern Recognition Random Thoughts N.A. GrafUTeVMarch 2, 2000
2Pattern Recognition There are many kinds of patterns. Visual, auditory, temporal, logical, …Using a broad enough interpretation, we can find pattern recognition in every intelligent activity.No single theory of pattern recognition can possibly cope with such a broad range of problems.
3OverviewRestrict our attention to the following 3 classes of pattern recognition techniques:Template MatchingGlobal, fixed patternsHough TransformGlobal, parameterized patternsKalman FilterLocal, dynamic state following
4Suppose that we are working with visual patterns, and we know that the patterns of interest represent the 26 letters of the Roman alphabet.Then we can say that the pattern recognition problem is one of assigning the input to one of 26 classes.In general, we will limit ourselves to the problem of deciding if the input belongs to Class 1 or Class 2 or ... or Class c.
5An obvious approach is to compare the input with a standard pattern for each class, and to choose the class that matches best.The obvious problem with this approach is that it doesn't say what to compare or how to measure the degree of match.
6Template MatchingOnce digitized, one can compare images bit by bit with a matching template to classify.Works very well in specific cases, but not in general (fonts, shearing, rotation, etc.)
7Template Matching in HEP In high energy physics experiments, the detectors are fixed, so template matching is a good solution for fast characterization of events.Commonly used to trigger on charged particle tracks.Use MC to build up a library of most probable patterns.
8Parametric Feature Extraction Often, one is interested in extracting topological information from “images”.Finding “edges” in pictures.Finding “tracks” in events.For patterns which can be parameterized, such as curves, features can be identified using conformal mapping techniques.
9The Hough TransformPatented by Paul Hough in 1962 as a technique for detecting curves in binary image data.Determines whether edge-detected points are components of a specific type of parametric curve.Maps “image space” points into “parameter space” curves by incrementing elements of an accumulator whose array indices are the curve parameters.
10The Hough TransformDeveloped to detect straight lines using the slope-intercept formy=mx+bEvery point in the image gives rise to a line in the accumulator.Curve parameters are identified as array maximalocation gives parametersentries gives number of points contributingRight-click and“Open in New Window”
11The Hough TransformRichard Duda and Peter Hart in 1972 introduced the - parameterization.yy=mx+bx
12The Hough TransformThe - accumulator is incremented using values for the angle and radius that satisfy = xcos + ysinSinusoidal curves are produced.Intersection of the curves indicates likely location of lines in the image.Normal form is periodic, limiting the range of values for the angle and eliminating the difficulties encountered with large slopes.
13Finding Straight Lines in Images Start with a digitized imageApply the Hough transformFind the “edges”Extract the features
14F(x,y,a,b) = (x-a)2 + (y-b)2 - r2 = 0 Other CurvesThe technique can be generalized to include arbitrary parametric curves.Finding charged tracks in a solenoidal field motivates the circle algorithm.F(x,y,a,b) = (x-a)2 + (y-b)2 - r2 = 0Simplify by fixing one of the parameters, e.g. radius:Right-click and“Open in New Window”
15Charged Tracks in HEP Want to find tracks that come from origin Construct line connecting each measured point to the origin.Orthogonal bisector of this line passes through the circle’s origin.Fill accumulator with each point’s lineone-to-many mappingFill accumulator with intersection of lines coming from two pointsmany-to-one mapping
16yp2p1Intersection gives centerof circle pT and 0x
17larger size better resolution more resources The resolution with which one can determine the curve parameters using the Hough transform is determined by the accumulator size.larger size better resolution more resourcesUse HT for pattern recognition then fit points which contributed to functional form.Use Adaptive HTUse coarse array to find regions of interestBackmap points to finer-binned accumulator
18HT Summary Works very well for well-defined problems. Ideally suited to modern, digital devices.Global, “democratic” methodindividual points “vote” independentlyVery robust against noise and inefficiency.Can be generalized to find arbitrary parameterized curves.AHT offers solution to trade-off between speed and resolution
19The Kalman FilterIn 1960 Rudolf Kalman published “A new Approach to Linear Filtering and Prediction Problems” in the ASME Journal of Basic Engineering.The best estimate for the state of a system and its covariance can be obtained recursively from the previous best estimate and its covariance matrix.Essential for real-time applications with noisy data, e.g. moon-landing, Stock Market predictions, military targeting
20Running Average Discrete measurements an of a constant A. Compare starting over with each new measurement viawith the recursive formula
21FilteringAnother class of pattern recognition involves systems for which an existing state is known and one wishes to add additional information.How does one reconcile new, perhaps noisy, information with an existing “best” estimate?
22Dynamic System Description A discrete dynamic system is characterized at each time tk by a state vector xk, the evolution of which is characterized by a time dependent transformation:fk: a deterministic functionwk: random disturbance of the system (process noise)
23Normally one only observes a function of the state vector, corrupted by some measurement noise: mk: vector of observations at time tkk: measurement noise
25Progressive FittingThere are three basic operations in the analysis of a dynamic system:Filtering:estimation of the present state vector, based upon all the past measurementsPrediction:estimation of the state vector at a future timeSmoothingImproved estimation of the state vector at some time in the past, based upon all measurements taken up to the present time.
26PredictionOne assumes that at a given initial point the state vector parameters x and their covariance matrix C are known.Parameter vector and covariance matrix are propagated to the position i+1 via:
27FilteringAt position i+1 one has a measurement mi+1 which can contain measurements of an arbitrary number of the state vector parameters.The question is how to reconcile this measurement with the existing prediction for the state vector at this position.
28The Kalman FilterThe Kalman Filter is the optimum solution in the sense that it minimizes the mean square estimation error.If the system is linear and the noise is Gaussian, then the Kalman Filter is the optimal filter; no nonlinear filter can do better.
29Combine the noisy measurement with the prior estimate: where Ki+1 is the Kalman Gain matrix:
30Kalman Filter Flow Begin with a prior estimate and covariance matrix Compute the Kalman GainUpdate estimate with measurementPredict ahead
31Kalman Filter Advantages Combines pattern recognition with parameter estimationNumber of computations increases only linearly with the number of detectors.Estimated parameters closely follow real path.Matrix size limited to number of state parameters
32Relationship to Least Squares Fitting To solve the matrix equation:we solvewhere minimizes
33Consider the generalized weighted sum of squared residuals: to minimize, take the derivative and set to 0.
34Consider the Kalman Filter solution: For no a priori knowledge about x:Giving the Kalman GainThe estimate for the state vector is:
35For a constant system state vector with an overdetermined system of linear equations and no a priori information, the Kalman filter reproduces the deterministic least squares result.In most cases, however, one does have prior knowledge and the Kalman filter’s advantage is the convenient way in which it accounts for this prior knowledge via the initial conditions.Basically a least squares best fit problem done sequentially rather than in batch mode.
36Estimating a Constant Iteration Voltage + Measurement variance > true varianceMeasurement variance < true varianceMeasurement variance = true variance
37Track FittingIn the ‘80s, Billoir and Frühwirth adapted the KF to track finding and fitting in HEP.Combined pattern recognition with parameter fitting.Use track state prediction to discriminate between multiple hits in detector elements.Dynamic system accommodates physics:multiple scatteringenergy lossmagnetic field stepping
38Fitting a Straight Line in 2D State Vector:Track Model: Straight Line
39The next position is simply the old plus the slope times the interval: The slope remains the same:Therefore the transformation matrix is:
40Ansatz for the initial state: with a1 arbitrary and M>>1Predict next state:
42The predicted position equals the measured position at the next surface, since we took the error on the predicted slope to be very large, i,e. we did not trust the prediction; the optimal solution is to use the measurement.The predicted slope is y/x, as we would expect for no a priori knowledge.The initial guess for the slope does not appear in the final result, since we had assigned the prediction a large uncertainty.We now have a good estimate for the slope and its uncertainty and will now iterate.
44Kalman Filter SummaryThe Kalman Filter provides an elegant formalism for reconciling measurements with an existing hypothesis.Its progressive, or iterative, nature allows the algorithm to be cleanly implemented in software.“Extended” KF removes limitations on linear systems with Gaussian noise.
45SummaryI have only barely touched the surface in presenting these three techniques here this evening.There exists a broad spectrum of pattern recognition techniques, but these are fairly representative of the most-used ones.Go out and implement them!