Presentation on theme: "Pattern Recognition Random Thoughts N.A. Graf UTeV March 2, 2000."— Presentation transcript:
Pattern Recognition Random Thoughts N.A. Graf UTeV March 2, 2000
Pattern Recognition There are many kinds of patterns. –Visual, auditory, temporal, logical, … Using a broad enough interpretation, we can find pattern recognition in every intelligent activity. No single theory of pattern recognition can possibly cope with such a broad range of problems.
Overview Restrict our attention to the following 3 classes of pattern recognition techniques: Template Matching –Global, fixed patterns Hough Transform –Global, parameterized patterns Kalman Filter –Local, dynamic state following
Suppose that we are working with visual patterns, and we know that the patterns of interest represent the 26 letters of the Roman alphabet. Then we can say that the pattern recognition problem is one of assigning the input to one of 26 classes. In general, we will limit ourselves to the problem of deciding if the input belongs to Class 1 or Class 2 or... or Class c.
An obvious approach is to compare the input with a standard pattern for each class, and to choose the class that matches best. The obvious problem with this approach is that it doesn't say what to compare or how to measure the degree of match.
Template Matching Once digitized, one can compare images bit by bit with a matching template to classify. Works very well in specific cases, but not in general (fonts, shearing, rotation, etc.)
Template Matching in HEP In high energy physics experiments, the detectors are fixed, so template matching is a good solution for fast characterization of events. Commonly used to trigger on charged particle tracks. Use MC to build up a library of most probable patterns.
Parametric Feature Extraction Often, one is interested in extracting topological information from images. Finding edges in pictures. Finding tracks in events. For patterns which can be parameterized, such as curves, features can be identified using conformal mapping techniques.
The Hough Transform Patented by Paul Hough in 1962 as a technique for detecting curves in binary image data. Determines whether edge-detected points are components of a specific type of parametric curve. Maps image space points into parameter space curves by incrementing elements of an accumulator whose array indices are the curve parameters.
The Hough Transform Developed to detect straight lines using the slope-intercept form y=mx+b Every point in the image gives rise to a line in the accumulator. Curve parameters are identified as array maxima –location gives parameters –entries gives number of points contributing Right-click and Open in New Window
The Hough Transform Richard Duda and Peter Hart in 1972 introduced the - parameterization. x y y=mx+b
The Hough Transform The - accumulator is incremented using values for the angle and radius that satisfy = xcos + ysin Sinusoidal curves are produced. Intersection of the curves indicates likely location of lines in the image. Normal form is periodic, limiting the range of values for the angle and eliminating the difficulties encountered with large slopes.
Finding Straight Lines in Images Start with a digitized imageFind the edgesApply the Hough transform Extract the features
Other Curves The technique can be generalized to include arbitrary parametric curves. Finding charged tracks in a solenoidal field motivates the circle algorithm. F(x,y,a,b) = (x-a) 2 + (y-b) 2 - r 2 = 0 Simplify by fixing one of the parameters, e.g. radius: Right-click and Open in New Window
Charged Tracks in HEP Want to find tracks that come from origin Construct line connecting each measured point to the origin. Orthogonal bisector of this line passes through the circles origin. Fill accumulator with each points line –one-to-many mapping Fill accumulator with intersection of lines coming from two points –many-to-one mapping
x y p1p1 p2p2 Intersection gives center of circle p T and 0
Resolution The resolution with which one can determine the curve parameters using the Hough transform is determined by the accumulator size. larger size better resolution more resources Use HT for pattern recognition then fit points which contributed to functional form. Use Adaptive HT –Use coarse array to find regions of interest –Backmap points to finer-binned accumulator
HT Summary Works very well for well-defined problems. Ideally suited to modern, digital devices. Global, democratic method –individual points vote independently Very robust against noise and inefficiency. Can be generalized to find arbitrary parameterized curves. AHT offers solution to trade-off between speed and resolution
The Kalman Filter In 1960 Rudolf Kalman published A new Approach to Linear Filtering and Prediction Problems in the ASME Journal of Basic Engineering. The best estimate for the state of a system and its covariance can be obtained recursively from the previous best estimate and its covariance matrix. Essential for real-time applications with noisy data, e.g. moon-landing, Stock Market predictions, military targeting
Running Average Discrete measurements a n of a constant A. Compare starting over with each new measurement via with the recursive formula
Filtering Another class of pattern recognition involves systems for which an existing state is known and one wishes to add additional information. How does one reconcile new, perhaps noisy, information with an existing best estimate?
Dynamic System Description A discrete dynamic system is characterized at each time t k by a state vector x k, the evolution of which is characterized by a time dependent transformation: f k : a deterministic function w k : random disturbance of the system (process noise)
Normally one only observes a function of the state vector, corrupted by some measurement noise: m k : vector of observations at time t k k : measurement noise
The simplest case has both f and h linear:
Progressive Fitting There are three basic operations in the analysis of a dynamic system: Filtering: –estimation of the present state vector, based upon all the past measurements Prediction: – estimation of the state vector at a future time Smoothing –Improved estimation of the state vector at some time in the past, based upon all measurements taken up to the present time.
Prediction One assumes that at a given initial point the state vector parameters x and their covariance matrix C are known. Parameter vector and covariance matrix are propagated to the position i+1 via:
Filtering At position i+1 one has a measurement m i+1 which can contain measurements of an arbitrary number of the state vector parameters. The question is how to reconcile this measurement with the existing prediction for the state vector at this position.
The Kalman Filter The Kalman Filter is the optimum solution in the sense that it minimizes the mean square estimation error. If the system is linear and the noise is Gaussian, then the Kalman Filter is the optimal filter; no nonlinear filter can do better.
Combine the noisy measurement with the prior estimate: where K i+1 is the Kalman Gain matrix:
Kalman Filter Flow Update estimate with measurement Compute the Kalman Gain Begin with a prior estimate and covariance matrix Predict ahead
Kalman Filter Advantages Combines pattern recognition with parameter estimation Number of computations increases only linearly with the number of detectors. Estimated parameters closely follow real path. Matrix size limited to number of state parameters
Relationship to Least Squares Fitting To solve the matrix equation: we solve where minimizes
Consider the generalized weighted sum of squared residuals: to minimize, take the derivative and set to 0.
Consider the Kalman Filter solution: For no a priori knowledge about x: Giving the Kalman Gain The estimate for the state vector is:
For a constant system state vector with an overdetermined system of linear equations and no a priori information, the Kalman filter reproduces the deterministic least squares result. In most cases, however, one does have prior knowledge and the Kalman filters advantage is the convenient way in which it accounts for this prior knowledge via the initial conditions. Basically a least squares best fit problem done sequentially rather than in batch mode.
Estimating a Constant Iteration Voltage Measurement variance = true varianceMeasurement variance > true varianceMeasurement variance < true variance
Track Fitting In the 80s, Billoir and Frühwirth adapted the KF to track finding and fitting in HEP. Combined pattern recognition with parameter fitting. –Use track state prediction to discriminate between multiple hits in detector elements. Dynamic system accommodates physics: –multiple scattering –energy loss –magnetic field stepping
Fitting a Straight Line in 2D State Vector: Track Model: Straight Line
The next position is simply the old plus the slope times the interval: The slope remains the same: Therefore the transformation matrix is:
Ansatz for the initial state: with a 1 arbitrary and M>>1 Predict next state:
The predicted position equals the measured position at the next surface, since we took the error on the predicted slope to be very large, i,e. we did not trust the prediction; the optimal solution is to use the measurement. The predicted slope is y/ x, as we would expect for no a priori knowledge. The initial guess for the slope does not appear in the final result, since we had assigned the prediction a large uncertainty. We now have a good estimate for the slope and its uncertainty and will now iterate.
Kalman Filter Summary The Kalman Filter provides an elegant formalism for reconciling measurements with an existing hypothesis. Its progressive, or iterative, nature allows the algorithm to be cleanly implemented in software. Extended KF removes limitations on linear systems with Gaussian noise.
Summary I have only barely touched the surface in presenting these three techniques here this evening. There exists a broad spectrum of pattern recognition techniques, but these are fairly representative of the most-used ones. Go out and implement them!