Professor William H. Press, Department of Computer Science, the University of Texas at Austin1 Opinionated in Statistics by Bill Press Lessons #46 Interpolation.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

An Architecture for Real-Time Vertebrae Drilling Simulation.
Nonparametric Methods: Nearest Neighbors
ECG Signal processing (2)
Regression “A new perspective on freedom” TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A AAA A A.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Partially Observable Markov Decision Process (POMDP)
Multi-Dimensional Data Interpolation Greg Beckham Nawwar.
Pattern Recognition and Machine Learning: Kernel Methods.
Model generalization Test error Bias, variance and complexity
Lecture 3 Nonparametric density estimation and classification
Model assessment and cross-validation - overview
Basic geostatistics Austin Troy.
Analysis of different gridding methods using “Surfer7”
Regression. So far, we've been looking at classification problems, in which the y values are either 0 or 1. Now we'll briefly consider the case where.
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 10a Kernel density estimators and nearest neighbors All lecture slides will be available as.ppt,.ps,
1cs542g-term Notes  Added required reading to web (numerical disasters)
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Kernel methods - overview
統計計算與模擬 政治大學統計系余清祥 2003 年 6 月 9 日 ~ 6 月 10 日 第十六週:估計密度函數
Lecture Notes for CMPUT 466/551 Nilanjan Ray
Giansalvo EXIN Cirrincione unit #6 Problem: given a mapping: x  d  t  and a TS of N points, find a function h(x) such that: h(x n ) = t n n = 1,
Spatial Interpolation
Fitting a Model to Data Reading: 15.1,
Governing equation for concentration as a function of x and t for a pulse input for an “open” reactor.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
CS Instance Based Learning1 Instance Based Learning.
Applications in GIS (Kriging Interpolation)
Radial-Basis Function Networks
Today: Central Tendency & Dispersion
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
Gaussian process modelling
Motion Blending (Multidimensional Interpolation) Jehee Lee.
Dr. Richard Young Optronic Laboratories, Inc..  Uncertainty budgets are a growing requirement of measurements.  Multiple measurements are generally.
Artificial Neural Networks Shreekanth Mandayam Robi Polikar …… …... … net k   
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press IMPRS Summer School 2009, Prof. William H. Press 1 4th IMPRS Astronomy.
Image Modeling & Segmentation Aly Farag and Asem Ali Lecture #2.
Data Classification with the Radial Basis Function Network Based on a Novel Kernel Density Estimation Algorithm Yen-Jen Oyang Department of Computer Science.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
CSE554Fairing and simplificationSlide 1 CSE 554 Lecture 6: Fairing and Simplification Fall 2012.
CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
Lecture 6: Point Interpolation
Chapter 13 (Prototype Methods and Nearest-Neighbors )
CSE 185 Introduction to Computer Vision Feature Matching.
KNN Classifier.  Handed an instance you wish to classify  Look around the nearby region to see what other classes are around  Whichever is most common—make.
Kernel Methods Arie Nakhmani. Outline Kernel Smoothers Kernel Density Estimators Kernel Density Classifiers.
Lecture 5: Statistical Methods for Classification CAP 5415: Computer Vision Fall 2006.
Sampling Distributions Chapter 18. Sampling Distributions If we could take every possible sample of the same size (n) from a population, we would create.
Data Visualization Fall Scattered Point Interpolation Fall 2015Data Visualization2 Sometimes, we have no explicit cell information connecting the.
The University of Texas at Austin, CS 395T, Spring 2008, Prof. William H. Press 1 Computational Statistics with Application to Bioinformatics Prof. William.
Intro. ANN & Fuzzy Systems Lecture 24 Radial Basis Network (I)
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Interpolation Local Interpolation Methods –IDW – Inverse Distance Weighting –Natural Neighbor –Spline – Radial Basis Functions –Kriging – Geostatistical.
Model Selection and the Bias–Variance Tradeoff All models described have a smoothing or complexity parameter that has to be considered: multiplier of the.
Instance Based Learning
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
Ch8: Nonparametric Methods
CSCI 5822 Probabilistic Models of Human and Machine Learning
Outline Parameter estimation – continued Non-parametric methods.
Model generalization Brief summary of methods
CSC2515 Fall 2007 Introduction to Machine Learning Lecture 7b: Kernel density estimators and nearest neighbors All lecture slides will be available as.
#21 Marginalize vs. Condition Uninteresting Fitted Parameters
#22 Uncertainty of Derived Parameters
#10 The Central Limit Theorem
#8 Some Standard Distributions
Presentation transcript:

Professor William H. Press, Department of Computer Science, the University of Texas at Austin1 Opinionated in Statistics by Bill Press Lessons #46 Interpolation on Scattered Data

Professor William H. Press, Department of Computer Science, the University of Texas at Austin2 In dimensions >2, the explosion of volume makes things difficult. Rarely enough data for any kind of mesh. Lots of near-ties for nearest neighbor points (none very near). The problem is more like a machine learning problem: Given a training set x i with “responses” y i, i = 1…N Predict y(x) for some new x Example: In a symmetrical multivariate normal distribution in large dimension D, everything is at almost the same distance from everything else: Interpolation on Scattered Data in Multidimensions

Professor William H. Press, Department of Computer Science, the University of Texas at Austin3 Let’s try some different methods on 500 data points in 4 dimensions /4  4.7, so the data is sparse, but not ridiculous testfun *exp(-2.0*norm(x-[ ]))... *x(1)*(1-x(1))*x(2)*(1-x(2))*x(3)*(1-x(3))*x(4)*(1-x(4)); [x1 x2] = meshgrid(0:.01:1,0:.01:1); z = testfun([s1 s2.3.3]), x1, x2); contour(z) an exponential, off-center in the unit cube, and tapered to zero at its edges z = testfun([s1 s2.7.7]), x1, x2); contour(z)

Professor William H. Press, Department of Computer Science, the University of Texas at Austin4 npts = 500; pts = cell(npts,1); for j=1:npts, pts{j} = rand(1,4); end; vals = cellfun(testfun,pts); hist(vals,50) Generate training and testing sets of data. The points are chosen randomly in the unit cube. tpts = cell(npts,1); for j=1:npts, tpts{j} = rand(1,4); end; tvals = cellfun(testfun,tpts); hist(tvals,50) If you have only one sample of real data, you can test by leave-one-out, but that is a lot more expensive since you have to repeat the whole interpolation, including one-time work, each time. most data points are where the function value is small but a few find the peak of the Gaussian

Professor William H. Press, Department of Computer Science, the University of Texas at Austin5 Shepard Interpolation The prediction is a weighted average of all the observed values, giving (much?) larger weights to those that are closest to the point of interest. It’s a smoother version of “value of nearest neighbor” or “mean of few nearest neighbors”. the power-law form has the advantage of being scale-free, so you don’t have to know a scale in the problem In D dimensions, you’d better choose p  D+1, otherwise you’re dominated by distant, not close, points: volume ~ no. of points ~ r D Shepard interpolation is relatively fast, O(N) per interpolation. The problem is that it’s usually not very accurate.

Professor William H. Press, Department of Computer Science, the University of Texas at Austin6 function val = shepinterp(x,p,vals,pts) phi = (norm(x-y)+1.e-40).^(-p), pts); val = (vals' * phi)./sum(phi); shepvals = shepinterp(x,6,vals,pts), tpts); plot(tvals,shepvals,'.') hist(shepvals-tvals,50) Shepard performance on our training/testing set: note value of p note biases

Professor William H. Press, Department of Computer Science, the University of Texas at Austin7 Radial Basis Function Interpolation This looks superficially like Shepard, but it is typically much more accurate. However, it is also much more expensive: O(N 3 ) one time work + O(N) per interpolation. Like Shepard, the interpolator is a linear combination of identical kernels, centered on the known points But now we solve N linear equations to get the weights, by requiring the interpolator to go exactly through the data: There is now no requirement that the kernel  (r) falls off rapidly, or at all, with r. or © w = y

Professor William H. Press, Department of Computer Science, the University of Texas at Austin8 Commonly used Radial Basis Functions (RBFs) “multiquadric” you have to pick a scale factor “inverse multiquadric” “thin plate spline” “Gaussian” Typically very sensitive to the choice of r 0, and therefore less often used. (Remember the problems we had getting Gaussians to fit outliers!) The choice of scale factor is a trade-off between over- and under- smoothing. (Bigger r 0 gives more smoothing.) The optimal r 0 is usually on the order of the typical nearest-neighbor distances.

Professor William H. Press, Department of Computer Science, the University of Texas at Austin9 r0 = 0.1; phi sqrt(norm(x)^2+r0^2); phimat = zeros(npts,npts); for i=1:npts, for j=1:npts, phimat(i,j) = phi(pts{i}-pts{j}); end; end; wgts = phimat\vals; valinterp wgts' * phi(norm(x-y)),pts); ivals = cellfun(valinterp,tpts); hist(ivals-tvals,50) stdev = std(ivals-tvals) Let’s try a multiquadric with r 0 = 0.1 stdev = Matlab “solve linear equations” operator!

Professor William H. Press, Department of Computer Science, the University of Texas at Austin10 r 0 = 0.2 r 0 = 0.6 r 0 = 1.0  =  =  = so r 0 ~ 0.6 is the optimal choice, and it’s not too sensitive

Professor William H. Press, Department of Computer Science, the University of Texas at Austin11 r0 = 0.6; phi 1./sqrt(norm(x)^2+r0^2); [stdev ivals] = TestAnRBF(phi,pts,vals,tpts,tvals); stdev plot(tvals,ivals,'.') stdev = Try an inverse multiquadric r 0 = 0.6  = (performance virtually identical to multiquadric on this example) RBF interpolation is for interpolation on a smooth function, not for fitting a noisy data set. By construction it exactly “honors the data” (meaning that it goes through the data points – it doesn’t smooth them). If the data is in fact noisy, RBF will produce an interpolating function with spurious oscillations.