ECE 7251: Signal Detection and Estimation Spring 2004 Prof. Aaron Lanterman Georgia Institute of Technology Lecture 1, 1/5/04: Introduction.

Slides:



Advertisements
Similar presentations
Advanced topics in Financial Econometrics Bas Werker Tilburg University, SAMSI fellow.
Advertisements

Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
CHAPTER 8 More About Estimation. 8.1 Bayesian Estimation In this chapter we introduce the concepts related to estimation and begin this by considering.
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
ECE 8443 – Pattern Recognition Objectives: Course Introduction Typical Applications Resources: Syllabus Internet Books and Notes D.H.S: Chapter 1 Glossary.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Ai in game programming it university of copenhagen Statistical Learning Methods Marco Loog.
Visual Recognition Tutorial
Maximum likelihood (ML) and likelihood ratio (LR) test
1 On the Statistical Analysis of Dirty Pictures Julian Besag.
Lecture 5: Learning models using EM
Maximum likelihood (ML)
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Maximum likelihood (ML) and likelihood ratio (LR) test
Evaluating Hypotheses
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Course outline and schedule Introduction Event Algebra (Sec )
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Course outline and schedule Introduction (Sec )
July 3, A36 Theory of Statistics Course within the Master’s program in Statistics and Data mining Fall semester 2011.
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Bayesian Estimation (BE) Bayesian Parameter Estimation: Gaussian Case
Maximum likelihood (ML)
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Crash Course on Machine Learning
B. RAMAMURTHY EAP#2: Data Mining, Statistical Analysis and Predictive Analytics for Automotive Domain CSE651C, B. Ramamurthy 1 6/28/2014.
Asaf Cohen Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan September 10,
Overview G. Jogesh Babu. Probability theory Probability is all about flip of a coin Conditional probability & Bayes theorem (Bayesian analysis) Expectation,
General information CSE : Probabilistic Analysis of Computer Systems
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Model Inference and Averaging
The Kalman Filter ECE 7251: Spring 2004 Lecture 17 2/16/04
ITGD3101Modern Telecommunications Lecture-1 Dr. Anwar Mousa University of Palestine Faculty of Information Technology.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Finding Scientific topics August , Topic Modeling 1.A document as a probabilistic mixture of topics. 2.A topic as a probability distribution.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Overview Particle filtering is a sequential Monte Carlo methodology in which the relevant probability distributions are iteratively estimated using the.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
Probability and Measure September 2, Nonparametric Bayesian Fundamental Problem: Estimating Distribution from a collection of Data E. ( X a distribution-valued.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Mobile Robot Localization (ch. 7)
Introduction to Digital Signals
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
ECE 8443 – Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem Proof EM Example – Missing Data Intro to Hidden Markov Models.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Occam’s Razor No Free Lunch Theorem Minimum.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Friday 23 rd February 2007 Alex 1/70 Alexandre Renaux Washington University in St. Louis Minimal bounds on the Mean Square Error: A Tutorial.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Introduction to Estimation Theory: A Tutorial
EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 15 Oct. 10 th, 2006.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Jensen’s Inequality (Special Case) EM Theorem.
Outline Historical note about Bayes’ rule Bayesian updating for probability density functions –Salary offer estimate Coin trials example Reading material:
Copyright 2004 Aaron Lanterman Incoherent Detection Prof. Aaron D. Lanterman School of Electrical & Computer Engineering Georgia Institute of Technology.
Lecture 1.31 Criteria for optimal reception of radio signals.
Detection theory 1. Definition of the problematic
ECE 7251: Signal Detection and Estimation
12. Principles of Parameter Estimation
ECE 7251: Signal Detection and Estimation
ECE 7251: Signal Detection and Estimation
ECE 7251: Signal Detection and Estimation
Objective of This Course
Filtering and State Estimation: Basic Concepts
ELEC6111: Detection and Estimation Theory Course Objective
12. Principles of Parameter Estimation
Presentation transcript:

ECE 7251: Signal Detection and Estimation Spring 2004 Prof. Aaron Lanterman Georgia Institute of Technology Lecture 1, 1/5/04: Introduction

What is ECE7251 About? Goal: Suck useful information out of messy data Strategy: Formulate probabilistic model of data y, which depends on underlying parameter(s)  Terminology depends on parameter space: –Detection (simple hypothesis testing):   {0,1}, i.e. 0=target absent, 1=target present –Classification (multihypothesis testing):   {0,1,…,M}, i.e.   {DC-9, 747, F-15, MiG-31} –Estimation   R n, C n, etc. (not too hard)   L 2 (R), (square-integrable functions), etc. (harder)

A Bit More Termonology Suppose  =(  1,  2 ) If we are only interested in  1, then  2 are called nuisance parameters If  1 ={0,1}, and  2 are nuisance parameters, we call it a composite hypothesis testing problem

Ex: Positron Emission Tomography Simple, traditional linear DSP-based approach –Filtered Back Projection (FBP) Advanced, estimation-theoretic approach –Model Poisson “likelihood” of collected data –Markov Random Field (MRF) “prior” on image –Find estimate using expectation-maximization algorithm (or similar technique) Raw Data FBP Statistical Estimate Images from webpage of Prof. Jeffrey Fessler, U. Michigan, Ann Arbor

Tasks of Statistical Signal Processing 1.Create statistical model for measured data 2.Find fundamental limitations on our ability to perform inference on the data –Cramèr-Rao bounds, Chernov bounds, etc. 3.Develop an optimal (or suboptimal) estimator 4.Asymptotic analysis (i.e., assume we have lots and lots of data) of estimator performance to see if it approaches bounds derived in (2) 5.Hop on the computer: Do simulations and experiments comparing algorithm performance to lower bounds and competing algorithms (from Hero)

To be B or not to be B: That is the Bayesian question Sir R.A. Fisher Pics from MacTutor History of Mathematics Archive ( Rev. Thomas Bayes

To be B A Bayesian analysis treats  as a random variable with a “prior” density p(  ) Data generating machinery is specified by a conditional density p(y|  ) –Gives the “likelihood” that the data y resulted from the parameters  Inference usually revolves around the posterior density, derived from Bayes’ theorem: p(y|  )p(  ) p(y) p(y|  )p(  )  p(y|  )p(  ) d  p(  |y)= =

A Quick Side Note We use the word “density” in a generalized sense, encapsulating probability mass functions as well. A notation like p(y|  ) could indicate discrete or continuous data y, and discrete or continuous parameters  What is intended will be clear by context Just replace integrals with sums when appropriate; no harm done! See p. 4 of Poor for a rigorous discussion

Not to be B Trouble with Bayesian analysis: it is not always obvious what a good prior density might be. –Often chosen for computational convenience! In a non-Bayesian analysis, the likelihood density p(y;  ) is a density in y parameterized by a nonrandom parameter . –It is not “conditioned” on  in a strict probabilistic sense. Inference usually revolves around likelihood p(y;  ) Debates between Bayesians and non-Bayesians have been violent at times!

The Big Picture Parameter Random Mapping  p(y|  ) or p(y;  ) y Data ()()  (y) Estimator (a function) Estimate (a specific value) In detection and classification, the “estimator” is instead called a “decision rule” Note the parameter  we want to estimate could itself be a function, in which case  is a functional Much of the class will be about ways of designing  ^ ^ ^ ^

“Under the Hood” of ECE7251, Spring 2004 offering Aaron’s quest for a textbook –Literature review Different ways of teaching/learning the material –What to cover? –What order to cover it in?

First “engineering” textbook on the topic Will be republished periodically for all eternity Some homework problems are masters thesis topics!!! Every EE should have a copy Unfortunately, showing its age –All emphasis on continuous time problems, as might be implemented with analog computers –Ex: Discrete-time Kalman filtering relegated to a homework problem!!! Really wish Van Trees would write a 2 nd edition “Detection, Estimation, and Modulation Theory, Part I”, Harry L. Van Trees, 1968

Rest of Van Trees’ Masterpiece “DEM Theory, Part II: Nonlinear Modulation Theory,” 1971 –Not as popular as the others in the series –Recently republished “DEM Theory, Part III: Radar-Sonar Processing and Gaussian Signals in Noise,” 1971 –Much material not readily available elsewhere –Often republished “DEM Theory, Part IV: Optimum Array Processing,” 2003 –30 years in the making –Worth the wait!

“ If you have some familiarity with the topic you will undoubtedly enjoy this book, but if you are a student tackling with this for the first time it will be demanding reading. You will need considerable fluency in random variable calculus to get the most out of the book, as the author presents many results and derivations as ‘straightforward.’” – Amazon Review Excellent emphasis on general structure of det. & est. problems Most mathematically rigorous of all the engineering texts Alas, writing style is often dense and difficult to penetrate “An Introduction to Signal Detection and Estimation,” 2 nd Edition, H. Vincent Poor (1988, 1994)

“… excellent tutorial and research reference book on estimation theory. The theory is illustrated with very concrete examples; the examples give an ‘under-the-hood’ insight into the solution of some common estimation problems in signal processing. If you're a statistician, you might not like this book. If you're an engineer, you will like it.” – Amazon review “The theory is explained well and motivated, but what makes the book great are the examples. There are many worked examples and they are chosen to make things very clear.” – Amazon review “Fundamentals of Statistical Signal Processing, Volume 1: Estimation Theory”, Steven Kay, 1993

–“ This is an excellent book and is very easy to follow, unlike Poor's which is too mathematical and hard too read. However, this book does make many references to Vol. 1, the Estimation Theory, so you almost have to get both books to get a full understanding.” - Amazon review –Friendly and easy to read, like J.R.R. Tolkien –Sprawling and epic, like J.R.R. Tolkien (two book sequence is a bit unwieldy for a single course) –Seems to view the world as a series individual problems that come up one at a time and are addressed one at a time, like J.R.R. Tolkien (less emphasis on abstract, general structure found in Poor’s book) “Fundamentals of Statistical Signal Processing, Volume 2: Detection Theory,” Steven Kay, 1998

“The book is well-organized in terms of content and layout, but the mathematics is very poorly presented. Some things just fall out of mid-air. It could use some rigorous presentation.” – Amazon Review “For a mathematical text this book lacks a lot of details. Things often fall out of nowhere and it is very hard for students to follow how equations are derived.” – Amazon Review Special topics: Digital Communications, Radar, Target Tracking Pattern Classification, System Identification “Introduction to Statistical Signal Processing,” M.D. Srinath, P.K. Rajasek, and R. Viswanthan, 1996

Some Other Books “Elements of Signal Detection and Estimation,” Carl Helstrom, 1995 –Out of print, unfortunately –A little bit wordy –Covers some interesting (although somewhat obscure) techniques not covered elsewhere: optical communications, saddle point integration, etc. “Statistical Signal Processing: Detection, Estimation, and Time Series Analysis,” Louis Scharf, 1991 –Heavy emphasis on matrix analysis; unique viewpoint –No continuous time results –Not recommended for ECE7251 in particular, but a good reference on the topics it covers; a good book to have close by when doing research

Book-notes; work in progress Prof. Hero noted most recent books downplay statistical theory, and instead emphasize specific engineering applications Applications are useful for motivation, but we must be careful to avoid obscuring common underlying structures Manuscript strives to more thoroughly cover the theory of mathematical statistics than other books, permitting unification Received permission from Hero to use these notes for ECE7251, Spring 2004 Springs 2002 and 2003 version of ECE7251 used Poor’s book in conjunction with Hero’s notes; students disliked Poor’s book, so this semester we’ll try just using Hero’s note Received permission from Hero to use notes - Feedback welcome!!! “Statistical Methods for Signal Processing,” Al Hero,

Notation in Different Books Param.DataNon-B. LLB. LLB. PriorB. Post. Van Treesarp(r|a) p(a)p(a|r) Kay  x p(x;  )p(x|  )p(  )p(  |x) Srinath  z p(z|  ) p(  )p(  |z) Poor  yp  (y) w(  )w(  |y) Hero  x f(x;  )f(x|  )f(  )f(  |x) Lanterman  y p(y;  )p(y|  )p(  )p(  |y) Van Trees’ and Srinath’s use of p(  |  ) for a non-Bayesian likelihood is potentially misleading Poor’s uses of p  (y) for a Bayesian loglikelihood is nonstandard When discussion Kalman and Wiener filtering, Poor uses X for parameters and Y for data (common notation in filtering literature)

Approaches of Different Authors Srinath et. al Det. with finite/DT data Det. with CT Data Est. with finite/DT data Est. with CT data/params. Det. with finite/DT data Est. with finite/DT data Det. with CT data Est. with CT data/params. Al Hero Est. with finite data Det. with finite data (little emphasis on CT) Stephen Kay Est. with finite/DT data Det. with finite/DT data Det. with CT data Est. with CT data/params. (less emphasis on CT) Vince Poor/Van Trees Continuous-Time analysis needs more mathematical machinery: Kahunen-Loéve expansions

Pedagocial Question: What to do first? View 1: Detection first, then Estimation (Van Trees/Poor/Srinath et. al) –Detection Theory is easier; introduces concepts used in Estimation Theory in a simple context –Detection Theory is more fun View 2: Estimation first, then Detection (Kay/Hero) –Detection Theory just a special case of estimation theory –Detection problems with “unknown parameters” are easier to think about if you’ve already seen estimation theory –Estimation Theory is more fun View 1 seems more common, but we’ll take View 2

Pedagogical Question: When to do Karhunen-Loéve? View 1: Don’t bother with at all! (Scharf, Kay) –“Most modern implementations done with discrete-time processing anyway” –Response from View 2: “continuous-time analysis provides insight into the behavior of discrete-time implementations” View 2a: Do after finite-data detection problems, but before estimation (Srinath et. al) –Do all detection material at once View 2b: Put at end of course (Poor/Van Trees/Hero) –Material quite challenging; integral equations are scary! We’ll take view 2b.