Maximum Likelihood Estimate Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
EMNLP, June 2001Ted Pedersen - EM Panel1 A Gentle Introduction to the EM Algorithm Ted Pedersen Department of Computer Science University of Minnesota.
AP Statistics Chapter 7 – Random Variables. Random Variables Random Variable – A variable whose value is a numerical outcome of a random phenomenon. Discrete.
Lecture 4A: Probability Theory Review Advanced Artificial Intelligence.
Review of Probability. Definitions (1) Quiz 1.Let’s say I have a random variable X for a coin, with event space {H, T}. If the probability P(X=H) is.
A.P. STATISTICS LESSON 7 – 1 ( DAY 1 ) DISCRETE AND CONTINUOUS RANDOM VARIABLES.
.. . Parameter Estimation using likelihood functions Tutorial #1 This class has been cut and slightly edited from Nir Friedman’s full course of 12 lectures.
Parameter Estimation using likelihood functions Tutorial #1
Part 2b Parameter Estimation CSE717, FALL 2008 CUBS, Univ at Buffalo.
Basics of Statistical Estimation. Learning Probabilities: Classical Approach Simplest case: Flipping a thumbtack tails heads True probability  is unknown.
A gentle introduction to Gaussian distribution. Review Random variable Coin flip experiment X = 0X = 1 X: Random variable.
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Today Today: Chapter 9 Assignment: 9.2, 9.4, 9.42 (Geo(p)=“geometric distribution”), 9-R9(a,b) Recommended Questions: 9.1, 9.8, 9.20, 9.23, 9.25.
. PGM: Tirgul 10 Parameter Learning and Priors. 2 Why learning? Knowledge acquisition bottleneck u Knowledge acquisition is an expensive process u Often.
Probability and Statistics Review
Visual Recognition Tutorial
Thanks to Nir Friedman, HU
EM Algorithm Likelihood, Mixture Models and Clustering.
1 CY1B2 Statistics Aims: To introduce basic statistics. Outcomes: To understand some fundamental concepts in statistics, and be able to apply some probability.
Discrete Random Variables: PMFs and Moments Lemon Chapter 2 Intro to Probability
Probability theory: (lecture 2 on AMLbook.com)
TELECOMMUNICATIONS Dr. Hugh Blanton ENTC 4307/ENTC 5307.
Modeling and Simulation CS 313
Probability Distributions - Discrete Random Variables Outcomes and Events.
A statistical model Μ is a set of distributions (or regression functions), e.g., all uni-modal, smooth distributions. Μ is called a parametric model if.
4.1 Probability Distributions. Do you remember? Relative Frequency Histogram.
Random Variables. A random variable X is a real valued function defined on the sample space, X : S  R. The set { s  S : X ( s )  [ a, b ] is an event}.
Chapter 7 Point Estimation
: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.
Quadratic Classifiers (QC) J.-S. Roger Jang ( 張智星 ) CS Dept., National Taiwan Univ Scientific Computing.
Probability Review-1 Probability Review. Probability Review-2 Probability Theory Mathematical description of relationships or occurrences that cannot.
Week 41 How to find estimators? There are two main methods for finding estimators: 1) Method of moments. 2) The method of Maximum likelihood. Sometimes.
Maximum Likelihood Estimation
Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.
Probability Distributions and Expected Value
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Conditional Expectation
Math 1320 Chapter 7: Probability 7.3 Probability and Probability Models.
Random Variables Lecture Lecturer : FATEN AL-HUSSAIN.
Conditional Probability 423/what-is-your-favorite-data-analysis-cartoon 1.
Bayesian Estimation and Confidence Intervals Lecture XXII.
Applied statistics Usman Roshan.
APPENDIX A: A REVIEW OF SOME STATISTICAL CONCEPTS
Umm Al-Qura University
Usman Roshan CS 675 Machine Learning
Maximum Likelihood Estimate
Quadratic Classifiers (QC)
Probability Theory and Parameter Estimation I
Parameter Estimation 主講人:虞台文.
STATISTICS Random Variables and Distribution Functions
Bayes Net Learning: Bayesian Approaches
Maximum Likelihood Estimation
Samples and Populations
Review of Probability and Estimators Arun Das, Jason Rebello
PROBABILITY The probability of an event is a value that describes the chance or likelihood that the event will happen or that the event will end with.
PROBABILITY.
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Advanced Artificial Intelligence
Covered only ML estimator
I flip a coin two times. What is the sample space?
Discrete & Continuous Random Variables
M248: Analyzing data Block A UNIT A3 Modeling Variation.
Discrete Random Variables: Basics
Discrete Random Variables: Basics
Experiments, Outcomes, Events and Random Variables: A Revisit
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Mathematical Foundations of BME Reza Shadmehr
Discrete Random Variables: Basics
Presentation transcript:

Maximum Likelihood Estimate Jyh-Shing Roger Jang ( 張智星 ) CSIE Dept, National Taiwan University

2 Intro. to Maximum Likelihood Estimate MLE Maximum likelihood estimate Goal: Given a dataset with no labels, how can we find the best model with the optimum parameters to describe the data? Applications Prediction Analysis

3 What Are Models? Models are used to describe the probabilities of random variables Discrete variables  Probabilities Continuous variables  Probability density functions (PDF) Examples Discrete variables The outcome of tossing a coin or a dice Continuous variables The distance to the bull eye when throwing a dart The time needed to run 100-m dash The heights of second-grade students Personalized PDF!

4 More about Models Discrete variables Outcome of tossing a coin  Pr{head}=1/2, Pr{tail}=1/2 Continuous variables Distance to the bull’s eye when throwing a dart  A PDF of Gaussian or normal distribution Quiz! Probability of x in [4, 6]

5 Basic Steps in MLE Steps 1. Perform a certain experiment to collect the data. 2. Choose a parametric model of the data, with certain modifiable parameters. 3. Formulate the likelihood as an objective function to be maximized. 4. Maximize the objective function and derive the parameters of the model. Examples Toss a coin  To find the probabilities of head and tail Throw a dart  To find your PDF of distance to the bull eye Sample a group of animals  To find the quantity of animals

6 Probability Model for Discrete Variables Toss an unfair coin 5 times to get 3 heads and 2 tails Our intuition: Pr{head}=3/5, Pr{tail}=2/5 By MLE Assume these 5 tosses are independent events to have the overall probability

7 Inequality of Arithmetic and Geometric Means AM-GM inequality Proof of this inequality Wikipedia How to use the inequality to solve MLE problem? Quiz!

8 Probability Model for Discrete Variables Toss a 3-side die for many times and obtain n1 of side 1, n2 of side 2, and n3 of side 3, then what is the most likely probabilities for sides 1, 2, and 3, respectively? Our intuition… By MLE…

9 MLE for PDF of Continuous Variables of 1D Detailed coverage PDF Overall PDF, or likelihood Log likelihood MLE

10 MLE for PDF of Continuous Variables of ND Detailed coverage PDF Overall PDF, or likelihood Log likelihood MLE

11 Q & A Questions Can we choose other probability models instead of Gaussian/normal distributions?  Yes! What are the other PDF available?