Lecture 5 More loops Introduction to maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

Detecting changes in von Bertalanffy growth parameters (K, L ∞ ) in a hypothetical fish stock subject to size-selective fishing Otoliths (ear bones) from.
Brief introduction on Logistic Regression
Lecture 13 L1 , L∞ Norm Problems and Linear Programming
1 Functions and Applications
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
Lecture 2 Describing Data II ©. Summarizing and Describing Data Frequency distribution and the shape of the distribution Frequency distribution and the.
FMSP stock assessment tools Training Workshop LFDA Theory.
458 Fitting models to data – II (The Basics of Maximum Likelihood Estimation) Fish 458, Lecture 9.
GG313 Lecture 8 9/15/05 Parametric Tests. Cruise Meeting 1:30 PM tomorrow, POST 703 Surf’s Up “Peak Oil and the Future of Civilization” 12:30 PM tomorrow.
5. Estimation 5.3 Estimation of the mean K. Desch – Statistical methods of data analysis SS10 Is an efficient estimator for μ ?  depends on the distribution.
Lecture 5 Curve fitting by iterative approaches MARINE QB III MARINE QB III Modelling Aquatic Rates In Natural Ecosystems BIOL471 © 2001 School of Biological.
Estimation 8.
Hui-Hua Lee 1, Kevin R. Piner 1, Mark N. Maunder 2 Evaluation of traditional versus conditional fitting of von Bertalanffy growth functions 1 NOAA Fisheries,
458 Fitting models to data – III (More on Maximum Likelihood Estimation) Fish 458, Lecture 10.
Copyright © Cengage Learning. All rights reserved. 5 Integrals.
Maximum likelihood (ML)
Chapter 5 Plotting Data Curve Fitting, Good Graphing Practices Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Central Tendency and Variability
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
Inequalities.
Psychometrics.
Chapter 13 Statistics © 2008 Pearson Addison-Wesley. All rights reserved.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
16-1 Copyright  2010 McGraw-Hill Australia Pty Ltd PowerPoint slides to accompany Croucher, Introductory Mathematics and Statistics, 5e Chapter 16 The.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Analytical vs. Numerical Minimization Each experimental data point, l, has an error, ε l, associated with it ‣ Difference between the experimentally measured.
Data Handling & Analysis BD7054 Scatter Plots Andrew Jackson
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Measures of Central Tendency and Dispersion Preferred measures of central location & dispersion DispersionCentral locationType of Distribution SDMeanNormal.
© 2008 Pearson Addison-Wesley. All rights reserved Chapter 1 Section 13-6 Regression and Correlation.
Sample-Based Epidemiology Concepts Infant Mortality in the USA (1991) Infant Mortality in the USA (1991) UnmarriedMarriedTotal Deaths16,71218,78435,496.
Modern Navigation Thomas Herring
CHAPTER 3 Model Fitting. Introduction Possible tasks when analyzing a collection of data points: Fitting a selected model type or types to the data Choosing.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Principles of Parameter Estimation.
Understanding Your Data Set Statistics are used to describe data sets Gives us a metric in place of a graph What are some types of statistics used to describe.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Sociology 5811: Lecture 3: Measures of Central Tendency and Dispersion Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Review - Confidence Interval Most variables used in social science research (e.g., age, officer cynicism) are normally distributed, meaning that their.
Trees Example More than one variable. The residual plot suggests that the linear model is satisfactory. The R squared value seems quite low though,
ANOVA, Regression and Multiple Regression March
Slide Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley.
INTEGRALS We saw in Section 5.1 that a limit of the form arises when we compute an area. We also saw that it arises when we try to find the distance traveled.
5 INTEGRALS.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Computacion Inteligente Least-Square Methods for System Identification.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
CHAPTER 11 Mean and Standard Deviation. BOX AND WHISKER PLOTS  Worksheet on Interpreting and making a box and whisker plot in the calculator.
Lecture 3 Loops and conditions Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Review. Common probability distributions Discrete: binomial, Poisson, negative binomial, multinomial Continuous: normal, lognormal, beta, gamma, (negative.
Lecture 5 More maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Day 3 Session 1 Model fitting. So, how do you assess a stock assessment model? There are five key questions you should ask yourself: 1.Assumptions What.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Lecture 2 Functions Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington.
Lecture 7 Confidence intervals
Bayesian Estimation and Confidence Intervals Lecture XXII.
Chapter 4: Basic Estimation Techniques

(5) Notes on the Least Squares Estimate
Lecture 5 More maximum likelihood estimation
CHAPTER 29: Multiple Regression*
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
CHAPTER 12 More About Regression
Presentation transcript:

Lecture 5 More loops Introduction to maximum likelihood estimation Trevor A. Branch FISH 553 Advanced R School of Aquatic and Fishery Sciences University of Washington

More on loops Very often, we need to loop through a series of values, calculate something for each, and store the result General strategy – Find out how many iterations there will be ( niter ) – Make a vector of the values to loop over – Create a results vector of length niter – Loop from i in 1:niter – Conduct the analysis based on the i th element of values – Store the result in the i th element of results

Storing for-loop results, vector mean.norm <- function(n=c(5,10,15,30,50,100)) { niter <- length(n) values <- vector(length=niter) for (i in 1:niter) { values[i] <- sd(rnorm(n=n[i], mean=0, sd=1)) } return(values) } mean.norm() Number of iterations Create a vector to store the results Loop from 1 to niter, not over the set in n Store in the ith element of values Calculate using the ith element of n Return the vector of answers

Number of samples from rnorm(mean=0, sd=1) SD of samples

Storing for-loop results, matrix mean.norm.mat <- function(n=c(5,10,15,30,50,100), nrep=100) { niter <- length(n) values <- matrix(nrow=nrep, ncol=niter) for (i in 1:nrep) { for (j in 1:niter) { values[i,j] <- sd(rnorm(n=n[j], mean=0, sd=1)) } return(values) } x <- mean.norm.mat(nrep=1000) Create a matrix to store the results Return the matrix of answers Add another loop for nrep Side note: this is not necessarily the fastest way to program this

Histogram of standard deviations n = 5 n = 10 n = 15 n = 30 n = 50 n = 100

Fitting models to data Scenario: the length (L) of a fish is related to its age (a) according to the von Bertalanffy growth curve where is the asymptotic maximum length in cm, K is the growth rate, and t 0 is the age at zero length MalesFemales

One possible model fit (males) Age (years) Length (cm)

Questions to answer Estimate the parameters (, K, t 0,  ) for this equation separately for males and females Calculate 95% confidence intervals for asymptotic length Can we conclude that asymptotic length differs for males and females?

What is maximum likelihood? To answer these questions, we will use maximum likelihood estimation Maximum likelihood is a method (in fact the optimal method) for fitting a mathematical model to some data “Fitting” means estimating the values of the model parameters that will ensure the model is closest to the data points

Simplifying the problem The full von Bertalanffy model is: However, the t 0 parameter is usually close to zero. To simplify the model, we will assume that t 0 = 0, and therefore Asymptotic maximum length (cm) Growth rate (yr -1 ) Age (yr) Length (cm)

Data, parameters, and error Data Parameters Error

In-class exercise 1 Create a function VB.fit() with arguments Linfinity, K, gender ("male" or "female"), and filename Tasks of the function (next slide): – Read in the data from file "LengthAge.csv" (Canvas) – Extract a vector of lengths, and a vector of ages for the specified gender (e.g. gender = "male") – Plot the lengths as a function of ages – Create a vector of xvalues for the model fit from age 0 to the maximum age in the data – Apply the von Bertalanffy model to the xvalues to get a vector of yvalues – Use lines() to plot xvalues against yvalues

gender = "male" L inf = 100 K = 0.2 Simple plot (no beautification)

Different values for L inf and K lead to different model fits (lines) Some model fits are “better” than others gender = "male" L inf = 110 K = 0.5 gender = "male" L inf = 110 K = 0.2 gender = "male" L inf = 110 K = 0.1 gender = "male" L inf = 90 K = 0.3 Beautified

How do we define “best fit” The “best” model fit should go through the middle of the data points There should be about as many points above the model fit as below the curve There should be few points that are far from the model fit (“outliers”)

Residuals (vertical lines) The differences between obs L and pred L are the residuals. Candidates for “best fits” might minimize the absolute values of residuals or the squares of residuals Also called “sum of squares”

Normal likelihoods Age (years) Length (cm) Each curve is a normal distribution with the mean at the model prediction. The height of the curve at each red data point is the likelihood. The likelihood depends on the curve (normal), the model prediction at that age (mean), the data points, and the standard deviation chosen (  =10 here) 5 LikeDensLines.pdf Zoom in

Zoom in... This data point has a high likelihood. The curve here is high This data point has a lower likelihood This data point has a very low likelihood. It is so far in the tails of the curve that I didn’t even plot the curve out here The highest likelihood is when the data point equals the model-predicted length

Normal likelihood: SD = 10 Age (years) Length (cm) 5 LikeDensLines.pdf

Normal likelihood: SD = 7 Age (years) Length (cm) When the standard deviation (sigma) is smaller, the likelihood is higher near the model prediction, but lower far from the model prediction 5 LikeDensLines.pdf

Normal likelihood: SD = 15 Age (years) Length (cm) When the standard deviation (sigma) is bigger, the likelihood is lower near the model prediction, but higher far from the model prediction 5 LikeDensLines.pdf

Maximum likelihood estimate (MLE) For every data point L i,obs, calculate the likelihood from the normal distribution, given a VB-predicted length L i,pred, and a standard deviation (  ) The total likelihood of the model fit to the data is the product of each L i value The maximum likelihood estimates are the values of K, L inf, and  that result in von–Bertalanffy-predicted lengths L i,pred that maximize the total likelihood Equation for one data point’s horizontal line in the previous plots

Products of likelihoods produce tiny numbers, such as , that computers don’t handle very well However, maximum likelihood estimates occur at the same values of K, L inf, and  as the minimum negative log-likelihood Start with the normal likelihood Take the negative log likelihood Negative log-likelihood The negative flips the curve so we look for a minimum instead of a maximum The logarithm turns tiny numbers into usable numbers: -ln( ) = 184

The MLE is where the product of the likelihoods is highest Now log(x 1 × x 2 ×...) = log(x 1 )+log(x 2 )+..., therefore Simplifying produces this equation This is the normal negative log-likelihood Product of likelihood, sum of -lnL This is the sum of squares! Note the residuals here L obs -L pred

Calculating –lnL tot in R code LA <- read.csv(filename) ages=LA[LA$Gender==gender,]$Ages lengths=LA[LA$Gender==gender,]$Lengths model.predL <- Linfinity*(1-exp(-K*ages)) ndata <- length(ages) NLL <- 0.5*ndata*log(2*pi) + ndata*log(sigma) + 1/(2*sigma*sigma) * sum((lengths-model.predL)^2) Read in the data and extract the lengths and ages for one gender Use VB model to calculate predicted lengths (ages is a vector) What is n? The number of data points Calculate the negative log-likelihood function parameters

Using manipulate() Package "manipulate" has a function manipulate() which allows the user to create interactive plots in RStudio (and sadly only RStudio) Each argument in the function must be called in exactly the same order in manipulate() You can use sliders, pickers, and checkboxes to allow the use to choose different values for the function arguments The function is then rerun and the results plotted with the new values

How I used manipulate() VB.like.sliders <- function(gender, Linfinity, K, sigma, add.curve, add.lines) { #make plot, calculate NLL, report NLL } manipulate(VB.like.sliders(gender, Linfinity, K, sigma, add.curve, add.lines), gender = picker("Male","Female"), Linfinity = slider(min=40, max=120, initial=90, step=0.001), K = slider(min=0.05, max=0.7, initial=0.3, step=0.001), sigma = slider(min=2, max=30, initial=7, step=0.001), add.curve = checkbox(TRUE, "Add curve"), add.lines = checkbox(TRUE, "Add lines"))

In-class exercise 2 Group class activity to find the lowest possible total negative log-likelihood (males) Use RStudio and package manipulate to run the code in "Lecture 5 manipulate.r" Play with the sliders for K, L inf, and  to find values that produce the smallest possible negative log- likelihood, -lnL tot If you find a better value, go to the board and write down the values you found for K, L inf, , and -lnL tot At home: go through the code and figure it out