Download presentation

Published byKendall Kimble Modified over 2 years ago

1
**Maximum Likelihood Estimates and the EM Algorithms II**

Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

2
**Part 1 Computation Tools**

3
**Include Functions in R source("file path") Example In MME.R: In R:**

MME=function(y1, y2, y3, y4) { n=y1+y2+y3+y4; phi1=4.0*(y1/n-0.5); phi2=1-4*y2/n; phi3=1-4*y3/n; phi4=4.0*y4/n; phi=(phi1+phi2+phi3+phi4)/4.0; print("By MME method") return(phi); # print(phi); } > source(“C:/MME.R”) > MME(125, 18, 20, 24) [1] "By MME method" [1]

4
**Part 2 Motivation Examples**

5
Example 1 in Genetics (1) Two linked loci with alleles A and a, and B and b A, B: dominant a, b: recessive A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab A B b a 1/2 a B b A 1/2

6
**Example 1 in Genetics (2) Probabilities for genotypes in gametes**

No Recombination Recombination Male 1-r r Female 1-r’ r’ A B b a 1/2 a B b A 1/2 AB ab aB Ab Male (1-r)/2 r/2 Female (1-r’)/2 r’/2

7
Example 1 in Genetics (3) Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92. More:

8
**Example 1 in Genetics (4) MALE AB (1-r)/2 ab aB r/2 Ab F E M A L**

AABB (1-r) (1-r’)/4 aABb aABB r (1-r’)/4 AABb AaBb aabb aaBb Aabb r’/2 AaBB (1-r) r’/4 aabB aaBB r r’/4 AabB aAbb AAbb

9
**Example 1 in Genetics (5) Four distinct phenotypes:**

A*B*, A*b*, a*B* and a*b*. A*: the dominant phenotype from (Aa, AA, aA). a*: the recessive phenotype from aa. B*: the dominant phenotype from (Bb, BB, bB). b*: the recessive phenotype from bb. A*B*: 9 gametic combinations. A*b*: 3 gametic combinations. a*B*: 3 gametic combinations. a*b*: 1 gametic combination. Total: 16 combinations.

10
Example 1 in Genetics (6) Let , then

11
Example 1 in Genetics (7) Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution: We know that and So

12
**Example 1 in Genetics (8) Suppose that we observe the data of**

which is a random sample from Then the probability mass function is

13
**Maximum Likelihood Estimate (MLE)**

Maximize likelihood: Solve the score equations, which are setting the first derivates of likelihood to be zeros. Under regular conditions, the MLE is consistent, asymptotic efficient and normal! More:

14
MLE for Example 1 (1) Likelihood MLE:

15
MLE for Example 1 (2) A B C

16
MLE for Example 1 (3) Checking: 1. 2. 3. Compare ?

17
**Part 3 Numerical Solutions for the Score Equations of MLEs**

18
A Banach Space A Banach space is a vector space over the field such that Every Cauchy sequence of converges in (i.e., is complete). More:

19
**Lipschitz Continuous A closed subset and mapping More:**

is Lipschitz continuous on with if is a contraction mapping on if is Lipschitz continuous and More:

20
Fixed Point Theorem (1) If is a contraction mapping on if is Lipschitz continuous and has an unique fixed point such that initial

21
**Fixed Point Theorem (2) More:**

22
**Applications for MLE (1)**

Numerical solution is a contraction mapping : initial value that Then

23
**Applications for MLE (2)**

How to choose s.t is a contraction mapping? Optimal ?

24
**Parallel Chord Method (1)**

Parallel chord method is also called simple iteration.

25
**Parallel Chord Method (2)**

26
**Plot Parallel Chord Method by R (1)**

### Simple iteration ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Parallel chord method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0)

27
**Plot Parallel Chord Method by R (2)**

phi0 = 0.25 # Given the initial value 0.25 # segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "red") phi1 = -b0/f2(phi0) # Find the closer phi # segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2) # Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red")

28
**Define Functions for Example 1 in R**

We will define some functions and variables for finding the MLE in Example 1 by R # Fist, second and third derivatives of log likelihood # f1 = function(y1, y2, y3, y4, phi){y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 = function(y1, y2, y3, y4, phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} f3 = function(y1, y2, y3, y4, phi) {2*y1*(2+phi)^(-3)-2*(y2+y3)*(1-phi)^(-3)+2*y4*(phi)^(-3)} # Fisher Information # I = function(y1, y2, y3, y4, phi) {(-1)*(y1+y2+y3+y4)*(1/(4*(2+phi))+1/(2*(1-phi))+1/(4*phi))} y1 = 125; y2 = 18; y3 = 20; y4 = 24; initial = 0.9

29
**Parallel Chord Method by R (1)**

> fix(SimpleIteration) function(y1, y2, y3, y4, initial){ phi = NULL; i = 0; alpha = -1.0/f2(y1, y2, y3, y4, initial); phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-5){ i = i+1; phi1 = phi2; phi2 = alpha*f1(y1, y2, y3, y4, phi1)+phi1; phi[i] = phi2; } print("By parallel chord method(simple iteration)"); return(list(phi = phi2, iteration = phi));

30
**Parallel Chord Method by R (2)**

> SimpleIteration(y1, y2, y3, y4, initial)

31
**Parallel Chord Method by C/C++**

32
**Newton-Raphson Method (1)**

33
**Newton-Raphson Method (2)**

34
**Plot Newton-Raphson Method by R (1)**

### Newton-Raphson Method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Newton-Raphson method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0)

35
**Plot Newton-Raphson Method by R (2)**

# Given the initial value 0.25 # phi0 = 0.25 segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "purple", lwd = 2) # Find the closer phi # phi1 = -b0/f2(phi0) segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2)

36
**Plot Newton-Raphson Method by R (3)**

# Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red") curve(f2(phi1)*x-f2(phi1)*phi1+f1(phi1), add = T, col = "blue", lwd = 2) legend(0.45, 250, c("Newton-Raphson", "Parallel chord method"), col = c("blue", "red"), lty = c(1, 1))

37
**Newton-Raphson Method by R (1)**

> fix(Newton) function(y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Newton-Raphson method"); return (list(phi = phi2, iteration = phi));

38
**Newton-Raphson Method by R (2)**

> Newton(125, 18, 20, 24, 0.9) [1] "By Newton-Raphson method" $phi [1] $iteration [1]

39
**Newton-Raphson Method by C/C++**

40
**Halley’s Method The Newton-Raphson iteration function is**

It is possible to speed up convergence by using more expansion terms than the Newton-Raphson method does when the object function is very smooth, like the method by Edmond Halley ( ): (http://math.fullerton.edu/mathews/n2003/Halley'sMethodMod.html)

41
Halley’s Method by R (1) > fix(Halley) function( y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1)*1.0/(1.0-f1(y1, y2, y3, y4, phi1)*f3(y1, y2, y3, y4, phi1)/(f2(y1, y2, y3, y4, phi1)*f2(y1, y2, y3, y4, phi1)*2.0)); phi[i] = phi2; } print("By Halley method"); return (list(phi = phi2, iteration = phi));

42
Halley’s Method by R (2) > Halley(125, 18, 20, 24, 0.9) [1] "By Halley method" $phi [1] $iteration [1]

43
**Halley’s Method by C/C++**

44
Bisection Method (1) Assume that and that there exists a number such that If and have opposite signs, and represents the sequence of midpoints generated by the bisection process, then and the sequence converges to . That is,

45
Bisection Method (2) 1

46
**Plot the Bisection Method by R**

### Bisection method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 f <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} x = c(1:100)*0.01 y = f(x) plot(x, y, type = 'l', main = "Bisection method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0) abline(v = 0.5, col = "red") abline(v = 0.75, col = "red") text(0.49, 2200, labels = "1") text(0.74, 2200, labels = "2")

47
**Bisection Method by R (1)**

> fix(Bisection) function(y1, y2, y3, y4, A, B) # A, B is the boundary of parameter # { Delta = 1.0E-6; # Tolerance for width of interval # Satisfied = 0; # Condition for loop termination # phi = NULL; YA = f1(y1, y2, y3, y4, A); # Compute function values # YB = f1(y1, y2, y3, y4, B); # Calculation of the maximum number of iterations # Max = as.integer(1+floor((log(B-A)-log(Delta))/log(2))); # Check to see if the bisection method applies # if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){ print("The values of function in boundary point do not differ in sign.");

48
**Bisection Method by R (2)**

print("Therefore, this method is not appropriate here."); quit(); # Exit program # } for(K in 1:Max){ if(Satisfied == 1) break; C = (A+B)/2; # Midpoint of interval YC = f1(y1, y2, y3, y4, C); # Function value at midpoint # if((K-1) < 100) phi[K-1] = C; if(YC == 0){ A = C; # Exact root is found # B = C; else{ if((YB*YC) >= 0 ){

49
**Bisection Method by R (3)**

B = C; # Squeeze from the right # YB = YC; } else{ A = C; # Squeeze from the left # YA = YC; if((B-A) < Delta) Satisfied = 1; # Check for early convergence # } # End of 'for'-loop # print("By Bisection Method"); return(list(phi = C, iteration = phi));

50
**Bisection Method by R (4)**

> Bisection(125, 18, 20, 24, 0.25, 1) [1] "By Bisection Method" $phi [1] $iteration [1] [8] [15]

51
**Bisection Method by C/C++ (1)**

52
**Bisection Method by C/C++ (2)**

53
Secant Method More:

54
Secant Method by R (1) > fix(Secant) function(y1, y2, y3, y4, initial1, initial2){ phi = NULL; phi2 = initial1; phi1 = initial2; alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); i = 0; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2;

55
Secant Method by R (2) alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Secant method"); return (list(phi=phi2,iteration=phi));

56
Secant Method by R (3) > Secant(125, 18, 20, 24, 0.9, 0.05) [1] "By Secant method" $phi [1] $iteration [1]

57
Secant Method by C/C++

58
**Secant-Bracket Method**

The secant-bracket method is also called the regular falsi method. S C A B

59
**Secant-Bracket Method by R (1)**

> fix(RegularFalsi) function(y1, y2, y3, y4, A, B){ phi = NULL; i = -1; Delta = 1.0E-6; # Tolerance for width of interval # Satisfied = 1; # Condition for loop termination # # Endpoints of the interval [A,B] # YA = f1(y1, y2, y3, y4, A); # compute function values # YB = f1(y1, y2, y3, y4, B); # Check to see if the bisection method applies # if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){ print("The values of function in boundary point do not differ in sign"); print("Therefore, this method is not appropriate here"); q(); # Exit program # }

60
**Secant-Bracket Method by R (2)**

while(Satisfied){ i = i+1; C = (B*f1(y1, y2, y3, y4, A)-A*f1(y1, y2, y3, y4, B))/(f1(y1, y2, y3, y4, A)-f1(y1, y2, y3, y4, B)); # Midpoint of interval # YC = f1(y1, y2, y3, y4, C); # Function value at midpoint # phi[i] = C; if(YC == 0){ # First 'if' # A = C; # Exact root is found # B = C; }else{ if((YB*YC) >= 0 ){ B = C; # Squeeze from the right # YB = YC;

61
**Secant-Bracket Method by R (3)**

}else{ A = C; # Squeeze from the left # YA = YC; } if(f1(y1, y2, y3, y4, C) < Delta) Satisfied = 0; # Check for early convergence # print("By Regular Falsi Method") return(list(phi = C, iteration = phi));

62
**Secant-Bracket Method by R (4)**

> RegularFalsi(y1, y2, y3, y4, 0.9, 0.05) [1] "By Regular Falsi Method" $phi [1] $iteration [1] [8] [15] [22] [29] [36]

63
**Secant-Bracket Method by C/C++ (1)**

64
**Secant-Bracket Method by C/C++ (2)**

65
Fisher Scoring Method Fisher scoring method replaces by where is the Fisher information matrix when the parameter may be multivariate.

66
**Fisher Scoring Method by R (1)**

> fix(Fisher) function(y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/I(y1, y2, y3, y4, phi1); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/I(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Fisher method"); return(list(phi = phi2, iteration = phi));

67
**Fisher Scoring Method by R (2)**

> Fisher(125, 18, 20, 24, 0.9) [1] "By Fisher method" $phi [1] $iteration [1]

68
**Fisher Scoring Method by C/C++**

69
**Order of Convergence Order of convergence is if and for .**

More: Note: as Hence, we can use regression to estimate

70
**Theorem for Newton-Raphson Method**

If , is a contraction mapping then and If exists, has a simple zero, then such that of the Newton-Raphson method is a contraction mapping and

71
**Find Convergence Order by R (1)**

> # Coverage order # > # Newton method can be substitute for different method # > R = Newton(y1, y2, y3, y4, initial) [1] "By Newton-Raphson method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x

72
**Find Convergence Order by R (2)**

>R = SimpleIteration(y1, y2, y3, y4, initial) [1] "By parallel chord method(simple iteration)" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x > R = Secant(y1, y2, y3, y4, initial, 0.05) [1] "By Secant method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x > R = Fisher(y1, y2, y3, y4, initial) [1] "By Fisher method" > temp=log(abs(R$iteration-R$phi)) > y=temp[2:(length(temp)-1)] > x=temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x > R = Bisection(y1, y2, y3, y4, 0.25, 1) [1] "By Bisection Method" > temp=log(abs(R$iteration-R$phi)) > y=temp[2:(length(temp)-1)] > x=temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x

73
**Find Convergence Order by R (3)**

> R = RegularFalsi(y1, y2, y3, y4, initial, 0.05) [1] "By Regular Falsi Method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x

74
**Find Convergence Order by C/C++**

75
Exercises Write your own programs for those examples presented in this talk. Write programs for those examples mentioned at the following web page: Write programs for the other examples that you know.

76
**More Exercises (1) Example 3 in genetics: The observed data are**

where , , and fall in such that Find the MLEs for , , and .

77
More Exercises (2) Example 4 in the positron emission tomography (PET): The observed data are and The values of are known and the unknown parameters are Find the MLEs for

78
**More Exercises (3) Example 5 in the normal mixture:**

The observed data are random samples from the following probability density function: Find the MLEs for the following parameters:

Similar presentations

OK

Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study by Sneh Gulati* *with Jean-Francois.

Goodness of Fit of a Joint Model for Event Time and Nonignorable Missing Longitudinal Quality of Life Data – A Study by Sneh Gulati* *with Jean-Francois.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on event handling in javascript something happens Ppt on job evaluation questions Ppt on f5 load balancer Ppt on power supply Ppt on job evaluation and job rotation Ppt on art of living download Ppt on area of equilateral triangle Ppt on file security system Ppt on building information modeling degree Ppt on blue eyes topic