source( “ C:/MME.R ” ) > MME(125, 18, 20, 24) [1] "By MME method" [1] In R:">

Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Similar presentations


Presentation on theme: "1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University"— Presentation transcript:

1 1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

2 2 Part 1 Computation Tools

3 Include Functions in R  source("file path")  Example In MME.R: 3 MME=function(y1, y2, y3, y4) { n=y1+y2+y3+y4; phi1=4.0*(y1/n-0.5); phi2=1-4*y2/n; phi3=1-4*y3/n; phi4=4.0*y4/n; phi=(phi1+phi2+phi3+phi4)/4.0; print("By MME method") return(phi); # print(phi); } > source( “ C:/MME.R ” ) > MME(125, 18, 20, 24) [1] "By MME method" [1] In R:

4 4 Part 2 Motivation Examples

5 Example 1 in Genetics (1)  Two linked loci with alleles A and a, and B and b A, B: dominant a, b: recessive  A double heterozygote AaBb will produce gametes of four types: AB, Ab, aB, ab 5 A Bb a B A b a 1/2 a B b A A B b a

6 Example 1 in Genetics (2)  Probabilities for genotypes in gametes 6 No RecombinationRecombination Male1-rr Female1-r ’ r’r’ ABabaBAb Male(1-r)/2 r/2 Female(1-r ’ )/2 r ’ /2 A Bb a B A b a 1/2 a B b A A B b a

7 Example 1 in Genetics (3)  Fisher, R. A. and Balmukand, B. (1928). The estimation of linkage from the offspring of selfed heterozygotes. Journal of Genetics, 20, 79–92.  More: yes/bank/handout12.pdf 7

8 Example 1 in Genetics (4) 8 MALE AB (1-r)/2 ab (1-r)/2 aB r/2 Ab r/2 FEMALEFEMALE AB (1-r ’ )/2 AABB (1-r) (1-r ’ )/4 aABb (1-r) (1-r ’ )/4 aABB r (1-r ’ )/4 AABb r (1-r ’ )/4 ab (1-r ’ )/2 AaBb (1-r) (1-r ’ )/4 aabb (1-r) (1-r ’ )/4 aaBb r (1-r ’ )/4 Aabb r (1-r ’ )/4 aB r ’ /2 AaBB (1-r) r ’ /4 aabB (1-r) r ’ /4 aaBB r r ’ /4 AabB r r ’ /4 Ab r ’ /2 AABb (1-r) r ’ /4 aAbb (1-r) r ’ /4 aABb r r ’ /4 AAbb r r ’ /4

9 Example 1 in Genetics (5)  Four distinct phenotypes: A*B*, A*b*, a*B* and a*b*.  A*: the dominant phenotype from (Aa, AA, aA).  a*: the recessive phenotype from aa.  B*: the dominant phenotype from (Bb, BB, bB).  b*: the recessive phenotype from bb.  A*B*: 9 gametic combinations.  A*b*: 3 gametic combinations.  a*B*: 3 gametic combinations.  a*b*: 1 gametic combination.  Total: 16 combinations. 9

10 Example 1 in Genetics (6)  Let, then 10

11 Example 1 in Genetics (7)  Hence, the random sample of n from the offspring of selfed heterozygotes will follow a multinomial distribution: We know that and So 11

12 Example 1 in Genetics (8)  Suppose that we observe the data of which is a random sample from Then the probability mass function is 12

13 Maximum Likelihood Estimate (MLE)  Likelihood:  Maximize likelihood: Solve the score equations, which are setting the first derivates of likelihood to be zeros.  Under regular conditions, the MLE is consistent, asymptotic efficient and normal!  More: ihood 13

14 MLE for Example 1 (1)  Likelihood  MLE: 14

15 MLE for Example 1 (2) 15 A B C

16 MLE for Example 1 (3)  Checking: Compare ? 16

17 17 Part 3 Numerical Solutions for the Score Equations of MLEs

18 A Banach Space  A Banach space is a vector space over the field such that Every Cauchy sequence of converges in (i.e., is complete).  More: 18

19 Lipschitz Continuous  A closed subset and mapping 1. is Lipschitz continuous on with if 2. is a contraction mapping on if is Lipschitz continuous and  More: tinuous 19

20 Fixed Point Theorem (1)  If is a contraction mapping on if is Lipschitz continuous and 1. has an unique fixed point such that 2. initial 3. 20

21 Fixed Point Theorem (2)  More: point_theorem linux.com/spip.php?article60 21

22 Applications for MLE (1)  Numerical solution is a contraction mapping : initial value that Then 22

23 Applications for MLE (2)  How to choose s.t. is a contraction mapping?  Optimal ? 23

24 Parallel Chord Method (1)  Parallel chord method is also called simple iteration.  24

25 25 Parallel Chord Method (2)

26 Plot Parallel Chord Method by R (1) ### Simple iteration ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1- phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Parallel chord method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0) 26

27 Plot Parallel Chord Method by R (2) phi0 = 0.25# Given the initial value 0.25 # segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "red") phi1 = -b0/f2(phi0)# Find the closer phi # segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2) # Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red") 27

28 Define Functions for Example 1 in R  We will define some functions and variables for finding the MLE in Example 1 by R 28 # Fist, second and third derivatives of log likelihood # f1 = function(y1, y2, y3, y4, phi){y1/(2+phi)-(y2+y3)/(1- phi)+y4/phi} f2 = function(y1, y2, y3, y4, phi) {(-1)*y1*(2+phi)^(-2)- (y2+y3)*(1-phi)^(-2)-y4*(phi)^(-2)} f3 = function(y1, y2, y3, y4, phi) {2*y1*(2+phi)^(-3)- 2*(y2+y3)*(1-phi)^(-3)+2*y4*(phi)^(-3)} # Fisher Information # I = function(y1, y2, y3, y4, phi) {(- 1)*(y1+y2+y3+y4)*(1/(4*(2+phi))+1/(2*(1- phi))+1/(4*phi))} y1 = 125; y2 = 18; y3 = 20; y4 = 24; initial = 0.9

29 Parallel Chord Method by R (1) > fix(SimpleIteration) function(y1, y2, y3, y4, initial){ phi = NULL; i = 0; alpha = -1.0/f2(y1, y2, y3, y4, initial); phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-5){ i = i+1; phi1 = phi2; phi2 = alpha*f1(y1, y2, y3, y4, phi1)+phi1; phi[i] = phi2; } print("By parallel chord method(simple iteration)"); return(list(phi = phi2, iteration = phi)); } 29

30 Parallel Chord Method by R (2) > SimpleIteration(y1, y2, y3, y4, initial) 30

31 Parallel Chord Method by C/C++ 31

32 Newton-Raphson Method (1)   /Newton'sMethodMod.html /Newton'sMethodMod.html  _method _method 32

33 33 Newton-Raphson Method (2)

34 Plot Newton-Raphson Method by R (1) ### Newton-Raphson Method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 # First and second derivatives of log likelihood # f1 <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} f2 <- function(phi) {(-1)*y1*(2+phi)^(-2)-(y2+y3)*(1- phi)^(-2)-y4*(phi)^(-2)} x = c(10:80)*0.01 y = f1(x) plot(x, y, type = 'l', main = "Newton-Raphson method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0) 34

35 Plot Newton-Raphson Method by R (2) # Given the initial value 0.25 # phi0 = 0.25 segments(0, f1(phi0), phi0, f1(phi0), col = "green", lty = 2) segments(phi0, f1(phi0), phi0, -200, col = "green", lty = 2) # Use the tangent line to find the intercept b0 # b0 = f1(phi0)-f2(phi0)*phi0 curve(f2(phi0)*x+b0, add = T, col = "purple", lwd = 2) # Find the closer phi # phi1 = -b0/f2(phi0) segments(phi1, -200, phi1, f1(phi1), col = "green", lty = 2) segments(0, f1(phi1), phi1, f1(phi1), col = "green", lty = 2) 35

36 Plot Newton-Raphson Method by R (3) # Use the parallel line to find the intercept b1 # b1 = f1(phi1)-f2(phi0)*phi1 curve(f2(phi0)*x+b1, add = T, col = "red") curve(f2(phi1)*x-f2(phi1)*phi1+f1(phi1), add = T, col = "blue", lwd = 2) legend(0.45, 250, c("Newton-Raphson", "Parallel chord method"), col = c("blue", "red"), lty = c(1, 1)) 36

37 Newton-Raphson Method by R (1) > fix(Newton) function(y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Newton-Raphson method"); return (list(phi = phi2, iteration = phi)); } 37

38 Newton-Raphson Method by R (2) > Newton(125, 18, 20, 24, 0.9) [1] "By Newton-Raphson method" $phi [1] $iteration [1]

39 Newton-Raphson Method by C/C++ 39

40 Halley’s Method  The Newton-Raphson iteration function is  It is possible to speed up convergence by using more expansion terms than the Newton-Raphson method does when the object function is very smooth, like the method by Edmond Halley ( ): 40 (http://math.fullerton.edu/mathews/n2003/Halley'sMethodMod.html)http://math.fullerton.edu/mathews/n2003/Halley'sMethodMod.html

41 Halley’s Method by R (1) > fix(Halley) function( y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/(f2(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1)*1.0/(1.0-f1(y1, y2, y3, y4, phi1)*f3(y1, y2, y3, y4, phi1)/(f2(y1, y2, y3, y4, phi1)*f2(y1, y2, y3, y4, phi1)*2.0)); phi[i] = phi2; } print("By Halley method"); return (list(phi = phi2, iteration = phi)); } 41

42 Halley’s Method by R (2) > Halley(125, 18, 20, 24, 0.9) [1] "By Halley method" $phi [1] $iteration [1]

43 Halley’s Method by C/C++ 43

44 44 Bisection Method (1)  Assume that and that there exists a number such that. If and have opposite signs, and represents the sequence of midpoints generated by the bisection process, then and the sequence converges to.  That is,.  hod hod

45 45 1 Bisection Method (2)

46 Plot the Bisection Method by R ### Bisection method ### y1 = 125; y2 = 18; y3 = 20; y4 = 24 f <- function(phi) {y1/(2+phi)-(y2+y3)/(1-phi)+y4/phi} x = c(1:100)*0.01 y = f(x) plot(x, y, type = 'l', main = "Bisection method", xlab = expression(varphi), ylab = "First derivative of log likelihood function") abline(h = 0) abline(v = 0.5, col = "red") abline(v = 0.75, col = "red") text(0.49, 2200, labels = "1") text(0.74, 2200, labels = "2") 46

47 Bisection Method by R (1) > fix(Bisection) function(y1, y2, y3, y4, A, B)# A, B is the boundary of parameter # { Delta = 1.0E-6;# Tolerance for width of interval # Satisfied = 0;# Condition for loop termination # phi = NULL; YA = f1(y1, y2, y3, y4, A);# Compute function values # YB = f1(y1, y2, y3, y4, B); # Calculation of the maximum number of iterations # Max = as.integer(1+floor((log(B-A)-log(Delta))/log(2))); # Check to see if the bisection method applies # if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){ print("The values of function in boundary point do not differ in sign."); 47

48 Bisection Method by R (2) print("Therefore, this method is not appropriate here."); quit();# Exit program # } for(K in 1:Max){ if(Satisfied == 1) break; C = (A+B)/2;# Midpoint of interval YC = f1(y1, y2, y3, y4, C);# Function value at midpoint # if((K-1) < 100) phi[K-1] = C; if(YC == 0){ A = C;# Exact root is found # B = C; } else{ if((YB*YC) >= 0 ){ 48

49 Bisection Method by R (3) B = C;# Squeeze from the right # YB = YC; } else{ A = C;# Squeeze from the left # YA = YC; } if((B-A) < Delta) Satisfied = 1;# Check for early convergence # }# End of 'for'-loop # print("By Bisection Method"); return(list(phi = C, iteration = phi)); } 49

50 Bisection Method by R (4) > Bisection(125, 18, 20, 24, 0.25, 1) [1] "By Bisection Method" $phi [1] $iteration [1] [8] [15]

51 Bisection Method by C/C++ (1) 51

52 Bisection Method by C/C++ (2) 52

53 Secant Method   More: od /SecantMethodMod.html od /SecantMethodMod.html 53

54 Secant Method by R (1) > fix(Secant) function(y1, y2, y3, y4, initial1, initial2){ phi = NULL; phi2 = initial1; phi1 = initial2; alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); i = 0; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; 54

55 Secant Method by R (2) alpha = (phi2-phi1)/(f1(y1, y2, y3, y4, phi2)-f1(y1, y2, y3, y4, phi1)); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/f2(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Secant method"); return (list(phi=phi2,iteration=phi)); } 55

56 Secant Method by R (3) > Secant(125, 18, 20, 24, 0.9, 0.05) [1] "By Secant method" $phi [1] $iteration [1]

57 Secant Method by C/C++ 57

58 58 Secant-Bracket Method  The secant-bracket method is also called the regular falsi method. S C A B

59 Secant-Bracket Method by R (1) > fix(RegularFalsi) function(y1, y2, y3, y4, A, B){ phi = NULL; i = -1; Delta = 1.0E-6;# Tolerance for width of interval # Satisfied = 1;# Condition for loop termination # # Endpoints of the interval [A,B] # YA = f1(y1, y2, y3, y4, A);# compute function values # YB = f1(y1, y2, y3, y4, B); # Check to see if the bisection method applies # if(((YA >= 0) & (YB >=0)) || ((YA < 0) & (YB < 0))){ print("The values of function in boundary point do not differ in sign"); print("Therefore, this method is not appropriate here"); q();# Exit program # } 59

60 Secant-Bracket Method by R (2) while(Satisfied){ i = i+1; C = (B*f1(y1, y2, y3, y4, A)-A*f1(y1, y2, y3, y4, B))/(f1(y1, y2, y3, y4, A)-f1(y1, y2, y3, y4, B));# Midpoint of interval # YC = f1(y1, y2, y3, y4, C);# Function value at midpoint # phi[i] = C; if(YC == 0){# First 'if' # A = C;# Exact root is found # B = C; }else{ if((YB*YC) >= 0 ){ B = C;# Squeeze from the right # YB = YC; 60

61 Secant-Bracket Method by R (3) }else{ A = C;# Squeeze from the left # YA = YC; } if(f1(y1, y2, y3, y4, C) < Delta) Satisfied = 0;# Check for early convergence # } print("By Regular Falsi Method") return(list(phi = C, iteration = phi)); } 61

62 Secant-Bracket Method by R (4) > RegularFalsi(y1, y2, y3, y4, 0.9, 0.05) [1] "By Regular Falsi Method" $phi [1] $iteration [1] [8] [15] [22] [29] [36]

63 Secant-Bracket Method by C/C++ (1) 63

64 Secant-Bracket Method by C/C++ (2) 64

65 65 Fisher Scoring Method  Fisher scoring method replaces by where is the Fisher information matrix when the parameter may be multivariate.

66 Fisher Scoring Method by R (1) > fix(Fisher) function(y1, y2, y3, y4, initial){ i = 0; phi = NULL; phi2 = initial; phi1 = initial+1; while(abs(phi1-phi2) >= 1.0E-6){ i = i+1; phi1 = phi2; alpha = 1.0/I(y1, y2, y3, y4, phi1); phi2 = phi1-f1(y1, y2, y3, y4, phi1)/I(y1, y2, y3, y4, phi1); phi[i] = phi2; } print("By Fisher method"); return(list(phi = phi2, iteration = phi)); } 66

67 Fisher Scoring Method by R (2) > Fisher(125, 18, 20, 24, 0.9) [1] "By Fisher method" $phi [1] $iteration [1]

68 Fisher Scoring Method by C/C++ 68

69 Order of Convergence  Order of convergence is if and for.  More: vergence vergence  Note: as Hence, we can use regression to estimate 69

70 Theorem for Newton-Raphson Method  If, is a contraction mapping then and  If exists, has a simple zero, then such that of the Newton-Raphson method is a contraction mapping and. 70

71 Find Convergence Order by R (1) > # Coverage order # > # Newton method can be substitute for different method # > R = Newton(y1, y2, y3, y4, initial) [1] "By Newton-Raphson method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x

72 > R = Fisher(y1, y2, y3, y4, initial) [1] "By Fisher method" > temp=log(abs(R$iteration-R$phi)) > y=temp[2:(length(temp)-1)] > x=temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x > R = Bisection(y1, y2, y3, y4, 0.25, 1) [1] "By Bisection Method" > temp=log(abs(R$iteration-R$phi)) > y=temp[2:(length(temp)-1)] > x=temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x >R = SimpleIteration(y1, y2, y3, y4, initial) [1] "By parallel chord method(simple iteration)" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x > R = Secant(y1, y2, y3, y4, initial, 0.05) [1] "By Secant method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x Find Convergence Order by R (2)

73 Find Convergence Order by R (3) > R = RegularFalsi(y1, y2, y3, y4, initial, 0.05) [1] "By Regular Falsi Method" > temp = log(abs(R$iteration-R$phi)) > y = temp[2:(length(temp)-1)] > x = temp[1:(length(temp)-2)] > lm(y~x) Call: lm(formula = y ~ x) Coefficients: (Intercept) x

74 Find Convergence Order by C/C++ 74

75 Exercises  Write your own programs for those examples presented in this talk.  Write programs for those examples mentioned at the following web page: elihood  Write programs for the other examples that you know. 75

76 More Exercises (1)  Example 3 in genetics: The observed data are where,, and fall in such that Find the MLEs for,, and. 76

77 More Exercises (2)  Example 4 in the positron emission tomography (PET): The observed data are and  The values of are known and the unknown parameters are.  Find the MLEs for. 77

78 More Exercises (3)  Example 5 in the normal mixture: The observed data are random samples from the following probability density function:  Find the MLEs for the following parameters: 78


Download ppt "1 Maximum Likelihood Estimates and the EM Algorithms II Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University"

Similar presentations


Ads by Google