Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.

Similar presentations


Presentation on theme: "Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas."— Presentation transcript:

1 Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas

2 2 Obama contract at intrade.com

3 3 Instance profiles Given k observations of maximum length n, construct a |Σ|×n matrix A (profile) where entry A ij is the estimated probability that the ith letter occurs in position j One way to estimate A ij is to count each letter occuring at this position (c ij ); then This is maximum likelihood estimation (MLE) Estimate becomes better as k increases

4 4 Example data 23 sample motif instances for the cyclic AMP receptor transcription factor (positions 3-9) TTGTGGC TTTTGAT AAGTGTC ATTTGCA CTGTGAG ATGCAAA GTGTTAA ATTTGAA TTGTGAT ATTTATT ACGTGAT ATGTGAG TTGTGAG CTGTAAC CTGTGAA TTGTGAC GCCTGAC TTGTGAT GTGTGAA CTGTGAC ATGAGAC TTGTGAG

5 5 Calculated profile 1234567 A0.3480.0430.0000.0430.1300.8260.261 C0.1740.0870.043 0.0000.0430.304 G0.1300.0000.7830.0000.8260.0430.174 T0.3480.8700.1740.9130.0430.0870.261

6 6 Probability of a motif Suppose that we consider M as a candidate motif consensus How do we find the best M given the observations in A? Assuming independence of positions,

7 7 Maximum likelihood estimation General method for estimating unknown parameters when we have –a sample of values that depend on these parameters –a formula specifying the probability of obtaining these values given the parameters

8 8 MLE example: three coins Suppose we have three coins with probability of heads ⅓, ½, and ⅔ One of them is used to generate a series of 20 tosses and we observe 11 heads θ = the heads probability of the coin used in the experiment Binomial distribution for the number of heads

9 9 Binomial distribution Count of one of two possible outcomes in a series of independent events The probabilities of the two outcomes are constant across events An example of iid events (independent, identically distributed)

10 10 Binomial probability mass If the probability of one outcome (let’s call it A) is p and there are n events –The probability of the other outcome is 1-p –The probability of obtaining a particular sequence of outcomes with m A’s is –There are sequences with the same number m of outcomes A Overall

11 11 MLE example: three coins Result: Choose θ = ½

12 12 MLE example: unknown coins θ can take any value between 0 and 1 m heads in n tosses Solve the differential equation

13 13 Solving the differential equation

14 14 MLE for binomial Of the three solutions, θ = 0 and θ = 1 result in P(X 1,X 2,...,X n | θ) = 0, i.e., local minima On the other hand, for 0 0, so θ = m/n must be a local maximum Therefore the MLE estimate is

15 15 Properties of estimators The estimation error for a given sample is where x is the unknown true value An estimator is a random variable –because it depends on the sample The mean square error represents the overall quality of the estimation across all samples

16 16 Expected values Recall that the expected value of a discrete random variable X is defined as The expected value of a dependent random variable f(X) is For continuous distributions, replace the sum with an integral

17 17 Bias in estimation An estimator is unbiased if MLE is not necessarily unbiased Example: standard deviation –Is the most commonly used measure of dispersion in a data set –For a random variable X, it is defined as

18 18 Estimators of standard deviation MLE estimator where “Almost unbiased” estimator ( is an unbiased estimator of σ 2 ) biased


Download ppt "Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas."

Similar presentations


Ads by Google