Probability: Many Random Variables (Part 2) Mike Wasikowski June 12, 2008.

Slides:



Advertisements
Similar presentations
Probability Distribution
Advertisements

Functions of Random Variables. Method of Distribution Functions X 1,…,X n ~ f(x 1,…,x n ) U=g(X 1,…,X n ) – Want to obtain f U (u) Find values in (x 1,…,x.
Distributions of sampling statistics Chapter 6 Sample mean & sample variance.
Discrete Uniform Distribution
ORDER STATISTICS.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
06/05/2008 Jae Hyun Kim Chapter 2 Probability Theory (ii) : Many Random Variables Bioinformatics Tea Seminar: Statistical Methods in Bioinformatics.
Probability theory 2010 Order statistics  Distribution of order variables (and extremes)  Joint distribution of order variables (and extremes)
1 Chap 5 Sums of Random Variables and Long-Term Averages Many problems involve the counting of number of occurrences of events, computation of arithmetic.
Hypergeometric Random Variables. Sampling without replacement When sampling with replacement, each trial remains independent. For example,… If balls are.
SUMS OF RANDOM VARIABLES Changfei Chen. Sums of Random Variables Let be a sequence of random variables, and let be their sum:
Chapter 6 Continuous Random Variables and Probability Distributions
1 Review of Probability Theory [Source: Stanford University]
Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.
Tch-prob1 Chapter 4. Multiple Random Variables Ex Select a student’s name from an urn. S In some random experiments, a number of different quantities.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
Some standard univariate probability distributions
Continuous Random Variables and Probability Distributions
Discrete Random Variables and Probability Distributions
1 Sampling Distribution Theory ch6. 2  Two independent R.V.s have the joint p.m.f. = the product of individual p.m.f.s.  Ex6.1-1: X1is the number of.
Some standard univariate probability distributions
Approximations to Probability Distributions: Limit Theorems.
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
Lecture 28 Dr. MUMTAZ AHMED MTH 161: Introduction To Statistics.
Distribution Function properties. Density Function – We define the derivative of the distribution function F X (x) as the probability density function.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
Copyright © Cengage Learning. All rights reserved. 3.5 Hypergeometric and Negative Binomial Distributions.
Review of Probability.
SAMPLING DISTRIBUTION
Pairs of Random Variables Random Process. Introduction  In this lecture you will study:  Joint pmf, cdf, and pdf  Joint moments  The degree of “correlation”
PBG 650 Advanced Plant Breeding
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
DATA ANALYSIS Module Code: CA660 Lecture Block 3.
Winter 2006EE384x1 Review of Probability Theory Review Session 1 EE384X.
Continuous Probability Distributions  Continuous Random Variable  A random variable whose space (set of possible values) is an entire interval of numbers.
Some standard univariate probability distributions Characteristic function, moment generating function, cumulant generating functions Discrete distribution.
MTH 161: Introduction To Statistics
Copyright ©2011 Nelson Education Limited The Normal Probability Distribution CHAPTER 6.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
PROBABILITY AND STATISTICS FOR ENGINEERING Hossein Sameti Department of Computer Engineering Sharif University of Technology Two Functions of Two Random.
Functions of Random Variables. Methods for determining the distribution of functions of Random Variables 1.Distribution function method 2.Moment generating.
Use of moment generating functions 1.Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …).
The Mean of a Discrete RV The mean of a RV is the average value the RV takes over the long-run. –The mean of a RV is analogous to the mean of a large population.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Yung-Kyun Noh and Joo-kyung Kim Biointelligence.
Week11 Parameter, Statistic and Random Samples A parameter is a number that describes the population. It is a fixed number, but in practice we do not know.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Topic 5 - Joint distributions and the CLT Joint distributions –Calculation of probabilities, mean and variance –Expectations of functions based on joint.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
The final exam solutions. Part I, #1, Central limit theorem Let X1,X2, …, Xn be a sequence of i.i.d. random variables each having mean μ and variance.
Using the Tables for the standard normal distribution.
Week 121 Law of Large Numbers Toss a coin n times. Suppose X i ’s are Bernoulli random variables with p = ½ and E(X i ) = ½. The proportion of heads is.
Chapter 5a:Functions of Random Variables Yang Zhenlin.
Topic 5 - Joint distributions and the CLT
Review of Probability. Important Topics 1 Random Variables and Probability Distributions 2 Expected Values, Mean, and Variance 3 Two Random Variables.
Statistical Estimation Vasileios Hatzivassiloglou University of Texas at Dallas.
Chapter 4. Random Variables - 3
Continuous Random Variables and Probability Distributions
Chapter 5 Sampling Distributions. Introduction Distribution of a Sample Statistic: The probability distribution of a sample statistic obtained from a.
SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather.
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Computer Performance Modeling Dirk Grunwald Prelude to Jain, Chapter 12 Laws of Large Numbers and The normal distribution.
1 Two Discrete Random Variables The probability mass function (pmf) of a single discrete rv X specifies how much probability mass is placed on each possible.
Ch 2. Probability Distributions (1/2) Pattern Recognition and Machine Learning, C. M. Bishop, Summarized by Joo-kyung Kim Biointelligence Laboratory,
Geology 6600/7600 Signal Analysis 04 Sep 2014 © A.R. Lowry 2015 Last time: Signal Analysis is a set of tools used to extract information from sequences.
Sums of Random Variables and Long-Term Averages Sums of R.V. ‘s S n = X 1 + X X n of course.
Introduction to Probability & Statistics The Central Limit Theorem
Elementary Statistics
ASV Chapters 1 - Sample Spaces and Probabilities
Presentation transcript:

Probability: Many Random Variables (Part 2) Mike Wasikowski June 12, 2008

Contents Indicator RV’s Derived RV’s Order RV’s Continuous RV Transformations

Indicator RV’s I A = 1 if event A occurs, 0 if not Consider A 1, A 2, …, A n events, I 1, I 2, …, I n their indicator RV’s, and p 1, p 2, …, p n the probabilities of events A i occurring Then Σ j I j is the total number of events that occur Mean of a sum of RV’s = sum of the mean of the RV’s (regardless of dependence), so E(Σ j I j ) = Σ j E(I j ) = Σ j p j

Indicator RV’s If all values of p i are equal, then E(Σ j I j ) = np When all events are independent, we calculate variance of number of events that occur as p 1 (1-p 1 )+…+p n (1-p n ) If all values of p i are equal and all events are independent, variance is np(1-p) Thus, we have a binomial distribution

Ex: Sequencing EST Libraries Transcription: DNA → mRNA → amino acids/proteions EST (expressed sequence tag): a sequence of 100+ base pairs of mRNA Different genes get expressed at different levels inside a cell Abundance class L: where a cell contains L copies of an mRNA “species” Generate an EST DB by sampling with replacement from the mRNA pool, see less rare species less often How does the number of samples affect the proportion of rare species we will see?

Ex: Sequencing EST Libraries Using indicator RV’s makes this problem easy to solve Let I a = 1 if a is in the S samples, 0 if not Number of species in abundance class L = Σ a I a We know each I a has the same mean, so E(Σ a I a ) = n L p L

Ex: Sequencing EST Libraries Let p L = 1-r L, where r L is the probability this species is not in the database r L = (1-L/N) S Thus, we get E(Σ a I a ) = n L (1- (1-L/N) S )

Derived RV’s Previously saw how we find joint distributions and density functions These joint pdf’s can be used to define many new RV’s Sum Average Orderings Because many statistical operations use these RV’s, knowing properties of their distributions is important

Sums and Averages Two most important derived RV’s S n = X 1 +X 2 +…+X n X = S n /n Mean of S n = nμ, variance = nσ 2 Mean of X = μ, variance = σ 2 /n These properties generalize to well-behaved functions of RV’s and vectors of RV’s as well Many important applications in probability and statistics use sums and averages

Central Limit Theorem If X 1, X 2,..., X n are iid with a finite mean and variance, as n→∞, the standardized RV (X-μ)sqrt(n)/σ converges to an RV ~ N(0,1) Image from Wikipedia: Central Limit Theorem

Order Statistics Involve the ordering of n iid RV’s Call smallest X (1), next smallest X (2), up to biggest X (n) X min = X (1), X max = X (n) We know that these order statistics are distinct because P(X (i) = X (j) ) = 0 for independent continuous RV’s

Minimum RV (X min) Let X 1, X 2,..., X n be iid as X If X min ≥ x, then for each X i, X i ≥ x P(X min ≥ x) = P(X ≥ x) n, also written as 1- F min (x) = (1-F X (x)) By differentiating, we get the density function f min (x) = n f X (x) (1-F X (x)) n-1

Maximum RV (X max ) Let X 1, X 2,..., X n be iid as X If X max ≤ x, then for each X i, X i ≤ x P(X max ≤ x) = P(X ≤ x) n, also written as F max (x) = (F X (x)) n By differentiating, we get the density function f min (x) = n f X (x) (F X (x)) n-1

Density function of X (i) Let h be a small value, ignore events of probability o(h) Consider the event that u < X (i) < u+h In this event, i-1 RV's are less than u, one is between u and u+h, the remaining exceed u+h Multinomial event with n trials and 3 outcomes We have an approximation of P(u < X < u+h) ~ f X (u)h

Density function of X (i)

Continuous RV Transformations Consider n continuous RV's, X 1, X 2,..., X n let V 1 = V 1 (X 1, X 2,..., X n ), V 2,..., V n defined similarly we then have a mapping from (X 1, X 2,..., X n ) to (V 1, V 2,..., V n ) If the mapping is 1-1 and differentiable with a differentiable inverse, we can define the Jacobian matrix Jacobian transformations are used to find marginal functions of one RV when that would be otherwise difficult Used in ANOVA as well as BLAST

Questions?