Probability and Information Theory

Slides:



Advertisements
Similar presentations
1 Continuous random variables Continuous random variable Let X be such a random variable Takes on values in the real space  (-infinity; +infinity)  (lower.
Advertisements

Random Variables ECE460 Spring, 2012.
Use of moment generating functions. Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function.
Multivariate Distributions
Review of Basic Probability and Statistics
Probability Review 1 CS479/679 Pattern Recognition Dr. George Bebis.
Probability Distributions
1 Engineering Computation Part 5. 2 Some Concepts Previous to Probability RANDOM EXPERIMENT A random experiment or trial can be thought of as any activity.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
A random variable that has the following pmf is said to be a binomial random variable with parameters n, p The Binomial random variable.
2. Random variables  Introduction  Distribution of a random variable  Distribution function properties  Discrete random variables  Point mass  Discrete.
Continuous Random Variables and Probability Distributions
Random Variable and Probability Distribution
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
5-1 Two Discrete Random Variables Example Two Discrete Random Variables Figure 5-1 Joint probability distribution of X and Y in Example 5-1.
Chapter 21 Random Variables Discrete: Bernoulli, Binomial, Geometric, Poisson Continuous: Uniform, Exponential, Gamma, Normal Expectation & Variance, Joint.
NIPRL Chapter 2. Random Variables 2.1 Discrete Random Variables 2.2 Continuous Random Variables 2.3 The Expectation of a Random Variable 2.4 The Variance.
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics, 2007 Instructor Longin Jan Latecki Chapter 7: Expectation and variance.
Review of Probability.
Prof. SankarReview of Random Process1 Probability Sample Space (S) –Collection of all possible outcomes of a random experiment Sample Point –Each outcome.
2. Mathematical Foundations
Chapter Two Probability Distributions: Discrete Variables
Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 2nd Mathematical Preparations (1): Probability and statistics Kazuyuki Tanaka Graduate.
OUTLINE Probability Theory Linear Algebra Probability makes extensive use of set operations, A set is a collection of objects, which are the elements.
Random variables Petter Mostad Repetition Sample space, set theory, events, probability Conditional probability, Bayes theorem, independence,
Chapter 12 Review of Calculus and Probability
IRDM WS Chapter 2: Basics from Probability Theory and Statistics 2.1 Probability Theory Events, Probabilities, Random Variables, Distributions,
Statistics for Engineer Week II and Week III: Random Variables and Probability Distribution.
LECTURE IV Random Variables and Probability Distributions I.
CPSC 531: Probability Review1 CPSC 531:Probability & Statistics: Review II Instructor: Anirban Mahanti Office: ICT 745
Probability & Statistics I IE 254 Summer 1999 Chapter 4  Continuous Random Variables  What is the difference between a discrete & a continuous R.V.?
Computer Vision Group Prof. Daniel Cremers Autonomous Navigation for Flying Robots Lecture 5.2: Recap on Probability Theory Jürgen Sturm Technische Universität.
Chapter 2 Random variables 2.1 Random variables Definition. Suppose that S={e} is the sampling space of random trial, if X is a real-valued function.
1 Continuous Probability Distributions Continuous Random Variables & Probability Distributions Dr. Jerrell T. Stracener, SAE Fellow Leadership in Engineering.
Statistics for Business & Economics
Physics Fluctuomatics/Applied Stochastic Process (Tohoku University) 1 Physical Fluctuomatics Applied Stochastic Process 3rd Random variable, probability.
Stats Probability Theory Summary. The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes.
IE 300, Fall 2012 Richard Sowers IESE. 8/30/2012 Goals: Rules of Probability Counting Equally likely Some examples.
Basics on Probability Jingrui He 09/11/2007. Coin Flips  You flip a coin Head with probability 0.5  You flip 100 coins How many heads would you expect.
Physics Fluctuomatics (Tohoku University) 1 Physical Fluctuomatics 3rd Random variable, probability distribution and probability density function Kazuyuki.
Math 4030 – 6a Joint Distributions (Discrete)
Continuous Random Variables and Probability Distributions
CSE 474 Simulation Modeling | MUSHFIQUR ROUF CSE474:
STA347 - week 91 Random Vectors and Matrices A random vector is a vector whose elements are random variables. The collective behavior of a p x 1 random.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
Lesson 99 - Continuous Random Variables HL Math - Santowski.
Random Variables By: 1.
Statistics Lecture 19.
Random Variable 2013.
Graduate School of Information Sciences, Tohoku University
STAT 311 REVIEW (Quick & Dirty)
Probability for Machine Learning
Appendix A: Probability Theory
Graduate School of Information Sciences, Tohoku University
Of Probability & Information Theory
Chapter 7: Sampling Distributions
3.1 Expectation Expectation Example
Distributions and Concepts in Probability Theory
Probability Review for Financial Engineers
Statistical NLP: Lecture 4
ASV Chapters 1 - Sample Spaces and Probabilities
Chapter 3 : Random Variables
Probability overview Event space – set of possible outcomes
Chapter 2. Random Variables
Berlin Chen Department of Computer Science & Information Engineering
Berlin Chen Department of Computer Science & Information Engineering
HKN ECE 313 Exam 2 Review Session
Berlin Chen Department of Computer Science & Information Engineering
Continuous Random Variables: Basics
Presentation transcript:

Probability and Information Theory

Random Variables A random variable is a variable that can take on different values randomly. a description of the states that are possible Denoted as a lower case letter discrete or continuous Ex) P(x=‘yes’)

Probability Distributions A probability distribution is a description of how likely a random variable or set of random variables is to take on each of its possible states.

Discrete Variables and Probability Mass Functions Probability mass function (PMF) A probability distribution over discrete variables may be described using a probability mass function (PMF) maps from a state of a random variable to the probability of that random variable taking on that state P(x=x) : random variable x 가 x 상태(값)을 가질 확률로 매핑

Discrete Variables and Probability Mass Functions Joint probability distribution P(x=x, y=y) denotes the probability that x=x and y=y simultaneously. P(x, y)

Discrete Variables and Probability Mass Functions

Continuous Variables and Probability Density Functions Probability Density Function (PDF)

Marginal Probability 3.4 Marginal Probability 3.4 Marginal Probability  x p ( [ p is given by the ( a, b where “parametrized by”; we consider CHAPTER 3. PROBABILITY AND INFORMATION THEORY When working with continuous random variables, we describe probability distri- volume δx is given by p(x)δx. integral of tion on an interval of the real numbers. We can do this with a function lies in the interval [a, b] is given by probability density over a continuous random variable, consider a uniform distribu- b 3.4 Marginal Probability to know the probability distribution over just a subset of them. by writing x ∼ U(a, b). integrates to 1. We often denote that u(x; a, b) = mass outside the interval, we say Sometimes we know the probability distribution over a set of variables and we want 3.3.2 Continuous Variables and Probability Density Functions state directly, instead the probability of landing inside an infinitesimal region with butions using a set of points. Specifically, the probability that mass function. To be a probability density function, a function following properties: are parameters that define the function. To ensure that there is no probability A probability density function • The domain of p must be the set of all possible states of x. For an example of a probability density function corresponding to a specific • ∀x ∈ x, p(x) ≥ 0. Note that we do not require p(x) ≤ 1. • We can integrate the density function to find the actual probability mass of a a p(x)dx = 1. and p b−a ( b 1 are the endpoints of the interval, with ) over that set. In the univariate example, the probability that . We can see that this is nonnegative everywhere. Additionally, it probability density function (PDF) x to be the argument of the function, while u  ( [a,b] x x ) does not give the probability of a specific x ; a, b p(x)dx. follows the uniform distribution on [ ) = 0 for all x lies in some set b > a x ∈ rather than a probability . The “;” notation means a, b S ]. Within [ must satisfy the x ] u ; x ], a, b a, b and a ),  x p ( [ p is given by the ( a, b CHAPTER 3. PROBABILITY AND INFORMATION THEORY volume δx is given by p(x)δx. “parametrized by”; we consider When working with continuous random variables, we describe probability distri- where probability density over a continuous random variable, consider a uniform distribu- set of points. Specifically, the probability that state directly, instead the probability of landing inside an infinitesimal region with integral of b tion on an interval of the real numbers. We can do this with a function integrates to 1. We often denote that 3.3.2 Continuous Variables and Probability Density Functions Sometimes we know the probability distribution over a set of variables and we want to know the probability distribution over just a subset of them. lies in the interval [a, b] is given by butions using a 3.4 Marginal Probability mass outside the interval, we say u(x; a, b) = following properties: by writing x ∼ U(a, b). mass function. To be a probability density function, a function are parameters that define the function. To ensure that there is no probability For an example of a probability density function corresponding to a specific • ∀x ∈ x, p(x) ≥ 0. Note that we do not require p(x) ≤ 1. A probability density function We can integrate the density function to find the actual probability mass of a • The domain of p must be the set of all possible states of x. • a p(x)dx = 1. and p b−a ( b 1 are the endpoints of the interval, with ) over that set. In the univariate example, the probability that . We can see that this is nonnegative everywhere. Additionally, it probability density function (PDF) x to be the argument of the function, while u  ( x [a,b] x ) does not give the probability of a specific x ; a, b p(x)dx. follows the uniform distribution on [ ) = 0 for all x lies in some set b > a x ∈ rather than a probability . The “;” notation means a, b S ]. Within [ must satisfy the a, b ] and a x x u ), ], ; a, b Marginal Probability The probability distribution over just a subset of them.

Conditional Probability The probability of some event, given that some other event has happened

The Chain Rule of Conditional Probabilities Any joint probability distribution over many random variables may be decomposed into conditional distributions over only one variable Chain rule

Independence and Conditional Independence Two random variables x and y are independent Conditionally independent

Expectation The expectation or expected value of some function f(x) with respect to a probability distribution P(x) the average or mean value that f takes on when x is drawn from P

Expectation Expectations are linear

Variance a measure of how much the values of a function of a random variable x vary as we sample different values of x from its probability distribution

Covariance Gives some sense of how much two values are linearly related to each other

Bernoulli Distribution a distribution over a single binary random variable

Multinoulli Distribution The multinoulli or categorical distribution is a distribution over a single discrete variable with k different states, where k is finite. parametrized by a vector p ∈[0,1]k−1, where pi gives the probability of the i-th state. The final k-th state’s probability is given by 1− 1Tp.

Gaussian Distribution The most commonly used distribution over real numbers

Gaussian Distribution

Gaussian Distribution Multivariate normal distribution

Exponential distribution In the context of deep learning, we often want to have a probability distribution with a sharp point at x= 0

Laplace distribution

Mixtures of Distributions A mixture distribution is made up of several component distributions.

Bayes’ Rule