AP **Statistics** Chapter 8 Section 2 If you want to know the number of successes in a fixed number of trials, then we have a binomial setting. If you want to know / a 3 each time 3.Independent events 4. Variable of interest – number of rolls until a 3 occurs Yes, Geometric distribution Rule for calculating Geometric **Probabilities** If X has a geometric distribution with p **probability** of success **and** (1-p) of failure on each observation, the possible values of X are 1, 2, 3, …. If n is any one of these values, then/

Which series are better START HT H T HH HTT HHT P1P1 P2P2 P3P3 P4P4 P i denotes that Andrew is here **and** win. Gergely WintschePart III / 21 – **Probability**, experiments, **statistic** Head runs Which series are better Gergely WintschePart III / 22 – **Probability**, experiments, **statistic** Head runs Which series are better START 7 12 29/36 1/36 6/36 29/36 12 7 1/36/

14 - 1 © 2003 Pearson Prentice Hall **Statistics** for Business **and** Economics Nonparametric **Statistics** Chapter 14 14 - 2 © 2003 Pearson Prentice Hall Learning Objectives 1.Distinguish/Test, 2 Test 14 - 5 © 2003 Pearson Prentice Hall Nonparametric Test Procedures 1.Do Not Involve Population Parameters Example: **Probability** Distributions, Independence Example: **Probability** Distributions, Independence 2.Data Measured on Any Scale Ratio or Interval Ratio or Interval Ordinal Ordinal Example: Good-Better-Best Example:/

or Gaussian **probability** distribution (given the class) The **probability** density function for the normal distribution is defined by two parameters: Sample mean Standard deviation Then the density function f(x) is 12 **Statistics** for weather / = 0.000136 / (0.000036 + 0. 000136) = 79.1% 14 **Probability** densities Relationship between **probability** **and** density: But: this doesn’t change calculation of a posteriori **probabilities** because cancels out Exact relationship: 15 Naïve Bayes: discussion Na/

,42 What is the mean of these numbers? Jana paid 50 cents for a pack of 14 baseball cards **and** 75 cents for a pack of 25 baseball cards. How many baseball cards did she buy? Math Review 1 NumbersAlgebraGeometry/ Measurement Data, **Statistics**, **Probability** Potpourri If Mari earns $40 a week, how much will she earn in 6 weeks? (11 _ 9) _/

Corinne makes 75% of her free throws. What is the **probability** of making exactly 7 of 12 free throws. binompdf(12,.75,7)=.1032 AP **Statistics**, Section 8.1.24 Binomial Distributions on the calculator Binomial **Probabilities** B(n,p) with k successes binomcdf(n,p,k) /of all adults would “agree”. What is the **probability** that 1520 or more of the sample “agree”. AP **Statistics**, Section 8.1.210 TI-83 calculator B(2500,.6) **and** P(X>1520) 1-binomcdf(2500,.6,1519).2131390887 AP **Statistics**, Section 8.1.211 Exercises 8.8-8./

used terms Mean Standard deviation or variance Coefficient of variation Median Skewness Correlation Distributions Types **Statistics** “ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Mean –The average of a set of values –Excel command – AVERAGE Expected value –the centroid of the **probability** distribution on a random variable Mean **and** Expected Value “ Building Strong “ Delivering Integrated, Sustainable, Water Resources Solutions Variance –The average squared/

option under the Regression package, but the axes are linear percents (unlike MINITAB **and** StatCrunch) … that can be changed manually ●StatCrunch The option Graph – QQ Plot in StatCrunch creates normal **probability** plots (also called QQ plots) The StatCrunch axes are switched compared to the MINITAB axes Sullivan – Fundamentals of **Statistics** – 2 nd Edition – Chapter 7 Section 4 – Slide 8 of 11 Chapter 7/

Inferential **statistics** High **probability** that sample generally representative of the population on variables of interest Non-random Samples Purposive Quota Accidental Generalizability based on “argument” Replication Sample “like” the population Selecting a sampling method Depends on the population Problem **and** aims of the research Existence of sampling frame Conclusion The purpose of sampling is to select a set of elements from the population/

Inferential **Statistics** Population Curve Mean Mean Group of 30 Population/ group mean could have occurred by chance (remember the relationship of z scores to percentile rank). 50.034.13 **Probability** of higher mean 15.87% (.16) Measuring Group Means Against the Sampling Distribution Sampling Distribution (n=30) 2/ you look up the cut-off value by determining the df (identifying the distribution) **and** then looking up on a table of “critical values” whether the difference was significant. Now you can read the/

that a randomly chosen, unrelated individual from a given population would have the same DNA profile observed in a sample? Mixture **statistics**: Combined **Probability** of Inclusion (CPI) or Likelihood Ratios (LR) Mixed DNA samples Put two peoples names into a mixture. How many names/. Whose might it be? Could the actual source be: Caucasian, Afro-Caribbean, or Indo-Pakistan? If it cannot be **and** there is no one else in the alternative suspect pool then the suspect must be the source. A suspect pool D matches/

detailed consideration of molecules as individuals. 2. Is a Microscopic, **statistical** approach to the calculation of Macroscopic quantities. 3. Applies the methods of **Probability** & **Statistics** to Macroscopic systems with HUGE numbers of particles. **Statistical** Mechanics 3. For systems with known energy (Classical or Quantum) it gives BOTH A. Relations between Macroscopic quantities (like Thermo) **AND** B. NUMERICAL VALUES of them (like Kinetic Theory). This course/

standard annotation **and** segmentation BAStat orth. transcript / tagging Verbmobil (manually) lexicon SAM-PA (manually) phonetic segmentation MAUS (automatic) syllabification U. Reichel (automatic) 10LREC 2010 Valletta, Malta OnFocus / OffFocus } BAStat : Phone **Statistic** two phoneme sets: basic (52) + extended (76) including all possible vocalized /r/ diphthongs (e.g. /E6/ (‚er‘), /u:6/ (‚Uhr‘) etc.) phone **probability** P(phon) phone bigram **probability** P(phon2|phon1) position **probability**: word/

Section 4-2 **Statistics** 300: Introduction to **Probability** **and** **Statistics** **Probability** Chapter 4 –Section 2: Fundamentals –Section 3: Addition Rule –Section 4: Multiplication Rule #1 –Section 5: Multiplication Rule #2 –Section 6: Simulating **Probabilities** –Section 7: Counting Fundamentals Vocabulary (Terms) –Event –Simple /number of all possible outcomes P(A) = (ways for A)/(all ways) Try this: What is the **probability** that I will get a 6 when I roll a die? Complementary Events The complement of event “A” consists/

**STATISTICS** / sci fi1649412816 romance7419311514543 humor163088913 Conditions = categories, sample = modal verbs 1. # from nltk.corpus import brown 2. # from nltk.**probability** import ConditionalFreqDist 3. >>> cat = [news, religion, hobbies, science_fiction, romance, humor] 4. >>> mod = [can, could,/Prof. Howard, Tulane University 12 Another example The task is to find the frequency of America **and** citizen in NLTKs corpus of presedential inaugural addresses: 1. >>> from nltk.corpus import inaugural 2. /

distribution. We make these **probability** judgments using a sampling distribution. What is a Sampling Distribution? Hypothetical Hypothetical A frequency distribution of sample **statistics** from an infinite number of samples. A frequency distribution of sample **statistics** from an infinite number of samples. Imagining a Sampling Distribution 1.Take a random sample. 2.Compute the mean. 3.Take another random sample **and** compute the mean. 4/

Answer: Estimates of a slope (b) have a sampling distribution, like any other **statistic** – It is the distribution of every value of the slope, based on all /then the sampling distribution would center at zero – Since the sampling distribution is a **probability** distribution, we can identify the likely values of b if the population slope is/ Assumptions Normality: Examine sub-samples at different values of X. Make histograms **and** check for normality. Good Not very good Bivariate Regression Assumptions 4. The /

10 - 1 © 2001 Prentice-Hall, Inc. **Statistics** for Business **and** Economics Simple Linear Regression Chapter 10 10 - 2 © 2001 Prentice-Hall, Inc. Learning Objectives / Prediction & Estimation 10 - 11 © 2001 Prentice-Hall, Inc. Regression Modeling Steps 1.Hypothesize Deterministic Component 2.Estimate Unknown Model Parameters 3.Specify **Probability** Distribution of Random Error Term Estimate Standard Deviation of Error Estimate Standard Deviation of Error 4.Evaluate Model 5.Use Model for Prediction & Estimation 10 /

0.002 0.000 P( x ) x Table A-1 Binomial **Probability** Distribution For n = 15 **and** p = 0.10 Method 2 9 Chapter 4. Section 4-3. Triola, Elementary **Statistics**, Eighth Edition. Copyright 2001. Addison Wesley Longman Example: Using Table A/is limited because a table may not be available for every n **and**/or p. Method 2 – Using a table 10 Chapter 4. Section 4-3. Triola, Elementary **Statistics**, Eighth Edition. Copyright 2001. Addison Wesley Longman **Probabilities** with “Exact” successes Press 2 nd, VARS (DISTR). Select /

known as discrimination, has been well studied **and** quantified for binary outcomes using measures such as the estimated area under the Receiver Operating Characteristics (ROC) curve (AUC), which is also referred to as a “C-**statistic**” (Uno 2011) data Admissions; length /non- event % of time model discriminated correctly Result c 0.72 There is a 0.72 **probability** of the model assigning a higher predicted **probability** to a randomly selected event case, compared with a randomly selected non-event case. 0.50 /

Distributions Sampling Distribution Is the Theoretical **probability** distribution of a sample **statistic** Is the Theoretical **probability** distribution of a sample **statistic** A sample **statistic** is a random variable: A sample **statistic** is a random variable: E.g/ possess characteristic Sampling Distribution of a Sample Proportion Approximated by normal distribution if: Approximated by normal distribution if: **and** **and** Mean of samples: Mean of samples: Standard error of proportion : Standard error of proportion : p = /

Shi, **and** Dongsong Zhang 報告者：黃烱育 2015/11/231 碩研資工一甲 M97G0217 黃烱育 Outline Introduction **Statistical** Language Models Data Sets Discussion Conclusion 2015/11/232 碩研資工一甲 M97G0217 黃烱育 Introduction There is a growing need develop effective ways to detect online deception. That developing SLMs does not require an explicit feature selection process. 2015/11/233 碩研資工一甲 M97G0217 黃烱育 **Statistical** Language Models(1) n-gram models Predicting the next word by the **probability** function/

For use with Classroom Response Systems Introductory **Statistics**: Exploring the World through Data, 1e by Gould **and** Ryan Chapter 9: Inferring Population Means Slide 9 - 1 If the conditions fail to be met for a hypothesis test, the z-**statistic** will not follow a Normal distribution when/2013 Pearson Education, Inc. Response Counter True or False A sampling distribution is a **probability** distribution of a **statistic**. A. True B. False Slide 9 - 7 © 2013 Pearson Education, Inc. Response Counter True or False When a/

To Tell If the Difference Is **Statistically** Significant We want to do a **statistical** test to calculate the **probability** that one model fits better than another 9 Using An F-Test To Tell If the Difference Is **Statistically** Significant 10 Method: Compute F inverse/ does not mean rate equation is correct. The quality of kinetic data vary with the equipment used **and** the method of temperature measurement **and** control. Data taken on one apparatus is often not directly comparable to data taken on different apparatus./

Beginning of the chapter Simple disease risk **statistics** 16 GENETICS Disease risk **statistics** GENETICS How do we know the risk? Concept: Odds Ratio The risk of developing a disease due to a genetic variation Example: G = /0.6OR 1.4 3 x 0.36 x 0.6 = 0.64 OR Disease **probability** 13.46% 3 x 1.4 = 4.2 OR Disease **probability** 60.82% Disease **probability** average: 20% GENETICS Summary: The genetic risk is stable **and** unchangeable. Different analyses of the same genes should yield the same OR, even when conducted/

**Probability** Distribution **Probability** Distribution 1.A **probability** distribution is a listing of all the possible outcomes of an experiment along with the relative frequency/**probability** of each outcome. 2.**Probability** distribution play a major role in the use of inferential **statistics**. 8.0 **Probability** Distribution Discrete **Probability**/: 8.0 **Probability** Distribution (x i )Freq 025 1185 2137 398 445 510 a) Develop a **probability** distribution for this data b) Calculate the mean, variance **and** standard deviation. /

N is even, so is m. A Fundamental Assumption is that successive steps are **statistically** independent Let p ≡ the **probability** of stepping to the right **and** q = 1 – p ≡ the **probability** of stepping to the left. Since each step is **statistically** independent, the **probability** of a given sequence of n1 steps to the right followed by n2 steps to the left is given by multiplying the/

Group Sampling Distribution of the Proportion When the sample **statistic** is generated by a count not a measurement, the proportion of successes in a sample of n trials is p, where Shape: Whenever both n p **and** n(1 – p) are greater than or equal/ claims that 55% of registered voters favor the candidate over her strongest opponent. Assuming that this claim is true, what is the **probability** that in a simple random sample of 300 voters, at least 60% would favor the candidate over her strongest opponent? p = 0/

REVIEW OF BASICS PART II **Probability** Distributions Confidence Intervals **Statistical** Significance **Probability** Distributions **Probabilities** are relative frequencies **Probabilities** vary between 0 **and** 1 MEAN +1SD+2SD-1SD-2SD 68% 95% +3SD-3SD 99% z - SCORES z-score: standard score measuring in units of standard deviations standard normal distribution: normal distribution in z-/

**Statistics**! What are the chances? Weather service collects precipitation data around the country Burlington Data (values in inches) How can you describe? TAKE MEAN= AVERAGE(##:##) 29.51.8 TAKE MEDIAN= MEDIAN(##:##) 26.31.9 STANDARD DEVIATION = STDEV(##:##) 4.10.9 % STANDARD DEVIATION = STDEV/MEAN*100 14% 47% MAXIMUM = MAX(##:##) 2.7 34.1 MINIMUM= MIN(##:##) 0.925.8 **Probability**/ on **probability** paper Extreme values +1 84% 50th percentile, median **and** mean if distribution is “normal” -1 1 Standard Deviation /

Value Sum 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Outcome Count Histogram 8 Event **Probability** Histogram 9 **Probabilities** of a Loaded Die 10 OutcomeProbability {1}1/8 {2}1/16 {3}1/8 {4}1/16 {5}1/1281/2561/16 Total1/81/161/81/161/81/163/81/161 **Probabilities** from Rolling Two Loaded Eight-sided Dice Event **Probability** Histogram 12 References 13 Sources: Foundations of **Statistical** Natural Language Processing, by Christopher Manning **and** Hinrich Schütze The MIT Press Discrete Mathematics with Applications, by /

a t-Test? **Statistical** test used in hypothesis testing –Example: Comparing Group A to Group B Used to determine if the difference between 2 mean values is significantly different or just difference due to random chance –Example: Compare the mean between Per. 2 **and** Per. 3 / value: 2.101 & Calculated t-value:.33; this calculated t-value is lower than the critical t-value 2c) What is the **probability** that the difference between the two groups is due to chance? 50% Let’s do # 6 from Part 2 6) When comparing the/

: aim for 18 hours spent by the end of this week Jan 30th Target Date for Descriptive **Statistics** Watch videos: 1.Picturing Distributions 2.Describing Distributions 3.Normal Distributions Quiz 1 NOT GRADED available starting /, 2nd, 3rd) **and** order a 3-item combo plate. How many different ways can this happen? Calculating **Probabilities** Counting rules (Permutation, Combination, Multiplication): Define sample space (# possible outcomes) **Probability** of a specific outcome: 1 sample space **Probability** of an event?/

atom in a 2d-plane T ¿ 4 Example: Epigallocatechine A. Fischer, Ch. Schütte, P. Deuflhard, **and** F. Cordes (2000) 5 Sampling Scheme q 1 T ¿ T ¿ T ¿ … q 2 q 3 q/A ( q ) ¼ ( q ) d q = C 1 [ ::: [ C N 10 ZIBgridfree M. Weber (2006) adaptive sampling (hierarchical) curse of dimensionality 11 Transition **Probabilities** © 1 ;:::; © N : ! [ 0 ; 1 ] N P i = 1 © i ( q ) = 1 ; 8 q 2 P ( i ; j/= 1 A ( q ( l ) k ) **statistical** weights w w > = w > P Learn more about transition matrices in the talk by Susanna Kube on /

range or standard deviation etc. If a sample of size n is being taken from the population, then the **statistic** is calculated for all the possible samples of size n. The **probability** distribution is then found. Example 1: A bag contains a large number of coins. 70% are 2p coins /arrangements) 5, 5, 5 P( sample contains three 2p coins) = 0.7 3 = P( sample contains two 2p coins **and** one 5p) =0.7 2 × 0.3 × 3 = P( sample contains one 2p **and** two 5p coins) = 0.7 × 0.3 2 × 3 = P( sample contains three 5p coins) =0.3 3/

AP **Statistics**: Chapter 8 Intro. You come to class totally unprepared for a quiz (imagine that!!!). The quiz consists of 10 multiple choice questions with 5 possible answers. Since you/question correct) = _______ How many questions would you expect to get correct? _______ Let the random variable X represent the number of questions you get correct **and** complete this **probability** distribution. To find the **probabilities**, let’s do a simulation: Each of you do 10 simulations. 0 1 2 3 4 5 6 7 8 9 10 We can also find /

an average of such events can be expected to occur. The **probability** of k occurrences of this event is For values of k = 0, 1, 2, … The mean **and** standard deviation of the Poisson random variable are Mean: Standard /would be very unusual (small **probability**) since x = 8 lies standard deviations above the mean. This would be very unusual (small **probability**) since x = 8 lies standard deviations above the mean. **STATISTIC** & INFORMATION THEORY (CSNB134) **PROBABILITY** DISTRIBUTIONS OF RANDOM VARIABLES (POISSON/

& Sons, Inc. The Box Plot (or Box-**and**-Whisker Plot) Chapter 313Introduction to **Statistical** Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc. Comparative Box Plots Chapter 314Introduction to **Statistical** Quality Control, 7th Edition by Douglas C. Montgomery. Copyright (c) 2012 John Wiley & Sons, Inc. **Probability** Distributions Chapter 315Introduction to **Statistical** Quality Control, 7th Edition by Douglas C. Montgomery/

250 trials 350 trials **Probability**: Relative Frequency An estimate of the **probability** of an event happening can be obtained by looking back at experimental or **statistical** data to obtain relative frequency. ColourfreqRelative freq Red50 Blue80 Green30 White40 Silver130 Black20 956 /discs. Rebecca selects a disc at random from the bag, notes its colour, then replaces it. She does this 500 times **and** her results are recorded in the table below. Rebecca hands the bag to Peter who is going to select one disc from /

sources of bias –Scope of inference: random sampling vs. random assignment 9 What math teachers like 10 What **statistics** teachers like Using simulations to estimate **probabilities** 11 Finding common ground? Suppose that one percent of the women in a certain region have breast cancer. For/ Total 1000 10990 9 89 98 9/98 = 0.092 What did it for me? APES/Stats –Field study work in forests **and** streams –Playing the role of data consultant 15 The data are trying to tell us a story… Son #1: The firefighter –Call/

have been obtained by random chance? Part of this comes from scientific intuition but another part comes from **statistics**. Types of **statistics** used in bioinformatics Yes-Likelihood methods No-ANOVA, regression analysis, hypothesis testing When one performs a sequence /are the sources of error for this approach? How to compute relevant **probabilities**? 1)Obtain all sequences of known DNA binders. Check for The particular aa sequence **and** compute its percentage. P(aa sequence/DNA binder)= # of protein with/

11th13 © Taylor & Francis 2014 Mean = 11.14286 Median = 13 Mode = 13 **and** 15 Range = 11 points Standard deviation = 3.42927 © Taylor & Francis 2014 The descriptive **statistics** give a sense of... ◦ central tendency ◦ dispersion The histogram gives a sense of... ◦ the/ 0.958, p-value = 0.7225 © Taylor & Francis 2014 The observed value of the Shapiro–Wilk **statistic** is: W = 0.958 The exact **probability** of the observed value, W = 0.958, is: p-value = 0.7225 © Taylor & Francis 2014 I’m reminding /

approximation Conclusion **Statistics** model A **statistical** model is a **probability** distribution constructed to enable inferences to be drawn or decisions made from data. Population sample Inference Make a decision : Hypothesis testing designer consumer We have to choose a **statistics** model for /negative integer values {0, 1, 2, 3,...}, **and** where these integers arise from counting rather than ranking. We tend to use fixed fractions of genes. The **probability** that reads appeared in this region The number of read/

reject the hypothesis that = 50. ... if in fact this were the population mean Level of Significance **Probability** Defines unlikely values of sample **statistic** if null hypothesis is true Called rejection region of sampling distribution Designated (alpha) Typical values are .01,/the average capacity of batteries at least 140 ampere-hours? A random sample of 20 batteries had a mean of 138.47 **and** a standard deviation of 2.66. Assume a normal distribution. Test at the .05 level of significance. One-Tailed t Test/

Algebraic **Statistics** for Computational Biology Lior Pachter **and** Bernd Sturmfels Ch.5: Parametric Inference R. Mihaescu Παρουσίαση: Aγγελίνα Βιδάλη Αλγεβρικοί & Γεωμετρικοί Αλγόριθμοι στη Μοριακή Βιολογία /**probabilities** Viterbi algorithm problem of computing p σ Tropicalization:u ij =-log(p’ ij )v ij =-log(p ij ) We can now efficiently find an explanation h 1,…,h m for the observation σ 1,…,σ n using the recursion: It is again the Forward algorithm. Pair Hidden Markov Model (pHMM) The algebraic **statistical**/

How many distinct 4-person teams can be chosen? Random sampling from finite population Example(cont.): The **probability** that students A,B,C,D are chosen to work on the project is 1/330. Suppose the group consists of 5 juniors **and** 6 seniors. How many samples of 4 have exactly 3 juniors? Think of selecting a sample as a 2/ = (# of samples with only good cars) = C(37,4)=66,045 Thus, (# of samples with ≥1 defective cars) = 91,390 – 66,045 = 25,345 Figure 4.4 (p. 157) **Probability** versus **statistical** inference.

of **statistics** typically used in bioinformatics Yes-Likelihood methods No-ANOVA, regression analysis, hypothesis testing When one performs a sequence comparison search one must ask what is the likelihood that one would obtain a match based on random chance. This depends on the sequence you are searching for **and** the amount of data within the database you are mining. Equally likely outcomes/

}1/8 {5}1/8 {6}1/8 {7}1/8 {8}1/8 Examples of Marginal **Probabilities** 8 Complementary **Probabilities** 9 Outcome {1} {2} {3} {4} {5} {6} {7} {8} Complementary Examples 10 References 11 Sources: Foundations of **Statistical** Natural Language Processing, by Christopher Manning **and** Hinrich Schütze The MIT Press Discrete Mathematics with Applications, by Susanna S. Epp Brooks/Cole, Cengage/

pnorm(q=0, mean=0, sd=1) gives the cumulative **probability** for the given value of x How to compute p value Z-test **statistic**: 2.5 pnorm(2.5, lower.tail=FALSE) *note: one-tail test Cumulative **probability** X qnorm qnorm(0.975) returns x that corresponds to the/two-tail). Cumulative **probability** X Tips 1 Handling vectors rnorm(10, mean=1:10, sd=1:10) rnorm(5, mean=c(1, 1, 2, 2, 2)) – # sampling from different distributions dnorm(0, mean=1:2) dnorm(c(0, 1), mean=1:2) – # similarly, qnorm **and** pnorm can handle /

1. Some Basics A random experiment Population **and** Sample point An event Mutually exclusive **and** equally likely Random variable: r.v. for short An example: coin tossing 2. Definition of **Probability** The classical definition The empirical definition Absolute frequency vs. relative frequency 2. Definition of **Probability** (continued) **Probability** distribution function (PDF) Joint **probability** Unconditional **probability** vs. conditional **probability** **Statistical** independence Independence vs. non-correlation 3/

Ads by Google