# Probability and Statistics

## Presentation on theme: "Probability and Statistics"— Presentation transcript:

Probability and Statistics
Chapter 1 Notes

Probability and Statistics
Chapter 1 Notes  I. Section 1-1 A. Definition of Statistics 1. Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions. a. Data – Information coming from observations, counts, measurements, or responses. 1) There are 2 types of data sets. a) Population – the collection of all outcomes, responses, measurements, or counts that are of interest. 1. In other words, the set of all possible measurements, counts or observations that are of interest in a particular study.

Probability and Statistics
Chapter 1 Notes  I. Section 1-1 b) Sample – A subset of the population. 1. Since it is usually impractical or even impossible in terms of time or money to obtain every possible response, we must often rely on information obtained from a sample. 1. Random Sample: -- A sample in which every member of the population has an equal chance of belonging. 2) A central theme, in the study of statistics, is that of using information obtained from a sample to make decisions or inferences concerning an entire population from which the sample has been drawn. 1. We will study techniques which will enable us to do this with a high level of reliability.

Probability and Statistics
Chapter 1 Notes  I. Section 1-1 3) There are 2 types of numerical descriptions a) Parameter – A numerical description of a population characteristic. b) Statistic – A numerical description of a sample characteristic. B. Branches of Statistics 1. Descriptive Statistics a. The branch of statistics that involves the organization, summarization, and display of data. 2. Inferential Statistics. a. The branch of statistics that involves using a sample to draw conclusions about a population. 1) A basic tool in the study of inferential statistics is probability.

Probability and Statistics
Chapter 1 Notes II. Section 1-2 A. Types of Data 1. Qualitative Data a. Attributes, labels or nonnumerical entries. 2. Quantitative Data a. Numerical measurements or counts. B. Levels of Measurement 1. Nominal Data a. Consists of names, categories, qualities, or labels. Example: type of car you drive. b. Can put data into categories, but we are unable to determine if one piece of data is better or higher than another. c. When numbers are used as labels, such as on an athletic jersey, they are classified as nominal data.

Probability and Statistics
Chapter 1 Notes II. Section 1-2 1) It is of no use whatsoever to know the average of all jersey numbers of the King’s Fork field hockey team. 2. Ordinal Data a. Designations or numerical rankings which can be arranged in ascending or descending order. 1) TV ratings for #1 show, #2 show, etc. b. We can compare rankings as to which is higher, however it does not make sense to subtract one rank value from another. 1) Differences in rankings are not meaningful computations. a) If there are three candidates for a job, they can be ranked 1, 2, and 3, but there is no way to tell how far ahead of the second candidate the first candidate is.

Probability and Statistics
Chapter 1 Notes II. Section 1-2 3. Interval Data a. Can be subtracted to find the difference between two values, put in order, and put into categories. b. Data is numerical; 0 can be used to indicate a position in time or space, however, the zero at this level does not correspond to “none” of the specific variable being measured. 1) The position on the thermometer of zero degrees does not indicate that is absolutely no heat present. c. Differences between data values are meaningful but it does not make sense to compare one data value as being twice (or any multiple of) another. 1) A temperature of 2 degrees is not twice as warm as a temperature of 1 degree.

Probability and Statistics
Chapter 1 Notes II. Section 1-2 4. Ratio Data a. The highest level of measurement. 1) The number of gallons of gasoline you put into your car today. b. There is a zero on this scale which is interpreted as “none” of the variable in question. 1) It is possible to put zero gallons of gas into your tank today. 2) This is called an “inherent” zero. c. It is meaningful to say one measure is two times, or three times, as much as another. 1) You may have put twice as much gas in your car today than you did last week.

Probability and Statistics
Chapter 1 Notes II. Section 1-2 5. How to tell Interval data from Ratio data. a. Does the expression “twice as much” have any meaning in the context of the data? 1) \$2 is twice as much as \$1, so these data points are at the ratio level. 2) A temperature of 2 degrees is NOT twice as warm as degree is, so these data points are at the interval level.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 A. Design of a Statistical Study 1. Identify the variable(s) of interest (the focus) and the population of the study. 2. Develop a detailed plan for collecting data. 3. Collect the data. 4. Describe the data, using descriptive statistics techniques. 5. Interpret the data and make decisions about the population using inferential statistics. 6. Identify any possible errors. B. Data Collection 1. Do an Observational Study a. Observe and measure characteristics of interest of part of a population, but do NOT change existing conditions.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 B. Data Collection 2. Do an Experiment a. Apply a treatment to part of a population and observe responses or results. b. Observe another part of the population as a control group. 1) May use a placebo in place of the treatment being tested.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 B. Data Collection 3. Use a simulation a. Use a mathematical or physical model to reproduce the conditions of a situation or process. 1) Simulations allow us to study situations that are impractical or even dangerous to create in real life. a) Testing the effects of alcohol on a pilot’s ability to fly is best done in a flight simulator 2) Simulations often save time and/or money. 4. Use a survey (census) a. A survey is an investigation of one or more characteristics of a population. 1) Usually carried out on people by asking them to respond to questions.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 B. Data Collection b. It’s important to word the questions so that they do not lead to biased results. C. Experimental Design 1. Experiments must be carefully designed in order to produce meaningful, unbiased, results. a. The Hawthorne effect occurs in an experiment when subjects change their behavior simply because they know they are participating in an experiment. 2. Three key elements of a well-designed experiment are control, randomization, and replication.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 C. Experimental Design a. Control 1) It is important to control as many influential factors as possible in a study. 2) When an experimenter cannot tell the difference between the effects of different factors in an experiment, a confounding variable has occurred. 3) Placebo effect occurs when a subject reacts favorably to a placebo when in fact they have been given no medical treatment at all. a) Blinding is a technique used in which the subject does not know whether he or she is receiving a real treatment or a placebo.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 C. Experimental Design b) Double-blind experiments occur when neither the subjects nor the experimenter know which individual subjects are receiving a treatment or a placebo. 1. The experimenter only finds out which subjects are which after all the data have been collected. b. Randomization is a process of randomly assigning subjects to different treatment groups. 1) Randomized block design – Divide subjects with similar characteristics into blocks, and then randomly split each block up into different treatment groups.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 C. Experimental Design 2) Matched-pairs design – Subjects are paired up according to a similarity. a) One subject in each pair is randomly selected to receive one treatment, while the other one gets another, different treatment. c. Replication is the repetition of an experiment using a large group of subjects. 1) The larger the sample size, the better. D. Sampling Techniques 1. Census – a count or measure of an entire population. a. Provides complete information, but is often too costly or difficult to perform.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques 2. Sampling – a count or measure of part of a population. a. Researcher must ensure that the sample is representative of the population. 1) This is necessary to ensure that inferences about a population are valid. a) Sampling error – the difference between the results of a sample and those of the population. b. Random sample – a sample in which every member of the population has an equal chance of being selected. 1) Methods of sampling randomly

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques a) Simple Random Sample – assign each member of the population a number and then randomly select the numbers that you will survey. 1. Random number table (Appendix B of the book) a. Randomly pick a starting point b. Count off digits in groups that match how many digits your population has. c. Record the numbers, ignoring those that are larger than the population size.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques 2. Calculator a. Press Math, select PRB, press 5(randInt) b. Enter the number that you started with when assigning labels to your population, then a comma, then the last number you assigned, comma, and the sample size you wish to use. 1) The calculator will generate the requested quantity of random numbers. 3. If you do not want to have any member of the population included in the sample twice, the sampling process is said to be without replacement.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques 4. If you don’t care if a member of the population is included twice, the sampling process is said to be with replacement. b) Stratified Sample 1. Separate population into two or more subsets, called strata, using some similar characteristic. a. Randomly select members of each strata to make up your sample. c) Cluster Sample 1. When the population is already divided into subsets that are very similar to each other, you could randomly select a number of entire groups (not all the groups) and do your data collection on those groups.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques a. We call these groups clusters. d) Systematic Sample 1) Each member of the population is assigned a number. a. Put the members of the population in order somehow. b. Randomly select a starting point. c. Randomly select an interval. d. Survey every nth member of the population from your starting point.

Probability and Statistics
Chapter 1 Notes III. Section 1-3 D. Sampling Techniques e) Convenience Sample 1) NOT RECOMMENDED!! a. Simply select those members of the population who are readily available.

QUIZ on Chapter 1 Sections 1 and 2 during next class block
Friday (ODD) and Monday (EVEN) TEST on Chapter 1 next week Tuesday (ODD) and Wednesday (EVEN)