 # What is statistics? STATISTICS BOOT CAMP Study of the collection, organization, analysis, and interpretation of data Help us see what the unaided eye misses.

## Presentation on theme: "What is statistics? STATISTICS BOOT CAMP Study of the collection, organization, analysis, and interpretation of data Help us see what the unaided eye misses."— Presentation transcript:

What is statistics? STATISTICS BOOT CAMP Study of the collection, organization, analysis, and interpretation of data Help us see what the unaided eye misses

 Descriptive Statistics:  Describe data  Help us organize bits of data into meaningful patterns and summaries  Tell us only about the sample we studied  Inferential:  Allow us to determine whether or not our findings can be applied to the larger population from which the sample was selected TWO TYPES

DESCRIPTIVE STATISTICS

 If you could have any animal in the world for a pet, what would it be?  Definition: Arrangement of data from high to low, indicating the frequency of each piece of data  Frequency polygons: illustrated frequency distribution in a line graph  Histograms: illustrated frequency distribution in a bar graph  **Frequency is always on the Y axis (vertical) FREQUENCY DISTRIBUTION

 Definition: a single score that represents a whole set of scores  Attempts to mark the center of a distribution  Three types: mean, median, mode  Mean: numerical average of a set of scores  Most commonly reported  Median: halfway mark in the data set, half of the scores are above and half are below  Write down numbers in ascending or descending order; find the halfway point, if there is an even number, take the average of the middle two scores  Why would we ever look at this? Extreme scores can drastically affect our mean MEASURES OF CENTRAL TENDENCY

 Mode: Simplest measure; The score that occurs most frequently  When is this used? Depends on research question  72% of Americans report having 0-1 drinks of alcohol per week; gov’t puts a tax on alcohol, it won’t affect most Americans  Bimodal (two modes) – better to use mode over mean/median in this case  Mean onset age for an eating disorder is 17  Two modes: peak around 14 and peak around 18  intervention program would be better suited for ages 14 and 18 than 17 MCT CONT.

 Mean is most commonly used measure of central tendency but can be biased by a few scores (extreme scores, outliers )  Examples:  Bill Gates walks into a coffee shop. The average income of all patrons soars. Median wealth remains unchanged.  Republicans use the average income to discuss income growth; Democrats refer to the median  19/20 of your friends have a car valued at \$12,000, but another has a car valued at 120,000  Mean is 17,400  Not best measure; median is better OUTLIERS

 Attempt to depict the diversity of a distribution of scores  Shows us how clustered our scores are around the mean  We can be more confident in our data if there is less variability  Example: Basketball player who averages 15 pts a game  Are you more confident if their range is between 13-17 pts in first 10 games or between 5-25 pts in the first 10 games?  Range: gap between the highest and lowest score  Subtract the low score from the high score MEASURES OF VARIABILITY

 Standard deviation: a measure of how tightly clustered a group of scores is around their mean  Calculated by taking the square root of the variance  Both the SD and variance relate the average distance of any score in the distribution to the mean  The higher the variance and SD, the more spread out the distribution  Smaller the standard deviation, the more clustered the scores are around the mean MEASURES OF VARIABILITY: STANDARD DEVIATION

 How much do employees at small businesses make?  40,000  45,000  47,000  52,000  350,000  Mean = 106,800  Standard deviation = 136,021; Average difference between a score and the mean is 136,021  Discard the extreme score, SD is now 4,966.56  Distribution of first four is tightly clustered, distribution of all five is spread out STANDARD DEVIATION EXAMPLE

 Shows how scores are distributed in nature  Example: Height of humans  Symmetrical; Mean, median, mode are all in center  68% of all scores fall within one standard deviation of the mean; 95% within two SD NORMAL DISTRIBUTION/BELL CURVE

 Used to compare scores from different distributions  Can convert scores from the different distributions into z scores. Z scores measure the distance of a score from the mean in units of standard deviation  Scores below the mean have negative z scores  Scores above the mean have positive z scores  Amy scored a 72 on a test with a mean of 80 and SD of 8, her z score is -1  Clarence scored an 84 on the test, his z score is +.5 Z-SCORES

INFERENTIAL STATISTICS

 Allows us to draw inferences from our data  Sometimes sets of data can differ because of chance, not because of a real difference  When differences between data are statistically significant, the observed differences is probably not due to a chance variation between the groups  Something is considered SS, if the odds of it occurring as a result of chance are less than 5%  p =.05 INFERENTIAL STATISTICS

 Indicate the distance of a score from 0  90 th percentile means they scored better than 90% of the people who took the test PERCENTILES

Download ppt "What is statistics? STATISTICS BOOT CAMP Study of the collection, organization, analysis, and interpretation of data Help us see what the unaided eye misses."

Similar presentations