Presentation is loading. Please wait.

Presentation is loading. Please wait.

Descriptive Statistics

Similar presentations


Presentation on theme: "Descriptive Statistics"— Presentation transcript:

1 Statistical Reasoning “He told me I was average. I told him he was mean.”

2 Descriptive Statistics
Used to organize and summarize data in a meaningful way. Frequency distributions – Where are the majority of the scores? Used to organize raw scores, or data, so that information makes sense at a glance. They take scores and arrange them in order of magnitude and the number of times each score occurs. I always remember that descriptive statistics describe a population.

3 Histograms & Frequency Polygons (showing you data a glance)
Ways of showing your frequency distribution data. Histogram – graphically represents a frequency distribution by making a bar chart using vertical bars that touch When you have a continuous scale (for example, scores on a test go from 0-100, continuously getting larger.) the bars touch, because you have to have a class for each score to fall into, and you can’t have any “gaps.” Different than a simple Bar Graph which is used when you have non-continuous classes (example, which candidate do you support, Obama or McCain? You’d have a bar for each, with gaps in between, because you can’t fall between two candidates, you have to pick one.) Histograms are used when you have a continuous scale (for example, scores on a test go from 0-100, continuously getting larger.) That’s why the bars touch, because you have to have a class for each score to fall into, and you can’t have any “gaps.” Bar graphs are used when you have non-continuous classes (example, which candidate do you support, Obama, Clinton, Edwards? You’d have a bar for each, with gaps in between, because you can’t fall between two candidates, you have to pick one.) We usually call frequency polygons line graphs.

4 Uses a Bar Graph to show data
Histogram Uses a Bar Graph to show data This is a bar graph. If you wanted it to be a histogram, you would make the bars equal width, touching. You would also give some numerical scale instead of letter grades, indicating what the cut-off is for A, B, C, and D.

5 Uses a line graph to show data
Frequency Polygon Uses a line graph to show data This is ok 2. Frequency Polygon – graphically represents a frequency distribution by marking each score category along a graph’s horizontal axis, and connecting them with straight lines (line graph)

6 Measures of Central Tendency
A single number that gives us information about the “center” of a frequency distribution. Measures of central tendency – 3 types 4, 4, 3, 4, 5 Mode=most common=4 (Reports what there is more of – Used in data with no connection. Can’t average men & women.) 2. Mean=arithmetic average=20/5=4 (has most statistical value but is susceptible to the effects of extreme scores ) 3. Median=middle score=4 (1/2 the scores are higher, half are lower. Used when there are extreme scores)

7 Central Tendency An extremely high or low price/score can skew the mean. Sometimes the median is better at showing you the central tendency TOPPS Baseball Cards Nolan Ryan $1500 Billy Williams $8 Luis Aparicio $5 Harmon Killebrew $5 Orlando Cepeda $3.50 Maury Wills $3.50 Jim Bunning $3 Tony Conigliaro $3 Tony Oliva $3 Lou Pinella $3 Mickey Lolich $2.50 Elston Howard $2.25 Jim Bouton $2 Rocky Colavito $2 Boog Powell $2 Luis Tiant $2 Tim McCarver $1.75 Tug McGraw $1.75 Joe Torre $1.5 Rusty Staub $1.25 Curt Flood $1 ok With Ryan: Median=$2.50 Mean=$74.14 Without Ryan: Median=$2.38 Mean=$2.85

8 Does the mean accurately portray the central tendency of incomes? NO!
Another good example that kids get easily is home values. We have lots of big $$$ homes in GH (out by the lake) that, when averaged in, skew the mean home price. That’s why you always hear realtors talk about the median home value. What measure of central tendency would more accurately show income distribution? Median – the majority of the incomes surround that number.

9 Measures of Variability
Gives us a single number that presents us with information about how spread out scores are in a frequency distribution. (See example of why this is important). Range – Difference b/w a high & low score Take the highest score and subtract the lowest score from it. (can be skewed by an extreme score) Standard Deviation – How spread out is your data? The larger this number is, the more spread out scores are from the mean. The larger the standard deviation the flatter the curve. The smaller this number is, the more consistent the scores are to the mean The smaller the standard deviation the more peaked the curve ok

10 Calculating Standard Deviation How spread out (consistent) is your data?
Calculate the mean. 2. Take each score and subtract the mean from it. Square the new scores to make them positive. Mean (average) the new scores Take the square root of the mean to get back to your original measurement. 6. The smaller the number the more closely packed the data. The larger the number the more spread out it is. ok

11 Numbers multiplied by itself & added together
Standard Deviation Deviation Squared Numbers multiplied by itself & added together Punt Distance Deviation from Mean Standard Deviation: 36 38 41 45 = -4 38 – 40 = -2 41 – 40 = +1 45 – 40 = +5 16 4 1 25 variance= 11.5 = 3.4 yds ok 46 Mean: 160/4 = 40 yds Variance: 46/4 = 11.5

12 Are these scores consistent? Is there a skew?
Multiple Choice Composite Essay I don’t understand this slide…probably your oral presentation clarifies it! Mean=9.3 SD=2.3 Mean=10.2 SD=2.0 Are these scores consistent? Is there a skew? Mean=34.3 SD=4.2

13 Which class did you perform better in compared to your classmates?
Z-Scores A number expressed in Standard Deviation Units that shows an Individual score’s deviation from the mean. Basically, it shows how you did compared to everyone else. + Z-score means you are above the mean, – Z-score means you are below the mean. Z-Score = your score minus the average score divided by standard deviation. Which class did you perform better in compared to your classmates? Test Total Your Score Average score S.D. Biology 200 168 160 4 Psych. 100 44 38 2 Ok – we do examples exactly like this! Z score in Biology: = 8, 8 / 4 = +2 Z Score Z score in Psych: = 6, 6/2 = +3 Z Score You performed better in Psych compared to your classmates.

14 9/14/2010 Photo courtesy of Judy Davidson, DNP, RN

15 Standard Normal Distribution Curve
Characteristics of the normal curve Bell shaped curve where the mean, median and mode are all the same and fall exactly in the middle + or - # ok -3 -2 -1 +1 +2 +3 Wechsler Intelligence Scores

16 Skewed Curves Skewed Distribution – when more scores pile up on one side of the distribution than the other. Positively skewed means more people have low scores. Negatively skewed means more people have high scores. Positive & Negative refers to the direction of the “tail” of the curve, they do not mean “good” or “bad.” Need more explanation? Try this website. We usually say skewed right (positive) or left (negative).

17 Inferential Statistics
Help us determine if our results are legit and can be generalized to the public Help to determine whether a study’s outcome is more than just chance events. Used to predict things about a population based on a sample. 3 Principles of Inferential Statistics: Non-biased sample - Representative Samples are better than biased samples for generalizing data Less-variability is better – the average is better when it comes from scores of low variability More cases are better than fewer – averages based on many cases are more reliable. Usually you would use inferential statistics to try to predict things about a population based on a sample. For example, we surveyed 50 staff members in the district about their level of education and are trying to use that to predict the average level of education for all staff in the district.

18 Statistically Significant
Possibility that the differences in results between the experimental and control groups could have occurred by chance is no more than 5 percent Must be at least 95% certain the differences between the groups is due to the independent variable

19 Statistical Significance
p value = likelihood a result is caused by chance. In other words, are they statistically significant? If the answer is yes, then they can be generalized to a larger population Researchers want this number to be as small as possible to show that any change in their experiment was caused by an independent variable and not some outside force. Results are considered statistically significant if the probability of obtaining it by chance alone is less than .05 or a P-Score of 5% p ≤ .05 Researcher must be 95% certain their results are not caused by chance. Replication of the experiment will prove the p value to be true or not. Effect Size – Measures the magnitude, or size, of an effect between two groups ok

20 Check out P Values made simple for more help.
This means the percentage of chance that a confounding variable may be responsible for our results. In other words…The chance you are willing to take that your study is wrong due to random chance. Describes the percent of the population/area under the curve (in the tail) that is beyond our statistic Check out P Values made simple for more help.


Download ppt "Descriptive Statistics"

Similar presentations


Ads by Google