1 Research Methods in Psychology AS Descriptive Statistics
2 There was widespread panic today with fans fainting as top girl band ‘Central Tendency’ revealed their true values mean median mode Top? More like pretty average
3 Hi! The name’s Ave Rage and I’m a pretty MEAN character. I get really MEAN because people add up all the facts about me and then divide them by the total number. I suppose that makes me an average guy most of the time
4 1.Starts from true zero, e.g., physical quantities such as time, height, weight 2.Are on a scale of fixed units separated by equal intervals that allow us to make accurate comparisons, e.g., someone completing a memory task in 20 seconds did it twice as fast as someone taking 40 seconds I’m MEAN and powerful because I make use of all the data. Those weaklings MEDIAN and MODE chuck most of it away, but I can only be used to measure data that: I’m also affected by extreme scores
5 Extreme scores Time in seconds to solve a puzzle: 135, 109, 95, 121, 140 Mean = 600 secs ÷ 5 participants =120 secs Add a 6 th participant, who stares at it for 8 mins 135, 109, 95, 121, 140, 480 Mean = 1080÷6=180 secs Get out of here, kid! You’re taking too long. You’re about to wreck my experiment!
6 Median Middle value of scores arranged in rank order Half the scores will lie above the median, half below it UUnlike the MEAN, it can be used on ranked data, e.g., placing a group of people in order of position on a memory test rather than counting their actual score UUnaffected by extremes, so can we can use it on data with a skewed distribution where results would be a bit one sided – if we were to plot a graph, it would look like this:
7 Do you know where my middle is? Odd scores: 2, 3, 5, 6, 7, 10, 14 median=6 (middle value) Even scores: 2, 3, 5, 6, 7, 10, 14, 15 median=6+7÷2 = 6.5
8 Disadvantages of the median Does not work well on small data sets, e.g., 1.10, 12, 13, 14, 18, 19, 22, 22 = , 12, 13, 14, 15, 19, 22, 22 = 14.5 Not as powerful as the MEAN: we can only say one value is higher than another on ranked data
9 Mode Most frequently occurring value in a data set, e.g., 2, 4, 6, 7, 7, 7, 10,12 mode = 7 Unaffected by extremes as we’re just looking at the most common value rather than its position Can be used on basic data forming nominal categories – we could do a frequency count on these, e.g., number of people preferring vanilla, strawberry or chocolate ice-cream Those mongrels are just sooooooo common
10 Disadvantages of the Mode Small changes can make a big difference, e.g., 1.3, 6, 8, 9, 10, 10 mode=10 2.3, 3, 6, 8, 9, 10 mode=3 Can be bi/multimodal, e.g., 3,5,8,8,10,12,16,16,16,20 But the MODE doesn’t tell me much. What about the rest of the data? There’s this really interesting figure… So we’re too common for you, now? You can calculate the MEDIAN and MEAN if you want to waste time…we’re off to have fun!
11 When using the …Use … MeanStandard deviation MedianInterquartile range or range ModeRange Measures of central tendency are always accompanied by a measure of dispersion
12 Measures of Dispersion Describe how spread out the values in a data set are Standard Deviation
13 The difference between the highest and lowest scores in a set of data Quick to calculate Gives us a basic measure of how much the data varies Tells us nothing about data in the middle of a set of scores Affected by outlying values
Interquartile This measures the spread of the middle 50% of scores Avoids extreme scores lying in the top 25% and bottom 25% Still uses only half of the available data 14
15 Standard Deviation Measures the variability of our data, i.e., how scores spread out in relation to the mean score Allows us to make statements about probability – how likely or unlikely a given value is to occur Most powerful measure of dispersion as all the data is used Data cannot be ranked or from categories Data must form a normal distribution curve as SD is affected by skewed data
16 68% -1 SD-3 SD-2 SD2 SD1 SD3 SD 95% 99% mean
17 The AS syllabus doesn’t require you to work out SD, but you must know why we use it and what it means. However, previous students found this much easier to understand when they saw how it was calculated and how it related to data in a study, so stick with it if you can.
18 Formula for calculating standard deviation
19 It’s really not so bad! You can do it! S = the standard deviation we are trying to calculate √ = square root ∑ = sum of – add up d 2 = the squared deviation from the mean for each value N = number of scores less one for error I knew I should’ve done art
20 The easiest way to calculate SD is to put all your data into a very simple table…Come on, I don’t think you’re even trying, but it’s not hard.
Let’s look at some test scores on a reaction time task
22 Test ScoresMeanDifference (d)Difference Squared (d 2 ) Make a table like this
= ÷ 10 (number of participants)=100 Calculate the mean of the data set
24 Test ScoresMeanDifference (d)Difference Squared (d 2 )
25 Find the difference between your results and the mean score to give you column d Then square all the values of d to give you the next column d 2. Add up all the figures to give you the total sum for use in the formula
26 Test ScoresMeanDifference (d)Difference Squared (d 2 ) ∑d 2 = 900
is the sum of d 2 10 is your number of participants Subtract 1 from your number of participants to allow for errors in sampling method, then divide the top by the bottom number Find the square root of this figure This will give you your figure of standard deviation You now have all the figures you need to place into the equation
28 But what does it actually mean in terms of the reaction time task? I’m still very confused The mean time taken to do the task was 100 seconds, but our SD figure shows us that individual performances varied from the mean by 10 seconds. Some people would’ve taken 90 seconds to complete the task, while others would’ve taken 110 seconds
29 Participant Number Control Condition (before caffeine) Experimental Condition (after caffeine) Raw scores in seconds for a reaction time task Here’s an example where we can compare performance between two differing conditions in a repeated measures design
Summary data table of reaction time scores in control (before caffeine) and experimental (after caffeine) conditions 30 Control Condition (time in secs) Experimental Condition (time in secs) Mean Median Mode65 Standard Deviation You can see from the table that performance after caffeine was faster compared to doing the task before caffeine. However, the SD tells us that there was greater variation in the scores in the experimental condition. Scores in the control condition were closer to the mean and therefore are more representative as there’s less variation in them. SD provides us with detailed information about data spread that the range cannot.