Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS1512 www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J.

Similar presentations


Presentation on theme: "CS1512 www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J."— Presentation transcript:

1 CS CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J R W Hunter, 2006

2 CS Ordinal data X is an ordinal variable with values: a 1, a 2, a 3,... a k,... a K ordinal means that: a 1 a 2 a 3... a k... a K cumulative frequency at level k : c k = sum of frequencies of values less than or equal to a k c k = f 1 + f 2 + f f k = (f 1 + f 2 + f f k-1 ) + f k = c k-1 + f k also (%) cumulative relative frequency

3 CS CAS marks % relative frequencies % cumulative relative frequencies

4 CS Enzyme concentrations Concentration Freq. Rel.Freq. % Cum. Rel. Freq c < % 39.5 c < % 59.5 c < % 79.5 c < % 99.5 c < % c < % c < % c < % c < % c < % Totals

5 CS Cumulative histogram

6 CS Discrete two variable data

7 CS Continuous two variable data X Y

8 CS Time Series Time and space are fundamental (especially time) Time series: variation of a particular variable with time

9 CS Summarising data by numerical means Further summarisation (beyond frequencies) Measures of location (Where is the middle?) Mean Median Mode

10 CS Mean _ sum of observed values of X Sample Mean ( X ) = number of observed values x = n use only for quantitative data

11 CS Sigma Sum of n observations n x i = x 1 + x x i x n-1 + x n i = 1 If it is clear that the sum is from 1 to n then: x = x 1 + x x i x n-1 + x n Sum of squares x 2 = x x x i x n x n 2

12 CS x from frequencies If X is a categorical variable with values: a 1, a 2, a 3,... a k,... a K x = x 1 + x x i x n-1 + x n ( order of summation isnt important ) (e.g. piglets: ) Group together those x s which have value a 1, those with value a 2,... x = x.. + x.. + x x s which have value a 1 - there are f 1 of them x.. + x x s which have value a 2 - there are f 2 of them... x.. + x.. x s which have value a K - there are f K of them = f 1 * a 1 + f 2 * a f k * a k f K * a K K = f k * a k k = 1

13 CS Mean Litter size Frequency Cum. Freq a k f k Total 36 K x = f k * a k k = 1 = 1*5 + 0* 6 + 2*7 + 3*8 3*9 + 9*10 + 8*11 5*12 + 3*13 + 2*14 = 375 _ X = 375 / 36 = 10.42

14 CS Median Sample median of X = middle value when n sample observations are ranked in increasing order = the ((n + 1)/2) th value n odd: values: 183, 163, 152, 157 and 157 rank order: 152, 157, 157, 163, 183 median: 157 n even: values: 165, 173, 180, 164 rank order: 164, 165, 173, 180 median: ( )/2 = 169

15 CS Median Litter size Frequency Cum. Freq Total 36 Median = 10.5

16 CS Median from cumulative distribution cumulative % frequency polygon

17 CS Mode Sample mode = value with highest frequency (may not be unique) Litter size Frequency Cum. Freq Mode = 10

18 CS Skew left skewed symmetric right skewed mean mode

19 CS Variance Measure of spread: variance

20 CS Variance sample variance = s 2 sample standard deviation = s = variance

21 CS Variance and standard deviation Litter size Frequency Cum. Freq a k f k Total 36 K x 2 = f k * a k 2 k = 1 = 1*25 + 2*49 + 3*64 3*81 + 9* *121 5* * *196 = 4145 x = 375 ( x) 2 / n = 375*375 / 36 = 3906 s 2 = ( ) / (36-1) = 6.83 s = 2.6

22 CS Piglets Mean = Median = 10.5 Mode = 10 Std. devn. = 2.6

23 CS Quartiles and Range Lower quartile: value such that 25% of observations are below it ( Q1 ). Median: value such that 50% of observations are below (above) it ( Q2 ). Upper quartile: value such that 25% of observations are above it ( Q3 ). Range: the minimum ( m ) and maximum ( M ) observations. Box and Whisker plot: m Q1 Q2 Q3 M

24 CS Estimating quartiles

25 CS Linear Regression y = mc + c Calculate m and c so that (distance of point from line) 2 is minimised y x

26 CS Time Series - Moving Average Time Y 3 point MA 0 24 * * smoothing function can compute median, max, min, std. devn, etc. in window


Download ppt "CS1512 www.csd.abdn.ac.uk/~jhunter/teaching/CS1512/lectures/ 1 CS1512 Foundations of Computing Science 2 Lecture 20 Probability and statistics (2) © J."

Similar presentations


Ads by Google