Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory.

Similar presentations


Presentation on theme: "CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory."— Presentation transcript:

1 CS CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory © J R W Hunter, 2006; C J van Deemter 2007

2 CS What this is going to be about 1. Suppose you know that the statement p is true and that the statement q is true. What can you say about the statement that p and q ?

3 CS What this is going to be about 1.Suppose you know that the statement p is true and that the statement q is true. What can you say about the statement that p and q ? In this case, you know that p and q is also true. 2.Suppose you know that the statement p has a probability of.5 and the statement q has a probability of.5. What can you say about the statement that p and q ?

4 CS What this is going to be about 1.Suppose you know that the statement p is true and the statement q is true. What can you say about the statement that p and q ? In this case, p and q is also true. 2.Suppose you know that the statement p has a probability of.5 and the statement q has a probability of.5. What can you say about the statement that p and q ? It depends! If p and q are independent of each other then you know that p and q has a probability of.25 But suppose (1) p = It will snow (some time) tomorrow and q = It will be below zero (some time) tomorrow Then p and q has a probability >.25

5 CS What this is going to be about 1.Suppose you know that the statement p is true and the statement q is true. What can you say about the statement that p and q ? In this case, p and q is also true. 2.Suppose you know that the statement p has a probability of.5 and the statement q has a probability of.5. What can you say about the statement that p and q ? It depends! If p and q are independent of each other then you know that p and q has a probability of.25 But suppose (2) p = It will snow (sometime) tomorrow and q = It will not snow (any time) tomorrow Then p and q has a probability 0

6 CS Before we get there... Some basic concepts in statistics different kinds of data ways of representing data ways of summarising data This will be useful later in your CS career. For example to assess whether a computer simulation is accurate assess whether one user interface is more user friendly than another estimate the expected run time of a computer program (on typical data)

7 CS Lecture slides on statistics and probability are based on originals by Professor Jim Hunter.

8 CS CS1512 Foundations of Computing Science 2 Lecture 1 Probability and statistics (1) © J R W Hunter, 2006; C J van Deemter 2007

9 CS Sources Text book (parts of chapters 1-6): Essential Statistics (Fourth Edition) D.G.Rees Chapman and Hall 2001 (Blackwells, ~£28) Courses: ST1505 Mathematical Sciences CS1012 Sets

10 CS Some definitions Sample space (population) Set of entities of interest, also called elements this set may be infinite entities can be physical objects, events, etc.... Sample subset of the sample space

11 CS More definitions Variable an attribute of an element which has a value (e.g., its height, weight, etc.) Observation the value of a variable as recorded for a particular element an element will have variables with values but they are not observations until we record it Sample data set of observations derived from a sample

12 CS Descriptive and Inferential Statistics Descriptive statistics: Summarising the sample data (as a number, graphic...) Inferential statistics: Using data from a sample to infer properties of the sample space Chose a representative sample (properties of sample match those of sample space – difficult) In practice, use a random sample (each element has the same likelihood of being chosen)

13 CS Variable types Qualitative: Nominal/Categorical (no ordering in values) e.g. sex, occupation Ordinal (ranked) e.g. class of degree (1, 2.1, 2.2,...) Quantitative: Discrete (countable) – [integer] e.g. number of people in a room Continuous – [double] e.g. height

14 CS Examples 1.A persons marital status 2.The length of a CD 3.The size of a litter of piglets 4.The temperature in degrees centigrade

15 CS Examples 1.A persons marital status Nominal/categorical 2.The length of a CD Quantitative; continuous or discrete? This depends on how you model length (minutes or bits) 3.The size of a litter of piglets Quantitative, discrete (if we mean the number of pigs) 4.The temperature in degrees centigrade Quantitative, continuous (Even though it does not make sense to say that 20 0 is twice as warm as 10 0 ) Footnote: We us the term `Continuous` a bit loosely: For us a variable is continuous/dense (as opposed to discrete) if between any values x and y, there lies a third value z.

16 CS Summarising data Categorical (one variable): X is a categorical variable with values: a 1, a 2, a 3,... a k,... a K ( k = 1, 2, 3,... K) f k = number of times that a k appears in the sample f k is the frequency of a k if we have n observations then: relative frequency = frequency / n percentage relative frequency = relative frequency 100

17 CS Frequency Blood Type Frequency Relative Frequency Percentage RF A % AB % B % O % Totals % sample of 572 patients (n = 572) sum of frequencies = n

18 CS Bar Chart

19 CS Summarising data Categorical (two variables): contingency table number of patients with blood type A who are female is 108 Blood Type Sex Totals male female A AB B O Totals

20 CS Summarising data Categorical (two variables): contingency table number of patients with blood type A who are female is 108 Blood Type Sex Totals % Blood Type by sex male female male female A % 51% AB % 66% B % 50% O % 49% Totals

21 CS Bar Chart

22 CS Ordinal data X is an ordinal variable with values: a 1, a 2, a 3,... a k,... a K ordinal means that: a 1 a 2 a 3... a k... a K cumulative frequency at level k : c k = sum of frequencies of values less than or equal to a k c k = f 1 + f 2 + f f k = (f 1 + f 2 + f f k-1 ) + f k = c k-1 + f k Can be applied to quantitative data as well...

23 CS Cumulative frequencies Number of piglets in a litter: (discrete data) c1=f1=1, c2=f1+f2=1, c3=f1+f2+f3=3, c4=f1+f2+f3+f4=6, etc. Litter size Frequency=f Cum. Freq =c Total 36 c K = n

24 CS Plotting frequency cumulative frequency

25 CS Continuous data A way to obtain discrete numbers from continuous data: Divide range of observations into non-overlapping intervals (bins) Count number of observations in each bin Enzyme concentration data in 30 observations: Range: 25 to 201 For example, you can use 10 bins of width 20:

26 CS Enzyme concentrations Concentration Freq. Rel.Freq. % Cum. Rel. Freq c < % 39.5 c < % 59.5 c < % 79.5 c < % 99.5 c < % c < % c < % c < % c < % c < % Totals

27 CS Cumulative histogram


Download ppt "CS1512 1 CS1512 Foundations of Computing Science 2 (Theoretical part) Kees van Deemter Probability and statistics Propositional Logic Elementary set theory."

Similar presentations


Ads by Google