Download presentation

Presentation is loading. Please wait.

Published byAlexander Harrington Modified over 4 years ago

1
School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 18 Statistical Data Analysis: Types of Data, Sampling Methods, Descriptive Statistics November 2011

2
In the previous lectures Mathematical Modelling Identify Factors Make assumptions Formulate model Examine behaviour Assumptions are crucial Estimated values for parameters Assumed dependencies between variables Check validity of assumptions Grounded on data analysis

3
In this series of lectures Analysis of data How to collect data samples? How to make estimations of values (point/interval)? How to infer possible dependencies between variables? How to check the validity of a hypothesis?

4
Do you agree with these statements? Average earnings in the UK grow steadily. There are more overseas visits to the UK than UK visits abroad. House prices are dependent on average family income. Young people prefer to shop online. Advertisement improves sales figures. TV advertisement is more powerful than Radio advertisement. Women are more likely to buy computer games than men. Men are more likely to buy cosmetics products than women.

5
Population Large collection of objects or events which vary in respect of some characteristics The whole set of measurements or counts about which we want to draw a conclusion Characteristics of population: height, age, reading abilities, fitness level What is the population for each of the claims on the previous slide?

6
Sample Subset of the population, as set of some of the measurements or the characteristics of the population. population sample Measures describing population characteristics PARAMETERS Measures describing sample characteristics STATISTICS Statistics estimate the parameters Systematic error the sample is not representative of the population Sampling error influences by the size of the sample and the variation in the population

7
Sampling methods Random sampling selecting members of the population in a random order Pros & Cons Systematic sampling selecting members of the population in a systematic order (quasi-random) Pros & Cons

8
Sampling methods (cont.) Stratified sampling dividing population in homegeneous groups and random selection within the group Pros & Cons Cluster sampling when the population is too big, we may select certain clusters (e.g. UK students) Pros & Cons Stage sampling – random selection of clusters

9
Sample size What precision do we want? Increase the size to get better precision What is the likely variability in the population? Increase the size to account for higher variability

10
Types of data Nominal Categories, classes Ordinal Nominal with order Discrete Numbers that are distinct points on a scale Continuous Can take any values between points on a scale GIVE EXAMPLES

11
Descriptive Statistics Mean – average score Median – middle point on the scale of measurement helpful for oddly shaped distributions General description of the sample 2 3 5 7 9 10 12 13 14 16 18 20 21 3 4 4 5 5 5 5 6 7

12
Distribution of scores Standard Deviation Variance Coefficient of variation

13
Example (EU-Area-Current-Accounts.xls) http://epp.eurostat.ec.europa.eu/

14
Normal and Skewed Distribution wikipedia Skewed Distribution Normal Distribution

15
Approximating Normal Distribution As the sample size increases, the shape of the sampling distribution becomes normal (see also Central Limit Theorem ) http://www.statsoft.com/textbook/esc.html

16
Correlation between two variables Measure of the relations between two or more variables Correlation coefficient r Negative correlation r -1 Positive correlation r 1 Different methods to calculate r Simplest: based on deviations from the mean

17
Example: Positive Correlation r=0.998

18
Example: Negative Correlation r=-0.99

19
Example: Limited (or No) Correlation r=0.179

20
Summary Types of data Population vs Sample Sampling methods Descriptive statistics Normal & Skewed distribution Correlation between variables References Rees D.G., Essential Statistics, Chapman & Hall/CRC, 2000. Cohen, L., Holliday, M., Practical Statistics for Students, Chapman, 1996. http://www.statsoft.com/textbook/esc.html

Similar presentations

OK

Chapter 7 Probability and Samples: The Distribution of Sample Means

Chapter 7 Probability and Samples: The Distribution of Sample Means

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google