Download presentation

Presentation is loading. Please wait.

Published byAlexander Harrington Modified over 3 years ago

1
School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 18 Statistical Data Analysis: Types of Data, Sampling Methods, Descriptive Statistics November 2011

2
In the previous lectures Mathematical Modelling Identify Factors Make assumptions Formulate model Examine behaviour Assumptions are crucial Estimated values for parameters Assumed dependencies between variables Check validity of assumptions Grounded on data analysis

3
In this series of lectures Analysis of data How to collect data samples? How to make estimations of values (point/interval)? How to infer possible dependencies between variables? How to check the validity of a hypothesis?

4
Do you agree with these statements? Average earnings in the UK grow steadily. There are more overseas visits to the UK than UK visits abroad. House prices are dependent on average family income. Young people prefer to shop online. Advertisement improves sales figures. TV advertisement is more powerful than Radio advertisement. Women are more likely to buy computer games than men. Men are more likely to buy cosmetics products than women.

5
Population Large collection of objects or events which vary in respect of some characteristics The whole set of measurements or counts about which we want to draw a conclusion Characteristics of population: height, age, reading abilities, fitness level What is the population for each of the claims on the previous slide?

6
Sample Subset of the population, as set of some of the measurements or the characteristics of the population. population sample Measures describing population characteristics PARAMETERS Measures describing sample characteristics STATISTICS Statistics estimate the parameters Systematic error the sample is not representative of the population Sampling error influences by the size of the sample and the variation in the population

7
Sampling methods Random sampling selecting members of the population in a random order Pros & Cons Systematic sampling selecting members of the population in a systematic order (quasi-random) Pros & Cons

8
Sampling methods (cont.) Stratified sampling dividing population in homegeneous groups and random selection within the group Pros & Cons Cluster sampling when the population is too big, we may select certain clusters (e.g. UK students) Pros & Cons Stage sampling – random selection of clusters

9
Sample size What precision do we want? Increase the size to get better precision What is the likely variability in the population? Increase the size to account for higher variability

10
Types of data Nominal Categories, classes Ordinal Nominal with order Discrete Numbers that are distinct points on a scale Continuous Can take any values between points on a scale GIVE EXAMPLES

11
Descriptive Statistics Mean – average score Median – middle point on the scale of measurement helpful for oddly shaped distributions General description of the sample 2 3 5 7 9 10 12 13 14 16 18 20 21 3 4 4 5 5 5 5 6 7

12
Distribution of scores Standard Deviation Variance Coefficient of variation

13
Example (EU-Area-Current-Accounts.xls) http://epp.eurostat.ec.europa.eu/

14
Normal and Skewed Distribution wikipedia Skewed Distribution Normal Distribution

15
Approximating Normal Distribution As the sample size increases, the shape of the sampling distribution becomes normal (see also Central Limit Theorem ) http://www.statsoft.com/textbook/esc.html

16
Correlation between two variables Measure of the relations between two or more variables Correlation coefficient r Negative correlation r -1 Positive correlation r 1 Different methods to calculate r Simplest: based on deviations from the mean

17
Example: Positive Correlation r=0.998

18
Example: Negative Correlation r=-0.99

19
Example: Limited (or No) Correlation r=0.179

20
Summary Types of data Population vs Sample Sampling methods Descriptive statistics Normal & Skewed distribution Correlation between variables References Rees D.G., Essential Statistics, Chapman & Hall/CRC, 2000. Cohen, L., Holliday, M., Practical Statistics for Students, Chapman, 1996. http://www.statsoft.com/textbook/esc.html

Similar presentations

OK

EPSY 5210 Ed. Statistics Instructor: Hector Ponce Background: Research Interest Experience with Quantitative Analysis Additional comments.

EPSY 5210 Ed. Statistics Instructor: Hector Ponce Background: Research Interest Experience with Quantitative Analysis Additional comments.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on real numbers for class 9th model Ppt on product advertising print Ppt on store design Download ppt on covalent and ionic bonds Ppt on service oriented architecture diagram Mis ppt on hospital management Ppt on free space optical communication link Strategic management ppt on nestle Ppt on chapter 3 atoms and molecules animation Ppt on center of gravity