School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 18 Statistical Data Analysis:

Slides:



Advertisements
Similar presentations
C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.
Advertisements

School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 19 Statistical Data Analysis:
STATISTICAL INFERENCE ABOUT MEANS AND PROPORTIONS WITH TWO POPULATIONS
Chapter 7 Sampling and Sampling Distributions
CHAPTER 11: Sampling Distributions
THE CENTRAL LIMIT THEOREM
Comparing Two Population Parameters
Statistical Inferences Based on Two Samples
Chapter 18: The Chi-Square Statistic
Chapter 6 Sampling and Sampling Distributions
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Chapter 10: Sampling and Sampling Distributions
Statistics. The usual course of events for conducting scientific work “The Scientific Method” Reformulate or extend hypothesis Develop a Working Hypothesis.
QUANTITATIVE DATA ANALYSIS
Topics: Inferential Statistics
Chapter 7 Sampling and Sampling Distributions
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
Sampling Distributions
Topic 2: Statistical Concepts and Market Returns
1 Economics 240A Power One. 2 Outline w Course Organization w Course Overview w Resources for Studying.
1 Basic statistics Week 10 Lecture 1. Thursday, May 20, 2004 ISYS3015 Analytic methods for IS professionals School of IT, University of Sydney 2 Meanings.
Part III: Inference Topic 6 Sampling and Sampling Distributions
Statistics for CS 312. Descriptive vs. inferential statistics Descriptive – used to describe an existing population Inferential – used to draw conclusions.
Central Tendency and Variability
AM Recitation 2/10/11.
PPA 501 – A NALYTICAL M ETHODS IN A DMINISTRATION Lecture 3b – Fundamentals of Quantitative Research.
Concepts and Notions for Econometrics Probability and Statistics.
1 Ch6. Sampling distribution Dr. Deshi Ye
6.1 What is Statistics? Definition: Statistics – science of collecting, analyzing, and interpreting data in such a way that the conclusions can be objectively.
Types of data and how to present them 47:269: Research Methods I Dr. Leonard March 31, :269: Research Methods I Dr. Leonard March 31, 2010.
F OUNDATIONS OF S TATISTICAL I NFERENCE. D EFINITIONS Statistical inference is the process of reaching conclusions about characteristics of an entire.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
STA291 Statistical Methods Lecture 16. Lecture 15 Review Assume that a school district has 10,000 6th graders. In this district, the average weight of.
Topic 5 Statistical inference: point and interval estimate
Introduction to Statistical Inference Chapter 11 Announcement: Read chapter 12 to page 299.
Chapter 4 Statistics. 4.1 – What is Statistics? Definition Data are observed values of random variables. The field of statistics is a collection.
Reasoning in Psychology Using Statistics Psychology
Statistics Recording the results from our studies.
1 Sampling Distributions Lecture 9. 2 Background  We want to learn about the feature of a population (parameter)  In many situations, it is impossible.
Research & Statistics Looking for Conclusions. Statistics Mathematics is used to organize, summarize, and interpret mathematical data 2 types of statistics.
University of Sunderland CSEM03 R.E.P.L.I. Unit 1 CSEM03 REPLI Research and the use of statistical tools.
Introduction Biostatistics Analysis: Lecture 1 Definitions and Data Collection.
Sampling and Confidence Interval Kenneth Kwan Ho Chui, PhD, MPH Department of Public Health and Community Medicine
Lecture 2 Forestry 3218 Lecture 2 Statistical Methods Avery and Burkhart, Chapter 2 Forest Mensuration II Avery and Burkhart, Chapter 2.
Chapter 7 Probability and Samples: The Distribution of Sample Means
Psychology 101. Statistics THE DESCRIPTION, ORGANIZATION AND INTERPRATATION OF DATA.
Lecture 2 Review Probabilities Probability Distributions Normal probability distributions Sampling distributions and estimation.
BUS304 – Chapter 6 Sample mean1 Chapter 6 Sample mean  In statistics, we are often interested in finding the population mean (µ):  Average Household.
Measures of Central Tendency: The Mean, Median, and Mode
Descriptive Statistics The goal of descriptive statistics is to summarize a collection of data in a clear and understandable way.
Academic Research Academic Research Dr Kishor Bhanushali M
Appendix B: Statistical Methods. Statistical Methods: Graphing Data Frequency distribution Histogram Frequency polygon.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Education 793 Class Notes Inference and Hypothesis Testing Using the Normal Distribution 8 October 2003.
Ch1 Larson/Farber 1 1 Elementary Statistics Larson Farber Introduction to Statistics As you view these slides be sure to have paper, pencil, a calculator.
Describing Data: Summary Measures. Identifying the Scale of Measurement Before you analyze the data, identify the measurement scale for each variable.
EPSY 5210 Ed. Statistics Instructor: Hector Ponce Background: Research Interest Experience with Quantitative Analysis Additional comments.
Chapter 6 Sampling and Sampling Distributions
Lecture 8 Data Analysis: Univariate Analysis and Data Description Research Methods and Statistics 1.
Outline Sampling Measurement Descriptive Statistics:
Statistical Methods Michael J. Watts
Statistics in Management
Data Analysis.
STA 291 Spring 2010 Lecture 12 Dustin Lueker.
Statistical Methods Michael J. Watts
Political Science 30 Political Inquiry
Sampling Distribution of the Mean
STA 291 Summer 2008 Lecture 12 Dustin Lueker.
STA 291 Spring 2008 Lecture 12 Dustin Lueker.
Introductory Statistics
Presentation transcript:

School of Computing FACULTY OF ENGINEERING MJ11 (COMP1640) Modelling, Analysis & Algorithm Design Vania Dimitrova Lecture 18 Statistical Data Analysis: Types of Data, Sampling Methods, Descriptive Statistics November 2011

In the previous lectures Mathematical Modelling Identify Factors Make assumptions Formulate model Examine behaviour Assumptions are crucial Estimated values for parameters Assumed dependencies between variables Check validity of assumptions Grounded on data analysis

In this series of lectures Analysis of data How to collect data samples? How to make estimations of values (point/interval)? How to infer possible dependencies between variables? How to check the validity of a hypothesis?

Do you agree with these statements? Average earnings in the UK grow steadily. There are more overseas visits to the UK than UK visits abroad. House prices are dependent on average family income. Young people prefer to shop online. Advertisement improves sales figures. TV advertisement is more powerful than Radio advertisement. Women are more likely to buy computer games than men. Men are more likely to buy cosmetics products than women.

Population Large collection of objects or events which vary in respect of some characteristics The whole set of measurements or counts about which we want to draw a conclusion Characteristics of population: height, age, reading abilities, fitness level What is the population for each of the claims on the previous slide?

Sample Subset of the population, as set of some of the measurements or the characteristics of the population. population sample Measures describing population characteristics PARAMETERS Measures describing sample characteristics STATISTICS Statistics estimate the parameters Systematic error the sample is not representative of the population Sampling error influences by the size of the sample and the variation in the population

Sampling methods Random sampling selecting members of the population in a random order Pros & Cons Systematic sampling selecting members of the population in a systematic order (quasi-random) Pros & Cons

Sampling methods (cont.) Stratified sampling dividing population in homegeneous groups and random selection within the group Pros & Cons Cluster sampling when the population is too big, we may select certain clusters (e.g. UK students) Pros & Cons Stage sampling – random selection of clusters

Sample size What precision do we want? Increase the size to get better precision What is the likely variability in the population? Increase the size to account for higher variability

Types of data Nominal Categories, classes Ordinal Nominal with order Discrete Numbers that are distinct points on a scale Continuous Can take any values between points on a scale GIVE EXAMPLES

Descriptive Statistics Mean – average score Median – middle point on the scale of measurement helpful for oddly shaped distributions General description of the sample

Distribution of scores Standard Deviation Variance Coefficient of variation

Example (EU-Area-Current-Accounts.xls)

Normal and Skewed Distribution wikipedia Skewed Distribution Normal Distribution

Approximating Normal Distribution As the sample size increases, the shape of the sampling distribution becomes normal (see also Central Limit Theorem )

Correlation between two variables Measure of the relations between two or more variables Correlation coefficient r Negative correlation r -1 Positive correlation r 1 Different methods to calculate r Simplest: based on deviations from the mean

Example: Positive Correlation r=0.998

Example: Negative Correlation r=-0.99

Example: Limited (or No) Correlation r=0.179

Summary Types of data Population vs Sample Sampling methods Descriptive statistics Normal & Skewed distribution Correlation between variables References Rees D.G., Essential Statistics, Chapman & Hall/CRC, Cohen, L., Holliday, M., Practical Statistics for Students, Chapman,