Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Data Analysis 2011/2012 M. de Gunst. Statistical Data Analysis 2 About the course (1) Docent Mathisca de Gunst 020-5987680,

Similar presentations


Presentation on theme: "Statistical Data Analysis 2011/2012 M. de Gunst. Statistical Data Analysis 2 About the course (1) Docent Mathisca de Gunst 020-5987680,"— Presentation transcript:

1 Statistical Data Analysis 2011/2012 M. de Gunst

2 Statistical Data Analysis 2 About the course (1) Docent Mathisca de Gunst (degunst@cs.vu.nl, 020-5987680, WN-R3.20) Assistants Geert Geeven (geert@few.vu.nl, 020-5987980, WN-S2.22) Beata Ros (b.p.ros@vu.nl, 020-5987724, WN-S2.30) Website www.math.vu.nl/~degunst/sda.htmlwww.math.vu.nl/~degunst/sda.html with login Literature Reader available on SDA-website Entry requirements Algemene Statistiek (W/Ectrie) or Algemene Statistiek (BWI)

3 Statistical Data Analysis 3 About the course (2) Elements of the course 1. Lectures Discussion part of reader and important points for assignments. 2. Homework Weekly assignment + deadline (STRICT) on SDA-website; concerns material of that week’s lecture. Made in groups of 2; groups will be formed in break. 3. Exercise classes Discussion homework assignments: presence compulsory. Note: tentative schedule of topics lectures and exercise classes on website.

4 Statistical Data Analysis 4 About the course (3) R Weekly assignments are made with aid of computer package R. R is installed on faculty's computer systems, but can also be downloaded for free; link on SDA-website. R-website provides introductory manual. Link to beginners’ manual for R (in Dutch) on SDA-website. Hand in homework on deadline at beginning of lecture: proper language, compact, printed, relevant R-code in appendix, proper graphs, appropriate rounding of numbers, etc.

5 Statistical Data Analysis 5 About the course (4) Assessment Via weekly homework assignments and written exam. -Both average score of homework assignments and exam score ≥ 5.5. -One can take part in written exam only if average score of homework ≥ 5.5. -Obtained scores for weekly exercises are valid for one course-year. - Final score = (average score of homework + exam score)/2, if exam score ≥ 5.5 = exam score, if exam score < 5.5. Topics for exam Complete content of reader except proofs Thm 3.1, Thm 6.1, and, if appliccable, other lecture notes.

6 Statistical Data Analysis 6 About the course (5) To do at home each week 1) Study part of reader and, if applicable, other lecture notes 2) Make homework (together) Where to go with questions About exercises and R → your own exercise class docent About reader, lectures and other → De Gunst

7 Statistical Data Analysis 7 About the course (6) Schedule Lectures: Tuesdays 11.00-12.45, M6.23 Exercise classes: Tuesdays 10.00-10.45, presence compulsory WN-C6.68 (group Geert Geeven); WN-F6.54 (group Beata Ros) start Sept 13. Today 10:00-10:45h, Lecture 10:45-11.00h, Form “homework groups” 11:00-12:45h, Computer class Introduction R, presence compulsory rooms WN-S3.45;WN-S3.29.

8 Statistical Data Analysis 8 Statistical Data Analysis: Introduction (1) Statistics is everywhere: Daily life: sports statistics, … ? Areas of application: industry, … ?

9 Statistical Data Analysis 9 Statistical Data Analysis: Introduction (2) Education in statistics bachelor n Algemene Statistiek (General/Basic Statistics) n Statistical Data Analysis master n Statistical Models n Mathematical Statistics/Asymptotic Statistics n Time Series n Statistics in Genetics n Bayesian Statistics (UvA) n Semiparametric Statistics (UvA)

10 Statistical Data Analysis 10 Statistical Data Analysis: Introduction (3) Stages of statistical study n Experimental question n Experimental design n Data collection n Data analysis n Interpretation analysis results this course n Presentation results and conclusions. Aim of course Give practical insight in data analysis. Needed Knowledge and understanding of possible methods. Practice in application of methods.

11 Statistical Data Analysis 11 Statistical Data Analysis: Introduction (4) Statistical Data Analysis Starting point: data Steps data analysis First: get impression of data, make suitable summary: descriptive statistics If more knowledge is wanted, next: stochastic models for data → u estimation (point and interval estimation) u testing u investigation of relation between variables this uses assumptions Translation of results to practical situation

12 Statistical Data Analysis 12 Statistical Data Analysis: Introduction (5) Topics of this course Summarizing data Exploring distributions Bootstrap Robust methods Nonparametric tests Analysis of categorical data Multiple linear regression

13 Statistical Data Analysis 13 Today’s topic: Summarizing data (Reader: Chapter 2) 2.1. Data Data types and measurement scales 2.2. Summarizing data Basic questions Summary methods Univariate data Graphical summaries Numerical summaries Bivariate data Graphical summaries Numerical summaries (Multivariate data)

14 Statistical Data Analysis 14 2.1. Data Data: quantified measurement results of a study. Variable: characteristic(s) or property(ies) of individual or unit that is(are) measured. Variables can be univariate, bivariate, multivariate. Variables can be independent or dependent. Data in terms of variables: measured values of variable(s) on individuals or units. Different types of data are measured on different scales.

15 Statistical Data Analysis 15 qualitative quantitative Nominal Ordinal Type Range of values Discrete Continuous Interval Ratio Interval Ratio Measurement scale Data Data: data types and measurement scales

16 Statistical Data Analysis 16 2.2. Summarizing data Summarizing data Which goals? n to describe n to detect structure Which questions should good summary answer?

17 Statistical Data Analysis 17 Summarizing data: basic questions Which questions should good summary answer? Depends on goal and context, but good summary should tell n where data lie: location, scale, range, extremes, accumulation, holes, symmetry? Sometimes also n data from known distribution (symmetric/asymmetric)? n are data rounded? n need to divide data into groups for separate analysis? n influence of other variables, like time? n relationship between different variables? Plus specific, context-related questions.

18 Statistical Data Analysis 18 Summarizing univariate data (1) Graphical summaries? Stem-and-leaf plot Histogram Empirical distribution function Boxplot (also numerical)

19 Statistical Data Analysis 19 Summarizing univariate data (2) Numerical summaries?

20 Statistical Data Analysis 20 Example Data: Cotinine content in blood (breakdown product of nicotine) Three groups of size 50 F non-smokers F environmental smokers F smokers ??Interesting questions?? Summarizing univariate data (3)

21 Statistical Data Analysis 21 Count50 Mean145.349 Median33.385 Std Dev226.358 Variance51237.729 Range983.33 Min0,08 Max983.41 IQR174.76 25th%12.58 75th%187.34 Example Data: cotinine content blood smokers Histogram Numerical summary Summarizing univariate data (4)

22 Statistical Data Analysis 22 Example Data: cotinine content Boxplot Histograms Summarizing univariate data (5)

23 Statistical Data Analysis 23 Summarizing bivariate data (1) Graphical summaries? Scatter plot Time plot Contingency table

24 Statistical Data Analysis 24 Summarizing bivariate data (2) Numerical summaries?

25 Statistical Data Analysis 25 Summarizing bivariate data (3) Example Data: sales of PCs Two time plots. Same data?

26 Statistical Data Analysis 26 Summarizing multivariate data (1) Graphical summaries Chernoff faces display multivariate data in the shape of a human face. Chernoff faces handle each variable differently: the individual parts, such as eyes, ears, mouth and nose represent values of the variables by their shape, size, placement and orientation. Idea behind using faces: humans easily recognize faces and notice small changes without difficulty. Example Data: Characteristics of 12 U.S. judges

27 Graphical summaries Modern science: highly complex data sets …. Example Data: Different types of neuroscience data But we will deal with much more simple data sets! Statistical Data Analysis 27 Summarizing multivariate data (2)

28 Statistical Data Analysis 28 Recap 2.1 Data Data types and measurement scales 2.2 Summarizing data Basic questions Summary methods Univariate data Graphical summaries Numerical summaries Bivariate data Graphical summaries Numerical summaries (Multivariate data)

29 Statistical Data Analysis 29 After the break: Introduction R in computer rooms WN-S3.45, WN-S3.29


Download ppt "Statistical Data Analysis 2011/2012 M. de Gunst. Statistical Data Analysis 2 About the course (1) Docent Mathisca de Gunst 020-5987680,"

Similar presentations


Ads by Google