Presentation is loading. Please wait.

Presentation is loading. Please wait.

上實習課之前不可不知道的事情 助教:陳佳滎 ( ㄧㄥˊ ) 助教 負責事項 – 禮拜五習題講演 ( 不ㄧ定上到兩點 ) – 每次針對當週老師上課做重點複習與習題練習 – 出作業 – 考試前做複習與重點整理.

Similar presentations


Presentation on theme: "上實習課之前不可不知道的事情 助教:陳佳滎 ( ㄧㄥˊ ) 助教 負責事項 – 禮拜五習題講演 ( 不ㄧ定上到兩點 ) – 每次針對當週老師上課做重點複習與習題練習 – 出作業 – 考試前做複習與重點整理."— Presentation transcript:

1 上實習課之前不可不知道的事情 助教:陳佳滎 ( ㄧㄥˊ ) 助教 負責事項 – 禮拜五習題講演 ( 不ㄧ定上到兩點 ) – 每次針對當週老師上課做重點複習與習題練習 – 出作業 – 考試前做複習與重點整理

2 上實習課之前不可不知道的事情 上課可以吃午餐,喝東西,如需早退也可 自行離開 所有投影片會上傳到老師網頁上面去

3 統計學內容 敘述統計學 – 研究如何簡化與表示現成之統計資料 – 圖表方式 – 數值方式 機率分配 推論統計學 – 研究如何利用母體中所抽取之樣本,去估計、 檢定或預測母體中之未知特性之科學方法 母體樣本 推論 抽樣

4 Chapter 4 Numerical Descriptive Techniques

5 Measures of Central Location –Mean, Median, Mode Measures of Variability –Range, Standard Deviation, Variance, Coefficient of Variation Measures of Relative Standing –Percentiles, Quartiles Measures of Linear Relationship –Covariance, Correlation, Determination, Least Squares Line

6 Sum of the observations Number of observations Mean = This is the most popular and useful measure of central location The Arithmetic Mean

7 Odd number of observations 0, 0, 5, 7, 8 9, 12, 14, 22 0, 0, 5, 7, 8, 9, 12, 14, 22, 33 Even number of observations Example Find the median of the time on the internet for the 10 adults The Median of a set of observations is the value that falls in the middle when the observations are arranged in order of magnitude. The Median Suppose only 9 adults were sampled (exclude, say, the longest time (33)) Comment 8.5, 8 Sample and population medians are computed the same way.

8 The Mode of a set of observations is the value that occurs most frequently. Set of data may have one mode (or modal class), or two or more modes. The modal class For large data sets the modal class is much more relevant than a single-value mode. The Mode

9 Example 1 The times (to the nearest minute) that a sample of 9 bank customers waited in line were recorded and are listed here Determine the mean, median, and mode for these data.

10 Solution

11 Relationship among Mean, Median, and Mode If a distribution is symmetrical, the mean, median and mode coincide If a distribution is asymmetrical, and skewed to the left or to the right, the three measures differ. A positively skewed distribution (“skewed to the right”) Mean Median Mode

12 If a distribution is symmetrical, the mean, median and mode coincide If a distribution is non symmetrical, and skewed to the left or to the right, the three measures differ. A positively skewed distribution (“skewed to the right”) Mean Median Mode Mean Median Mode A negatively skewed distribution (“skewed to the left”) Relationship among Mean, Median, and Mode

13 –The range of a set of observations is the difference between the largest and smallest observations. –Its major advantage is the ease with which it can be computed. –Its major shortcoming is its failure to provide information on the dispersion of the observations between the two end points. ? ? ? But, how do all the observations spread out? Smallest observation Largest observation The range cannot assist in answering this question Range The range

14 Variance … The variance of a population is: The variance of a sample is: population mean sample mean Note! the denominator is sample size (n) minus one ! population size

15 Variance … As you can see, you have to calculate the sample mean (x-bar) in order to calculate the sample variance. Alternatively, there is a short-cut formulation to calculate sample variance directly from the data without the intermediate step of calculating the mean. Its given by:

16 Coefficient of Variation … The coefficient of variation of a set of observations is the standard deviation of the observations divided by their mean, that is: Population coefficient of variation = CV = Sample coefficient of variation = cv =

17 4.17 The Empirical Rule… Approximately 68% of all observations fall within one standard deviation of the mean. Approximately 95% of all observations fall within two standard deviations of the mean. Approximately 99.7% of all observations fall within three standard deviations of the mean.

18 4.18 Chebysheff’s Theorem… A more general interpretation of the standard deviation is derived from Chebysheff’s Theorem, which applies to all shapes of histograms (not just bell shaped). The proportion of observations in any sample that lie within k standard deviations of the mean is at least: For k=2 (say), the theorem states that at least 3/4 of all observations lie within 2 standard deviations of the mean. This is a “lower bound” compared to Empirical Rule’s approximation (95%).

19 Example 2 Determine the variance, standard deviation, range, and the cv of the following sample

20 Solution Range=31-9=22 cv=6.82/17.22

21 Your score Measures of Relative Standing and Box Plots Percentile –The pth percentile of a set of measurements is the value for which p percent of the observations are less than that value 100(1-p) percent of all the observations are greater than that value. –Example Suppose your score is the 60% percentile of a SAT test. Then 60% of all the scores lie here 40%

22 Commonly used percentiles –First (lower)decile = 10th percentile –First (lower) quartile, Q 1,= 25th percentile –Second (middle)quartile,Q 2,= 50th percentile –Third quartile, Q 3, = 75th percentile –Ninth (upper)decile= 90th percentile Quartiles

23 Find the location of any percentile using the formula Location of Percentiles

24 Example 3 (Textbook 4.40) Determine the first, second, and third quartiles of the following data

25 Solution

26 Example 4 (Textbook 4.38) Find the third and eighth deciles (30th and 80th percentiles) of the following data set

27 Solution

28 This is a measure of the spread of the middle 50% of the observations Large value indicates a large spread of the observations Interquartile range = Q 3 – Q 1 Interquartile Range

29 1.5(Q 3 – Q 1 ) –This is a pictorial display that provides the main descriptive measures of the data set: L - the largest observation Q 3 - The upper quartile Q 2 - The median Q 1 - The lower quartile S - The smallest observation SQ1Q1 Q2Q2 Q3Q3 L Whisker Box Plot

30 Measures of Linear Relationship … We now present two numerical measures of linear relationship that provide information as to the strength & direction of a linear relationship between two variables (if one exists). They are the covariance and the coefficient of correlation.  Covariance - is there any pattern to the way two variables move together?  Coefficient of correlation - how strong is the linear relationship between two variables?

31 Covariance … population mean of variable X, variable Y sample mean of variable X, variable Y Note: divisor is n-1, not n as you may expect.

32 Covariance … In much the same way there was a “ shortcut ” for calculating sample variance without having to calculate the sample mean, there is also a shortcut for calculating sample covariance without having to first calculate the mean:

33 Covariance … (Generally speaking) When two variables move in the same direction (both increase or both decrease), the covariance will be a large positive number. When two variables move in opposite directions, the covariance is a large negative number. When there is no particular pattern, the covariance is a small number.

34 Coefficient of Correlation … The coefficient of correlation is defined as the covariance divided by the standard deviations of the variables: Greek letter “rho” This coefficient answers the question: How strong is the association between X and Y?

35 Coefficient of Correlation … The advantage of the coefficient of correlation over covariance is that it has fixed range from -1 to +1, thus: If the two variables are very strongly positively related, the coefficient value is close to +1 (strong positive linear relationship). If the two variables are very strongly negatively related, the coefficient value is close to -1 (strong negative linear relationship). No straight line relationship is indicated by a coefficient close to zero.

36 Coefficient of Correlation …  or r = +1 0 Strong positive linear relationship No linear relationship Strong negative linear relationship

37 Example 5 (Textbook 4.58) Are the marks one receives in a course related to the amount of time spent studying the subject? To analyze this mysterious possibility, a student took a random sample of 10 students who had enrolled in an accounting class last semester. She asked each to report his or her mark in the course and the total number of hours spent studying accounting. These data are listed here. Time Spent Studying Marks a. Calculate the covariance b. Calculate the coefficient of correlation c. Determine the least squares line d. What do the statistics calculated above tell you about the relationship between marks and study time? e. Calculate the coefficient of determination

38 Solution

39

40 e. R 2 =r 2 =(0.8811) 2 =0.7763


Download ppt "上實習課之前不可不知道的事情 助教:陳佳滎 ( ㄧㄥˊ ) 助教 負責事項 – 禮拜五習題講演 ( 不ㄧ定上到兩點 ) – 每次針對當週老師上課做重點複習與習題練習 – 出作業 – 考試前做複習與重點整理."

Similar presentations


Ads by Google