Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution.

Similar presentations


Presentation on theme: "1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution."— Presentation transcript:

1 1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution of responses is given in the table below: Test the hypothesis “political party is independent of opinion on Proposal 129.” Chapter 6 ~ Two Way Tables Presentation of Bivariate Data

2 2 ES9 Example:A pharmaceutical company conducted an experiment to determine the effectiveness of three new cough suppressants. Each cough syrup was given to 100 random subjects: Is there any evidence to suggest the syrups act differently to suppress coughs? Example

3 3 ES9 Three combinations of variable types: 1.Both variables are qualitative (attribute) 2.One variable is qualitative (attribute) and the other is quantitative (numerical) 3.Both variables are quantitative (both numerical) Bivariate Data Bivariate Data: Consists of the values of two different response variables that are obtained from the same population of interest

4 4 ES9 Two Qualitative Variables When bivariate data results from two qualitative (attribute or categorical) variables, the data is often arranged on a cross- tabulation or contingency table  Example: A survey was conducted to investigate the relationship between promotion and gender. The results are given in the table below:

5 5 ES9 Example A large company is being sued for discrimination in its promotion practices. The suit alleges that women have been systematically denied promotion. The lawyers for the women have obtained access to the company's personnel records for the past 5 years. However, the court has allowed only limited access since the information is highly personal. It will allow the plaintiff to select a sample of 200 employees hired 5 years ago and follow their careers within the firm. The evidence from the sample will be used to determine whether the company is discriminating. Both sides have agreed to the procedure. How would the judge rule in the gender discrimination case?

6 6 ES9 Concerning Contingency Tables Contingency table: an arrangement of data into a two- way classification Data is sorted into cells, and the observed frequency in each cell is reported Contingency table involves two factors, or variables Usual question: Are the two variables independent or dependent? or, Are the distributions of proportions the same or different?

7 7 ES9 r  c Contingency Table: 2.Used to test the independence of the row factor and the column factor 3.n = grand total 5.R 1, R 2,..., R r and C 1, C 2,... C c : marginal totals Important Definitions 1.r: number of rows; c: number of columns 4.Expected frequency in the ith row and the jth column: Each E should be at least 5 n CR E ji     totalGrand alColumn tot totalRow ji, ji,

8 8 ES9 Note Minitab output for the previous example: Tabulated Statistics: Gender, Promoted Rows: Gender Columns: Promoted No Yes All Female 65 25 90 42.75 47.25 Male 30 80 110 52.25 57.75 All 95 105 200 Cell Contents: Count Expected count

9 9 ES9 Marginal Totals This table also displays the marginal totals (or marginals). The total of the marginal totals is the grand total: Note:Contingency tables often show percentages (relative frequencies). These percentages are based on the entire sample or on the subsample (row or column) classifications. Tabulated Statistics: Gender, Promoted Rows: Gender Columns: Promoted No Yes All Female 65 25 90 Male 30 80 110 All 95 105 200 Cell Contents: Count

10 10 ES9 The previous contingency table may be converted to percentages of the grand total by dividing each frequency by the grand total and multiplying by 100 Percentages Based on the Grand Total (Entire Sample) –For example, 25 becomes 12.5% 25 200 10012 5       . Tabulated Statistics: Gender, Promoted Rows: Gender Columns: Promoted No Yes All Female 32.50 12.50 45.00 Male 15.00 40.00 55.00 All 47.50 52.50 100.00 Cell Contents: % of Total

11 11 ES9 These same statistics (numerical values describing sample results) can be shown in a (side-by-side) bar graph: Illustration

12 12 ES9 Percentages Based on Row (Column) Totals The entries in a contingency table may also be expressed as percentages of the row (column) totals by dividing each row (column) entry by that row’s (column’s) total and multiplying by 100. The entries in the contingency table below are expressed as percentages of the column totals: Note:These statistics may also be displayed in a side-by-side bar graph Tabulated Statistics: Gender, Promoted Rows: Gender Columns: Promoted No Yes All Female 72.22 27.78 100.00 Male 27.27 72.73 100.00 All 47.50 52.50 100.00 Cell Contents: % of Row

13 13 ES9 Side by Side Bar Graph Conditional Probability

14 14 ES9 Intervening Variables There can be more than one intervening variable. For example, we could classify people by course work and overseas experience. Here we are using two intervening variables as potential explanations for the differences in percentages of males and females that have been promoted. As more intervening variables are included, the number of cross- tabs tables increases while the sample size for each table decreases. The sample size for each cell of the cross-tabs, table may become so small that we cannot reach valid conclusions about potential gender discrimination.

15 15 ES9 Effect of Course Work Tabulated Statistics: Gender, IB Work?, Promoted Rows: Gender / IB Work? Columns: Promoted No Yes All Female No 64 6 70 Yes 1 19 20 Male No 27 3 30 Yes 3 77 80 All All 95 105 200 Cell Contents: Count

16 16 ES9 Effect of Course Work Tabulated Statistics: Gender, IB Work?, Promoted Rows: Gender / IB Work? Columns: Promoted No Yes All Female No 91.43 8.57 100.00 Yes 5.00 95.00 100.00 Male No 90.00 10.00 100.00 Yes 3.75 96.25 100.00 All All 47.50 52.50 100.00 Cell Contents: % of Row

17 17 ES9 Simpson’s Paradox Suppose we are observing several groups, and establish a relationship or correlation for each of these groups. Simpson’s paradox says that when we combine all of the groups together, and look at the data in aggregate form, the correlation that we noticed before may reverse itself. This is most often due to lurking variables that have not been considered, but sometimes it is due to the numerical values of the data.


Download ppt "1 ES9 A random sample of registered voters was selected and each was asked his or her opinion on Proposal 129, a property tax reform bill. The distribution."

Similar presentations


Ads by Google