Presentation is loading. Please wait.

Presentation is loading. Please wait.

E XCEL P ROJECT T UTORIAL. G ETTING YOUR UNIQUE DATA SET … Go to the stat 216 homepage: and.

Similar presentations


Presentation on theme: "E XCEL P ROJECT T UTORIAL. G ETTING YOUR UNIQUE DATA SET … Go to the stat 216 homepage: and."— Presentation transcript:

1 E XCEL P ROJECT T UTORIAL

2 G ETTING YOUR UNIQUE DATA SET … Go to the stat 216 homepage: http://www.stat.wmich.edu/s216http://www.stat.wmich.edu/s216 and click on Weekly Homework Link

3 G ETTING YOUR UNIQUE DATA SET … Under Excel Projects section, Click on HW Data

4 G ETTING YOUR UNIQUE DATA SET … You will be directed to a page containing several data sets. Click on the one assigned for this semester: Realestate Data You would be directed to a page pertaining to that data set. Under Select Variables section, check all the box before each variable. Check these boxes

5 G ETTING YOUR UNIQUE DATA SET … At the bottom of the page, enter 30 for sample size, and your 4-digit pin that you use to access your weekly homework. Enter 30 here Your 4-digit pin here then click on the submit button

6 G ETTING YOUR UNIQUE DATA SET … You would be directed to a page containing your unique data set.

7 C OPYING Y OUR D ATA SET INTO E XCEL … On the page containing your unique data set, select all then copy. Open Microsoft Excel and paste your data set in the first cell Paste your data set here

8 C OPYING Y OUR D ATA SET INTO E XCEL … Click on DATA tab, then text to columns to separate the variables into several columns

9 C OPYING Y OUR D ATA SET INTO E XCEL … You will see this box appear next. Choose Delimited then click next

10 C OPYING Y OUR D ATA SET INTO E XCEL … On the next dialogue box, select how your data set is delimited. In our case, each variable is separated by comma, so make sure only the box referring to comma is checked. Then click on Finish.

11 C OPYING Y OUR D ATA SET INTO E XCEL … You would then see your data set separated into columns. You may edit the font size and everything you want on this data. Since you are going to use this specific data set in all three phases of the project, save this data set with a filename that you could remember. E.g. stat216project

12 E XCEL P ROJECT Phase I

13 P HASE I In this phase, you are expected to identify the type and level of measurement of each variable that you are dealing with. In addition, depending on what kind of variable that you have, what is the appropriate method of data presentation that you could use to present that variable? Furthermore, what measures of location and spread could you compute for these variables to better describe your data set?

14 P HASE I You may construct a table to help guide you on what to do with your variables. Example: VariableTypeLevel of Measurement PriceNumericalRatio ColorCategoricalNominal

15 P HASE I Once you have identified the type and level of measurement of each variable, what graphs or tables could you use to describe categorical variables? What about numerical variables? Microsoft Excel has a data analysis toolpak that could assist you in coming up with graphs. In your Data tab, you should see a button labeled Data Analysis. If not, then you need to install this toolpak.

16 I NSTALLING D ATA A NALYSIS T OOLPAK... In Excel 2007, click on the office button at the top, then choose Excel Options

17 I NSTALLING D ATA A NALYSIS T OOLPAK... You would see this box next. Click on Add-Ins

18 I NSTALLING D ATA A NALYSIS T OOLPAK... You would then be directed to the Add-Ins menu. At the bottom of this menu, select Excel Add-Ins from the Manage drop down list then click on Go

19 I NSTALLING D ATA A NALYSIS T OOLPAK... You would be directed to the Add-Ins menu. Check the box corresponding to Analysis Toolpak then click OK.

20 I NSTALLING D ATA A NALYSIS T OOLPAK... You would then see the Data Analysis button on the Data Tab.

21 G RAPHING V ARIABLES … Suppose you want to create a graph for a variable. Lets say for example, your variable has two categories: 1-Yes and 0-No. For this variable, first thing you need to do is count the number of observations belonging to each category. Then select the appropriate graph that you want to make.

22 G RAPHING V ARIABLES … Open the file containing your data set. Suppose your data set contains a categorical variable, say Pool (0-No, 1-Yes)

23 G RAPHING V ARIABLES … In this particular example, suppose our observations for pool starts from D2 and goes up to D31. In graphing categorical variables you must create a “bin” which contains all the categories of your variable.

24 G RAPHING V ARIABLES … Since we only have two categories for pool we would create a bin that has two categories as well, i.e. 0 and 1.

25 G RAPHING V ARIABLES … Once you have created the bin, click on the Data tab, then click on Data Analysis button. You would see a menu showing all the contents of the Analysis toolpak. Since our goal is to count the number of observations for each category, choose Histogram, then click OK.

26 G RAPHING V ARIABLES … You would then be prompted to enter the Input Range and the Bin range. The input range would be that column containing the observations for the variable. The bin range is that column that contains the categories of the variable.

27 G RAPHING V ARIABLES … In our example: observations for pool starts from D2 to D31, while bin starts from K3 to K4

28 G RAPHING V ARIABLES … Once you click OK, a new worksheet would be created showing the counts for each category of the variable:

29 G RAPHING V ARIABLES … On this worksheet, click on INSERT tab, then choose the graph you want. For example, we want a pie graph. Click on Pie, then choose the type of Pie that you want. It would then show you the Pie graph

30 C OMPUTING S UMMARY S TATISTICS … Suppose for example, you want to describe a variable using some numerical descriptive measures. Let’s say our variable is price of a house. In our data set, lets say this variable is on the first column. Again, click on Data tab, then Data Analysis button. From the menu, select Descriptive Statistics

31 C OMPUTING S UMMARY S TATISTICS … On the Input Range box, enter the range of the variable that you want to compute statistics for.

32 C OMPUTING S UMMARY S TATISTICS … If the first row contains the label of the variable, check the box that says Labels in First Row. Then check the box for Summary statistics, then OK

33 C OMPUTING S UMMARY S TATISTICS … On a new worksheet, the values for some numerical descriptive measures would be displayed. Adjust the column width to clearly see the values.

34 P HASE I W RITE -U P Using all the graphs and computations that you made for the variables, describe the data set that you have on hand. You may or may not use all the variables in your write-up, but you have to give a brief explanation on why you decided to include a particular variable in your project.

35 E XCEL P ROJECT Phase II

36 P HASE II The second phase of the project is focused on estimation and test of hypothesis. In this phase, you are to compute point and interval estimates for a specific variable of interest and draw conclusion based on confidence interval or p-value of the test.

37 P HASE II Suppose for example, we go back to our data set that has variables price and pool. We might be interested to know the average price of a house, or the difference in the average price of a house with and without a pool.

38 P OINT E STIMATION If we are interested in just a point estimate for the average of a specific variable, we could just use the descriptive statistics option under the data analysis menu. (see previous slides for instructions) If we want a confidence interval instead, you could use an excel worksheet that we have provided for you.

39 C ONFIDENCE I NTERVAL We made an excel worksheet that could help you compute your confidence interval for the mean easily. The spreadsheet looks like this:

40 C ONFIDENCE I NTERVAL The first worksheet is designed for one population mean confidence interval. Just follow the instructions that is written on the spreadsheet. This is your Confidence interval

41 C ONFIDENCE I NTERVAL If you are interested in a confidence interval for a difference between two independent means, you would use the second spreadsheet.

42 C ONFIDENCE I NTERVAL First, you need to sort the data set to separate the values according to which category they belong. For example, we want a confidence interval for the average difference in the price for homes with (pool=1) or without pool (pool=0). We need to sort the data set in a way that all those with pool=0 are next to each other, and those with pool=1 are also next to each other.

43 S ORTING YOUR DATA SET Select the entire data set (CTRL + A). Click on the Data Tab, then choose the SORT button.

44 S ORTING Y OUR D ATA S ET You would see the SORT dialogue box appear. Since our data set has the variable names on the first row, check this box.

45 S ORTING Y OUR D ATA S ET Then, from the Sort By drop down menu, choose the variable that you would use as sorting variable. In our case, we would use pool. Once you have selected the appropriate variable, click on OK.

46 S ORTING Y OUR D ATA S ET You would then see your data set sorted according to that variable. All those with Pool=0 are next to each other.

47 C ONFIDENCE I NTERVAL Once you have your data set sorted, follow the instructions in the worksheet. This is Your Confidence interval

48 C ONFIDENCE I NTERVAL Note that since our interest is the difference in the price for with or without pool, what you would copy in the worksheet are the PRICES for those with pool=0 under the 0 column, and the PRICES for those with pool=1 under the 1 column. You could use this confidence interval for drawing conclusion as well.

49 T EST OF H YPOTHESIS There are several functions in the Data Analysis toolpak that you could use to conduct a test of hypothesis. Depending on the test that you are going to conduct, choose the appropriate test.

50 T EST OF H YPOTHESIS Suppose in our example, we want to know if there is a difference in the average price of houses with or without pool. The test that we would use is this one

51 T EST OF H YPOTHESIS Once you click OK, this dialogue box should appear: Specify the range of Values for prices with Pool = 0 here. Specify the range of Values for prices with Pool = 0 here. Set the level of Significance here.

52 T EST OF H YPOTHESIS Suppose in our sample data set, the prices for pool=0 starts from A2 up to A19 while for pool=1, it starts from A20 up to A31. We want to test the hypothesis at 5% level of significance.

53 T EST OF H YPOTHESIS The output would be on a new worksheet. Adjust the column widths to see the numbers clearly. Value of the Test statistic P-value for one-tailed test P-value for two-tailed test

54 P HASE II W RITE -U P Your write-up for phase II should include all your estimates and conclusions that you drew. You must have supporting evidence as to why did you come up with that conclusion. i.e, specify the p-value, and why did it lead you to that conclusion.

55 E XCEL P ROJECT Phase III

56 P HASE III The Final Phase of the project is basically phase I and II combined, with some more information that you could include in your project. For example, by the time you have turned in phase II, we have not covered chi-square, regression and correlation analysis yet. In your final phase, you might want to include some of this analysis to give further meaning to your data set.

57 P HASE III For example, in our data set containing price of a house. What are the variables that are associated with price? What are the variables that you could use to predict the price of a house? Those are just guide questions that could help you analyze your data set further.

58 C ORRELATION A NALYSIS Suppose you want to determine the strength of association between price of the house and the number of bedrooms. In the data analysis toolpak, choose Correlation

59 C ORRELATION A NALYSIS On the dialogue box, highlight the column for price and bedrooms on the input range. Also, check the box for Labels in the first row.

60 C ORRELATION A NALYSIS You would see the output on a new worksheet. This is the correlation coefficient of the two variables.

61 R EGRESSION A NALYSIS Suppose you want to predict the price of the house using say, the number of bedrooms. You could use the Regression Analysis option from the Data Analysis Toolpak.

62 R EGRESSION A NALYSIS On the Regression dialogue box: Specify the range of Values of the variable You want to predict Here Specify the range of Values of the variable You are using to predict the other Variable here

63 R EGRESSION A NALYSIS On a new worksheet, you would see the regression output.

64 P HASE III W RITE -U P In your final project write-up, you are expected to write an executive summary about the entire project. You might want to include these sections in your project in order to provide your readers with an effective paper.

65 P HASE III W RITE -U P Introduction What your data set is all about? What are the variables? What are the questions you intended to answer in this project and what are the methods that you used to answer them? Executive Summary What are your findings? What are the answer to the questions you raised before? What can you conclude on your data set?

66 P HASE III W RITE -U P Appendix a copy of your data set your references


Download ppt "E XCEL P ROJECT T UTORIAL. G ETTING YOUR UNIQUE DATA SET … Go to the stat 216 homepage: and."

Similar presentations


Ads by Google