Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of.

Similar presentations


Presentation on theme: "Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of."— Presentation transcript:

1 Using Excel Biostatistics 212 Lecture 4

2 Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of a variable with missing values Stata considers missing values to be very large!!? –agecat with chi2 or “exact” vs. age with ttest Final Project Dataset! –Check in

3 Housekeeping Lab ends at 4pm…at least for the TA’s –Thanks to TA’s who stayed longer last Tuesday!

4 Today... Why are we talking about spreadsheets? Pro’s and Con’s of using a spreadsheet for: –Data management, Statistics, Calculating, Modeling, Tables, Figures Cells Formulas Cutting and pasting formulas Spreadsheet style Examples

5 Why spreadsheets? Excel is widely used, and for good reason –Store numbers and text –Calculations –Desktop graphics – Tables and Figures –Flexible creation of ledgers, models, other complex programs

6 Why spreadsheets? How is a spreadsheet different than Stata’s data editor? –Less structured –Formulas –Formatting

7 Why spreadsheets? How is a spreadsheet different than a database program like Access? –Less structured –Formula chains –Formatting

8 Pro’s and Con’s of spreadsheets For data management –Pro’s Easy start – just name columns and start typing –Con’s No structure Can’t sort, filter or query data “Flat” file – no relational table structure allowed

9 Pro’s and Con’s of spreadsheets For statistical analysis –Pro’s Easy start, if you know how to do formulas –Con’s Extremely limited range of options Difficult to document

10 Pro’s and Con’s of spreadsheets For calculating, or “modeling” –Pro’s Repetitive calculations easy Complex calculations easy –Con’s Simple, 1-time calculations not as fast as a calculator Sometimes hard to decipher in retrospect

11 Pro’s and Con’s of spreadsheets Tables and Figures – will discuss in Session 6

12 Cells The basic building block of a spreadsheet Can contain: –Numbers –Text –Dates, times, other special formats –“blanks” Start with 46 million blank cells! (230 cols x 66536 rows x 3 worksheets)

13 Cells, cont Enter anything you like into each cell (numbers, text, symbols, etc) using keyboard Contents displayed on spreadsheet Organized and named by column/row

14 Formulas Use when you want the contents of one cell to depend on the contents of other cells ALWAYS starts with: = (an “equals sign”)

15 Formulas Can contain: –Numbers –Text –References to cells –The usual math operators (+ - * / ^ ) –Built-in functions

16 Formulas Cell contents update automatically when a referenced cell content changes “Chains” of formulas make for flexible calculating

17 Formulas Contents of a cell displayed on spreadsheet The formula determining that content is displayed in the “formula box” Example

18 Formulas Types of formulas –Simple arithmetic operators +, -, *, /, ^ –Built-in functions LN(number) –Returns the natural log of a number ABS(number) –Returns the absolute value of a number SUM(range of cells) –Returns the sum of the values in the range –SUM(A5:A10) AVERAGE(range of cells) –Returns the average of the values in the range

19 Formulas Types of formulas –Built-in functions, con’t: STDEV(range of cells) –Returns the standard deviation NORMINV(probability, mean of dist, SD of dist) –Returns the z-value associated with a given probability… MAX(number1, number2, etc), MIN(…) PMT(int rate, #payments, principal)

20 Formulas Types of formulas –Logic IF(boolean, value 1, value 2) –Returns value 1 if TRUE, value 2 if FALSE –The value can be a number, an expression, a string, etc. AND(boolean, boolean, boolean…) –Returns TRUE if all booleans are true, otherwise FALSE OR(boolean, boolean, boolean…) –Returns TRUE if any booleans are true, otherwise FALSE

21 Formulas Types of formulas, cont –String manipulation LEFT(text, #characters) –Returns the left part of the string (the # of characters specified) –See also RIGHT(…) and MID(…) VALUE(text) –Converts a number in a string to an actual number (like “ destring ” in Stata) CONCATENATE(text1, text2,…) –Puts strings together –Misc NOW() –Returns the current date, time; see HOUR(…), etc ISBLANK(…), ISERROR(…), ISNUMBER(…)

22 Formulas Tips –Use parentheses IF(SUM(A5:A10)>5,1,IF(C9=“y”,2,3)) –Or do in multiple steps

23 Cutting/Copying and Pasting Cutting and Copying treat formulas differently!

24 Cutting and pasting formulas Excel assumes the cell references are ABSOLUTE, and you’re just moving the location of the formula cell Example

25 Copying and pasting formulas Excel assumes the cell references are RELATIVE Example Shortcut: drag little square in the corner…

26 Copying and pasting formulas If you want to FIX the position of a referenced cell, use $’s = A5 + $B$6 Example

27 Examples Repetitive calculations –Back-transforming linear regression coefficients Complex calculations –2 x 2 template –Converting “Plate” map to data for Telomere Length data Modeling –Mortgage calculator –Risk integrator –Figure 2 for LDL-lowering paper

28 Spreadsheet style Formatting –Text –Column width –Borders –Placement of stuff on the page

29 Spreadsheet style For models: –Inputs on the left, in red –Outputs on the right, in blue, boxed, bolded, etc –Calculations on other sheets –“Protect” all cells besides inputs Format/Cells…/Protection Tools/Protect

30 Take home points Understand cells and formulas Use copy/paste with and without fixed cells ($A$45) Good formatting adds significant value to your spreadsheet

31 Today in Lab… Practice with: –A repetitive calculation spreadsheet –A complex calculation spreadsheet –Introduction to making a figure with Excel NOTE: a “picture” of the answers are on the back sheet! Due before lecture next week Extra credit puzzle challenge – 2x2 excel template –Due Sept 21 th – email to mpletcher@epi.ucsf.edu

32 To come… Next lecture –Epidemiologic analysis with Stata 2 x 2 tables, confounding and interaction Epitab commands Logistic regression introduction


Download ppt "Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode –Cross-checking/recoding missing values –Analysis of."

Similar presentations


Ads by Google