Presentation on theme: "Part 0: Introduction 0-1/18 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics."— Presentation transcript:
Part 0: Introduction 0-1/18 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department of Economics
Part 0: Introduction 0-2/18 Statistics and Data Analysis Part 0 - Introduction
Part 0: Introduction 0-3/18 Professor William Greene; Economics and IOMS Departments Office: KMEC, 7-90 (Economics Department) Office phone: 212-998-0876 Email: email@example.com URL: http://people.stern.nyu.edu/wgreene http://people.stern.nyu.edu/wgreene/Statistics/Outline.htm
Part 0: Introduction 0-4/18 Course Objectives Basic Understanding Understand random outcomes and random information Understand statistical information as the measured outcomes of random processes Technical Know How Learn how to analyze statistical information Statistical analysis Model building Learn how to present statistical information
Part 0: Introduction 0-5/18 What Does it Mean? Slightly more than one-third of Americans have a favorable opinion of the Democratic-led Congress, a poll said Wednesday. The Pew Research Center for the People & the Press said the 37% expressing a positive opinion represents a decline of 13 points since April. The favorable percentage is one of the lowest in more than two decades of Pew surveys – if not the lowest, the poll said. The previous low was 40% in January, but the result is not statistically significant because of the margin of error. (USA Today, 9/3/09, page 4)
Part 0: Introduction 0-7/18 Really? To Get Rid of Hiccups, Have Someone Startle You. The truth is: Most home remedies, like holding your breath or drinking from a glass of water backward, haven't been medically proven to be effective, says Pollack. However, you can try this trick dating back to 1971, when it was published in The New England Journal of Medicine: Swallow one teaspoon of white granulated sugar. According to the study, this tactic resulted in the cessation of hiccups in 19 out of 20 afflicted patients. Posted August 31, 2010, cnn.com http://www.cnn.com/2010/HEALTH/08/31/rs.12.health.myths/index.html?iref=allsearch
Part 0: Introduction 0-8/18 Heard on the Street? Dear Professor Greene, The WSN is trying to poll people on the Park51 Mosque debate. I saw that you were an statistics/data analysis professor and I was wondering if you could explain how we should go about conducting this poll. For example, approximatley [sic] how many people would we need to poll for the data to be completley [sic] unbaised? Email received September 5, 2010
Part 0: Introduction 0-9/18 The following was taken from http://www.msnbc.msn.com/id/27339545/ An msnbc.com guide to presidential polls Why results, samples and methodology vary from survey to survey WASHINGTON - A poll is a small sample of some larger number, an estimate of something about that larger number. For instance, what percentage of people reports that they will cast their ballots for a particular candidate in an election? A sample reflects the larger number from which it is drawn. Lets say you had a perfectly mixed barrel of 1,000 tennis balls, of which 700 are white and 300 orange. You do your sample by scooping up just 50 of those tennis balls. If your barrel was perfectly mixed, you wouldnt need to count all 1,000 tennis balls your sample would tell you that 30 percent of the balls were orange.
Part 0: Introduction 0-10/18 Technical Help Wanted Our firm is looking for a [Ph.D.-level] statistician to assist us in analyzing a simple database of compensation levels. Our database includes 93 unique records for different institutions. We expect to analyze two dependent variables against 13 independent variables. We need to perform multivariate regression analysis to determine which of the variables are statistically significant. We also need to calculate the t-statistics for each of the independent variables and adjusted r-squared values for the multivariate regression model developed. We expect that some of the variables may need to be transformed prior to creating the regression analysis. Additional statistical approaches and techniques may be required as appropriate. Subsequent to the analysis of each of the variables, we will require a brief write-up detailing any relationships (or lack thereof) uncovered through the analysis. We anticipate that this write-up will be approximately 2-3 pages in length, excluding any supporting appendices. This write up should describe, in plain English, all relevant details regarding the analysis.
Part 0: Introduction 0-11/18 Course Prerequisites Basic algebra. (Especially summation) Geometry (straight lines) Logs and exponents NOTE: I (you) will use only base e (natural) logs, not base 10 (common) logs in this course. A smattering of simple calculus. (I may use two or three derivatives during the entire semester.)
Part 0: Introduction 0-12/18 Course Materials Notes: Distributed in first class Text: Hildebrand, Ott and Gray. Basic Statistical Ideas for Managers, 2 nd ed. (Recommended, not required) On the course website: Miscellaneous notes and materials Class slide presentations Problem sets http://people.stern.nyu.edu/wgreene/Statistics/Outline.htm
Part 0: Introduction 0-13/18 Course Software: Minitab The Current Version: Minitab 16 Buy: Professional Bookstore Rent: e5.onthehub.com $29.99 to rent for 6 months, $99.99 to own Search: e5.onthehub.com minitab
Part 0: Introduction 0-14/18 Course Outline and Overview 1. Presenting Data Data Types Information content Data Description Graphical devices: Plots, histograms Statistical: Summary statistics
Part 0: Introduction 0-15/18 Data: House Price Listings and Per Capita Income by State How to describe/summarize them. How to explain the variation across states How to determine if there is any connection between the two variables.
Part 0: Introduction 0-16/18 Course Outline and Overview 2. Explaining How Random Data Arise Probability: Understanding unpredictable outcomes Precise mathematical principles of random outcomes e.g., gambling and games of chance Models = descriptions of random outcomes that dont have fixed mathematical laws The Normal distribution THE fundamental model for outcomes involving behavior Model building for random outcomes using the normal distribution
Part 0: Introduction 0-17/18 Course Outline and Overview 3. Modeling Relationships Between Outcomes What is correlation? Simple linear regression: Connecting one variable with another Multiple regression Model building Understanding covariation of more than one variable. Correlation = 0.428. Is this large? Hawaii. Outlier?
Part 0: Introduction 0-18/18 Course Outline and Overview - 4 Statistical inference Hypothesis testing: (Is the correlation large? Can we be confident that it not actually zero?) Hypothesis tests for specific applications Mean of a population: Is it a specific value? Applications in regression: Are the variables in the model really related? An application in marketing: Did the sales promotion work? How would you find out?