Presentation is loading. Please wait.

Presentation is loading. Please wait.

Life after linear regression

Similar presentations


Presentation on theme: "Life after linear regression"— Presentation transcript:

1 Life after linear regression
A survey of Penn State applied statistics graduate courses

2 The courses Stat 500: Applied Statistics Stat 501: Regression Methods
Stat 502: Analysis of Variance & Design of Expts Stat 503: Design of Experiments Stat 504: Analysis of Discrete Data Stat 505: Applied Multivariate Statistical Analysis Stat 506: Sampling Theory and Methods Stat 509: Biostatistical Methods Stat 510: Applied Time Series Analysis

3 Stat 500: Applied Statistics
Topics covered: Descriptive statistics Hypothesis testing and power Estimation and confidence intervals Regression One- and two-way ANOVA Chi-square tests Prerequisites 2 credits of algebra

4 Stat 501: Regression Methods
Topics covered: Analysis of research data through simple and multiple regression and correlation Polynomial models Indicator variables Stepwise and piecewise regression Logistic regression Prerequisites 6 credits of statistics or Stat 500; matrix algebra

5 Stat 502: Analysis of Variance and Design of Experiments
Analysis of data when: the response y is continuous the predictors (called factors or treatments) are all qualitative have same error assumptions as for regression Do the means differ among the groups defined by the factor combinations?

6 Stat 502: Analysis of Variance and Design of Experiments
Topics covered: Analysis of variance and design concepts Factorial, nested and unbalanced data Analysis of covariance Blocked designs Latin-square, split-plot, repeated measures designs Multiple comparisons Prerequisites Stat 501 (or undergraduate version Stat 462)

7 A Stat 502 Example: Intertidal Seaweed Grazers
To study influence of ocean grazers on regeneration rates of seaweed in intertidal zone, a researcher scraped square rock plots free of seaweed and observed the seaweed regeneration when certain types of seaweed-grazing animals were denied access. Research questions: Which grazer consumes most seaweed? Do different grazers influence impact of each other? Are grazing effects similar in all microhabitats?

8 A Stat 502 Example: Intertidal Seaweed Grazers
The grazers were limpets (L), small fishes (f), and large fishes (F): LfF: all three grazers were allowed access fF: limpets were excluded using caustic paint Lf: large fish were excluded using coarse net f: limpets and large fish were excluded L: small, large fish excluded using fine net C: the control group, all excluded

9 A Stat 502 Example: Intertidal Seaweed Grazers
Intertidal zone is a highly variable environment. Researcher applied treatments in 8 blocks of 12 plots each: #1: Just below high tide, exposed to heavy surf #2: Just below high tide, protected from surf #3: Midtide, exposed #4: Midtide, protected #5: Just above low tide level, exposed #6: Just above low tide level, protected #7: On near-vertical rock wall, midtide, protected #8: On near-vertical rock wall, above low tide, protected

10 A Stat 502 Example: Percent of regenerated seaweed on intertidal plots with some grazers excluded
Block Control L f Lf fF LfF 1 14, 23 4, 4 11, 24 3, 5 10, 13 1, 2 2 22, 35 7, 8 14, 31 3, 6 10, 15 3 67, 82 28, 58 52, 59 9, 31 44, 50 6, 9 4 94, 95 27, 35 83, 89 21, 57 57, 73 7, 22 5 34, 53 11, 33 33, 34 5, 9 26, 42 5, 6 6 58, 75 16, 31 39, 52 26, 43 38, 42 10, 17 7 19, 47 6, 8 43, 53 4, 12 29, 36 5, 14 8 53, 61 15, 17 30, 37 12, 18 11, 40 5, 7

11 Stat 503: Design of Experiments
The key word is “experiments” When you can control the values of your predictors (factors), you should ensure you can answer your research question by: Collecting the appropriate measurements Setting the values of your factors appropriately Reducing extraneous variation by “blocking” Having an appropriate sample size

12 Stat 503: Design of Experiments
Topics covered: Design principles Optimality Confounding in split-plot designs Repeated measures designs, fractional factorial designs, response surface designs Balanced/partially balanced incomplete block designs Prerequisites: Stat 501 (or undergraduate Stat 462) Stat 502

13 A Stat 503 Example: The BARGE Study
Current standard treatment for patients with mild to moderate asthma is scheduled daily use of inhaled albuterol. Now hypothesized that such regular use has a negative effect on lung function in patients with B16Arg/Arg genotype, but not in those with B16Gly/Gly genotype.

14 A Stat 503 Example: The BARGE Study
The BARGE Study concerns comparing the regular use of inhaled albuterol (A) to placebo (P) in patients with the B16Arg/Arg genotype (R) and in patients with the B16GlyGly genotype. The primary hypothesis concerns inference about whether (μRA- μRP)- (μGA- μGP) is 0.

15 A Stat 503 Example: BARGE Study’s Paired Crossover
Order Period 1 Washout Period 2 GenotypeR 1 (AP) Y1jRA --- Y1jRP 2 (PA) Y2jRP Y2jRA GenotypeG Y1jGA Y1jGP Y2jGP Y2jGA

16 Stat 504: Analysis of Discrete Data
Analysis of data when: the response y is binary or discrete the predictors are qualitative or quantitative Summarized data are frequency counts How do the predictors affect the response?

17 Stat 504: Analysis of Discrete Data
Topics covered: Models for frequency arrays Goodness-of-fit tests Two-, three- and higher-way tables Latent models Logistic and Poisson regression models Prerequisites Stat 502 (or undergraduate Stat 460 or major Stat 512) Matrix algebra

18 A Stat 504 Example: Survival in the Donner Party
In 1846, Donner and Reed families traveled from Illinois to California by covered wagon. Group became stranded in eastern Sierra Nevada mountains when hit by heavy snow. 40 of 87 members (45 adults over age 15) died from famine and exposure. Are females better able to withstand harsh conditions than are males?

19 A Stat 504 Example: Survival in the Donner Party

20 A Stat 504 Example: Survival in the Donner Party
Link Function: Logit Response Information Variable Value Count STATUS SURVIVED (Event) DIED Total Logistic Regression Table Odds % CI Predictor Coef SE Coef Z P Ratio Lower Upper Constant AGE Gender

21 Stat 505: Applied Multivariate Statistical Analysis
Analysis of data when you have several correlated, continuous responses is called multivariate data analysis. A repeated measure is a special kind of multivariate response obtained by measuring the same variable on each subject several times, possibly under different conditions.

22 Stat 505: Applied Multivariate Statistical Analysis
Topics covered: Multivariate data: matrix review, graphical displays, probability theory, multivariate normal distribution, partial correlations Inferences about multivariate means: Hotelling’s T2 tests, multivariate analysis of variance, repeated measures experiments and growth curves, discriminant analysis Data reduction: Principal components, factor analysis, canonical correlation analysis, cluster analysis Structural equation modeling Prerequisites: 6 credits in statistics Matrix algebra

23 A Stat 505 Example: Pottery Data
Pottery samples were collected from four sites in the British Isles: Llanedyrn, Caldicot, Isle Thornes, and Ashley Rails. Each piece analyzed for its aluminum, iron, magnesium, calcium, and sodium content. Do the pottery samples from the four sites differ with respect to their composition?

24 A Stat 505 Example: Pottery Data

25 Stat 506: Sampling Theory and Methods
Topics covered: Basic methods: simple random sampling, selecting sample sizes, unequal probability sampling, ratio and regression estimation, stratified sampling, cluster and systematic sampling, multistage designs, double sampling Special topics: sampling hidden human populations, environmental sampling, sampling to study cause-and-effect relationships, resampling of data, measurement errors and nonresponse in surveys, adaptive sampling, network and snowball sampling Prerequisites: Calculus 3 credits in statistics

26 A Stat 506 Example: A Water Pollution Survey
Study region of interest has 320 lakes. Take random sample of the lakes by: Drawing a rectangle of length l and width w around study region. Generate pairs of (0,1) random numbers. Multiple first number by l, second by w to get random location coordinates within region. If location is a lake, then lake is selected. Continue until required number of lakes selected.

27 Stat 509: Biostatistics Topics covered: Prerequisites:
An introduction to the design and statistical analysis of randomized and observational studies in biomedical research Prerequisites: Stat 500

28 Stat 510: Applied Time Series Analysis
Topics covered: Identification of models for empirical data collected over time Use of models in forecasting Prerequisites: Stat 501 (or undergraduate Stat 462 or major Stat 511)

29 A Stat 510 Example: Measuring Global Warming
Temperature (in degrees Celsius) averaged for the northern hemisphere over a full year. Temperature series collected from 1880 to 1987. All measurements expressed as differences from their 108-year mean. Research questions: Is the mean temperature increasing over the 88 years? What is the rate of increase in global temperature over the past century?

30 A Stat 510 Example: Measuring Global Warming

31 A Stat 510 Example: Measuring Global Warming


Download ppt "Life after linear regression"

Similar presentations


Ads by Google