Properties of Community Data in Ecology Adapted from Ecological Statistical Workshop, FLC, Daniel Laughlin.

Slides:



Advertisements
Similar presentations
Lecture 3: A brief background to multivariate statistics
Advertisements

An Introduction to Multivariate Analysis
Chapter 4 – Finite Fields. Introduction will now introduce finite fields of increasing importance in cryptography –AES, Elliptic Curve, IDEA, Public Key.
Linear Algebra Applications in Matlab ME 303. Special Characters and Matlab Functions.
CHAPTER 24 MRPP (Multi-response Permutation Procedures) and Related Techniques From: McCune, B. & J. B. Grace Analysis of Ecological Communities.
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Community Measurements. Indirect Gradient Analysis o Use Importance Values (Sum of Relative Frequency, Rel. Dominance, Rel. Density)
PPA 501 – Analytical Methods in Administration Lecture 8 – Linear Regression and Correlation.
Networks, Lie Monoids, & Generalized Entropy Metrics Networks, Lie Monoids, & Generalized Entropy Metrics St. Petersburg Russia September 25, 2005 Joseph.
Lecture 2 PY 427 Statistics 1 Fall 2006 Kin Ching Kong, Ph.D
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
Types of Data Displays Based on the 2008 AZ State Mathematics Standard.
CHAPTER 19 Correspondence Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Distance Measures and Ordination
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Session 7.1 Bivariate Data Analysis
Data Basics. Data Matrix Many datasets can be represented as a data matrix. Rows corresponding to entities Columns represents attributes. N: size of the.
10/17/071 Read: Ch. 15, GSF Comparing Ecological Communities Part Two: Ordination.
Visual Recognition Tutorial1 Random variables, distributions, and probability density functions Discrete Random Variables Continuous Random Variables.
CHAPTER 30 Structural Equation Modeling From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach,
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.
9/17/071 Community Properties Reading assignment: Chapter 9 in GSF.
1 4. Multiple Regression I ECON 251 Research Methods.
The Multivariate Normal Distribution, Part 1 BMTRY 726 1/10/2014.
Separate multivariate observations
Community Ordination and Gamma Diversity Techniques James A. Danoff-Burg Dept. Ecol., Evol., & Envir. Biol. Columbia University.
OUR Ecological Footprint …. Ch 20 Community Ecology: Species Abundance + Diversity.
Linear Regression Inference
CENTRE FOR INNOVATION, RESEARCH AND COMPETENCE IN THE LEARNING ECONOMY Session 2: Basic techniques for innovation data analysis. Part I: Statistical inferences.
STATISTICS: BASICS Aswath Damodaran 1. 2 The role of statistics Aswath Damodaran 2  When you are given lots of data, and especially when that data is.
B AD 6243: Applied Univariate Statistics Understanding Data and Data Distributions Professor Laku Chidambaram Price College of Business University of Oklahoma.
CHAPTER 26 Discriminant Analysis From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon.
Basic concepts in ordination
Agronomic Spatial Variability and Resolution What is it? How do we describe it? What does it imply for precision management?
Biostatistics: Measures of Central Tendency and Variance in Medical Laboratory Settings Module 5 1.
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach, Oregon
BPS - 3rd Ed. Chapter 161 Inference about a Population Mean.
Principal Component Analysis (PCA). Data Reduction summarization of data with many (p) variables by a smaller set of (k) derived (synthetic, composite)
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Psychology 820 Correlation Regression & Prediction.
ORDINATION What is it? What kind of biological questions can we answer? How can we do it in CANOCO 4.5? Some general advice on how to start analyses.
MGT-491 QUANTITATIVE ANALYSIS AND RESEARCH FOR MANAGEMENT OSMAN BIN SAIF Session 22.
Lecture 07: Dealing with Big Data
© Buddy Freeman, 2015 Let X and Y be two normally distributed random variables satisfying the equality of variance assumption both ways. For clarity let.
Data Summary Using Descriptive Measures Sections 3.1 – 3.6, 3.8
PCB 3043L - General Ecology Data Analysis.
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Warsaw Summer School 2015, OSU Study Abroad Program Normal Distribution.
Statistical Fundamentals: Using Microsoft Excel for Univariate and Bivariate Analysis Alfred P. Rovai Charts Overview PowerPoint Prepared by Alfred P.
Chapter 2 Frequency Distributions PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh Edition by Frederick J Gravetter.
Important Properties of Distributions:
Matrices and Vector Concepts
Chapter 3: Maximum-Likelihood Parameter Estimation
Bivariate & Multivariate Regression Analysis
Statistical Data Analysis - Lecture /04/03
CH 5: Multivariate Methods
Matrix Algebra - Overview
Matrices Elements, Adding and Subtracting
Principal Component Analysis (PCA)
Analytics – Statistical Approaches
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
The student is expected to: 7C analyze and evaluate how natural selection produces change in populations, not individuals; 7D analyze and evaluate how.
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

Properties of Community Data in Ecology Adapted from Ecological Statistical Workshop, FLC, Daniel Laughlin

Community Data Summary Community data matrices Species on gradients Problems with community data Normality assumptions Key questions to keep in the back of your mind: 1. How do species abundances relate to each other? 2. How do species relate to environmental gradients?

Community data matrices or Molecular marker (abundance or presence/absence used as a measure of species performance) Independent sample units Traits SPARSE

Full Community Dataset n = # of sample units (plots) p = # of species t = # of traits e = # of environmental variables or factors d = # of dimensions n x pn x en x tn x d t x pt x e e x p plots in species space plots in envir space plots in trait space plots in reduced species space traits in species space used for species in environmental space (A’E) traits in envir space d x p species in reduced plot space Ordination can address more questions than how plots differ in composition…

Species on environmental gradients Gaussian ideal - peak abundances, nonlinear - this is challenging to analyze Linear responses to gradients - okay for short gradients

Major Problems with Community Data 1.Species responses have the “zero truncation problem” 2.Curves are “solid” due to the action of many other factors 3.Response curves can be complex 4.High beta diversity 5.Nonnormal species distributions

Major Problems with Community Data species responses truncated at zero only zeros are possible beyond limits no info on how unfavorable the environment is for a species “curves” are typically solid envelopes rather than curves species is usually less abundant than its potential (even zeros are possible) 1. Zero truncation 2. “Solid” curves

Major Problems with Community Data 3. Complex curves -polymodal, asymmetric, discontinuous Average lichen cover on twigs in shore pine bogs in SE Alaska.

High beta diversity Beta diversity = the difference in community composition between communities along an environmental gradient or among communities within a landscape

Whittaker’s (1972) Beta Diversity γ = number of species in composite sample (total number of species) ά = average species richness in the sample units No formal units, but can be thought of as ‘number of distinct communities” The one is subtracted to make zero beta diversity correspond to zero variation in species turnover. Rule of thumb: β w 5 are high

Are species distributions normal? Univariate normality (it’s what we’re used to) Bivariate normality (it’s easy to visualize) –Idealized community data –Real community data Multivariate normality (straightforward extension of bivariate normality to multiple dimensions)

Univariate normality Normality can be assessed by: skewness (asymmetry), and kurtosis (peakiness) Skew = 0 Kurtosis = 0

Skewness Community data will nearly always be positively skewed due to lots of zeroes Linear models require |skew| < 1 Assess skewness of data in PCORD (Row and Column Summary)

Positively skewed distribution typical of community data PLHE HYVI HYIN

Bivariate Normality Views from above

Bivariate Species Distributions Idealized Gaussian species response curves positive association negative association bivariate distribution is non-linear dust bunny distribution- plotting one species against another (lots of points near orgin and along axes)

Bivariate Species Distributions Realistic data with “solid” response curves positive association negative association dust bunny distribution

Bi- and Tri-variate Distributions Bivariate normal distribution forms elliptical cloud Bivariate distribution with most points lying near one or two axes Multivariate normal distribution (hyperellipsoid) Multivariate dust bunny distribution

Dust bunny in 3-D species space Environmental gradients form strong non-linear shape in species space

A: cluster within the cloud of points (stands) occupying vegetation space. B: 3 dimensional abstract vegetation space: each dimension represents an element (e.g. proportion of a certain species) in the analysis (X Y Z axes). A, the results of a classification approach (here attempted after ordination) in which similar individuals are grouped and considered as a single cell or unit. B, the results of an ordination approach in which similar stands nevertheless retain their unique properties and thus no information is lost (X1 Y1 Z1 axes). Key Point: Abstract space has no connection with real space from which the records were initially collected.

Multivariate Normality Linear algebra easily extends these concepts into multiple dimensions Most multivariate methods assume multivariate normality (linear ordination methods) Ecological data are seriously abnormal Thus, we will often require different methods