Maxent Implements “Maximum Entropy” modeling –Entropy = randomness –Maximizes randomness by removing patterns –The pattern is the response Website with.

Slides:

Advertisements

Similar presentations

What Could We Do better? Alternative Statistical Methods Jim Crooks and Xingye Qiao.

Advertisements

Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.

Model generalization Test error Bias, variance and complexity

Robert Plant != Richard Plant. Sample Data Response, covariates Predictors Remotely sensed Build Model Uncertainty Maps Covariates Direct or Remotely.

Model Assessment, Selection and Averaging

Model Assessment and Selection

Model assessment and cross-validation - overview

CMPUT 466/551 Principal Source: CMU

Paper presentation for CSI5388 PENGCHENG XI Mar. 23, 2005

The loss function, the normal equation,

Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.

Maxent interface.

Lecture 9 Inexact Theories. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture 03Probability and.

Section 4.2 Fitting Curves and Surfaces by Least Squares.

Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

Statistics for Managers Using Microsoft® Excel 5th Edition

Supervised classification performance (prediction) assessment Dr. Huiru Zheng Dr. Franscisco Azuaje School of Computing and Mathematics Faculty of Engineering.

Resampling techniques

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.

Modeling Gene Interactions in Disease CS 686 Bioinformatics.

Modeling Species Distribution with MaxEnt

An Introduction to Logistic Regression

Basics of regression analysis

Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.

Ensemble Learning (2), Tree and Forest

SW388R7 Data Analysis & Computers II Slide 1 Multiple Regression – Split Sample Validation General criteria for split sample validation Sample problems.

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

Collaborative Filtering Matrix Factorization Approach

沈致远. Test error(generalization error): the expected prediction error over an independent test sample Training error: the average loss over the training.

Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.

Climate Predictability Tool (CPT) Ousmane Ndiaye and Simon J. Mason International Research Institute for Climate and Society The Earth.

Bootstrap and Cross-Validation Bootstrap and Cross-Validation.

ENSEMBLE LEARNING David Kauchak CS451 – Fall 2013.

Ryan DiGaudio Modified from Catherine Jarnevich, Sunil Kumar, Paul Evangelista, Jeff Morisette, Tom Stohlgren Maxent Overview.

Linear vs. Logistic Regression Log has a slightly better ability to represent the data Dichotomous Prefer Don’t Prefer Linear vs. Logistic Regression.

Why Model? Make predictions or forecasts where we don’t have data.

Museum and Institute of Zoology PAS Warsaw Magdalena Żytomska Berlin, 6th September 2007.

Chapter 4: Introduction to Predictive Modeling: Regressions

CROSS-VALIDATION AND MODEL SELECTION Many Slides are from: Dr. Thomas Jensen -Expedia.com and Prof. Olga Veksler - CS Learning and Computer Vision.

Logistic Regression. Linear Regression Purchases vs. Income.

Multiple Logistic Regression STAT E-150 Statistical Methods.

Machine Learning Tutorial-2. Recall, Precision, F-measure, Accuracy Ch. 5.

Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

Assignments CS fall Assignment 1 due Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between.

Ryan DiGaudio Modified from Catherine Jarnevich, Sunil Kumar, Paul Evangelista, Jeff Morisette, Tom Stohlgren Maxent Overview.

1 Chapter 4: Introduction to Predictive Modeling: Regressions 4.1 Introduction 4.2 Selecting Regression Inputs 4.3 Optimizing Regression Complexity 4.4.

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

Tutorial I: Missing Value Analysis

Uncertainty “God does not play dice” –Einstein “the end of certainty” –Prigogine, 1977 Nobel Prize What remains is: –Quantifiable probability with uncertainty.

More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.

Chapter 5: Credibility. Introduction Performance on the training set is not a good indicator of performance on an independent set. We need to predict.

Downloading the MAXENT Software

Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.

Use of Maxent for predictive habitat mapping of CWC in the Bari canyon Bargain Annaëlle Foglini Federica, Bonaldo Davide, Pairaud Ivane & Fabri Marie-Claire.

Why Model? Make predictions or forecasts where we don’t have data.

BINARY LOGISTIC REGRESSION

Deep Feedforward Networks

Robert Plant != Richard Plant

Probability Theory and Parameter Estimation I

How Good is a Model? How much information does AIC give us?

Ch3: Model Building through Regression

Statistics in MSmcDESPOT

Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Collaborative Filtering Matrix Factorization Approach

QQ Plot Quantile to Quantile Plot Quantile: QQ Plot:

Uncertainty “God does not play dice”

Jensen, et. al Winter distribution of blue crab Callinectes sapidus in Chesapeake Bay: application and cross-validation of a two-stage generalized.

Cross-validation Brenda Thomson/ Peter Fox Data Analytics

More on Maxent Env. Variable importance:

Presentation transcript:

Maxent Implements “Maximum Entropy” modeling –Entropy = randomness –Maximizes randomness by removing patterns –The pattern is the response Website with papers: – nt/

Densities

Logit – Inverse of Logistic

MaxEnt’s “Model”

MaxEnt Optimizes “Gain” “Gain in MaxEnt is related to deviance” –See Phillips in the tutorial MaxEnt generates a probability distribution of pixels in the grid starting at uniform and improving the fit to the data “Gain indicates how closely the model is concentrated around presence samples” –Phillips

Gain

Regularization

Background Points 10,000 random points Uses all pixels if <10,000 samples

MaxEnt really… MaxEnt tries to create a probability surface in hyperspace where: –Values are near 1.0 where there are lots of points –Values are near 0.0 where there are few or no points

Synthetic Habitat & Species

MaxEnt Outputs

Threshold~0.5 Threshold~0.2Threshold~0.0

Cumulative Threshold Threshold of 0 = Entire area No omission for entire area All points omitted for no area Threshold of 100% = no area

Definitions Omission Rate: Proportion of points left out of the predicted area for a threshold Sensitivity: Proportion of points left in the predicted area –1 – Omission Rate Fractional Predicted Area: –Proportion of area within the thresholded area Specificity: Proportion of area outside the thresholded area –1 – Fractional Predicted Area:

Receiver-Operator Curve (ROC) Area Under The Curve (AUC)

What proportion of the sample points are within the thresholded area What proportion of the total area is within the thresholded area Goes up quickly if points are within a sub-set of the overall predictor values

AUC Area Under the Curve 0.5=Model is random, Closer to 1.0 the better

Best Explanation Ever!

Fitting Features Types of “Features” –Threshold: flat response to predictor –Hinge: linear response to predictor –Linear: linear response to predictor –Quadratic: square of the predictor –Product: two predictors multiplied together –Binary: Categorical levels The following slides are from the tutorial you’ll run in lab

Threshold Features

Linear

Quadratic

Hinge Features

Product Features

Getting the “Best” Model AUC does not account for the number of parameters –Use the regularization parameter to control over-fitting MaxEnt will let you know which predictors are explaining the most variance –Use this, and your judgment to reduce the predictors to the minimum number –Then, rerun MaxEnt for final outputs

Number of Parameters cld6190_ann, 0.0, 32.0, 84.0 dtr6190_ann, 0.0, 49.0, ecoreg, 0.0, 1.0, 14.0 frs6190_ann, , 0.0, h_dem, 0.0, 0.0, pre6190_ann, 0.0, 0.0, pre6190_l1, 0.0, 0.0, pre6190_l10, 0.0, 0.0, pre6190_l4, 0.0, 0.0, pre6190_l7, 0.0, 0.0, tmn6190_ann, 0.0, , tmp6190_ann, , 1.0, tmx6190_ann, 0.0, 101.0, vap6190_ann, 0.0, 1.0, tmn6190_ann^2, , 0.0, tmx6190_ann^2, , , vap6190_ann^2, , 1.0, cld6190_ann*dtr6190_ann, , , cld6190_ann*pre6190_l7, , 0.0, cld6190_ann*tmx6190_ann, , , cld6190_ann*vap6190_ann, , 38.0, dtr6190_ann*pre6190_l1, , 0.0, dtr6190_ann*pre6190_l4, , 0.0, dtr6190_ann*tmn6190_ann, , , ecoreg*tmn6190_ann, , , ecoreg*tmx6190_ann, , 154.0, ecoreg*vap6190_ann, , 12.0, frs6190_ann*pre6190_l4, , 0.0, frs6190_ann*tmp6190_ann, , 0.0, frs6190_ann*vap6190_ann, , 0.0, h_dem*pre6190_l10, , 0.0, h_dem*pre6190_l4, , 0.0, pre6190_ann*pre6190_l7, , 0.0, pre6190_l1*vap6190_ann, , 0.0, pre6190_l10*pre6190_l4, , 0.0, pre6190_l10*pre6190_l7, , 0.0, pre6190_l10*tmn6190_ann, , , pre6190_l4*pre6190_l7, , 0.0, pre6190_l4*tmn6190_ann, , , pre6190_l4*vap6190_ann, , 0.0, pre6190_l7*tmp6190_ann, , 0.0, pre6190_l7*tmx6190_ann, , 0.0, tmn6190_ann*tmp6190_ann, , , tmn6190_ann*vap6190_ann, , , (85.5<pre6190_l10), , 0.0, 1.0 (22.5<pre6190_l7), , 0.0, 1.0 (14.5<pre6190_l7), , 0.0, 1.0 (101.5<tmn6190_ann), , 0.0, 1.0 (195.5<h_dem), , 0.0, 1.0 (340.5<tmx6190_ann), , 0.0, 1.0 (48.5<h_dem), , 0.0, 1.0 (14.5<pre6190_l10), , 0.0, 1.0 (308.5<tmx6190_ann), , 0.0, 1.0 (311.5<tmx6190_ann), , 0.0, 1.0 (23.5<pre6190_l4), , 0.0, 1.0 (9.5<ecoreg), , 0.0, 1.0 (281.5<tmx6190_ann), , 0.0, 1.0 (50.5<h_dem), , 0.0, 1.0 'tmn6190_ann, , 228.5, (36.5<h_dem), , 0.0, 1.0 (191.5<tmp6190_ann), , 0.0, 1.0 (101.5<dtr6190_ann), , 0.0, 1.0 'cld6190_ann, , 82.5, 84.0 `h_dem, , 0.0, 2.5 (1367.0<h_dem), , 0.0, 1.0 (280.5<tmx6190_ann), , 0.0, 1.0 (55.5<pre6190_ann), , 0.0, 1.0 (211.5<h_dem), , 0.0, 1.0 (82.5<pre6190_l10), , 0.0, 1.0 (41.5<pre6190_l7), , 0.0, 1.0 (86.5<pre6190_l1), , 0.0, 1.0 (106.5<pre6190_l1), , 0.0, 1.0 (97.5<pre6190_l4), , 0.0, 1.0 `h_dem, , 0.0, 57.5 `pre6190_l1, , 0.0, 3.5 (87.5<pre6190_l1), , 0.0, 1.0 `pre6190_l1, , 0.0, 33.5 (199.5<vap6190_ann), , 0.0, 1.0 `pre6190_l7, , 0.0, 0.5 (985.0<h_dem), , 0.0, 1.0 (12.5<pre6190_l7), , 0.0, 1.0 'dtr6190_ann, , 100.5, (24.5<pre6190_l7), , 0.0, 1.0 `h_dem, , 0.0, 58.5 (95.5<pre6190_l4), , 0.0, 1.0 (42.5<pre6190_l10), , 0.0, 1.0 (59.5<cld6190_ann), , 0.0, 1.0 (156.5<tmn6190_ann), , 0.0, 1.0 'vap6190_ann, , 284.5, (195.5<pre6190_l10), , 0.0, 1.0 'pre6190_ann, , 198.5, (85.5<pre6190_ann), , 0.0, 1.0 `cld6190_ann, , 32.0, 56.5 (127.5<pre6190_l7), , 0.0, 1.0 (310.5<tmx6190_ann), , 0.0, 1.0 (123.5<dtr6190_ann), , 0.0, 1.0 (58.5<cld6190_ann), , 0.0, 1.0 `h_dem, , 0.0, 15.5 `dtr6190_ann, , 49.0, 89.5 (34.5<pre6190_ann), , 0.0, 1.0 `h_dem, , 0.0, 16.5 (195.5<tmp6190_ann), , 0.0, 1.0 linearPredictorNormalizer, densityNormalizer, numBackgroundPoints, entropy,

Running Maxent Folder for layers: –Must be in ASCII Grid “.asc” format CSV file for samples: –Must be: Species, X, Y Folder for outputs: –Maxent will put a number of files here

Avoiding Problems Create a folder for each modeling exercise. –Add a sub-folder for “Layers” Layers must have the same extent & number of rows and columns of pixels –Save your samples to a CSV file: Species, X, Y as columns –Add a sub-folder for each “Output”. Number or rename for each run Some points may be missing environmental data

Running Maxent Batch file: –maxent.bat contents: java -mx512m -jar maxent.jar –The 512 sets the maximum RAM for Java to use Double-click on jar file –Works, with default memory

Maxent GUI

Douglas-Fir Points

AUC Curve

Response Curves Each response if all predictors are used Each response if only one predictor is used

Surface Output Formats

Percent Contribution Precip. contributes the most

Settings

Regularization = 2 AUC = 0.9

Resampling Resampling: The model general term Cross-validation: typically with an independent data set Leave-one-out cross-validation (LOOCV) –Break up data set into N “chucks”, run model leaving out each chunk Replication: MaxEnt’s term for resampling

MaxEnt: Replication Cross-Validation: LOOCV –10 replicates -> each replicate will be trained using 90% of the data Repeated Subsampling: –Breaks data into “training” and “test” data sets “Bootstrapping”: sub-samples data using replacement. Training can have duplicate records (not recommended)

Optimizing Your Model Select the “Sample Area” carefully Use “Percent Contribution”, Jackknife and correlation stats to determine the set of “best” predictors Try different regularization parameters to obtain response curves you are comfortable with and reduce the number of parameters (and/or remove features) Run “replication” to determine how robust the model is to your data

Model Optimization & Selection Modeling approach Predictor Selection Coefficients estimation Validation: –Against sub-sample of data –Against new dataset Parameter sensitivity Uncertainty estimation

LinearGAMBRTMaxent Number of predictors NNNN “Base” equationLinear (or linearized) Link + splines (typical) TreesLinear, product, threshold, etc. Fitting approachDirect analytic solution Solve derivative for maximum likelihood Make a tree, add one, if better, keep going Search for best solution Response variable Continuous Continuous or categorical Continuous Sample Measure ContinuousContinuous or categorical Presence-only PredictorsContinuousContinuous or categorical Uniform residuals Yes Independent samples Yes ComplexitySimpleModerateComplex Over fitNoUnlikelyProbably