Classification and Regression Trees for Land Cover Mapping Lecture materials taken from a presentation by: B. Wylie, C. Homer, C. Huang, L. Yang, M. Coan.

Slides:

Advertisements

Similar presentations

Sampling Design, Spatial Allocation, and Proposed Analyses Don Stevens Department of Statistics Oregon State University.

Advertisements

Random Forest Predrag Radenković 3237/10

U.S. Department of the Interior U.S. Geological Survey USGS/EROS Data Center Global Land Cover Project – Experiences and Research Interests GLC2000-JRC.

VEGETATION MAPPING FOR LANDFIRE National Implementation.

Ensemble Methods An ensemble method constructs a set of base classifiers from the training data Ensemble or Classifier Combination Predict class label.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Model Assessment, Selection and Averaging

Chapter 7 – Classification and Regression Trees

Chapter 7 – Classification and Regression Trees

What is Statistical Modeling

Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.

Introduction to Predictive Learning

Sparse vs. Ensemble Approaches to Supervised Learning

VALIDATION OF REMOTE SENSING CLASSIFICATIONS: a case of Balans classification Markus Törmä.

U.S. Department of the Interior U.S. Geological Survey Decision Trees for Land Cover Mapping Guilty Parties: B. Wylie, C. Homer, C. Huang, L. Yang, M.

Land Cover Mapping for the Southwest Regional GAP Analysis Project Tenth Biennial Forest Service Remote Sensing Applications Conference, RS-2004, Salt.

Ensemble Learning: An Introduction

SEReGAP Land Cover Mapping Summary and Results Southwest Regional GAP Project Arizona, Colorado, Nevada, New Mexico, Utah US-IALE 2004, Las Vegas, Nevada:

Image Classification.

The Coeur d'Alene Tribe is learning the remote sensing methodology developed by LANDFIRE, and will be attempting to apply the methods to higher resolution.

Three kinds of learning

Bagging LING 572 Fei Xia 1/24/06. Ensemble methods So far, we have covered several learning methods: FSA, HMM, DT, DL, TBL. Question: how to improve results?

Sparse vs. Ensemble Approaches to Supervised Learning

Ensemble Learning (2), Tree and Forest

For Better Accuracy Eick: Ensemble Learning

1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.

by B. Zadrozny and C. Elkan

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

Model Building III – Remedial Measures KNNL – Chapter 11.

Satellite Cross comparisonMorisette 1 Satellite LAI Cross Comparison Jeff Morisette, Jeff Privette – MODLAND Validation Eric Vermote – MODIS Surface Reflectance.

Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.

Using spectral data to discriminate land cover types.

Chapter 9 – Classification and Regression Trees

LOGO Ensemble Learning Lecturer: Dr. Bo Yuan

Biophysical Gradient Modeling. Management Needs Decision Support Tools – Baseline Information Vegetation characteristics Forest stand structure Fuel loads.

Generic Approaches to Model Validation Presented at Growth Model User’s Group August 10, 2005 David K. Walters.

Image Classification 영상분류

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensemble Methods: Bagging and Boosting

Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

CLASSIFICATION: Ensemble Methods

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Digital Image Processing

Overcoming Chance Agreement in Classification Tree Modeling: Predictor Variables, Training Data, and Spatial Autocorrelation Considerations Southwest Regional.

Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.

Introducing Communication Research 2e © 2014 SAGE Publications Chapter Seven Generalizing From Research Results: Inferential Statistics.

Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.

Classification Ensemble Methods 1

Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.

GLC 2000 Workshop March 2003 Land cover map of southern hemisphere Africa using SPOT-4 VEGETATION data Ana Cabral 1, Maria J.P. de Vasconcelos 1,2,

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Classification and Regression Trees

 Seeks to determine group membership from predictor variables ◦ Given group membership, how many people can we correctly classify?

Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.

Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.

ADAPTIVE HIERARCHICAL CLASSIFICATION WITH LIMITED TRAINING DATA Dissertation Defense of Joseph Troy Morgan Committee: Dr Melba Crawford Dr J. Wesley Barnes.

1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.

Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.

26. Classification Accuracy Assessment

Ensemble Classifiers.

HIERARCHICAL CLASSIFICATION OF DIFFERENT CROPS USING

Chapter 13 – Ensembles and Uplift

Map of the Great Divide Basin, Wyoming, created using a neural network and used to find likely fossil beds See:

Incorporating Ancillary Data for Classification

Data Mining Practical Machine Learning Tools and Techniques

Introduction to Data Mining, 2nd Edition

Cross-validation Brenda Thomson/ Peter Fox Data Analytics

Advisor: Dr.vahidipour Zahra salimian Shaghayegh jalali Dec 2017

STT : Intro. to Statistical Learning

Presentation transcript:

Classification and Regression Trees for Land Cover Mapping Lecture materials taken from a presentation by: B. Wylie, C. Homer, C. Huang, L. Yang, M. Coan EROS Data Center, Sioux Falls Mnf1<=28 Mnf3<=19 Mnf13>56 Mnf1>19Mnf16<=54 decid. shrub Mnf3<=38 Mnf1<=28Mnf1<=25 Mnf3<=24 Mnf8<=28 shrub Mnf11<51 Mnf17>56 shrubcedar P. pine cedar Mnf2<=43cedar cedar

How do you eat an elephant? One bite at time! Divide&ConquerStratify&Predict

Specific Gravity Weight 6/6 = 100% 6/7 = 86% Separating Apples and Oranges (Lemons?) Classification Trees

Mnf1<=28 Mnf3<=19 Mnf13>56 Mnf1>19Mnf16<=54 decid. shrub Mnf3<=38 Mnf1<=28Mnf1<=25 Mnf3<=24 Mnf8<=28 decid. shrub Mnf11<51 Mnf17>56 shrubcedar decid. P. pinecedar Mnf2<=43cedar P. pine Example Decision Tree TrueFalse

Advantages of Decision Trees Rapid Rapid Repeatable Repeatable Nonparametric Nonparametric Utilize categorical data Utilize categorical data Non-linear relationships Non-linear relationships Less sensitive to errors in training data Less sensitive to errors in training data Disadvantages of Decision Trees Lots of training Lots of training Over-fitting Over-fitting Weights toward the relative % of training data Weights toward the relative % of training data Short-sighted (stepwise methods fail to identify optimal subset regressors –Fox,1991) Short-sighted (stepwise methods fail to identify optimal subset regressors –Fox,1991)

Other Methods: Unsupervised Clustering + Cluster busting can be time consuming + Cluster interpretation is subjective + Cannot include categorical inputs + Difficult to interpret if multiple date or if non spectral data include (DEM) + Parametric (assumes normal distribution) Supervised Classification + Parametric (assumes normal distribution) + Cannot include categorical inputs + Problematic multiple date or if non-spectral data include (DEM) + Difficult for large area applications

Other Methods: Unsupervised Clustering + Cluster busting can be time consuming + Cluster interpretation is subjective + Cannot include categorical inputs + Difficult to interpret if multiple date or if non spectral data include (DEM) + Parametric (assumes normal distribution) Supervised Classification + Parametric (assumes normal distribution) + Cannot include categorical inputs + Problematic multiple date or if non-spectral data include (DEM) + Difficult for large area applications Neural Nets + Long convergence times (training) + High CPU demands (training) + Grey box + Tricked by local minimums + Non-repeatability of results (random search functions) + Sensitive to errors in training data

Spectral variability: A monoculture of wheat Training data: capture variability of a class (sample size)

Predictor 1 Predictor 2 Would 2 examples of each produce a reliable separation?

Training samples Classification tree is a “Data Mining” method so it performs well with large training data sets. Sampling of classes should reflect their relative frequency in the study area. rare classes = few training points common classes = many training points Adequate but not over sampling of rare classes Samples should be widely distributed over the study area to minimize autocorrelation effects on the surface and on the image

Descriptive or Prediction decision trees? (De’ath and Fabricus 2000) DESCRIPTIVE TREES: 1) Usually a single tree 2) The objective is to understand important factors or functional relationships between predictor and dependent variables. 3) The decisions generated by the tree are as important as the predictions

PREDICTION TREES: 1)The objective is the Best Possible Prediction of the dependent variable 2)Usually consists of a combination of multiple trees aimed at producing higher accuracies and more stable and robust models (DeFries and Chan 2000)

Multiple Tree Approaches: Prediction 1)Bagging (bootstrap sampling of training data)--Splus & C5 2)Subset data layers—Splus & C5 3)Boosting * – C5

2) Subset of data layers—Splus & C5 Multiple Tree Approaches: Prediction soils spectral LUDA Tree 1 Tree 2 Tree 3 VOTEVOTE

3) Boosting (iterative trees try to trees try to account for account for previous trees previous trees errors)—C5 errors)—C5 Different over-fitting issues associated with each tree tend to be averaged out. Multiple Tree Approaches VOTE

Single tree Boosted

Boosting versus Single Tree (USGS NLCD Zone 16, Independent Test Data)

Mnf1<=28 Mnf3<=19 Mnf13>56 Mnf1>19Mnf16<=54 decid.shrub Mnf3<=38 Mnf1<=28Mnf1<=25 Mnf3<=24 Mnf8<=28 decid.shrub Mnf11<51 Mnf17>56 shrubcedar decid. P. pine cedar Mnf2<=43cedar cedar At each “terminal node” or “leaf” we: + know the number of training points correct, incorrect, and % right and % right Trees provide more than just land cover predictions

“leaf” map land cover % right or confidence Identifies terminal node number for each decision Model output Since each terminal node is coded with a probability, a confidence map can be generated

1)Reduce inputs to decision tree 2)Reduced tree may have improved accuracies 3)Increases speed that the tree can be applied to the study area 4)Interpretation of underlying functional relationships (drivers) 5)Produce multiple trees for class “voting” Use relative frequency of use of data layers in the training data as a crude index of data layer “utility”. Top “utility” data layers from 40 possible input layers

Accuracy Assessment: Cross Validation versus Independent Test, Zone 16, Utah

Uses of Cross Validation Accuracy Assessment Accuracy Assessment Optimal tree data sets Optimal tree data sets Pruning Pruning All training data used for prediction All training data used for prediction Cautions spatial autocorrelationspatial autocorrelation look for “significant” error changes when pruning or selecting tree parameterslook for “significant” error changes when pruning or selecting tree parameters

Past Experiences: hierarchical implementation of trees Landsat 7 ETM+ Mosaic (band 5,4,3) Mapping Zone 60, Spring, 2000 and 2001

Forest and Non-Forest Classification Mapping Zone 60, 2001 Established a classification tree model for mapping forest and non-forest class using FIA plot data (669 forest and non-forest plots). The classification was run using a 5-fold cross- validation procedure. The agreement between mapped and reference/validation data is 95% with standard error (SE) less than 1.0%.

Forest Classification Based on NLCD 2000 Classification System Mapping Zone 60, 2001 Established a classification tree model for mapping three MRLC forest classes using 669 FIA plots and 5-fold cross-validation procedure (134 plots for validation for each of the 5 runs). The agreement between mapped and reference/validation data is 80% with SE 1.0%

Forest Type Group Classification Based on USFS classification System Mapping Zone 60, 2001 Established a classification tree model for mapping six forest type groups using 669 FIA plots and a 5-fold cross-validation procedure (134 plots for validation for each of the 5 runs). The agreement between mapped and reference/validation data is 65% with SE 2.3%

Leaf-on Landsat 7 ETM+ scene mosaic (bands 5,4,3) for mapping zone 16 – Utah/Southern Idaho Leaf-on Landsat 7 ETM+ scene mosaic (bands 5,4,3) for mapping zone 16 – Utah/Southern Idaho

Forest/non-forest classification for mapping zone 16 – Utah/Southern Idaho Forest/non-forest classification for mapping zone 16 – Utah/Southern Idaho Missclassification 0 * 16.8% 0 * 16.8% 1 * 18.8% 1 * 18.8% 2 * 19.1% 2 * 19.1% 3 * 19.4% 3 * 19.4% 4 * 16.3% 4 * 16.3% Mean 18.1% Mean 18.1% SE 0.6% SE 0.6%

Deciduous/evergreen/ mixed classification for mapping zone 16 – Utah/Southern Idaho Deciduous/evergreen/ mixed classification for mapping zone 16 – Utah/Southern Idaho Fold Decision Tree Size Errors Size Errors 0 * 19.7% 0 * 19.7% 1 * 19.9% 1 * 19.9% 2 * 18.5% 2 * 18.5% 3 * 19.4% 3 * 19.4% 4 * 20.8% 4 * 20.8% Mean 19.7% Mean 19.7% SE 0.4% SE 0.4%

Forest type group classification for mapping zone 16 – Utah/Southern Idaho Fold Decision Tree Size Errors Size Errors 0 * 31.9% 0 * 31.9% 1 * 37.3% 1 * 37.3% 2 * 36.8% 2 * 36.8% 3 * 40.0% 3 * 40.0% 4 * 31.9% 4 * 31.9% 5 * 37.8% 5 * 37.8% 6 * 33.0% 6 * 33.0% 7 * 35.7% 7 * 35.7% 8 * 35.5% 8 * 35.5% 9 * 32.3% 9 * 32.3% Mean 35.2% Mean 35.2% SE 0.9% SE 0.9%

Mean and standard error (in parenthesis) of the overall accuracy (%) of classifications developed using a descriptive tree and an hierarchical tree approach in 5 repeated experiments. (Zone 16)

Recipe for Success Adequate and representative training (adequately represent rare classes, preserve relative proportions of training and population)Adequate and representative training (adequately represent rare classes, preserve relative proportions of training and population) Model construction assessed with Cross Validation Model construction assessed with Cross Validation (boosting, pruning, and data layer exclusions) Multiple trees for mapping (boosting) Multiple trees for mapping (boosting) Visually inspect land cover and add training in miss-classified areas and reconstruct modelVisually inspect land cover and add training in miss-classified areas and reconstruct model Consider “hierarchical” trees to allow trees to focus on problematic separationsConsider “hierarchical” trees to allow trees to focus on problematic separations

Edgematching

ACCURACY ASSESSMENT Accuracy assessment is based on a 20% sample of training sites for each mapped type.Accuracy assessment is based on a 20% sample of training sites for each mapped type. Since the 20% sample is not independent, this is not considered a true accuracy assessment.Since the 20% sample is not independent, this is not considered a true accuracy assessment. The 20% sample or validation set is not used in the classification tree model development.The 20% sample or validation set is not used in the classification tree model development. In addition to a validation with a 20% sample, we have supported our findings with a manual 5-fold cross validation with sample replacement to test model stability and help understand the effects of validation sample size on the model stability results.In addition to a validation with a 20% sample, we have supported our findings with a manual 5-fold cross validation with sample replacement to test model stability and help understand the effects of validation sample size on the model stability results.

Each site is considered correctly classified if the majority of pixels agree with sample polygon

Validation: Lower Wasatch Range MZ

Summary Statistics, 5-fold Validation This is a 5-fold cross validation with sample replacement to test model stability and help understand the effects of validation sample size on the model stability results.

Sample Size vs Accuracy and Coefficient of Variation

% of area mapped vs. available sample size (Great Salt Lake Desert)

% of area mapped vs. accuracy level

< 20 samples > 20 samples > 40 samples

Conclusions Current validation shows between 60-65% “accuracy” for completed mapping zones.Current validation shows between 60-65% “accuracy” for completed mapping zones. Variance of accuracies is directly linked to validation sample size.Variance of accuracies is directly linked to validation sample size. A validation set of 40 points seems to provide some stability to model results.A validation set of 40 points seems to provide some stability to model results. A minimum of 40 samples for validation per cover type should be a target size. This is difficult due to the preponderance of “rare” types and the lack of training sites.A minimum of 40 samples for validation per cover type should be a target size. This is difficult due to the preponderance of “rare” types and the lack of training sites. Using a 20% validation set, we need a minimum of 200 field sites per mapped type. This is difficult given the availability of unique sample sites for a particular cover type.Using a 20% validation set, we need a minimum of 200 field sites per mapped type. This is difficult given the availability of unique sample sites for a particular cover type. Overall map “accuracy” seems to coincide with expected results given the map detail.Overall map “accuracy” seems to coincide with expected results given the map detail. Rare ecological systems and systems that consist of a wide range of cover types (I.e. shrub steppe types) tend to lower accuracies.Rare ecological systems and systems that consist of a wide range of cover types (I.e. shrub steppe types) tend to lower accuracies.

A STRATEGY FOR ESTIMATING TREE CANOPY DENSITY USING LANDSAT 7 ETM+ AND HIGH RESOLUTION IMAGES OVER LARGE AREAS Chengquan Huang*, Limin Yang, Bruce Wylie, Collin Homer Raytheon ITSS EROS Data Center, Sioux Falls, SD 57198, USA