Feature Selection for Tree Species Identification in Very High Resolution Satellite Images Matthieu Molinier and Heikki Astola VTT Technical Research Centre.

Slides:



Advertisements
Similar presentations
Image quality assessment and statistical evaluation Lecture 3 February 4, 2005.
Advertisements

Predicting and mapping biomass using remote sensing and GIS techniques; a case of sugarcane in Mumias Kenya Odhiambo J.O, Wayumba G, Inima A, Omuto C.T,
Major Operations of Digital Image Processing (DIP) Image Quality Assessment Radiometric Correction Geometric Correction Image Classification Introduction.
Correlation Aware Feature Selection Annalisa Barla Cesare Furlanello Giuseppe Jurman Stefano Merler Silvano Paoli Berlin – 8/10/2005.
Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”
Evaluating the potential of satellite data to classify tree species and multiple stages of tree damages RU-Science Day 4 June 2013 Evaluating the potential.
Radiometric and Geometric Errors
ASTER image – one of the fastest changing places in the U.S. Where??
Multiple Criteria for Evaluating Land Cover Classification Algorithms Summary of a paper by R.S. DeFries and Jonathan Cheung-Wai Chan April, 2000 Remote.
FOR 474: Forest Inventory Plot Level Metrics from Lidar Heights Other Plot Measures Sources of Error Readings: See Website.
VALIDATION OF REMOTE SENSING CLASSIFICATIONS: a case of Balans classification Markus Törmä.
Feature Screening Concept: A greedy feature selection method. Rank features and discard those whose ranking criterions are below the threshold. Problem:
ESTEC July 2000 Estimation of Aerosol Properties from CHRIS-PROBA Data Jeff Settle Environmental Systems Science Centre University of Reading.
Digital Imaging and Remote Sensing Laboratory Real-World Stepwise Spectral Unmixing Daniel Newland Dr. John Schott Digital Imaging and Remote Sensing Laboratory.
The Global Digital Elevation Model (GTOPO30) of Great Basin Location: latitude 38  15’ to 42  N, longitude 118  30’ to 115  30’ W Grid size: 925 m.
Overview of Biomass Mapping The Woods Hole Research Center Alessandro Baccini, Wayne Walker and Ned Horning November 8 – 12, Samarinda, Indonesia.
DR. JOHANNES HEINZEL (Dipl.-Geogr.) University of Freiburg, Department of Remote Sensing and Landscape Information Systems, Freiburg, Germany Use.
Ensemble Learning (2), Tree and Forest
INTRODUCTION Problem: Damage condition of residential areas are more concerned than that of natural areas in post-hurricane damage assessment. Recognition.
1 Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data Presented by: Tun-Hsiang Yang.
Module 2.1 Monitoring activity data for forests using remote sensing REDD+ training materials by GOFC-GOLD, Wageningen University, World Bank FCPF 1 Module.
These early results were obtained using one year’s set of FIA field data, DISTANCE: EUCLIDEAN WEIGHTING FUNCTION: NO WEIGHTS. NUMBER OF PLOTS: 696 NUMBER.
Accuracy Assessment. 2 Because it is not practical to test every pixel in the classification image, a representative sample of reference points in the.
Methods of Validating Maps of Deforestation and Selective Logging Carlos Souza Jr. Instituto do Homem e Meio Ambiente da Amazônia—Imazon.
earthobs.nr.no Land cover classification of cloud- and snow-contaminated multi-temporal high-resolution satellite images Arnt-Børre Salberg and.
Chenghai Yang 1 John Goolsby 1 James Everitt 1 Qian Du 2 1 USDA-ARS, Weslaco, Texas 2 Mississippi State University Applying Spectral Unmixing and Support.
Forest stratification of REDD pilot sites, using VHR data. Vincent Markiet, Johannes Reiche¹, Samuela Lagataki², Akosita Lewai², Wolf Forstreuter³ 1) Wageningen.
Satellite Cross comparisonMorisette 1 Satellite LAI Cross Comparison Jeff Morisette, Jeff Privette – MODLAND Validation Eric Vermote – MODIS Surface Reflectance.
Mapping lichen in a caribou habitat of Northern Quebec, Canada, using an enhancement-classification method and spectral mixture analysis J.Théau, D.R.
BUILDING EXTRACTION AND POPULATION MAPPING USING HIGH RESOLUTION IMAGES Serkan Ural, Ejaz Hussain, Jie Shan, Associate Professor Presented at the Indiana.
Image Classification 영상분류
Urban Building Damage Detection From Very High Resolution Imagery By One-Class SVM and Shadow Information Peijun Li, Benqin Song and Haiqing Xu Peking.
__________. Introduction Importance – Wildlife Habitat – Nutrient Cycling – Long-Term Carbon Storage – Key Indicator for Biodiversity Minimum Stocking.
Spectral classification of WorldView-2 multi-angle sequence Atlanta city-model derived from a WorldView-2 multi-sequence acquisition N. Longbotham, C.
SINGLE-TREE FOREST INVENTORY USING LIDAR AND AERIAL IMAGES FOR 3D TREETOP POSITIONING, SPECIES RECOGNITION, HEIGHT AND CROWN WIDTH ESTIMATION Ilkka Korpela.
Topographic correction of Landsat ETM-images Markus Törmä Finnish Environment Institute Helsinki University of Technology.
Digital Image Processing Definition: Computer-based manipulation and interpretation of digital images.
BIOPHYS: A Physically-based Algorithm for Inferring Continuous Fields of Vegetative Biophysical and Structural Parameters Forrest Hall 1, Fred Huemmrich.
Accuracy of Land Cover Products Why is it important and what does it all mean Note: The figures and tables in this presentation were derived from work.
Area estimation in the MARS project. A summary history J. Gallego,– MARS AGRI4CAST.
Chuvieco and Huete (2009): Fundamentals of Satellite Remote Sensing, Taylor and Francis Emilio Chuvieco and Alfredo Huete Fundamentals of Satellite Remote.
Designing multiple biometric systems: Measure of ensemble effectiveness Allen Tang NTUIM.
14 ARM Science Team Meeting, Albuquerque, NM, March 21-26, 2004 Canada Centre for Remote Sensing - Centre canadien de télédétection Geomatics Canada Natural.
Application of spatial autocorrelation analysis in determining optimal classification method and detecting land cover change from remotely sensed data.
LiDAR Remote Sensing of Forest Vegetation Ryan Anderson, Bruce Cook, and Paul Bolstad University of Minnesota.
BOT / GEOG / GEOL 4111 / Field data collection Visiting and characterizing representative sites Used for classification (training data), information.
earthobs.nr.no Retraining maximum likelihood classifiers using a low-rank model Arnt-Børre Salberg Norwegian Computing Center Oslo, Norway IGARSS.
Remote Sensing Unsupervised Image Classification.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
GLC 2000 Workshop March 2003 Land cover map of southern hemisphere Africa using SPOT-4 VEGETATION data Ana Cabral 1, Maria J.P. de Vasconcelos 1,2,
Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.
Updated Cover Type Map of Cloquet Forestry Center For Continuous Forest Inventory.
Citation: Zhang Z.Y.,Kazakova A.N. and Moskal L.M Integrating LIDAR with Hyperspectral Data for Tree Species Classification in Urban Ecosystems.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Citation: Kato, A.., L. M. Moskal., P. Schiess, M. Swanson, D. Calhoun and W. Stuetzel, LiDAR based tree crown surface reconstruction. Factsheet.
Airborne LiDAR requires purchase, but offers a number of advantages; Airborne LiDAR requires purchase, but offers a number of advantages; Spatial resolution.
Integrating LiDAR Intensity and Elevation Data for Terrain Characterization in a Forested Area Cheng Wang and Nancy F. Glenn IEEE GEOSCIENCE AND REMOTE.
Module 2.8 Overview and status of evolving technologies REDD+ training materials by GOFC-GOLD, Wageningen University, World Bank FCPF 1 Module 2.8 Overview.
1.Image Error and Quality 2.Sampling Theory 3.Univariate Descriptive Image Statistics 4.Multivariate Statistics 5.Geostatistics for RS Next Remote Sensing1.
26. Classification Accuracy Assessment
Automatically Collect Ground Control Points from Online Aerial Maps
HIERARCHICAL CLASSIFICATION OF DIFFERENT CROPS USING
ASTER image – one of the fastest changing places in the U.S. Where??
Incorporating Ancillary Data for Classification
Paulo Gonçalves1 Hugo Carrão2 André Pinheiro2 Mário Caetano2
National Forest Inventory for Great Britain
Satellite data Marco Puts
Igor Appel Alexander Kokhanovsky
Principal Component Analysis (PCA)
An introduction to Machine Learning (ML)
Presentation transcript:

Feature Selection for Tree Species Identification in Very High Resolution Satellite Images Matthieu Molinier and Heikki Astola VTT Technical Research Centre of Finland IGARSS 2011 Vancouver,

2 17/11/2015 Introduction NewForest – Renewal of Forest Resource Mapping A 1.5-year study ( ) funded by The Finnish Funding Agency for Technology and Innovation (TEKES), with Finnish Companies (forest) and Research Organizations (VTT and University of Eastern Finland UEF) Study motivation Improve methods for operative forest inventory from remote sensing data Species-wise estimates (e.g. stem volume) not accurate enough (accuracy vs. cost)

3 17/11/2015 NewForest approach in forest variable estimation Modelling based on satellite image pixel reflectances and contextual features Individual tree crown (ITC) detection and crown width estimation Combining data to predict total amount and size variation by species segmentation estimates Refined, more accurate species-wise estimates

4 17/11/2015 Study site Karttula / Kuopio, Central Finland º N º E Karttula GeoEye image, , RGB NIR 10.5 km x 11.5 km, 3% clouds Mixed forest, spruce dominated 25% pine, 45% spruce, 30% deciduous (mainly birch)

5 17/11/2015 Optical image data pre-processing Rectification to geographic coordinate system (WGS84, NUTM35) Geo-coding corrected using Digital Elevation Model (Airborne Laser Scanning DEM) : mean corrections 2.65 m, maximum 20 m Calibration to Top Of Atmosphere (TOA) reflectances using the band- specific calibration coefficients Atmospherical correction into surface reflectances by applying the SMAC4-radiation transfer code

6 17/11/2015 Ground reference data Training data – from 222 field plots  212 field plots within GeoEye image area (2009)  10 additional 0-stem volume plots extracted visually  Tree species classification : training data from 20 pure species field plots Testing data – from 178 field plots (mixed species)  178 field plots acquired in 2009, limited spatial distribution (several plots per forest stand) In total : 1164 ground objects mapped (276 pines, 277 spruces, 347 deciduous, 264 non-trees) GeoEye image : 10.5 km x 11.5 km

7 17/11/2015 Input for feature selection – features R G B NIR PAN mean intensity within 1.5 m radius around tree candidates (TC) SPECTRAL (5) – set A CONTEXTUAL (9) – set B From PAN, 7.5 m radius around TC mean mean / median skewness kurtosis contrast pm1 : mean of brightest pixels ps1 : std of brightest pixels pm2 : mean of darkest pixels ps2 : std of darkest pixels SEGMENT-WISE (21) – set C From PAN, 3 segment sizes : 50 m 2, 85 m 2, 125 m 2 mean mean / median skewness kurtosis std : standard deviation pmean : partial mean pstd : partial standard deviation Probe variables random vectors or random permutations of a feature vector probe_gauss1, probe_gauss2 probe_shuffle1, probe_shuffle2

8 17/11/2015 Class definitions and training scheme Class # Class name 1pine 2spruce 3deciduous 4shadow 5open area / sunlit 6bare ground 7green vegetation Tree classes Non-tree classes WHOLE DATASET (1164 samples) 900 trees, 264 non-trees TESTING (391) MODEL DESIGN (773) 2 / 31 / 3 TRAINING (512) VAL (261) 2 / 31 / 3 stratified sampling to preserve classes proportions model buildingranking

9 17/11/2015 Feature selection preparation (Guyon et al., 2003) Feature normalization to the range [0, 1] Visual screening of scatter plots on the 35 real features : no obvious correlations, very few outlier samples Variable ranking – assessing features one by one with the most simple classifier (single threshold), one(+) vs all(-). 4 scores : –Fisher criteria F, scaled to [0 1] –R 2 – Pearson correlation coefficient for a single feature vs +/- labels –AUC : Area under ROC curve (Receiver-Operative Curve) –sum of previous scores (FR2AUC) All scores computed for every class, then averaged to rank the variables for all 7 classes and for tree classes only (1,2,3). No single feature outperformed significantly and consistently the others

10 17/11/2015 Feature selection and image classification Classification accuracy on validation set VAL (261) as a score Sequential Forward Selection (SFS) with three classification methods : –Linear Discriminant Analysis (LDA) –Quadatric LDA –k-nearest neighbor (kNN) classifier, k  [2 9]. Feature selection and choice of k at the same time. Find the best minimal feature subset by a brute-force approach –10 best features from the SFS –retrain the best model using all modeling dataset (TRAIN + VAL) and test with the independent TEST set –brute force approach tractable in this case with simple classifiers –overcome the sub-optimality of SFS

11 17/11/ features is enough Spectral features performed best segment-wise features not suited to mixed species study Overall classification accuracy on tree classes over 80% Probe variables selected more often in the first places with LDA than with kNN : linear classifier too simple. Quadratic LDA was overfitting. kNN, k=5 best overall performance, and lowest difference from training to validation error => lower risk of overfitting

12 17/11/2015 Example of tree species classification map pine : 76 % spruce : 76 % deciduous : 88 % non-forest Pan-sharpened GeoEye image extract of 1 km x 1 km Individual tree crown classification with 5-NN classifier trained with pure species training data Non-forest mask generated with k-means clustering + cluster labeling

13 17/11/2015 Predicted species-wise stem numbers vs. field plot data Nspruce [stems/ha]Npine [stems/ha] Predicted [stems/ha] Ndecid [stems/ha] Predicted stem number per species plot against test data (178 test plots) Systematic under-estimation of predicted stem number with spruce and deciduous classes Noise partly due the small collecting radius (r = 8 m) of test data, and to geolocation differences between satellite and ground data True number of spruces/field plot Predicted number of spruces/field plot y=0.98*x R 2 = 0.24 y=0.98*x y=0.33*x y=0.56*x R 2 = 0.54 True number of broadleaved/field plot Predicted number of broadleaved/field plot y=0.85*x R 2 = 0.34 True number of pines/field plot Predicted number of pines/field plot

14 17/11/2015 Conclusions The methodology could detect individual treetops, identify their species and determine species proportions in mixed forest. Feature ranking and feature selection was performed on a set of 35 features for tree species classification. Several classifiers (model including a feature subset and a classification method) were built. The best turned out to be 5-NN with a subset of 6 features, mostly spectral. Segment-wise features could be discarded. The tree species proportion accuracy was good (1.4% to 3.5%), but the correlation of stem numbers / species not as good as expected. Future work Model selection with more elaborate classifiers (e.g. SVMs) Embedding feature selection into a cross-validation scheme Improve stem number estimation with adaptive filtering Tree crown width estimation validation with ground data

15 17/11/2015 Thank you