Puulajeittainen estimointi ja ei-parametriset menetelmät Multi-scale Geospatial Analysis of Forest Ecosystems Tahko 23.3.2011 Petteri Packalén Faculty.

Slides:



Advertisements
Similar presentations
Notes Sample vs distribution “m” vs “µ” and “s” vs “σ” Bias/Variance Bias: Measures how much the learnt model is wrong disregarding noise Variance: Measures.
Advertisements

Data Mining Feature Selection. Data reduction: Obtain a reduced representation of the data set that is much smaller in volume but yet produces the same.
Data Mining Classification: Alternative Techniques
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
Chapter 10 Curve Fitting and Regression Analysis
Lazy vs. Eager Learning Lazy vs. eager learning
COMPUTER AIDED DIAGNOSIS: FEATURE SELECTION Prof. Yasser Mostafa Kadah –
What is Statistical Modeling
Classification and Decision Boundaries
Classification and Prediction: Regression Via Gradient Descent Optimization Bamshad Mobasher DePaul University.
Feature Selection Presented by: Nafise Hatamikhah
FOR 474: Forest Inventory Plot Level Metrics from Lidar Heights Other Plot Measures Sources of Error Readings: See Website.
Chapter 2: Pattern Recognition
Instance Based Learning
Remote sensing is up! Inventory & monitoring Inventory – To describe the current status of forest Landcover / landuse classification Forest structure /
INSTANCE-BASE LEARNING
What Do You See? Message of the Day: The management objective determines whether a site is over, under, or fully stocked.
CHAPTER 3 Community Sampling and Measurements From: McCune, B. & J. B. Grace Analysis of Ecological Communities. MjM Software Design, Gleneden Beach,
Applications in GIS (Kriging Interpolation)
Classification and Prediction: Regression Analysis
An overview of Lidar remote sensing of forests C. Véga French Institute of Pondicherry.
Data Mining Techniques
Computer modelling ecosystem processes and change Lesson 8 Presentation 1.
Module 04: Algorithms Topic 07: Instance-Based Learning
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Model of Prediction Error in Chaotic and Web Driven Business Environment Franjo Jović*, Alan Jović ** * Faculty of Electrical Engineering, University of.
Validating Wykoff's Model, Take 2: Equivalence tests and spatial analysis in a design- unbiased analytical framework Robert Froese, Ph.D., R.P.F. School.
REGENERATION IMPUTATION MODELS FOR INTERIOR CEDAR HEMLOCK STANDS Badre Tameme Hassani, M.Sc., Peter Marshall PhD., Valerie LeMay, PhD., Temesgen Hailemariam,
Slide Number 1 of 31 Properties of a kNN tree-list imputation strategy for prediction of diameter densities from lidar Jacob L Strunk
Geographic Information Science
__________. Introduction Importance – Wildlife Habitat – Nutrient Cycling – Long-Term Carbon Storage – Key Indicator for Biodiversity Minimum Stocking.
PCB 3043L - General Ecology Data Analysis. OUTLINE Organizing an ecological study Basic sampling terminology Statistical analysis of data –Why use statistics?
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
SINGLE-TREE FOREST INVENTORY USING LIDAR AND AERIAL IMAGES FOR 3D TREETOP POSITIONING, SPECIES RECOGNITION, HEIGHT AND CROWN WIDTH ESTIMATION Ilkka Korpela.
Improving the accuracy of predicted diameter and height distributions Jouni Siipilehto Finnish Forest Research Institute, Vantaa
Course Review FORE 3218 Course Review  Sampling  Inventories  Growth and yield.
1Ellen L. Walker Category Recognition Associating information extracted from images with categories (classes) of objects Requires prior knowledge about.
1) Single-Tree Remote Sensing with LiDAR and Multiple Aerial Images 2A) Mapping forest plots: A new method combining photogrammetry and field triangulation.
LiDAR Remote Sensing of Forest Vegetation Ryan Anderson, Bruce Cook, and Paul Bolstad University of Minnesota.
L15 – Spatial Interpolation – Part 1 Chapter 12. INTERPOLATION Procedure to predict values of attributes at unsampled points Why? Can’t measure all locations:
PCB 3043L - General Ecology Data Analysis.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Introduction to Machine Learning Multivariate Methods 姓名 : 李政軒.
Citation: Zhang Z.Y.,Kazakova A.N. and Moskal L.M Integrating LIDAR with Hyperspectral Data for Tree Species Classification in Urban Ecosystems.
CZ5211 Topics in Computational Biology Lecture 4: Clustering Analysis for Microarray Data II Prof. Chen Yu Zong Tel:
Eick: kNN kNN: A Non-parametric Classification and Prediction Technique Goals of this set of transparencies: 1.Introduce kNN---a popular non-parameric.
Citation: Kato, A.., L. M. Moskal., P. Schiess, M. Swanson, D. Calhoun and W. Stuetzel, LiDAR based tree crown surface reconstruction. Factsheet.
U NIVERSITY OF J OENSUU F ACULTY OF F ORESTRY Introduction to Lidar and Airborne Laser Scanning Petteri Packalén Kärkihankkeen ”Multi-scale Geospatial.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
CS 8751 ML & KDDInstance Based Learning1 k-Nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning.
Forest ecological applications of ALS ALS provide 3D information where each point has height (and intensity) value Even with low pulse density data, say,
Counting the trees in the forest
Francisco Mauro, Vicente Monleon, and Hailemariam Temesgen
Factsheet # 27 Canopy Structure From Aerial and Terrestrial LiDAR
Joonghoon Shin Oregon State University
Factsheet # 17 Understanding multiscale dynamics of landscape change through the application of remote sensing & GIS Estimating Tree Species Diversity.
Data Transformation: Normalization
Chapter 7. Classification and Prediction
Eco 6380 Predictive Analytics For Economists Spring 2016
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
PCB 3043L - General Ecology Data Analysis.
CH 5: Multivariate Methods
Learning with information of features
COSC 4335: Other Classification Techniques
Multiple Decision Trees ISQS7342
Nearest Neighbors CSC 576: Data Mining.
Lecture # 2 MATHEMATICAL STATISTICS
Feature Selection Methods
FEATURE WEIGHTING THROUGH A GENERALIZED LEAST SQUARES ESTIMATOR
A Data Partitioning Scheme for Spatial Regression
Presentation transcript:

Puulajeittainen estimointi ja ei-parametriset menetelmät Multi-scale Geospatial Analysis of Forest Ecosystems Tahko Petteri Packalén Faculty of Science and Forestry

How to consider tree species? Stratification Inventory area is stratified as a separate step by main tree species Is this species-specific estimation at all? Estimation by tree species Individual tree approach Area based method Note: here the term species-specific growing stock refers to attributes obtained by tree species

Species-specific estimation by ALS data The estimation by tree species is rather difficult Species-specific estimates are substantially less accurate than the estimates of total characteristics Species-specific estimation may be practically impossible if the number of tree species is high or stand is multilayered by species How to relate ALS data to tree species – height or density does not correlate very well with tree species? Lidar intensity? Fusion of optical images and ALS point cloud Compatible stand description

Species-specific estimation by individual tree approach Detect and delineate individual trees Classify detected canopies by tree species Several classification methods have been tested Estimate tree attributes by tree species Aggregate to desired level

Species-specific estimation by individual tree approach Well-defined case – classify each tree Final error in species-specific stand attribute is an accumulated error in each processing step It may be difficult to trace back what is the reason for certain error at aggregated level Simultaneous NN imputation of all required tree attributes

Species-specific estimation by area based method Instead of classifying individual trees all stand attributes are estimated by tree species In mixed forests several tree species may or may not occur in the same estimation unit (e.g. plot or grid cell). Two principles: –First estimate whether a particular tree species occur or not. If occur estimate required attributes, if not set to zero or null. –Estimate attributes always for all tree species. For non-existing species prediction should be zero or null, i.e. rely on the capability of the estimation procedure to discover the non-existing tree species.

Compatibility in area based method Compatibility is important in practical applications. This means that: –estimator does not predict negative values –the sum of species-specific estimates is equal to the total estimate –attribute relations are logical; if mean diameter of pine is 7 cm mean height must be realistic, not e.g. 25 m Above requirements are difficult to fulfill by regression modeling, compatibility may be obtained e.g. by non-parametric methods

Accuracy Based on current knowledge it is impossible to state which approach is better in terms of species-specific estimates Aggregation improves the accuracy in both approaches (not related to species-specific estimation) It is very difficult – or at least expensive – to validate accuracy at the stand level since it requires separate validation measurements from trees to stand from cells to stand

Validity of accuracy assessment in terms of species-specific predictions RMSE is not always the best metric to assess the accuracy of species-specific attributes From the silvicultural point of view minority tree species are often less important –if there are 250m 3 /ha pine in a stand it is quite irrelevant whether the volume of spruce is estimated to be 5m 3 /ha or 10m 3 /ha RMSE of dominant tree species, RMSE of minor tree species, ignore “near zero” observations

Which one makes more sense?

Nearest Neighbor imputation Measure n predictor variables and one or more dependent variables on a set of observation units (Reference obs) Dependent variables are assigned to locations in an n-dimensional prediction space; dimensions are formed by the ranges of values of the n predictor layers Values of the dependent variables then can be imputed to other locations in the prediction space (Target obs) Numerous distance metrics, e.g. Euclidean, Manhattan, Minkowsky, Mahalanobis, MSN, RF etc.

Nearest Neighbor imputation Issues to consider –Distance metric, how to calculate weight matrix? Exhaustive search, heuristic algorithms, … In Mahalanobis and MSN weights are solved analytically –What is the optimal value of k? –How to weight NNs in order to obtain the estimate? Multiple dependent variables may be estimated simultaneously  often utilized in forestry applications

NN development in Kärkihanke Nearest Neighbor Imputation in Airborne Laser Scanning Based Forest Inventory –Visit at Oregon State University Two topics –Heuristic optimization in NN imputation –Tree list estimation in complex stands

Heuristic optimization in NN imputation A selection of an appropriate set of independent variables to the NN imputation is difficult and laborious taks There are several distance metrics used to define neighborhood  Use heuristic optimization to solve both tasks Select minimized goodness-of-fit criterion (e.g. weighted RMSE) Distance metric: weighted Euclidean distance Minimize goodness-of-fit criterion by SA, GA, …

Tree list estimation in complex stands

Thank you for your attention!