Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis.

Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis

Classification vs. Spectral Mixture Analysis In SMA, image pixels were regarded as being mixed from various proportions of common materials. The goal was to find what those materials were in an image, and what their proportions were pixel by pixel. In classification, the image pixels are regarded as grouping into thematically and spectrally distinct clusters (in DN space). Each pixel is tested to see what group it most closely resembles. The goal is to produce a map of the spatial distribution of each theme or unit. Forests – group 2 Water – group 1 Desert – group 3

 =2  =48  =60 Multi-unit veg map AVHRR Images with pixels similar to vegetation flagged according to distance at different tolerances 

What is spectral similarity? X Y  x A B    Spectral distance:  Spectral angle:  Spectral contrast between similar objects is small

Manual Classification Seattle 1)association by spectral similarity of pixels into units 2) naming those units, generally using independent information - reference spectra - field determinations - photo-interpretation

Basic steps in image classification: 1) Data reconnaissance and self-organization 2) Application of the classification algorithm 3) Validation

Reconnaissance and data organization Reconnaissance What is in the scene? What is in the image? What bands are available? What questions are you asking of the image? Can they be answered with image data? Are the data sufficient to distinguish what’s in the scene? Organization of data How many data clusters in n-space can be recognized? What is the nature of the cluster borders? Do the clusters correspond to desired map units?

Unsupervised Separate Data Into Groups With Clustering Classify Data Into Groups Assign Name To Each Group Satisfactory ? Yes No Form Images Of Data Choose Training Pixels For Each Category Calculate Statistical Descriptors Satisfactory ? Classify Data Into Categories Defined Yes No Supervised Classification algorithms

Unsupervised Classification: K-Means algorithm Pick number of themes; set distance tolerance ° 1st pixel defines 1st theme ° is 2nd pixel within tolerance? - YES: redefine theme - NO: define 2nd theme ° Interrogate 3rd pixel… ° Iterate, using “found” themes as the new seed How do you estimate the number of themes? - can be greater than number of bands

Parallelipiped Minimum Distance Maximum Likelihood Decision-Tree + x “Hard” vs. “soft” classification Hard: winner take all Soft: “answer” expressed as probability x belongs to A, B “Fuzzy” classification is very similar to spectral unmixing Supervised Classification: What are some algorithms?

Parallelepiped Classifier Assigns a DN range in each band for each class (parallelepiped) Advantages: simple Disadvantages: low accuracy - especially when the distribution in DN space has strong covariance, large areas of the parallelipipeds may not be occupied by data and they may overlap

Minimum-Distance Classifier Uses only the mean of each class. The unknown pixel is classified using its distance to each of the class means. The shortest distance wins. Decision boundaries

Maximum Likelihood The most commonly used classifier used. A pixel is assigned to the class based on statistical probability. Based on statistics (mean; covariance) A (Bayesian) probability function is calculated from the inputs for classes established from training sites. Each pixel is then judged as to the class to which it most probably belong.

Maximum Likelihood For each DN ntuple in the image, 1) calculate the distance to each cluster mean 2) scale by the number of standard deviations in the direction of the ntuple from the mean 3) construct rule images, pixel by pixel for each cluster, in which the number of standard deviations is recorded 4) threshold the rule images (null pixels too far from a cluster) 5) pick best match (least number of standard deviations and record it in the appropriate pixel of the output image or map

Decision-Tree Classifier Hierarchical classifier compares the data sequentially with carefully selected features. Features are determined from the spectral distributions or separability of the classes. There is no general procedure. Each decision tree or set of rules is custom-designed. A decision tree that provides only two outcomes at each stage is called a “binary decision tree” (BDT) classifier.

One goal: reduce impact of topography on outcome ratioing NDVI Spectral angle Pre-processing - dimension transformation Line of constant ratio y x x/y y/z  A B B A

Validation Photointerpretation Look at the original data: does your map make sense to you?

Confusion matrices Well-named. Also known as contingency tables or error matrices Here’s how they work… Training areas A B C D E F A B C D E F Classified data Column sums Row sums Grand sum All non diagonal elements are errors Row sums give “commission” errors Column sums give “omission” errors Overall accuracy is the diagonal sum over the grand total This is the assessment only for the training areas What do you do for the rest of the data? p 586, LKC 6 th 480 05000485 0 0 0 0 0 0 0 0 480 52 16 681992 0200 0 72

Again: columns give the cover types used for training, and rows Give the pixels actually classified into each category by the classifier Training areas A B C D E F A B C D E F Classified data Column sums: Number reference pixels In each class Row sums: Number pixels Classified as Each class Grand sum 480 05000485 0 0 0 0 0 0 0 0 480 52 16 681992 0200 0 72 3824 60 313 0 0 0 0 0 79 359 353 142 459 481 356248 402 438 126 342 38 0 40 Producer’s accuracy 480/480=100% 52/68=76% User’s accuracy 480/485=99% 52/72=72% 313/353=87% 126/142=89% 342/459=74% 359/481=75% 313/356=88% 126/248=51% 342/402=85% 359/438=82% Overall accuracy diagonal sum/grand sum 84% Producer’s accuracy: The number of correctly classified pixels used for training divided by the number in the training area User’ accuracy: The number of correctly classified pixels in each category divided by the total number classified as that category Overall accuracy: The total number of correctly classified pixels Divided by the total reference pixels in all categories

A basic problem with classification What’s actually on the ground – all three look similar (A) because they are grassy meadow golf course cemetary Mystery pixel X is found to be spectrally similar to: Theme A = grass We tend to want to classify by land use, and from the remote sensing perspective this may lead to ambiguity = bear habitat ≠ bear habitat I want to find bears. Bears like meadows. I train on a meadow (Theme A) and classify an image to see where the bears are. Pixel X is classified as similar to A. Will I find bears there? Maybe not. 1) they might be somewhere else even they like the meadow. 2) What a meadow is from the RS perspective is a high fraction of GV. Other things share this equivalence. Therefore, X may indeed belong to A spectrally, but not according to use. Thought exercise: what would you need to do in order to classify by land use?

Next class: Radar

Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis.

Similar presentations

Presentation on theme: "Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis.

Similar presentations

Presentation on theme: "Lecture 14: Classification Thursday 18 February 2010 Reading: Ch. 7.13 – 7.19 Last lecture: Spectral Mixture Analysis."— Presentation transcript:

Similar presentations

About project

Feedback