Automated Objective Tropical Cyclone Eye Detection

Automated Objective Tropical Cyclone Eye Detection
Robert DeMaria1, John Knaff2, Galina Chirokova1, and Jack Beven3 (1) CIRA, Colorado State University, Fort Collins, CO (2) NOAA Center for Satellite Applications and Research , Fort Collins, CO (3) NOAA/NWS/National Hurricane Center, Miami, FL +Hi, I’m Robert DeMaria and I work for CIRA out at Colorado State University. +Typically, I do programming support for the RAMM team at CIRA, but this past year I received my masters degree in computer science and worked on some combined computer science and tropical cyclone research for my thesis. +I think we got some pretty good results detecting tropical cyclone eyes using machine learning algorithms. 32nd Conference on Hurricanes and Tropical Meteorology San Juan, PR, 18 – 22 April 2016

Why Do We Care About Detecting Hurricane Eyes?
Initial eye formation is a sign a tropical cyclone is getting more organized It is often a signal that the storm is about to get much stronger Eye detection is an important step in estimating the current strength of a hurricane Current methods require aircraft data, manual inspection using Dvorak method or a limited number of automated methods Dvorak only performed four times a day Large amount of data being unused +So, first of all, why do we care about detecting whether a hurricane has an eye or not? +Eye detection can help tell us how strong a storm currently is and whether it may experience rapid intensification. +Generally, if you want to perform eye detection, you either need to look at aircraft data or you need a human being to perform the Dvorak method. +But right now, the Dvorak method is only performed 4 times per day.

IR Data from Geostationary Satellites
Available every 30 minutes around the globe +So what this means is that the majority of the data available to us is not being used for eye detection. +An automated algorithm could make eye detection information available at more times which can then be used as input to other schemes such as rapid intensification scheme.

Objective Replicate human-performed eye detection with automated procedure Utilize same input routinely available to forecasters Generate scheme with accuracy close to human performed accuracy Target: 95% accuracy Make eye detection available at more times to supplement human-performed eye detection. +So what we set out to do with this project is to replicate human performed eye detection with an automated procedure using the same input routinely available to forecasters. +And we wanted to get the answer right about 95% of the time to match the accuracy of human performed eye detection. +Again, the ultimate goal is to make eye detection available at more times to supplement human-performed eye detection.

Method Used geostationary IR data and best track data
Used archive of classified images (from Dvorak fixes) as training/truth: Atlantic Basin 4109 Samples Used simple statistical computer vision/machine learning techniques Principal Component Analysis (PCA) Linear Discriminant Analysis (LDA) Quadratic Discriminant Analysis (QDA) +So what we did is we took the Atlantic Dvorak fixes from and matched them up with geostationary IR images and best track data available at the same time. +This gave us a total of 4109 images with a classification of “Eye Present” or “Eye Absent”. +We then used this data as input into some simple statistical techniques from the field of computer vision and machine learning. +List +I like these methods because in order to use them, you just need to perform a little linear algebra.

IR vs Microwave vs Visible
Used in addition to IR in Dvorak. Only available during the day. Adds complexity to algorithm. Microwave 3D view inside storm. Easier to determine if eye is present. Only available every 12 hr per satellite. Can miss a storm entirely. +So just as a quick aside, I’d like to point out that we only used IR data when we could have also used visible and microwave data. +Even though visible data is also used in the Dvorak scheme, it’s only available during the day and producing an algorithm that used both IR and visible data would have made the algorithm more complex than what we wanted. +Additionally, even though it can be easier to determine if an eye is present using microwave data, it’s only available every 12 hours and it can miss a storm entirely. +Again, using this data would add more complexity.

Algorithm Subsect/unroll satellite imagery Form training/testing sets
Perform PCA for dimension reduction Combine with ancillary data Train QDA/LDA Use testing set to evaluate performance +So the algorithm follows these six basic steps +First we subsect each of our sample images +Then we form training and testing sets from our sample images +After that, we reduce the dimension of each image +Then we combine each of our samples with ancillary data +Finally, we train QDA/LDA and use the testing set to evaluate performance

Step 1) Subsecting Imagery
Cut out 80x80 pixel box centered on best track estimate. +So the first thing we do is take each of our images and cut out an 80 pixel by 80 pixel box centered on the best track estimate. +This box is about 320km by 320km.

Unroll Imagery Unroll 80x80 pixel box to form 6400 element vector.
Then we unroll the box so it forms a 6400 element vector.

Step 2) Training/Testing Sets
Data randomly shuffled 70% used for training 30% used for testing Then we randomly shuffle our images and use 70% of the samples to train the algorithm and the remaining 30% to test the algorithm.

Step 3) Principal Component Analysis
Problem: Have 4109 total samples and predictors per image Solution: Use PCA to reduce data to most important patterns. PCA used to project data from predictors down to 10 while only losing ~10% of variance in data. +So at this point, we have a problem. +We have 6400 predictors and 4109 samples. +These machine learning algorithms tend to get confused when you have too many predictors compared to the number of samples you have. +To solve this problem, we use PCA. +If you’re not familiar with this technique, then the bottom line is that you can find the basic patterns in your dataset and only select the most important ones. +In our case, we can project our data down to just 10 predictors while only losing about 10% of the variance of our data.

Top 10 Most “Significant” EOFs Selected
+If you are familiar with PCA then these are the top 25 EOFs +Using a measure that estimated how well each EOF separates “eye” images from “no-eye” images, the 10 best EOFs were selected. +Interestingly, all the EOFs that resemble eyes were selected. 7 & 8 curved band

Step 4)Combine With Ancillary Data
Each IR sample projected down from predictors to 10. Added 4 Additional predictors: VMAX Storm Motion U Storm Motion V Storm Lat 14 Predictors total per sample Analysis indicated VMAX most important 10 IR predictors were next most important +Once each sample is projected down from 6400 predictors to 10 predictors then we add 4 additional predictors to each sample +List predictors +So we end up with a vector of 14 predictors per sample. +Our analysis indicated that the most important predictor was the vmax, followed by the predictors associated with the IR data.

Step 5) QDA/LDA LDA QDA No-Eye Region No-Eye Region Predictor 1
No-Eye Sample LDA QDA Eye Sample No-Eye Region No-Eye Region Predictor 1 Predictor 1 Eye Region Eye Region +Finally, we train QDA and LDA with our training samples. +The way this works is that you have to imagine that each sample represents a point in space where each predictor is an axis in this space. +Each training sample has a classification of whether it’s an “Eye” sample or a “No-eye” sample. +So QDA and LDA create a boundary in this space and tries to get as many “Eye” samples on one side of the boundary and “No-eye” samples on the other side of the boundary. +LDA creates a linear boundary and QDA creates a curved boundary. +Given a new sample, you can perform classification by determining which side of the boundary the sample lies on. Right hand side isn’t really doing anything Left hand side curve does a better job of classifying weak cases. Predictor 2 Predictor 2 QDA/LDA creates boundary in this space Blue (No-Eye) sample in Red(Eye) region is False Positive Red(Eye) sample in Blue(No-Eye) region is False Negative Real version uses 14 predictors/dimensions

Verification Correctly classifies ~90% of imagery in testing set
Target: ~95% Hurricane Katrina al +So the algorithm gets the right answer about 90% of the time. +That’s pretty close to our target of 95% +It does very well on cases like the one on the left where the storm has no structure and ones on the right where the eye is extremely well defined.

Algorithm Analysis for Hurricane Danielle
x x +The output of LDA and QDA can also be interpreted as a probability that an image contains an eye. +If the probability is above 50% then the algorithm classifies the image as an “eye” case. +Here you can see the performance of LDA and QDA over the lifetime of hurricane Danielle +LDA and QDA are represented by the green and red lines and they’re compared to whether an eye was actually observed, which is the black dashed line. +Most of the time, the algorithm got the correct answer. +However, there were a few cases where the algorithm doesn’t do so well.

Future Work Further Steps: Re-center add wind shear False Positive
add VIIRS: most important for cases with high uncertainty use algorithm results as an input into the RII Predict eye formation False Positive (misclassified as eye) False Negatives (misclassified as no eye) +The algorithm doesn’t perform well on cases where the storm is heavily sheared, where the storm is offset from the center of the image, or when we have a small eye +So some of the things we’d like to do with the algorithm are to +try to re-center off center images +Use wind shear data as a predictor +Use the algorithm with high resolution VIIRS data when the algorithm is uncertain to help accommodate for cases with a small eye. + +Additionally, we would like to use the algorithm’s output as input in a Rapid Intensification scheme +Potentially use the algorithm to help predict eye formation Hurricane Danielle al

Conclusions LDA and QDA can be used to objectively identify tropical cyclone eyes using commonly available data IR satellite imagery and basic storm parameters 90% success rate Most important predictor is Vmax IR data more important than all other ancillary data +To summarize, LDA and QDA can correctly identify tropical cyclyone eyes using IR data and basic storm parameters about 90% of the time. +We’re working on improving the algorithm to get the accuracy closer to 95% +Of the 14 predictors, vmax was the most important predictor, followed by the ir data and then the remaining ancillary data.

Extra Slides

LDA/QDA Verification Results
About 90% correct classification (close to 95% accuracy of “truth”) BSS better for LDA (accuracy of probabilities) PSS better for QDA (accuracy of classifications) QDA has fewer samples for eye covariance matrix. Note use of all data for case studies. Average Performance metrics from 1000 shufflings of training/testing sets and with all cases

QDA/LDA Discriminant Functions
where Σk is the covariance matrix for class k and μk is the mean for class k. LDA: where Σ is the weighted average of all Σk. Pick class k with the largest δk value Bi-variate normal distributions provide probability of being in class k

Dimension Reduction Can construct approximation of original data with a few EOFs

Sensitivity Vector Use LDA to give insight into importance of predictors Calculate change of discriminate function difference with 1 standard deviation change in each predictor value z ∆z(δ2-δ1) = ∂/∂z[(δ2-δ1)]z LDA δk are linear in x, so ∆z are constants

Additional Metrics Peirce Skill Score (PSS) Brier Skill Score (BSS)
Evaluates skill of classification compared to random guesses based on training data distrib. Brier Skill Score (BSS) Evaluates skill of probabilities compared to constant probabilities obtained from training data distrib. BSS and PSS 1 is perfect 0 is no better than no skill scheme < 0 is worse than no skill.

Step 5) QDA/LDA Need estimate of probability of being in class k, given input vector x: P(C=k | x) Bayes rule relates P(C=k | x) to P(x | C=k) Fit bivariate normal distributions to get P(x) for each class Take natural log of P(C=k | x) to get discriminate function for each class For LDA, assume class covariance matrices are equal Makes discriminant function linear in x

Algorithm Analysis for Hurricane Katrina
x x

Katrina Misclassified Images
False Negatives:

Automated Objective Tropical Cyclone Eye Detection

Similar presentations

Presentation on theme: "Automated Objective Tropical Cyclone Eye Detection"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Automated Objective Tropical Cyclone Eye Detection

Similar presentations

Presentation on theme: "Automated Objective Tropical Cyclone Eye Detection"— Presentation transcript:

Similar presentations

About project

Feedback