Presentation is loading. Please wait.

Presentation is loading. Please wait.

Examining Hurricane Irma with Twitter Data

Similar presentations


Presentation on theme: "Examining Hurricane Irma with Twitter Data"— Presentation transcript:

1 Examining Hurricane Irma with Twitter Data
and Machine Learning Gbemisola Oladosu, Dylan Johnson, (Mentors: Dr. Chien-fei Chen, Dr. Xiaojing Xu, Zach McMichael, Jullian Ball)

2 Outline Introduction Purpose Research Questions Methods
Data Collection Text Cleaning Descriptives Bag-of-words Doc2Vec Doc2Vec Models Results Conclusion Future Work

3 Introduction Proportion of Category 4 and 5 hurricanes in the US has increased in the past two decades[1] Cost of hurricane damage is increasing[2] Larger responses are necessary Higher risk for social issues Measuring response effectiveness and social issues is important

4 Introduction (Cont.) How did tweet content and purpose change over time? What tweet content categories had the most number of complaints, messages of appreciation, or requests for help? How did people feel about the government?

5 Methods 10,784 Irma-related tweets from September 2017
Tweets were labeled for content (Code 1) and purpose (Code2) Removed: Duplicates Labeling mistakes

6 Methods (Cont.) Text was cleaned: Non-English Retweets Punctuation
Articles URLs Set to lowercase Original Text: Senior community finally gets power restored 18 days after Irma on.wtsp.com/2xOJoDN pic.twitter.com/Og1dyXLMEa Cleaned Text: senior community finally gets power restored days irma

7 Methods (Cont.) Oversampled to 10,000
Composition of Tweet Purpose (Code2) Composition of Tweet Content (Code1)

8 Methods (Cont.) Neural networks require numeric inputs
Bag of Words feature vector Represent relationships btw. words Word similarity Word order (‘Large’ as far from ‘fat’ as ‘cat’) Vocabulary Vector The 2 Fat 1 Cat Sat On Mat Rat Blue Ran Large The fat cat sat on the mat.

9 Doc2Vec Uses word embeddings (many dimensional vectors) to represent:
Word similarity Word order Relationships between words [3]

10 Methods (Cont.) Eight Doc2Vec models were created to predict Code 1 and Code 2 Dependent Variable Text Type Sampling Model Accuracy Code1 Original Normal 17.5% Cleaned 18.6% Oversampled 38.1% 43.3% Code2 11.1% 15.3% 20.6% 31.6%

11 Results (Cont.) Only complaints at govt.
Mostly regarding infrastructure and animals Tweets of appreciation were related to health, infrastructure, and social issues/crimes Requests for help related to animals Code1 vs Code2

12 Results (Cont.) Tweets dramatically increased after landfall
Tweet composition remained constant Code1 vs Time Code1 vs Time

13 Results (Cont.) Not very accurate ‘Relief efforts’ was predicted a lot before the hurricane made landfall ‘Recovery info’ and ‘relief efforts’ were predicted disproportionately

14 Conclusion Predictions were inaccurate due to a small training set of fewer than 1400 tweets Code 1 model’s accuracy of 43% and Code 2 model’s accuracy of 31% suggest more data could produce better results

15 Conclusion (Cont.) Analyzing the results of a more accurate model can allow future researchers to determine the most impactful hurricane problems This can inform hurricane response groups on how to address these problems

16 References Holland, G., & Bruyère, C. L. (2013). Recent
intense hurricane response to global climate change. Climate Dynamics, 42(3–4), 617–627. 2. Hurricane Costs. (2017). Retrieved July 23, 2019, from Noaa.gov website: 3. Le, Q., & Mikolov, T. (2014). Distributed Representations of Sentences and Documents. Retrieved from

17 Acknowledgements Special thanks to our mentors:
Dr. Chien-fei Chen, Dr. Xiaojing Xu, Zach McMichael, and Julian Ball

18 Acknowledgements This work was supported primarily by the ERC Program of the National Science Foundation and DOE under NSF Award Number EEC and the CURENT Industry Partnership Program. Other US government and industrial sponsors of CURENT research are also gratefully acknowledged.


Download ppt "Examining Hurricane Irma with Twitter Data"

Similar presentations


Ads by Google