Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst.

Slides:



Advertisements
Similar presentations
Surveillance in a Pandemic: Situational Awareness
Advertisements

Reeder et al. Perceived usefulness of a distributed community-based syndromic surveillance system: a pilot qualitative evaluation study. BMC Research Notes.
The gathering of information to make marketing decisions.
Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter Eiji ARAMAKI * Sachiko MASKAWA * Mizuki MORITA ** * The University of Tokyo ** National.
CS 315 – Web Search and Data Mining. Overview The power of crowdsourcing Predicting flu outbreaks Predicting “the present” through Google Insights! Predicting.
ICDM, Shenzhen, 2014 Flu Gone Viral: Syndromic Surveillance of Flu on Twitter using Temporal Topic Models Liangzhe Chen, K. S. M. Tozammel Hossain, Patrick.
Comparability of Electronic and Manual Influenza-like Illness (ILI) Surveillance Methods Robin M. Williams, Nebraska Department of Health & Human Services/University.
Ambient Geographic Information and Biosurveillance Capstone Presentation Todd Barr March 20, 2013.
Federal Epidemiology Response to Hurricane Sandy
Joey Engelberg University of California - San Diego Financial Risks International Forum March 21, 2014 Search Data and Behavioral Finance.
Analysis of Twitter Data NIKHIL PURANIK CMSC 601 – Research Skills 25 th April 2011UNIVERSITY OF MARYLAND BALTIMORE COUNTY.
U.S. Surveillance Update Anthony Fiore, MD, MPH CAPT, USPHS Influenza Division National Center for Immunizations and Respiratory Disease Centers for Disease.
Happy semester with best wishes from all nursing staff Dr Naiema Gaber
Google Flu Trends Terminology –Influenza = flu –ILI = influenza like illness CDC ILI time series –Weekly –1-2 week publication lag Predicting it using.
Severe Acute Respiratory Syndrome What can we do to help? Ed Fredkin CMU -- MIT 4 April
THREE ESSENTIAL FOCUSES IN MOBILE MARKETING By Eric Koeck Center website:
The Commercialization of Big Data* Vasant Dhar Professor and Co-Director, Center for Business Analytics Stern School of Business Editor-in-Chief, Big Data.
Forecasting with Twitter data Presented by : Thusitha Chandrapala MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA.
How to make the most of your website: It’s one of your best marketing, branding, awareness tools.
Towards Detecting Influenza Epidemics by Analyzing Twitter Massages Aron Culotta Jedsada Chartree.
The following are instructions for public health officials and hospital users to conduct syndromic surveillance on influenza-like illnesses using ESSENCE.
September 27, 2012 THE FLOW OF DATA. The Flow of Data Data sources Data streams Databases Data repositories Data warehouses.
Sustainable Adult Immunization Activities Julie Morita, M.D. Medical Director, Immunization Program Chicago Department of Public Health.
© 2013 Data driven models to minimize hospital readmissions Miriam Paramore, EVP Strategy & Product Management, Emdeon David Talby, VP Engineering, Atigeo.
Immunization in the Time of H1N1 Anne Schuchat, MD Rear Admiral, US Public Health Service Director, National Center for Immunization and Respiratory Diseases.
Socio-Technological Impacts on Journalism Studies Joel Gershon.
Learning from the 2009 H1N1 Pandemic Response 1 Daniel S. Miller MD, MPH Director, International Influenza Unit Office of the Secretary Office of Global.
SEARCH ENGINE OPTIMIZATION How You can generate qualified Leads from Search Engine Optimization Search Engine Optimization.
 Plethora of literacies (functional, information, computer, digital, visual, e-literacy etc.) - Educational challenges  Add literacy to everything (mathematical,
1 INFLUENZA (AND OTHER RESPIRATORY VIRUS) SURVEILLANCE IN WISCONSIN Thomas Haupt M.S. Wisconsin Influenza Surveillance Coordinator.
Lauren Lewis, MD, MPH Health Studies Branch Environmental Hazards and Health Effects National Center for Environmental Health Centers for Disease Control.
Surveillance Overview Julia Gunn Boston Public Health Commission.
Information Exchange for Detection and Monitoring: Clinical Care to Health Departments Janet J Hamilton, MPH Florida Department of Health.
Influenza-like Illness Surveillance at the National Level
Math in the News Having the flu is no fun! Flu season is here, and this year is a bad one. The season started early and is still going strong. The Centers.
Sore throat? Sniffles?Sore throat? Sniffles?  Google it! Duh!  During flu season, more people enter search queries concerning the flu.  Each year 90.
Influenza Surveillance Update Health and Medical Subpanel Meeting July 13, 2010 Diane Woolard, VDH.
Eurostat Web activity evidence to increase timeliness of official statistics IAOS – 10 October.
Getting Business Performance through Blogging Sanford Dickert Rawlings Atlantic.
US Situation Update and CDC International Response H1N1 Pandemic US Situation Update and CDC International Response Peter Nsubuga, MD, MPH On behalf of.
1 Novel Influenza A H1N1 Outbreak: The Florida Response Epidemiology Perspective: Situation Update.
1 Using ESSENCE-FL and a serosurvey to estimate total influenza infections, 2009 Richard S. Hopkins, MD, MSPH Kate Goodin, MPH Mackenzie Weise, MPH Aaron.
INFLUENZA SURVEILLANCE Julie L Freshwater, MPH PhD Influenza Surveillance Coordinator.
Introduction for Basic Epidemiological Analysis for Surveillance Data National Center for Immunization & Respiratory Diseases Influenza Division.
Detecting Influenza Outbreaks by Analyzing Twitter Messages By Aron Culotta Jedsada Chartree 02/28/11.
Community Change By: Emily Alpers, Shirley Iler, Barbara Lentz, & Sharon Lumbert.
The Influenza Immunization Season: An Early Look at Communications and Media Planning Glen Nowak, Ph.D. Director, Media Relations Centers for Disease.
Eurostat WebDataNet Conference 2015 Salamanca, 26 th – 28 th May 2015 Fernando Reis, Big Data Task-Force European Commission (Eurostat) Web activity evidence.
U.S. Outpatient Influenza-Like Illness Surveillance Network (ILINet) Neil Pascoe for Irene Brown.
Center for Computational Analysis of Social and Organizational Systems Dynamic Network Approach to Health Surveillance Prof.
Public Relations & Social Media
Community Change By: Emily Alpers, Shirley Iler, Barbara Lentz, & Sharon Lumbert.
Measuring Social Media - update Dr Tom Watson Bournemouth University Chair of Judges, AMEC Awards
Some Final Material. GOOGLE FLU TRENDS Sore throat? Sniffles? Google it! Duh! During flu season, more people enter search queries concerning the flu.
Big data What they are and why we should care. Previous hot topics 60’s Catastrophe theory 70’s Fractals 80’s Chaos theory 90’s Data mining 00’s Machine.
CVD Testing the H1N1 Pandemic Flu Vaccines Mini-Med School Karen Kotloff, MD University of Maryland School of Medicine Center for Vaccine Development September.
Choose You Over the Flu Laura Scott, Executive Director, Families Fighting Flu National Conference on Immunization and Health Coalitions May 24, 2012.
Lyme Disease Case Study: Binghamton University Campus
Influenza pandemic: FluWorkLoss: Software to estimate work days lost
Summary Presented by : Aishwarya Deep Shukla
Project Sentinel Collaboratory Georgetown University Medical Center Washington Hospital Center Seong K. Mun, PhD Funded By National Library of Medicine.
Epidemic Alerts EECS E6898: TOPICS – INFORMATION PROCESSING: From Data to Solutions Alexander Loh May 5, 2016.
One Health Early Warning Alert
Influenza-like Illness Surveillance at the National Level
Flu and big data Week 10.2.
Toby L. Merlin, M.D. Deputy Incident Manager CDC H1N1 Response
Analyzing social media data to monitor public health trends
2015 NINR Big Data in Symptoms Research Boot Camp Overview
Mining Online Data to Learn About Disease Outbreaks
Multiple models forecast Influenza
Presentation transcript:

Bringing Together the Social and Technical in Big Data Analytics: Why You Can't Predict the Flu from Twitter, and Here's How David A. Broniatowski Asst. Prof. EMSE

PUBLIC HEALTH CYCLE PopulationDoctors Surveillance Intervention

Traditional mechanisms Surveys Clinical visits REQUIRES: DATA ON THE POPULATION This has limited research

TWITTER Short messages (140 chars) posted to public internet Content: news, conversation, pointless babble Huge volume 500 million a day

WHY TWITTER? Huge volumes of data A constant stream of small updates Nothing like waiting in line to buy cigarettes behind a guy in a business suit buying gasoline with ten dollars in dimes I eat pizza too much I'm at Cvs Pharmacy (117th and kendall, Miami)

INFLUENZA SURVEILLANCE

CDC has nationwide surveillance network with 2700 outpatient centers reporting ILI: influenza-like illness Cons: Slow (2 weeks) Varying levels of geographic granularity

TWITTER SURVEILLANCE Twitter influenza surveillance must be 1) Accurately track ground truth Identify infection tweets 2) Effective at both municipal and national level Expand tweet geolocation and evaluate municipal accuracy 3) Predictive in real time Deploy previously trained system on this flu season

PIPELINE CLASSIFIERS Three steps using supervised machine learning+NLP Step 1: Identify health tweets Step 2: Identify flu related Step 3: Awareness vs. infection

TWITTER SURVEILLANCE Twitter influenza surveillance must be 1) Accurately track ground truth Identify infection tweets 2) Effective at both municipal and national level Expand tweet geolocation and evaluate municipal accuracy 3) Predictive in real time Deploy previously trained system on this flu season

LOCAL EFFECTIVENESS Current work focuses on US national flu rates Useful surveillance needed by region/state/city How can Twitter track local trends? Is it accurate? Is there enough data? Only about 1% of Twitter is geocoded

CARMEN (Dredze et al., 2013) Over 4000 known locations (countries, states, counties, cities) Geocordinates only: ~1% Expanded locations: ~22% Available in Python and Java

TWITTER SURVEILLANCE Twitter influenza surveillance must be 1) Accurately track ground truth Identify infection tweets 2) Effective at both municipal and national level Expand tweet geolocation and evaluate municipal accuracy 3) Predictive in real time Deploy previously trained system on this flu season

SURVEILLANCE RESULTS Pearson Correlation Keywords Flu Classifier Google Flu Trends Infection

GOOGLE FLU TRENDS GETS IT WRONG? Lohr, S. (2014). Google flu trends: the limits of big data. New York Times.

Pearson Correlation: Keywords: 0.75 Infection: 0.93

ILI counts: Infection: 0.88 Keywords: 0.72 BLIND EVALUATION

Correlation

MOST RECENT DATA Broniatowski, D. A., Dredze, M., Paul, M. J., & Dugas, A. (2015). Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study. JMIR Public Health and Surveillance, 1(1), e5.

PREDICTING ACTUAL FLU IN BALTIMORE Broniatowski, D. A., Dredze, M., Paul, M. J., & Dugas, A. (2015). Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study. JMIR Public Health and Surveillance, 1(1), e5.

HEALTHTWEETS.ORG

HEALTHTWEETS WORLDWIDE

Some Other Projects David A. Broniatowski Asst. Prof. EMSE

29 BIG DATA FOR GROUP DECISION MAKING: EXTRACTING SOCIAL NETWORKS FROM FDA ADVISORY PANEL MEETING TRANSCRIPTS (Broniatowski & Magee, 2013 American Journal of Therapeutics; Broniatowski & Magee, 2012 IEEE Signal Processing Magazine; Broniatowski & Magee, in preparation)

“GERMS ARE GERMS” AND “WHY NOT TAKE A RISK?” MODELS AND DATA FOR RISKY DECISION MAKING IN THE ED (Broniatowski, Klein, & Reyna, in press, Medical Decision Making Broniatowski & Reyna, in preparation)

Examples: Phylogenetic trees General Motors Problem decomposition Tree HierarchyLayered Hierarchy Examples: Levels of abstraction Law firm organization Problem abstraction Grid Networks and Teams Examples: Contagion Markets Crowdsourcing Families (teams) HOW DO WE DESIGN SYSTEMS TO USE INFORMATION FLOW TO OUR ADVANTAGE? We would like to deepen our intuition regarding system architectures (Broniatowski & Moses, in preparation)

32 QUESTIONS? Big data Influenza tracking and coupled contagion Group decision-making Individual decision-making Formal models Medical and engineering applications Formal and mathematical models Systems architecture Design for flexibility