Towards Real Time Epidemic Vigilance through Online Social Networks Lingji Chen [1] Harshavardhan Achrekar [ 2 ] Benyuan Liu [2] Ross Lazarus [3] MobiSys.

Slides:



Advertisements
Similar presentations
Surveillance in a Pandemic: Situational Awareness
Advertisements

Predicting Flu Trends using Twitter Data Harshavardhan Achrekar [1] Avinash Gandhe [ 2 ] Ross Lazarus [3] Ssu-Hsin Yu [2] Benyuan Liu [1] SNEFT – Social.
Findings Department of Health and Human Services National Institutes of Health National Institute of General Medical Sciences Social Studies Physicist.
Chapter 5: Introduction to Information Retrieval
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
Predicting Flu Trends using Twitter Data Harshavardhan Achrekar [1] Avinash Gandhe [ 2 ] Ross Lazarus [3] Ssu-Hsin Yu [2] Benyuan Liu [1] Workshop on Cyber-Physical.
ICDM, Shenzhen, 2014 Flu Gone Viral: Syndromic Surveillance of Flu on Twitter using Temporal Topic Models Liangzhe Chen, K. S. M. Tozammel Hossain, Patrick.
Introduction of Surveillance and Prevention of H1N1 Flu in Yunnan Department of Public Health of Yunnan Province Speaker : Hu Shou Jing Report Time : Fifteen.
INFLUENZA A (H1N1) SWINE FLU : EVOLUTION OF THE PROBLEM BY DR ESSAM EL-GAMAL Professor of Chest Diseases Mansoura Faculty of Medicine Tuesday May, 5, 2009.
U.S. Surveillance Update Anthony Fiore, MD, MPH CAPT, USPHS Influenza Division National Center for Immunizations and Respiratory Disease Centers for Disease.
Presentation Topic : Modeling Human Vaccinating Behaviors On a Disease Diffusion Network PhD Student : Shang XIA Supervisor : Prof. Jiming LIU Department.
UNDERSTANDING VISIBLE AND LATENT INTERACTIONS IN ONLINE SOCIAL NETWORK Presented by: Nisha Ranga Under guidance of : Prof. Augustin Chaintreau.
 Well-publicized worms  Worm propagation curve  Scanning strategies (uniform, permutation, hitlist, subnet) 1.
George A. Ralls M.D. Dave Freeman Health Services Department September 1st, 2009 INFLUENZA UPDATE.
Forecasting World Wide Pandemics Using Google Flu Data to Forecast the Flu Brian Abe Dan Helling Eric Howard Ting Zheng Laura Braeutigam Noelle Hirneise.
Modeling the SARS epidemic in Hong Kong Dr. Liu Hongjie, Prof. Wong Tze Wai Department of Community & Family Medicine The Chinese University of Hong Kong.
Google Flu Trends Terminology –Influenza = flu –ILI = influenza like illness CDC ILI time series –Weekly –1-2 week publication lag Predicting it using.
Pandemic Influenza Preparedness Kentucky Department for Public Health Department for Public Health.
Epidemic Vs Pandemic 8.L.1.2.
Audumbar Chormale Advisor: Dr. Anupam Joshi M.S. Thesis Defense
How does mass immunisation affect disease incidence? Niels G Becker (with help from Peter Caley ) National Centre for Epidemiology and Population Health.
Towards Detecting Influenza Epidemics by Analyzing Twitter Massages Aron Culotta Jedsada Chartree.
Influenza Surveillance at IRID Immunization and Respiratory Infections Division Centre for Infectious Disease Prevention & Control Public Health Agency.
From Pandemic Preparedness to Management: UK experience Professor Lindsey Davies CBE FRCP FFPH National Director of Pandemic Influenza Preparedness.
Authors: Xu Cheng, Haitao Li, Jiangchuan Liu School of Computing Science, Simon Fraser University, British Columbia, Canada. Speaker : 童耀民 MA1G0222.
TANEY COUNTY HEALTH DEPARTMENT AUGUST 2009 Situation Update: H1N1 Influenza A.
Learning from the 2009 H1N1 Pandemic Response 1 Daniel S. Miller MD, MPH Director, International Influenza Unit Office of the Secretary Office of Global.
Stanislaus County It’s Not Flu as Usual It’s Not Flu as Usual Pandemic Influenza Preparedness Renee Cartier Emergency Preparedness Manager Health Services.
Best Practice Guideline for the Workplace During Pandemic Influenza Occupational Health and Safety Employment Standards.
Learning Goals Appreciate that events on the other side of the world affect us.
1 Twitter improves Seasonal Influenza Prediction [1] Computer Science Department, University of Massachusetts Lowell [2] Scientific Systems Company Inc,
Understanding Cross-site Linking in Online Social Networks Yang Chen 1, Chenfan Zhuang 2, Qiang Cao 1, Pan Hui 3 1 Duke University 2 Tsinghua University.
Influenza-like Illness Surveillance at the National Level
Using Facebook to Connect With Customers Part 1. Outline Questions from Librarians Introduction to Facebook Uses for Facebook Facebook for Personal Use.
Pete Bohman Adam Kunk. What is real-time search? What do you think as a class?
Epidemic and Pandemic Disease Outbreaks. How do we define an Epidemic?  An epidemic is an out break of disease that affects many individuals at the same.
Review and Discussion Time line courtesy of:
Sore throat? Sniffles?Sore throat? Sniffles?  Google it! Duh!  During flu season, more people enter search queries concerning the flu.  Each year 90.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
Rate My Health A Smartphone Application for Benchmarking Individual Health Status to the Health Indicators Warehouse.
Dr. Zhen XU Branch of Respiratory Disease Prevention and Control Division for Disease Control and Emergency Response Chinese Center for Disease Control.
Pete Bohman Adam Kunk. Real-Time Search  Definition: A search mechanism capable of finding information in an online fashion as it is produced. Technology.
Papua New Guinea Update 3 rd NIC Meeting 18 – 20 Beijing, China Berry Ropa National CSR Officer Department of Health Papua New Guinea.
Use of Electronic and Internet advertising options Standard 3.4.
US Situation Update and CDC International Response H1N1 Pandemic US Situation Update and CDC International Response Peter Nsubuga, MD, MPH On behalf of.
Warm Up March 17 th, )What is an outbreak? Which would be an easier outbreak to stop: bacteria or parasite and explain why. 2)Explain one way a fungus.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Diseases Unit 3. Disease Outbreak  A disease outbreak happens when a disease occurs in greater numbers than expected in a community, region or during.
Pandemic Flu Brief Unit Name Rank / Name Unit logo.
Swine Flu & You! Information Regarding the Possible Approaching Swine Flu Pandemic.
Copyright © 2001, SAS Institute Inc. All rights reserved. Data Mining Methods: Applications, Problems and Opportunities in the Public Sector John Stultz,
What Is H1N1 (Swine Flu) Pandemic Influenza? Colorized image of H1N1 from a transmission electron micrograph. Source: CDC.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Kentucky Community Surveillance N. Brennan O’Banion, MPH Kentucky Department for Public Health.
Beth Roland 8th Grade Science
Research using Registries
Infectious Diseases Surveillance in the Military
Pandemic H1N1 Influenza The California Experience
Diseases Unit 3.
System Control based Renewable Energy Resources in Smart Grid Consumer
Epidemic Alerts EECS E6898: TOPICS – INFORMATION PROCESSING: From Data to Solutions Alexander Loh May 5, 2016.
One Health Early Warning Alert
Influenza-like Illness Surveillance at the National Level
Use of Electronic and Internet advertising options
Human- Environment Interaction
What is the difference between an outbreak, epidemic, and a pandemic?
Diseases Unit 3.
Susceptible, Infected, Recovered: the SIR Model of an Epidemic
Analyzing social media data to monitor public health trends
Building Topic/Trend Detection System based on Slow Intelligence
Presentation transcript:

Towards Real Time Epidemic Vigilance through Online Social Networks Lingji Chen [1] Harshavardhan Achrekar [ 2 ] Benyuan Liu [2] Ross Lazarus [3] MobiSys 2010, San Francisco, CA, USA SNEFT – Social Network Enabled Flu Trends [1] Scientific Systems Company Inc, Woburn, MA [2] Computer Science Department, University of Massachusetts Lowell [3] Department of Population Medicine - Harvard Medical School

Background Related Work Our Approach SNEFT System Architecture Detection and Prediction Initial Stage Results Conclusion Outline

Seasonal flu Influenza (flu) is contagious respiratory illness caused by influenza viruses. Seasonal - wave occurrence pattern. 5 to 20 % of population gets flu ≈ 200,000 people are hospitalized from flu related complications. 36,000 people die from flu every year in USA. worldwide death toll is 250,000 to 500,000. Epidemiologists use early detection of disease outbreak to reduce no. of people affected.

Historical DataFlu Pandemic /1918 Spanish fluSARSSwine Flu/H1N1 Causeoverreaction of body’s immune systemSARS coronavirus Swine Influenza Virus OriginUSA & France before getting to Spain.Guangdong, China USA and Mexico Infected Masses/Areas predominant in healthy young adults as opposed to juvenile,elderly or weak. 37 countries including USA 207 countries TimelineMar Jun 1920 {World war I}Nov Jul 2003Aug 2009 onwards Infected cases 500 million {1/3 of world’s population} 8,273622,482 so far Deaths 50 million (3% of world’s population) {1.6 billion at that time} 77515,174 so far Historical Background

Related Work :- Google Flu Trends Certain Web Search terms are good Indicators of flu activity. Google Trend uses Aggregated search data on flu indicators. Estimate current flu activity around the world in real time. Accuracy of data {not every person who searches for “Flu” is sick} CDC stands for Center for Disease Control Link:-

Our Approach OSN emerged as popular platform for people to make connections,share information and interact. OSN represent a previously untapped data source for detecting onset of an epidemic and predicting its spread. {“i am down with flu”, “get well soon”} msg exchange between users provide early,robust predictions. Twitter/Facebook mobile users tweet/posts updates with their geo-location updates. helps in carrying out refined analysis. User demographics like age, gender, location, affiliated networks.,etc can be inferred from data. snapshot of current epidemic condition and preview on what to expect next on daily or hourly bases. FaceBook:- 400, Myspace:- 200, Twitter:- 80 User Population (in millions)

System Architecture of SNEFT ILI Data OSN Data downloader crawler OSN models Math models ARMA Model Novelty Detector Filter / Predictor ILI Pre- diction Flu Warn- ing State Esti- mate Internet Data Collection Engine ILI stands for Influenza-Like Illness

Components of SNEFT Architecture Data Collection Downloader :- stores CDC ILI data/reports into ILI Database. Crawler :- collect publicly available data from online social networking sites. choose a list of keywords that are likely to be of significant. use OSN public search interfaces to collect relative keyword frequencies. store relevant information in a OSN spatio-temporal database. Novelty Detection Detecting transition from "normal" baseline situation to a pandemic in real time by monitoring volume and content of OSN data. provide timely {early stage} warning to public health authorities for investigations.

Components of SNEFT Architecture ILI prediction / ARMA [ Auto-regressive Moving Average ] Model build ARMA model to predict ILI incidence as a linear function of current and past OSN data and past ILI data. provide valuable ‘‘preview’’ of ILI cases well ahead of CDC reports. Integration with mathematical models Mathematical models to understand dynamics of influenza spread & effects of intervention. parameters are obtained by fitting historical data. build an "OSN sensor model" which describes what would be observed on OSN if the population is infected as such and such." integrate real time OSN data with the prediction of mathematical models, to obtain a posterior estimate of the " infected state" of the population. possible parameter values not consistent with OSN observations are weighted less, while those consistent are weighted more. OSN data "sharpen" the prediction of mathematical models.

OSN Data Collection Design of the Facebook data collection engine / Crawler Facebook Search API Result Set (Public Posts) containing Keywords HTML Content Scrapper Database Profile Info, Location Details Content, Timestamp Profile Id Facebook Profile Scan Engine Individual Users Organizations Community

Facebook Data Collection / Crawler Facebook Search Engine sign-in with a valid account. enter keyword to search with "Post by everyone" option to retrieve status updates and posts of users containing the keyword. Result Set containing Keyword Privacy settings :- user can publish his post/update to friends, group, or everyone. The "everyone" option (default setting) makes corresponding updates available to public and searchable by Facebook search engine. Results are available for public viewing for limited time span.

Facebook Data Collection / Crawler HTML Content Scrapper a screen scrapper for web pages. extract useful information out of posts that are returned as result set from the keyword search. Search response HTML content is input onto DOM Parser/Regular expression matcher and techniques of pattern matching are applied. retrieve profile ID time-stamp of the post post content {with story_id}.

Facebook Data Collection / Crawler Facebook Profile Scan Engine Given a profile ID, we will retrieve the detailed information of the profile name gender age affiliations (school, work, region) birthday location education history friends count. Profile last update time. profile may belong to an individual user, an organization, or a community.

Constraints in OSN Data Collection Search Rate Limit Return Result Limit User Activity Pattern disparities in user activities different hours of the day days of a week special holidays. Continuous Data Collection/ prevent Data Loss schedule search time to guarantee complete set of blog posts containing the keywords, no gap in the collected data.

Mitigation Search Rate Limit Constraint Resolution launch multiple concurrent search sessions from different IP addresses. to coordinate among themselves and collect data at different time intervals so that each session is within the search rate limit. Return Result Limit continuous http request and store response. Continuous Data Collection mechanism

EWMA Scheduling Mechanism to prevent data Loss volume of returned search results determine no. of active search sessions. Denote the estimated average and current search result volume at search round k by v(k) and u(k), respectively, α is the smoothing factor that reflects the weight of the previous estimate. EWMA(search result volume) is computed as follows: v(k) = αv(k-1) + (1 - α)u(k) If the required rate exceeds the rate limit, new search sessions will be triggered to share the load. When the search result volume becomes lighter, the number of active search sessions will be reduced. Exponentially Weighted Moving Average (EWMA) scheme in OSN Data Collection

Detection and Prediction (SIR Model) Susceptible-Infectious-Removed (SIR) model where the dynamics of the population in each compartment is described by dS = -βSI; dt dI = βSI - ϒ I; dt R = N - S - I; N being the total population, β the transmission rate, ϒ the recovery rate. let x(t) be the "state" of the population, which in this case is given by x = [S,I] T. θ be the parameter vector used in model, which is given by θ= [β, ϒ ] T. Transition Probability of disease spread Susceptible Removed Infectious Infection Recovery (or death) Loss of immunity Prob(x(t+1) |x(t), θ)

Initial Stage Results

Conclusion and Future Work achieve faster and near real time detection. predict emergence and spread of influenza epidemic. presented the design of a system called SNEFT, for collecting and aggregating OSN data, extracting information from it, and integrating it with mathematical models of influenza. OSN data - individually noisy but collectively revealing. potential use - disaster relief, supply chain management, epidemic vigilance.

Thank You