David Inouye Georgia Institute of Technology 2011 DIMACS REU Intern at Rutgers University William M. Pottenger, Ph.D., Mentor * The content of this presentation.

Slides:



Advertisements
Similar presentations
Topic models Source: Topic models, David Blei, MLSS 09.
Advertisements

Military Deployments. Deployment timelines KBR/Civilian employees including “security personnel” is 6 months or 12 months. Air Force was 2 months but.
John Bohannon Presenter: Mustafa Kilavuz.  Shyam Sankar proposes looking at the geospatial distribution of significant acts on the map of Baghdad. 
2  Established by the Florida Legislature in 1941  Funded by the Insurance Commissioner’s Regulatory Trust Fund  Chief Financial Officer Jeff Atwater.
Joint Sentiment/Topic Model for Sentiment Analysis Chenghua Lin & Yulan He CIKM09.
 6Fon_KQ 6Fon_KQ.
The Revolutionary Armed Forces of Colombia (FARC).
Assuming normally distributed data! Naïve Bayes Classifier.
Cognitive Architecture for Reasoning about Adversaries T-REX: A Domain-Independent System for Automated Cultural Information Extraction Massimiliano Albanese.
LATENT DIRICHLET ALLOCATION. Outline Introduction Model Description Inference and Parameter Estimation Example Reference.
WEEK VI Malcolm Collins-Sibley Mentor: Shervin Ardeshir.
Overseas Security Advisory Council Pearl Continental Peshawar Attack June 9, 2009 The contents of this (U) presentation in no way represent the policies,
Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.
Terrorism Timeline By: Ryan Huynh Period Olympics -A Terrorist group known as Black September -Occurred in Munich, Germany. -Kidnapping -Killed.
INDIA MAP ACTIVITY LEARN THE NAME Identify all state names Learn them Check their outline boundaries for identification.
PEACE THROUGH DISARMAMENT INTERNATIONAL HUMANITARIAN LAW, MINE ACTION AND YOUTH ACTIVISM.
ISRAEL/ OCCUPIED PALESTINIAN TERRITORY. Israel/Occupied Palestinian Territory Decades of tension between Israel and the Occupied Palestinian Territory.
Events of Modern Terrorism.  On 1972, in Munich, Germany, during the summer Olympics, 11 members of the Israeli Olympic team were taken hostage and killed.
Scheme of Assistance for Working Women Hostel MINISTRY OF WOMEN AND CHILD DEVELOPMENT GOVERNMENT OF INDIA.
Boko Haram Breakdown. Tactical Breakdown of Recent Incidents Oct. 31 Arms amnesty for Islamist militants expires; house to house searches by JTF begin.
1 Road Crash and Victim Information System (RCVIS) Mr Sem Panhavuth Road Crash and victim information System Project Manager Handicap International Belgium.
A Two Tier Framework for Context-Aware Service Organization & Discovery Wei Zhang 1, Jian Su 2, Bin Chen 2,WentingWang 2, Zhiqiang Toh 2, Yanchuan Sim.
Safety Information for Soldiers The Dangers Associated with Munitions 1.
THE SHADOW OF TERROR BIJOY RAVEENDRAN INFORMATION & INTERFACE DESIGN JULY | NATIONAL INSTITUTE OF DESIGN.
Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.
Recent FBI Statistics on Terrorism The History of Terrorism as a Strategy of Political Insurgency March 2011.
Finding the Hidden Scenes Behind Android Applications Joey Allen Mentor: Xiangyu Niu CURENT REU Program: Final Presentation 7/16/2014.
Rotary India Literacy Mission T-E-A-C-H Program 2 nd Child Development Committee Meeting Date – 1 st August,2015 Venue: RILM Office 145, Sarat Bose Road,
Terrorism in Moscow, 3/29/2010 Moscow Attack a Test for Putin and His Record Against Terror.
Syniah & Marvin. Definition: the use of violence and intimidation in the pursuit of political aims. Own definition: Terrorism is when one person believes.
By: Amelia Veno. WHAT IS AN IED? An IED is abbreviated in which it actually means Improvised Explosive Devices, once an IED is detonated it can be extremely.
PHILIPPINES. A shelter on the grounds of Sadanga National High School, Mountain Province, in the Philippines, used as quarters by soldiers of the 54th.
First Responder Support Tools (FiRST app): An Overview 2011 : The Office of Bombing Prevention provides a concept and technical requirements to DHS’ Science.
The London Bombings, July SPEAKOUT UPPER-INTERMEDIATE, UNIT 9.1 Reference:
Conflict South Asia. Intro Two major conflicts are presently occurring in South Asia. Both are between religious groups, over land. 1. Kashmir 2. Sri.
Terrorist Attacks By Craig Stevens Basic info For the world we live in today we are constantly under the threat of a possible terrorist that may make.
Tackling child malnutrition the LAGAAN approach S B Agnihotri 15/01/2015.
Topic Modeling using Latent Dirichlet Allocation
Security of Ammunition Depots
Topic Models Discovering Annotating Comparing Referring Sampling Illustrating Representing John Unsworth, “Scholarly Primitives” “Scholarly Primitives”
Processing of large document collections Part 1 (Introduction) Helena Ahonen-Myka Spring 2006.
What is FIRES? Web Based Crash Database for Florida Crash Data Member
Dynamic Multi-Faceted Topic Discovery in Twitter Date : 2013/11/27 Source : CIKM’13 Advisor : Dr.Jia-ling, Koh Speaker : Wei, Chang 1.
Soharabuddin Sheikh And A Brief Account Of His Criminal History.
SLIDES FOR POWERPOINT 2007 & MAP OF INDIA Illustrations of country and administry districts.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
At Least 12 Dead in Attack on French Newspaper. A local resident distributes coffee to reporters gathered at the scene after gunmen stormed a French newspaper,
Text-classification using Latent Dirichlet Allocation - intro graphical model Lei Li
The Atomic Bomb Effects on Hiroshima and Nagasaki : Tragic, Powerful, & Devastating.
Strategic Bombing of World War II Five Main Targets 1.Military group – ground troops, military bases, command centers, etc. 2.Industrial group – factories,
An Assassination Attempt on a U.S. President. Dev- Sol (Revolutionary Left)  Established by Dursun Karatas in 1978  Between They killed 35.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Review of Registration of SSA Implementing Agencies under CPSMS 36 th Review Meeting of Finance Controllers New Delhi.
Strategy Beginning strategy limited its operations to fighting Soviets in Afghanistan that they achieved Shifted focus to new goal of establishing full.
Towards new controls on explosive weapons in populated areas
Targets and Tactics Attacks were meant to destabilize governments and spread fear to bring about a communist revolution. Able to carry out attacks from.
INDIA AND IT’S CULTURE By A.Abilash V std A sec
Kent State and Jackson State
South Carolina/ North Carolina IAAI Annual Training Confernece
Armed Violence Reporting and Research
Measuring Sustainability Reporting using Web Scraping and Natural Language Processing Alessandra Sozzi
Clustering and Topic Analysis
Trevor Savage, Bogdan Dit, Malcom Gethers and Denys Poshyvanyk
And it’s subsequent consequences
Topic Modeling Nick Jordan.
10 attacks on India NAME: umanand giri COURSE : B.sc nursing (1 st Year) Submitted to Lata ma’am.
In The News Contact Us Today To Schedule Training or To Learn More
Unsupervised learning of visual sense models for Polysemous words
GEORGIA.
Presentation transcript:

David Inouye Georgia Institute of Technology 2011 DIMACS REU Intern at Rutgers University William M. Pottenger, Ph.D., Mentor * The content of this presentation has been adapted from a presentation given by Nir Grinberg. 06/07/20111

Introduction to Entity Resolution Entity resolution is the problem of deciding if two sets of data elements refer to the same real-world entity. 06/07/20112 Elements from Source 1Elements from Source 2 ? ? ?

Introduction to Entity Resolution Entity resolution is the problem of deciding if two sets of data elements refer to the same real-world entity. 06/07/20113 Elements from Source 1Elements from Source 2

Objective/Approach 06/07/20114 Standardize and Encode Calculate Similarity Scores Classify Using Ground Truth Data * WITS - GTD - Incidents in GTD*Incidents in WITS* Month: 6 Day: 28 Year: 2005 City: Dardsun, Kupwara Type: Arson Date: 06/27/2005 City: Kupwara Type: Fire attack

Phase 1: Standardize and Encode 06/07/20115 WITS Incident_ID DateCityState_ProvCountry /3/06UdhampurJammu and Kashmir India /27/2005KupwaraJammu and Kashmir India GTD Eventid IyearImonthIdayCityProvstatecountry PatnaBiharIndia Dardsun Kupwara Jammu & Kashmir (State) India

Phase 1: Standardize and Encode Standardize Dates Map WITS weapon types to GTD weapon types GeoCode location to latitude and longitude Extract topic model distribution using LDA 06/07/20116

Phase 1: Latent Dirichlet Allocation Generative probabilistic model Assumes topics are probability distributions of words Assumes documents are probability distributions of topics 06/07/20117

Phase 1: LDA Example 06/07/20118 * Example from “Probabilistic Topic Models” by Mark Steyvers.

Phase 1: Latent “Topics” (most probable words in topic) killed, kashmir, attack, injured, militants, suspected, blast, kill, bomb fired, upon, armed, killed, manipur, civilian, imphal, member, former civilian, kashmir, jammu, night, residence, kidnapped, one, village, doda police, one, killing, wounding, officers, two, officer, others, injuring jammu, kashmir, baramula, security, one, armed, anantnag, hizbul, mujahedin assam, explosive, front, improvised, device, liberation, united, ied, ulfa widely, two, civilians, national, tripura, kidnapped, three, village, karbi causing, injuries, damage, damaging, fire, station, set, detonated, train maoist, party, communist, cpi, widely, pradesh, andhra, chhattisgarh, village grenade, threw, civilians, wounding, srinagar, vehicle, two, kashmir, jammu 06/07/20119

Phase 1: Latent “Topics” (most probable words in topic) killed, kashmir, attack, injured, militants, suspected, blast, kill, bomb fired, upon, armed, killed, manipur, civilian, imphal, member, former civilian, kashmir, jammu, night, residence, kidnapped, one, village, doda police, one, killing, wounding, officers, two, officer, others, injuring jammu, kashmir, baramula, security, one, armed, anantnag, hizbul, mujahedin assam, explosive, front, improvised, device, liberation, united, ied, ulfa widely, two, civilians, national, tripura, kidnapped, three, village, karbi causing, injuries, damage, damaging, fire, station, set, detonated, train maoist, party, communist, cpi, widely, pradesh, andhra, chhattisgarh, village grenade, threw, civilians, wounding, srinagar, vehicle, two, kashmir, jammu 06/07/201110

Phase 2: Compute Similarity Dates 05/23/2001 vs. 05/22/2001 Nominal strings such as country or city “Jammu” vs. “Jammuu” GeoLocation Lat 32.8/Long 74.7 vs. Lat 32.27/Long 75.6 Topic distribution 06/07/201111

Phase 3: Classify as Match/Non-match 06/07/ * The Center for the Study of Terrorism and Responses to Terrorism (START) at the University of Maryland provided the human annotated ground truth data. Similarity Scores Classifier Model Based on Ground Truth* Match or Non-match

Phase 3: Classifier Results 06/07/ Classified Non-matchMatch Class Non-match Match116246

My research possibilities Clean up the ground truth data Improve upon the HO-LDA algorithm Consider how to compute different similarity scores 06/07/201114

06/07/201115