Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery.

Slides:



Advertisements
Similar presentations
Supporting the Research Process The NaCTeM Text Mining Service William Black Informatics, Manchester.
Advertisements

REACTION REACTION Workshop Task 2 – Progress Report & Plans Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal.
What every School District should know about the Social Web February 25, 2011.
Automatic Timeline Generation from News Articles Josh Taylor and Jessica Jenkins.
News and Blog Analysis with Lydia Steven Skiena Dept. of Computer Science SUNY Stony Brook
Introduction to ReviewMiner Hongning Wang Department of Computer Science University of Illinois at Urbana-Champaign
Distributed search for complex heterogeneous media Werner Bailer, José-Manuel López-Cobo, Guillermo Álvaro, Georg Thallinger Search Computing Workshop.
Name ____________________ Date ___________ Period ____.
1 DynaMat A Dynamic View Management System for Data Warehouses Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan.
Linking Entities in #Microposts ROMIL BANSAL, SANDEEP PANEM, PRIYA RADHAKRISHNAN, MANISH GUPTA, VASUDEVA VARMA INTERNATIONAL INSTITUTE OF INFORMATION TECHNOLOGY,
University of Illinois Visualizing Text Loretta Auvil UIUC February 25, 2011.
SFU, CMPT 741, Fall 2009, Martin Ester 418 Outlook Outline Trends in KDD research Graph mining and social network analysis Recommender systems Information.
IVITA Workshop Summary Session 1: interactive text analytics (Session chair: Professor Huamin Qu) a) HARVEST: An Intelligent Visual Analytic Tool for the.
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
CSE 574 – Artificial Intelligence II Statistical Relational Learning Instructor: Pedro Domingos.
Database and Information- Retrieval Methods for Knowledge Discovery Database and Information- Retrieval Methods for Knowledge Discovery Gerhard Weikum,
Data Mining – Intro.
Data and Information Systems Laboratory University of Illinois Urbana-Champaign CS 512 Jan 18, 2010 WinaCS Project Web Entity Extraction and Mapping Discovering.
In Situ Evaluation of Entity Ranking and Opinion Summarization using Kavita Ganesan & ChengXiang Zhai University of Urbana Champaign
Geographic Data Mining Marc van Kreveld Seminar for GIVE Block 1, 2003/2004.
LLNL-PRES This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.
Text Analytics And Text Mining Best of Text and Data
Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural.
Advisor: Hsin-Hsi Chen Reporter: Chi-Hsin Yu Date:
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
Temporal Event Map Construction For Event Search Qing Li Department of Computer Science City University of Hong Kong.
KNOWLEDGE DATABASE Topics inside  Document sharing  Event marketing  Web content.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Page 1 Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Analysis and Monetization of Social Data Amit P. Sheth Lexis-Nexis Ohio Eminent Scholar Director, Kno.e.sis Center, Wright State University.
updated CmpE 583 Fall 2008 Ontology Integration- 1 CmpE 583- Web Semantics: Theory and Practice ONTOLOGY INTEGRATION Atilla ELÇİ Computer.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Visualizing Ontology Components through Self-Organizing.
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
Multimodal Information Access and Synthesis A DHS Institute of Discrete Science UIUC Dan Roth Department of Computer Science University of Illinois.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 24, 2001.
MODEL ADAPTATION FOR PERSONALIZED OPINION ANALYSIS MOHAMMAD AL BONI KEIRA ZHOU.
Kansas State University Department of Computing and Information Sciences CIS 798: Intelligent Systems and Machine Learning Tuesday, December 7, 1999 William.
Computing & Information Sciences Kansas State University IJCAI HINA 2015: 3 rd Workshop on Heterogeneous Information Network Analysis KSU Laboratory for.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Summary Knowledge Bases from Web are Real, Big & Useful: Entities, Classes & Relations Key Asset for Intelligent Applications: Semantic Search, Question.
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
Foundations of Business Intelligence: Databases and Information Management.
What Is Text Mining? Also known as Text Data Mining Process of examining large collections of unstructured textual resources in order to generate new.
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
Computer Science & Informatics Digital Ethnography, & Digital Humanities 1 st Big Data Event Kansas State University Monday 28 April 2014 William H. Hsu.
Opportunities for Text Mining in Bioinformatics (CS591-CXZ Text Data Mining Seminar) Dec. 8, 2004 ChengXiang Zhai Department of Computer Science University.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
MINING DEEP KNOWLEDGE FROM SCIENTIFIC NETWORKS
Sentiment analysis algorithms and applications: A survey
中国计算机学会学科前沿讲习班:信息检索 Course Overview
Course Summary (Lecture for CS410 Intro Text Info Systems)
Open Cirrus Summit Indranil Gupta, Roy Campbell, Michael Heath
Jiawei Han Computer Science University of Illinois at Urbana-Champaign
Mining the Data Charu C. Aggarwal, ChengXiang Zhai
CS7280: Special Topics in Data Mining Information/Social Networks
Ontology-Based Information Integration Using INDUS System
How to publish in a format that enhances literature-based discovery?
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
CS565: Intelligent Systems and Interfaces
Presentation transcript:

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases University of Kansas CBAR Wednesday, 04 September 2013 William H. Hsu Laboratory for Knowledge Discovery in Databases, Kansas State University Acknowledgements Kansas State: Wesam Elshamy, Ming Yang, Surya Teja Kallumadi, Majed Alsadhan Illinois: Chengxiang Zhai, Jiawei Han, Kevin Chang, Dan Roth iQGateway: Praveen Koduru, Krishna Kumar Vallyatodi Dynamic Topic Modeling for Spatiotemporal Event Extraction: Probabilistic Approaches and The Dim Sum Process

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Based on NLP Group NER Toolkit © Stanford University Simile © Massachusetts Institute of Technology Google Maps © Tele Atlas, Inc. and Google, Inc. Motivation: Thematic Mapping [1] Summarizing News from The Web

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2011 – 2012 TextMap.org Motivation: Thematic Mapping [4] TextMap & Topic modelsc

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Volkova, S., Caragea, D., Hsu, W. H., Drouhard, J., & Fowles, L. (2010). Boosting Biomedical Entity Extraction by using Syntactic Patterns for Semantic Relation Discovery. Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2010). See also: Volkova, S. (2010). As Entity Extraction, Animal Disease-related Event Recognition and Classification from Web. M.S. thesis, Kansas State University. Motivation: Thematic Mapping [5] Existing Systems & Limitations

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Timeline Formation: General Task Illustrated Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation (STEF) Adapted from Elshamy (2012) Time t: 3 extant topicsTime t + k: 2 extant topics

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [1]: Basic Task (Static) Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [2]: Understanding Plate Notation Adapted from Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [3]: Hyperparameters (Another Model) Adapted from Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous Time vs. Variable Number of Topics Elshamy (2012) State of the Field Goal

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Event s from Text: Markov Model for Topic Detection & Tracking Adapted from Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Dynamic Topic Model (cDTM) Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Discrete Time Online Hierarchical Dirichlet Process (oHDP) Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Infinite Dynamic Topic Model (CIDTM) Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2006 – 2013 Brownstein, J. & Freifeld, C. HealthMap Redux: Thematic Mapping, Health Infor matics, & Epidemiology

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [1]: Entities Example: CNN, 2007 Foot-and-Mouth Disease ( Tests have confirmed a second foot-and-mouth outbreak in southern England, the government announced, raising fears that the highly contagious animal virus is spreading. Chief Veterinary Officer Debby Reynolds said Tuesday that tests showed a herd of cattle had been infected. The animals were culled Monday evening after showing signs of the disease. Update Summarization A second foot-and-mouth disease infection in a herd of cattle in southern England was responded to by culling on Monday evening and announced by Debby Reynolds on Tuesday. (Second since earlier report – hence “update”.) Compare: Recognizing Textual Entailment A foot-and-mouth disease infection was reported the day after culling. (True.)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [2]: Aspects © 2008 C. Zhai University of Illinois

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Current off-the-shelf applications fall into ambiguity problems Thematic Mapping Tasks [3]: Location & Disambiguation © 2008 W. Elshamy

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Search phrase: “smallpox”© 2007 – 2009 Google, Inc. Thematic Mapping Tasks [4]: Time & Timelines

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [5]: Timeline Reconstruction Murphy, Hsu, Elshamy, Kallumadi, & Volkova (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [1]: Meth Lab mapping Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [2]: Visual Analytics Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [3]: Topic Proportions Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Sentiment Analysis Tasks: Polarity © 1999 – 2012 dslreports.com

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Aggregation & OLAP: Wikipedia Infobox as Fact Table Infobox: Albert Einstein © 2001 – 2010 Wikimedia Foundation Q: Where can this information be found? A: It depends… How much formatting does source page have? Marked up? (Machine-readable?) Semantically rich markup? Albert Einstein © 2001 – 2010 Wikimedia Foundation

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [1]: Health Blogs on Chronic Disease

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [2]: New Entities & Relationships

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [3]: Polarity © 2012 Twitrratr

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [4]: Aims & Approach Aim 1 – Extend Algorithms to Detect New:  Entities: Diseases, Treatments, Complications  Relationships: Adverse Reactions, Controversies Aim 2 – Domain-Specific Ontology  Symptoms, Disease Attributes  Treatments, Complications  Comparisons Aim 3 – Better Recognition of Scope, Polarity

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases User Groups: Goals & Primary Use Cases Goal: Thematic Opinion Map (Choropleth, etc.) User Groups  Experienced: policymakers, health professionals  Individual stakeholders: patients, activists, voters Primary Use Case: Infographics as IE Views © 2011 Mediabistro Are Germans really the happiest Twitter users by country, Tennesseans by U.S. state?