Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery.

Similar presentations


Presentation on theme: "Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery."— Presentation transcript:

1 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases University of Kansas CBAR Wednesday, 04 September 2013 William H. Hsu Laboratory for Knowledge Discovery in Databases, Kansas State University http://www.kddresearch.org Acknowledgements Kansas State: Wesam Elshamy, Ming Yang, Surya Teja Kallumadi, Majed Alsadhan Illinois: Chengxiang Zhai, Jiawei Han, Kevin Chang, Dan Roth iQGateway: Praveen Koduru, Krishna Kumar Vallyatodi Dynamic Topic Modeling for Spatiotemporal Event Extraction: Probabilistic Approaches and The Dim Sum Process

2 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Based on NLP Group NER Toolkit © 2005-2010 Stanford University Simile © 2003-2010 Massachusetts Institute of Technology Google Maps © 2007-2010 Tele Atlas, Inc. and Google, Inc. Motivation: Thematic Mapping [1] Summarizing News from The Web http://fingolfin.user.cis.ksu.edu/timemap2gs

3 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap

4 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap

5 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2011 – 2012 TextMap.org Motivation: Thematic Mapping [4] TextMap & Topic modelsc

6 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Volkova, S., Caragea, D., Hsu, W. H., Drouhard, J., & Fowles, L. (2010). Boosting Biomedical Entity Extraction by using Syntactic Patterns for Semantic Relation Discovery. Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2010). See also: Volkova, S. (2010). As Entity Extraction, Animal Disease-related Event Recognition and Classification from Web. M.S. thesis, Kansas State University. Motivation: Thematic Mapping [5] Existing Systems & Limitations

7 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

8 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Timeline Formation: General Task Illustrated Elshamy (2012)

9 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation (STEF) Adapted from Elshamy (2012) Time t: 3 extant topicsTime t + k: 2 extant topics

10 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

11 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [1]: Basic Task (Static) Elshamy (2012)

12 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [2]: Understanding Plate Notation Adapted from Elshamy (2012)

13 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [3]: Hyperparameters (Another Model) Adapted from Elshamy (2012)

14 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

15 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous Time vs. Variable Number of Topics Elshamy (2012) State of the Field Goal

16 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Event s from Text: Markov Model for Topic Detection & Tracking Adapted from Elshamy (2012)

17 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

18 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Dynamic Topic Model (cDTM) Elshamy (2012)

19 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Discrete Time Online Hierarchical Dirichlet Process (oHDP) Elshamy (2012)

20 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Infinite Dynamic Topic Model (CIDTM) Elshamy (2012)

21 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

22 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. HealthMap Redux: Thematic Mapping, Health Infor matics, & Epidemiology

23 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

24 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [1]: Entities Example: CNN, 2007 Foot-and-Mouth Disease (http://bit.ly/3gof6o)http://bit.ly/3gof6o Tests have confirmed a second foot-and-mouth outbreak in southern England, the government announced, raising fears that the highly contagious animal virus is spreading. Chief Veterinary Officer Debby Reynolds said Tuesday that tests showed a herd of cattle had been infected. The animals were culled Monday evening after showing signs of the disease. Update Summarization A second foot-and-mouth disease infection in a herd of cattle in southern England was responded to by culling on Monday evening and announced by Debby Reynolds on Tuesday. (Second since earlier report – hence “update”.) Compare: Recognizing Textual Entailment A foot-and-mouth disease infection was reported the day after culling. (True.)

25 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [2]: Aspects © 2008 C. Zhai University of Illinois http://sifaka.cs.uiuc.edu/ir/

26 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Current off-the-shelf applications fall into ambiguity problems Thematic Mapping Tasks [3]: Location & Disambiguation © 2008 W. Elshamy

27 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Search phrase: “smallpox”© 2007 – 2009 Google, Inc. Thematic Mapping Tasks [4]: Time & Timelines

28 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [5]: Timeline Reconstruction Murphy, Hsu, Elshamy, Kallumadi, & Volkova (2012)

29 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

30 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [1]: Meth Lab mapping Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

31 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [2]: Visual Analytics Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

32 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [3]: Topic Proportions Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)

33 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation  Topic Modeling: Static (Atemporal) to Dynamic  Continuous Time vs. Variable Number of Topics  Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed  News Monitoring: Geotagging & Timelines  Recent Results STEF & Heterogeneous Info Network Analysis Outline

34 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Sentiment Analysis Tasks: Polarity http://dslreports.com © 1999 – 2012 dslreports.com

35 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Aggregation & OLAP: Wikipedia Infobox as Fact Table Infobox: Albert Einstein © 2001 – 2010 Wikimedia Foundation Q: Where can this information be found? A: It depends… How much formatting does source page have? Marked up? (Machine-readable?) Semantically rich markup? Albert Einstein © 2001 – 2010 Wikimedia Foundation

36 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [1]: Health Blogs on Chronic Disease

37 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [2]: New Entities & Relationships

38 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [3]: Polarity http://twitrratr.com/search/EuroHCIR © 2012 Twitrratr

39 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [4]: Aims & Approach Aim 1 – Extend Algorithms to Detect New:  Entities: Diseases, Treatments, Complications  Relationships: Adverse Reactions, Controversies Aim 2 – Domain-Specific Ontology  Symptoms, Disease Attributes  Treatments, Complications  Comparisons Aim 3 – Better Recognition of Scope, Polarity

40 Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases User Groups: Goals & Primary Use Cases Goal: Thematic Opinion Map (Choropleth, etc.) User Groups  Experienced: policymakers, health professionals  Individual stakeholders: patients, activists, voters Primary Use Case: Infographics as IE Views http://bit.ly/fu04zf © 2011 Mediabistro Are Germans really the happiest Twitter users by country, Tennesseans by U.S. state?


Download ppt "Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery."

Similar presentations


Ads by Google