Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information.

Similar presentations


Presentation on theme: "1 Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information."— Presentation transcript:

1 1 Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information Science School of Earth, Society, and Environment National Center for Supercomputing Applications (NCSA) University of Illinois at Urbana-Champaign CyberGIS ‘ 12, Urbana IL, August 8, 2012 A CyberGIS Approach to Digital Humanities and Social Sciences: The World of Textual Geography and a Case Study of Wikipedia’s History of the World

2

3

4

5

6

7

8

9

10 10

11 11

12

13

14 14 http://www.sgi.com/go/wikipedia

15 15

16 16

17 17

18 18

19 19

20 Workflow CyberGIS Sentiment Mining Fulltext Geocoding

21 Inside the CyberGIS “black box” Security Domain Decomposition XSEDE GISolve Middleware CI Data & Viz Resource Selection Task Scheduling Clouds Workflow Management Services Open Service API OSG Emotional Heatmap

22 Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic

23 Data Input for a Topic A set of locations with 3 attributes Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic Latitude, longitude point location 1. Number of articles mentioning this location 2. Number of articles mentioning both this location and topic 3. Average tone of articles mentioning both this location and topic ?

24 Spatializing Emotion 3 important elements 1. Importance of location 2. Prevalence of topic 3. Emotion toward topic Goal: Capture 3 elements on a single map

25 1) Importance of Location Every mention of a location increases its importance Every mention of a location increases its importance Generate a density map of the number of times a location is mentioned in text using Kernel Density Estimation (KDE) based on k nearest neighbor search Generate a density map of the number of times a location is mentioned in text using Kernel Density Estimation (KDE) based on k nearest neighbor search

26 1) Importance of Location

27 2) Prevalence of Topic We term topic intensity to capture the prevalence of a topic relative to other topics, and adopt a method commonly used in epidemiological studies to estimate it We term topic intensity to capture the prevalence of a topic relative to other topics, and adopt a method commonly used in epidemiological studies to estimate it Relative risk is a ratio of the KDE of disease infection locations and case control locations Relative risk is a ratio of the KDE of disease infection locations and case control locations

28 Topic Intensity KDE(articles that mention a topic)___ KDE(articles that do not mention the topic) KDE(articles that mention a topic)___ KDE(articles that do not mention the topic) Relative Risk KDE(points with disease)__ KDE(points without disease) KDE(points with disease)__ KDE(points without disease)

29 Topic Intensity

30 3) Emotion Toward a Topic Challenging question: Is the emotional measure tone, discrete or continuous? Challenging question: Is the emotional measure tone, discrete or continuous? –Is tone "countable" like trees or does it exist as a continuum like air temperature? Tone is a continuum: Tone is a continuum: –Cannot have "number of tones"

31 3) Emotion Toward a Topic A different method is used, because tone is continuous and not discrete A different method is used, because tone is continuous and not discrete Inverse distance weighted (IDW) interpolation is used to estimate tone across space creating a tone map Inverse distance weighted (IDW) interpolation is used to estimate tone across space creating a tone map Tone map captures positive and negative tone toward a particular topic across space Tone map captures positive and negative tone toward a particular topic across space

32 3) Emotion Toward a Topic

33 Overview – 3 layers 1) Article density - Proxy: Importance of location 2) Topic intensity - Proxy: Prevalence of topic relative to other topics 3) Tone - Proxy: Emotion toward a topic

34 Overview – 3 layers 1) Article density - Proxy: Importance of location 2) Topic intensity - Proxy: Prevalence of topic relative to other topics 3) Tone - Proxy: Emotion toward a topic First two layers represent scaling factors for tone Value range: 0 - 1 Value range: 0 - 100 Value range: -100 - 100

35 Emotional Heatmap Article Density Topic Intensity Emotional Heatmap Tone * = *

36 Emotional Heatmap of Armed Conflict in 2003 (Wikipedia)

37 Summary First steps, but started the dialogue First steps, but started the dialogue Balance Balance –Managing the complexity of cyberinfrastructure access –Simplifying the workflow of chaining of spatial analytics –Making sense of what’s involved Scientific rigor Scientific rigor

38 Ongoing Work Translate spatial knowledge to domain knowledge by answering a basic question: why is this here and not there? Translate spatial knowledge to domain knowledge by answering a basic question: why is this here and not there? Tackle spatial aggregation issues Tackle spatial aggregation issues –Represent locations as areas not points –Areal interpolation

39 39 Acknowledgments Guofeng Cao, Anand Padmanabhan Guofeng Cao, Anand Padmanabhan National Science Foundation National Science Foundation –BCS-0846655 –OCI- –OCI-1047916 –Open Science Grid –XSEDE SES070004N

40 40 Thanks!


Download ppt "1 Kalev Leetaru, Eric Shook, and Shaowen Wang CyberInfrastructure and Geospatial Information Laboratory (CIGI) Department of Geography and Geographic Information."

Similar presentations


Ads by Google