Mumbai, india. november 26, 2008 another chapter in the war against civilization.

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Our Digital World Second Edition
Spatio-Temporal-Thematic Analysis of Citizen Sensor Data Challenges and Experiences Meenakshi Nagarajan, Karthik Gomadam, Amit Sheth, Ajith Ranabahu, Raghava.
FOSS4G 2009 Building Human Sensor Webs with 52° North SWE Implementations Building Human Sensor Webs with 52° North SWE Implementations Eike Hinderk Jürrens,
Semantics enhanced Data, Social and and Sensor Webs Talk at Dagstuhl Seminar on Semantic Challenges in Sensor Networking Amit Sheth Kno.e.sis Center, Wright.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
Visual Event Detection & Recognition Filiz Bunyak Ersoy, Ph.D. student Smart Engineering Systems Lab.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
W3C Video on the Web Workshop December 2007, San Jose, California Video on the Semantic Sensor Web Amit Sheth Amit Sheth with Cory Henson, Prateek.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Mining the web to improve semantic-based multimedia search and digital libraries
Sunita Sarawagi.  Enables richer forms of queries  Facilitates source integration and queries spanning sources “Information Extraction refers to the.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
ACCESS TO QUALITY RESOURCES ON RUSSIA Tanja Pursiainen, University of Helsinki, Aleksanteri institute. EVA 2004 Moscow, 29 November 2004.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Redefining Perspectives A thought leadership forum for technologists interested in defining a new future June COPYRIGHT ©2015 SAPIENT CORPORATION.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Text Analytics And Text Mining Best of Text and Data
Business Driven Technology Unit 4
Presented to: By: Date: Federal Aviation Administration Enterprise Information Management SOA Brown Bag #2 Sam Ceccola – SOA Architect November 17, 2010.
Copyright © 2010 Accenture All Rights Reserved. Accenture, its logo, and High Performance Delivered are trademarks of Accenture. Multiple Ontologies in.
Mashups… …Recycling Data. As a simple example…  Click on  Videos that are uploaded individually over time are collected.
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Web geo-visualization Data integration, amelioration, geo-referencing Advanced geo-spatial computing engine Tools: geospatial querying, data drill-down,
Chapter 1 Introduction to Data Mining
Creating Collaborative Partnerships
1 Technologies for (semi-) automatic metadata creation Diana Maynard.
NLP And The Semantic Web Dainis Kiusals COMS E6125 Spring 2010.
Analysis and Monetization of Social Data Amit P. Sheth Lexis-Nexis Ohio Eminent Scholar Director, Kno.e.sis Center, Wright State University.
Linked-data and the Internet of Things Payam Barnaghi Centre for Communication Systems Research University of Surrey March 2012.
Towards an ecosystem of data and ontologies Mathieu d’Aquin and Enrico Motta Knowledge Media Institute The Open University.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
AIR TWITTER: USING SOCIAL MEDIA AND SCIENTIFIC DATA TO SENSE AIR QUALITY EVENTS E. M. Robinson 1 ; W.E. Fialkowski 1 1. Energy, Environmental and Chemical.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Definition of a taxonomy “System for naming and organizing things into groups that share similar characteristics” Taxonomy Architectures Applications.
Future Learning Landscapes Yvan Peter – Université Lille 1 Serge Garlatti – Telecom Bretagne.
Mining Topic-Specific Concepts and Definitions on the Web Bing Liu, etc KDD03 CS591CXZ CS591CXZ Web mining: Lexical relationship mining.
1. 2 Sensor Data Management 3 1.Motivating Scenario 2.Sensor Web Enablement 3.Sensor data evolution hierarchy 4.Semantic Analysis Presentation Outline.
1. 2 Semantic Sensor Markup of Data and Services SSN-XG Meeting (04/22/09) Amit ShethAmit Sheth, Kno.e.sis CenterKno.e.sis Center SSW LabSSW Lab, Services.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.
You sexy beast. Ok, inappropriate. How about: Web of links to Web of Meaning Hello Semantic Web!
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
Ontology Mapping in Pervasive Computing Environment C.Y. Kong, C.L. Wang, F.C.M. Lau The University of Hong Kong.
GeoSpatial and GeoTemporal Informatics for dynamic and complex systems May Yuan.
Summary Knowledge Bases from Web are Real, Big & Useful: Entities, Classes & Relations Key Asset for Intelligent Applications: Semantic Search, Question.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #15 Secure Multimedia Data.
WEB PAGE CONTENTS VERIFICATION AGAINST TAGS USING DATA MINING TOOL IKNOW VІI scientific and practical seminar with international participation "Economic.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Presented By- Shahina Ferdous, Student ID – , Spring 2010.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Video on the Semantic Web Experiences with Media Streams CWI Amsterdam Joost Geurts Jacco van Ossenbruggen Lynda Hardman UC Berkeley SIMS Marc Davis.
An Ontological Approach to Financial Analysis and Monitoring.
Social Information Processing March 26-28, 2008 AAAI Spring Symposium Stanford University
Explain How Researchers Use Inductive Content Analysis (Thematic Analysis) on Transcripts.
Information Sharing on the Social Semantic Web Aman Shakya* and Hideaki Takeda National Institute of Informatics, Tokyo, Japan The Second NEA-JC Workshop.
1© 2015 IBM Corporation Unlocking the power of the API economy Client Briefing Nov.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Twitris By: Bhargabi Chakrabarti 28/03/13. Twitris 28/03/13 “Situation awareness application that care more about knowing what is going on so you can.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Contextual Intelligence as a Driver of Services Innovation
DrillSim July 2005.
Intersection of GI and IT
Context-Aware Internet
Presentation transcript:

mumbai, india

november 26, 2008

another chapter in the war against civilization

and

the world saw it Through the eyes of the people

the world read it Through the words of the people

PEOPLE told their stories to PEOPLE

A powerful new era in Information dissemination had taken firm ground

Making it possible for us to create a global network of citizens Citizen Sensors – Citizens observing, processing, transmitting, reporting

12

Semantic Integration of Citizen Sensor Data and Multilevel Sensing: A comprehensive path towards event monitoring and situational awareness Amit P. Sheth, LexisNexis Eminent Scholar Kno.e.sis Center Wright State University

Image Metadata latitude: 18° 54′ 59.46″ N, longitude: 72° 49′ 39.65″ E Image Metadata latitude: 18° 54′ 59.46″ N, longitude: 72° 49′ 39.65″ E Geocoder (Reverse Geo-coding) Geocoder (Reverse Geo-coding) Address to location database 18 Hormusji Street, Colaba Nariman House Identify and extract information from tweets Spatio-Temporal Analysis Structured Meta Extraction Income Tax Office Vasant Vihar

Research Challenge #1 Spatio Temporal and Thematic analysis – What else happened “near” this event location? – What events occurred “before” and “after” this event? – Any message about “causes” for this event?

Spatial Analysis…. Which tweets originated from an address near °N °E?

Which tweets originated during Nov 27th 2008,from 11PM to 12 PM

Giving us Tweets originated from an address near °N, °E during time interval 27 th Nov 2008 between 11PM to 12PM?

Research Challenge #2: Understanding and Analyzing Casual Text Casual text – Microblogs are often written in SMS style language – Slangs, abbreviations

Understanding Casual Text Not the same as news articles or scientific literature – Grammatical errors Implications on NL parser results – Inconsistent writing style Implications on learning algorithms that generalize from corpus

Nature of Microblogs Additional constraint of limited context – Max. of x chars in a microblog – Context often provided by the discourse Entity identification and disambiguation Pre-requisite to other sophisticated information analytics

NL understanding is hard to begin with.. Not so hard – “commando raid appears to be nigh at Oberoi now” Oberoi = Oberoi Hotel, Nigh = high Challenging – new wing, live taj 2nd floor on iDesi TV stream Fire on the second floor of the Taj hotel, not on iDesi TV

Social Context surrounding content Social context in which a message appears is also an added valuable resource Post 1: – “Hareemane House hostages said by eyewitnesses to be Jews. 7 Gunshots heard by reporters at Taj” Follow up post – that is Nariman House, not (Hareemane)

Research Opportunities NER, disambiguation in casual, informal text is a budding area of research Another important area of focus: Combining information of varied quality from a – corpus (statistical NLP), – domain knowledge (tags, folksonomies, taxonomies, ontologies), – social context (explicit and implicit communities)

What Drives the Spatio-Temporal-Thematic Analysis and Casual Text Understanding Semantics with the help of 1.Domain Models 2.Domain Models 3.Domain Models (ontologies, folksonomies)

And who creates these models? YOU, ME, We DO!

Domain Knowledge: A key driver Places that are nearby ‘Nariman house’ – Spatial query Messages originated around this place – Temporal analysis Messages about related events / places – Thematic analysis

Research Challenge #3 But Where does the Domain Knowledge come from? Community driven knowledge extraction – How to create models that are “socially scalable”? – How to organically grow and maintain this model?

The Wisdom of the Crowds The most comprehensive and up to date account of the present state of knowledge is given by Everybody = The Web in general = Blogs = Wikipedia

Wikipedia = Concise concept descriptions + An article title denotes a concept + Community takes care of disambiguation Collecting Knowledge

Wikipedia = Concise concept descriptions + An article title denotes a concept + Community takes care of disambiguation + Large, highly connected, sparsely annotated graph structure that connects named entities + Category hierarchy Collecting Knowledge

Goal: Harness the Wisdom of the Crowds to Automatically define a domain with up-to-date concepts We can safely take advantage of existing (semi)structured knowledge sources

Collecting Instances

Creating a Hierarchy

Hierarchy Creation - summary

Snapshot of final Topic Hierarchy

Great to know Explosion and Fire are related! But, knowing Explosion “causes” fire is powerful Relationships at the heart of semantics!

Identifying relationships: Hard, harder than many hard things But NOT that Hard, When WE do it

Games with a purpose Get humans to give their solitaire time – Solve real hard computational problems – Image tagging, Identifying part of an image – Tag a tune, Squigl, Verbosity, and Matchin – Pioneered by Luis Von Ahn

OntoLablr Relationship Identification Game leads to causes Explosion Traffic congestion

And the infrastructure Semantic Sensor Web – How can we annotate and correlate the knowledge from machine sensors around the event location?

Research Challenge #4: Semantic Sensor Web

Semantically Annotated O&M T05:00:00,29.1

Semantic Sensor ML – Adding Ontological Metadata 45 Person Company Coordinates Coordinate System Time Units Timezone Spatial Ontology Domain Ontology Temporal Ontology Mike Botts, "SensorML and Sensor Web Enablement," Earth System Science Center, UAB Huntsville

46 Semantic Query Semantic Temporal Query Model-references from SML to OWL-Time ontology concepts provides the ability to perform semantic temporal queries Supported semantic query operators include: – contains: user-specified interval falls wholly within a sensor reading interval (also called inside) – within: sensor reading interval falls wholly within the user-specified interval (inverse of contains or inside) – overlaps: user-specified interval overlaps the sensor reading interval Example SPARQL query defining the temporal operator ‘within’

Kno.e.sis’ Semantic Sensor Web 47

Synthetic but realistic scenario an image taken from a raw satellite feed 48

an image taken by a camera phone with an associated label, “explosion.” Synthetic but realistic scenario 49

Textual messages (such as tweets) using STT analysis Synthetic but realistic scenario 50

Correlating to get Synthetic but realistic scenario

Create better views (smart mashups)

A few more things Use of background knowledge Event extraction from text – time and location extraction Such information may not be present Someone from Washington DC can tweet about Mumbai Scalable semantic analytics – Subgraph and pattern discovery Meaningful subgraphs like relevant and interesting paths Ranking paths

The Sum of the Parts Spatio-Temporal analysis – Find out where and when + Thematic – What and how + Semantic Extraction from text, multimedia and sensor data - tags, time, location, concepts, events + Semantic models & background knowledge – Making better sense of STT – Integration + Semantic Sensor Web – The platform = Situational Awareness

Text Multimedia Content and Web data Metadata Extraction Patterns / Inference / Reasoning Domain Models Meta data / Semantic Annotations Relationship Web Search Integration Analysis Discovery Question Answering Situational Awareness Sensor Data RDB Structured and Semi- structured data

Interested in more background? Semantics-Empowered Social Computing Semantic Sensor Web Traveling the Semantic Web through Space, Theme and Time Traveling the Semantic Web through Space, Theme and Time Relationship Web: Blazing Semantic Trails between Web Resources Relationship Web: Blazing Semantic Trails between Web Resources Contact/more details: knoesis.orgamitknoesis.org Special thanks: Karthik Gomadam, Meena Nagarajan, Christopher Thomas Partial Funding: NSF (Semantic Discovery: IIS: , Spatio Temporal Thematic: IIS ), AFRL and DAGSI (Semantic Sensor Web), Microsoft Research and IBM Research (Analysis of Social Media Content),and HP Research (Knowledge Extraction from Community- Generated Content).Semantic DiscoverySpatio Temporal ThematicSemantic Sensor WebAnalysis of Social Media ContentKnowledge Extraction from Community- Generated Content