Context Aware Semantic Association Ranking SWDB Workshop Berlin, September 7, 2003 Boanerges Aleman-MezaBoanerges Aleman-Meza, Chris Halaschek, I. Budak.

Slides:



Advertisements
Similar presentations
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
Advertisements

An Ontological Approach to the Document Access Problem of Insider Threat ISI 2005, (May 20) Boanerges Aleman-Meza 1 Phillip Burns 2 Matthew Eavenson 1.
WSCD INTRODUCTION  Query suggestion has often been described as the process of making a user query resemble more closely the documents it is expected.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Research Problems in Semantic Web Search Varish Mulwad ____________________________ 1.
Semantic Location Based Services for Smart Spaces Kostas Kolomvatsos, Vassilis Papataxiarhis, Vassileios Tsetsos P ervasive C omputing R esearch G roup.
Retrieval Evaluation. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Grid Service Discovery with Rough Sets Maozhen Li, Member, IEEE, Bin Yu, Omer Rana, and Zidong Wang, Senior Member, IEEE IEEE TRANSACTION S ON KNOLEDGE.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Semantic Web Technology Evaluation Ontology (SWETO): A test bed for evaluating tools and benchmarking semantic applications WWW2004 (New York, May 22,
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Improving Data Discovery in Metadata Repositories through Semantic Search Chad Berkley 1, Shawn Bowers 2, Matt Jones 1, Mark Schildhauer 1, Josh Madin.
Predicting Missing Provenance Using Semantic Associations in Reservoir Engineering Jing Zhao University of Southern California Sep 19 th,
OMAP: An Implemented Framework for Automatically Aligning OWL Ontologies SWAP, December, 2005 Raphaël Troncy, Umberto Straccia ISTI-CNR
Managing Large RDF Graphs (Infinite Graph) Vaibhav Khadilkar Department of Computer Science, The University of Texas at Dallas FEARLESS engineering.
Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection Boanerges Aleman-Meza, Meenakshi Nagarajan,
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
An Ontological Approach to Assessing IC Need to Know Phillip BurnsCTA Inc. Prof. Amit ShethLSDIS Lab, University of Georgia Presented to ARDA PI Meeting,
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Ranking Documents based on Relevance of Semantic Relationships Boanerges Aleman-Meza LSDIS labLSDIS lab, Computer Science, University of Georgia Advisor:
Ranking Relationships on the Semantic Web Budak Arpinar This work is funded by NSF-ITR-IDM Award# titled '‘SemDIS: Discovering Complex Relationships.
Managing Information Quality in e-Science using Semantic Web technology Alun Preece, Binling Jin, Edoardo Pignotti Department of Computing Science, University.
Database Support for Semantic Web Masoud Taghinezhad Omran Sharif University of Technology Computer Engineering Department Fall.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
SWETO: Large-Scale Semantic Web Test-bed Ontology In Action Workshop (Banff Alberta, Canada June 21 st 2004) Boanerges Aleman-MezaBoanerges Aleman-Meza,
1 Efficient Search Ranking in Social Network ACM CIKM2007 Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo B. Golgher, Davi de Castro Reis,
SemSearch: A Search Engine for the Semantic Web Yuangui Lei, Victoria Uren, Enrico Motta Knowledge Media Institute The Open University EKAW 2006 Presented.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
A Declarative Similarity Framework for Knowledge Intensive CBR by Díaz-Agudo and González-Calero Presented by Ida Sofie G Stenerud 25.October 2006.
From Social Bookmarking to Social Summarization: An Experiment in Community-Based Summary Generation Oisin Boydell, Barry Smyth Adaptive Information Cluster,
Haggle Architecture and Reference Implementation Uppsala, September Erik Nordström, Christian Rohner.
Relevance Feedback in Image Retrieval Systems: A Survey Part II Lin Luo, Tao Huang, Chengcui Zhang School of Computer Science Florida International University.
SemRank: Ranking Complex Relationship Search Results on the Semantic Web Kemafor Anyanwu, Angela Maduko, Amit Sheth LSDIS labLSDIS lab, University of Georgia.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
Efficient RDF Storage and Retrieval in Jena2 Written by: Kevin Wilkinson, Craig Sayers, Harumi Kuno, Dave Reynolds Presented by: Umer Fareed 파리드.
Peer-to-Peer Discovery of Semantic Associations Matthew Perry, Maciej Janik, Cartic Ramakrishnan, Conrad Ibanez, Budak Arpinar, Amit Sheth 2 nd International.
OntoQA: Metric-Based Ontology Quality Analysis Samir Tartir, I. Budak Arpinar, Michael Moore, Amit P. Sheth, Boanerges Aleman-Meza IEEE Workshop on Knowledge.
Algorithmic Detection of Semantic Similarity WWW 2005.
Searching and Ranking Documents based on Semantic Relationships PaperPaper presentation ICDE Ph.D. Workshop 2006 April 3rd, 2006, Atlanta, GA, USA This.
Graph Summaries for Subgraph Frequency Estimation 1 Angela Maduko, 2 Kemafor Anyanwu, 3 Amit Sheth, 4 Paul Schliekelman 1 LSDIS Lab, University of Georgia.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Ontology Quality by Detection of Conflicts in Metadata Budak I. Arpinar Karthikeyan Giriloganathan Boanerges Aleman-Meza LSDIS lab Computer Science University.
An Ontology-based Approach to Context Modeling and Reasoning in Pervasive Computing Dejene Ejigu, Marian Scuturici, Lionel Brunie Laboratoire INSA de Lyon,
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
An Ontological Approach to Financial Analysis and Monitoring.
 -Queries: Enabling Querying for Semantic Associations on the Semantic Web WWW2003 (Budapest, May 23, 2003) Paper Presentation Kemafor Anyanwu Amit Sheth.
Selected Semantic Web UMBC CoBrA – Context Broker Architecture  Using OWL to define ontologies for context modeling and reasoning  Taking.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Ontology Evaluation and Ranking using OntoQA Samir Tartir and I. Budak Arpinar Large-Scale Distributed Information Systems Lab University of Georgia The.
Discovering and Ranking Semantic Associations over a Large RDF Metabase Chris Halaschek, Boanerges Aleman- Meza, I. Budak Arpinar, Amit P. Sheth 30th International.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Neighborhood - based Tag Prediction
By: Chris Halaschek Advisors: Dr. I. Budak Arpinar Dr. Amit P. Sheth
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
Peer-to-Peer Discovery of Semantic Associations
Knowledge Discovery in the Semantic Web
Probabilistic Data Management
Summarizing Entities: A Survey Report
Gong Cheng, Yanan Zhang, and Yuzhong Qu
Visual analytics for discovering entity relationship on text data
Topic: Semantic Text Mining
Presentation transcript:

Context Aware Semantic Association Ranking SWDB Workshop Berlin, September 7, 2003 Boanerges Aleman-MezaBoanerges Aleman-Meza, Chris Halaschek, I. Budak Arpinar, Amit ShethChris Halaschek I. Budak ArpinarAmit Sheth Large Scale Distributed Information Systems Lab Computer Science DepartmentComputer Science Department, University of GeorgiaUniversity of Georgia This material is based upon work supported by the National Science Foundation under Grant No

“Finding out about” [ Belew00 ] relationships! Finding things From ….. to…..

Outline From Search to Analysis: Semantic Associations Using Context for Ranking Ranking Algorithm Preliminary Results / Demo Related Work Conclusion & Future Work

Changing expectations Not documents, not search, not even entities, but actionable information and insight Emergence of text/content analytics, knowledge discovery, etc. for business intelligence, national security, and other emerging markets

Example in 9-11 context What are relationships between Khalid Al- Midhar and Majed Moqed ?  Connections Bought tickets using same frequent flier number  Similarities Both purchased tickets originating from Washington DC paidby cash and picked up their tickets at the Baltimore-Washington Int'l Airport Both have seats in Row 12 “What relationships exist (if any) between Osama bin Laden and the 9-11 attackers”

Semantic Associations

 - Association Two entities e 1 and e n are semantically connected if there exists a sequence e 1, P 1, e 2, P 2, e 3, … e n-1, P n-1, e n in an RDF graph where ei, 1  i  n, are entities and P j, 1  j < n, are properties &r1 &r5 &r6 purchased for “M’mmed” “Atta” fname lname “Abdulaziz” “Alomari ” fname lname Semantically Connected

 - Association Two entities are semantically similar if both have ≥ 1 similar paths starting from the initial entities, such that for each segment of the path:  Property Pi is either the same or subproperty of the corresponding property in the other path  Entity Ei belongs to the same class, classes that are siblings, or a class that is a subclass of the corresponding class in the other path SemanticSimilarity &r8 &r2 paidby “Marwan” “Al-Shehhi” &r7 &r1 fname lname purchased “M’mmed” “Atta” paidby &r9 fname lname &r3 CashTicketPassenger SemanticSimilaritySemanticSimilarity

Semantic Association  - Query  A  - Query, expressed as  (x, y), where x and y are entities, results in the set of all semantic paths that connect x and y  - Query  A  - Query, expressed as  (x, y), where x and y are entities, results in the set of all pairs of semantically similar paths originating at x and y

The Need For Ranking Current test bed with > 6,000 entities and > 11,000 explicit relations The following semantic association query  (“Nasir Ali”, “AlQeada”), results in 2,234 associations The results must be presented to a user in a relevant fashion…thus the need for ranking

Context Use For Ranking

Context: Why, What, How? Context => Relevance; Reduction in computation space Context captures the users’ interest to provide the user with the relevant knowledge within numerous relationships between the entities By defining regions (or sub-graphs) of the ontology we are capturing the areas of interest of the user

Context Specification Topographic approach (current)  Regions ‘capture’ user’s interest, such as a region is a subset of classes (entities) and properties of an ontology View approach (future) Each region can have a relevance weight

Ranking Algorithm

Ranking – Introduction Our ranking approach defines a path rank as a function of several ranking criteria Ranking criteria:  Universal – query (or context) independent Subsumption  User-Defined - query (or context) specific Path Length Context Trust

Subsumption Weight Specialized instances are considered more relevant More “specific” relations convey more meaning Organization Political Organization Democratic Political Organization H. Dean Democratic Party member Of H. DeanAutoClub member Of Ranked Higher Ranked Lower

Path Length Weight Interest in the most direct paths (i.e., the shortest path)  May infer a stronger relationship between two entities Interest in hidden, indirect, or discrete paths (i.e., longer paths)  Terrorist cells are often hidden  Money laundering involves deliberate innocuous looking transactions

Path Length - Example ABU ZUBAYDAH SAAD BIN LADEN friend Of Osama Bin LadenAl Qeada member Of Ranked Lower ( ) Ranked Higher (1.0) friend Of SAIF AL-ADIL OMAR AL-FAROUQ friend Of member Of friend Of Short Paths Favored Ranked Higher (0. 889) Ranked Lower (0.01) Long Paths Favored

Context Weight Consider user’s domain of interest (user- weighted regions) Issues  Paths can pass through numerous regions of interest  Large and/or small portions of paths can pass through these regions Paths outside context regions rank lower or are discarded

Context Weight - Example Region 1 : Financial Domain, weight=0.50 Region 2 : Terrorist Domain, weight=0.75 e7:Terrorist Organization e4:Terrorist Organization e8:Terrorist Attack e6:Financial Organization e2:Financial Organization e 1 :Person e 9 :Location e 5 :Person friend Of member Of located In e3:Organization supports has Account located In works For member Of involved In at location

Trust Weight Relationships (properties) originate from differently trusted sources Trust values need to be assigned to relationships depending on the source e.g., Reuters could be more trusted than some of the other news sources Current approach penalizes low trusted relationships (may overweight lowest trust in a relationship)

Ranking Criterion Overall Path Weight of a semantic association is a linear function Ranking Score = where k i add up to 1.0 Allows fine-tuning of the ranking criteria k 1 × Subsumption + k 2 × Length + k 3 × Context + k 4 × Trust

Preliminary Results & Demo

Preliminary Results Metadata sources cover terrorism domain Ontology in RDFS, metadata in RDF Semagix Freedom suite used for metadata extraction Semagix Currently > 6,000 entities and > 11,000 relations/assertions (plan to increase by 2 order of magnitude)

PISTA Ontology

Have implemented naïve algorithms for  and   Using a depth-first graph traversal algorithm  Used Jena to interact with RDF graphs (i.e., metadata in main memory) Preliminary Results

Demo Context  ‘A’ defines a region covering ‘terrorism’ - weight of 0.6  ‘B’ captures ‘financial’ region - weight of 0.4 Ranking criteria (this example)  0.6 to context  0.1 to subsumption  0.2 to path length (longer paths favored),  0.1 to trust weight

Demo Click here to begin demo

Related Work

Ranking in Semantic Web Portals  [Maedche et al 2001] Our Earlier Work  [Anyanwu et al 2003] Contemporary information retrieval ranking approaches  [ Brin et al 1998], [Teoma] Context Modeling  [Kashyap et al 1996], [Crowley et al 2002]

Conclusions & Future Work

Summary and Future Work This paper: ranking of  path  Even more important than ranking of documents in contemporary Web search Ongoing: ranking of  path Future:  Formal query language for semantic associations is currently under development  Develop evaluation metrics for context-aware ranking (different than the traditional precision and recall)  Use of the ranking scheme for the semantic-association discovery algorithms (scalability in very large data sets)

Questions, Comments,... For more info:  PISTA Project, papers, presentations