INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID

Slides:



Advertisements
Similar presentations
Search Computing Engineering SeCo: Liquid Queries Marco Brambilla, Stefano Ceri SeCo workshop, Como, June 17th-19th, 2009.
Advertisements

A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Personalized Presentation in Web-Based Information Systems Institute of Informatics and Software Engineering Faculty of Informatics and Information Technologies.
2009 – E. Félix Security DSL Toward model-based security engineering: developing a security analysis DSML Véronique Normand, Edith Félix, Thales Research.
Systems Analysis and Design in a Changing World, 6th Edition
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 32 Slide 1 Aspect-oriented Software Development.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
0 General information Rate of acceptance 37% Papers from 15 Countries and 5 Geographical Areas –North America 5 –South America 2 –Europe 20 –Asia 2 –Australia.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
System Design & Software Architecture
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Katanosh Morovat.   This concept is a formal approach for identifying the rules that encapsulate the structure, constraint, and control of the operation.
Configuration Management and Server Administration Mohan Bang Endeca Server.
Conceptual Modeling Issues in Web Applications enhanced with Web services Sara Comai, Politecnico di Milano In collaboration with:
CS523 INFORMATION RETRIEVAL COURSE INTRODUCTION YÜCEL SAYGIN SABANCI UNIVERSITY.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.
EET 4250: Chapter 1 Computer Abstractions and Technology Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Irwin.
Autumn Web Information retrieval (Web IR) Handout #0: Introduction Ali Mohammad Zareh Bidoki ECE Department, Yazd University
Keyword Searching and Browsing in Databases using BANKS Seoyoung Ahn Mar 3, 2005 The University of Texas at Arlington.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
1 Of Crawlers, Portals, Mice and Men: Is there more to Mining the Web? Jiawei Han Simon Fraser University, Canada ACM-SIGMOD’99 Web Mining Panel Presentation.
Center for E-Business Technology Seoul National University Seoul, Korea Optimization of Multi-Domain Queries on the Web Daniele Braga, Stefano Ceri, Florian.
Systems Analysis and Design in a Changing World, 6th Edition
Ranking CSCI 572: Information Retrieval and Search Engines Summer 2010.
Data Integration Hanna Zhong Department of Computer Science University of Illinois, Urbana-Champaign 11/12/2009.
Introduction to Usability By : Sumathie Sundaresan.
Service Marts: a Service Framework for Search Computing Alessandro Campi Andrea Maesani.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
HUBBLE LEGACY ARCHIVE STSCI Astronomical Data Tagging Web 2.0 meets Astronomy in the HLA Niall I. Gaffney, W. Warren Miller (STScI)
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
Chapter 2 Database System Concepts and Architecture
Unified Modeling Language
OPM/S: Semantic Engineering of Web Services
Distributed Systems CS
Grid Computing.
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
CS 425/625 Software Engineering Architectural Design
Princess Nourah bint Abdulrahman University
Tools for Composing and Deploying Grid Middleware Web Services
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Information Retrieval Systems
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Distributed Systems CS
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Exploratory Search Framework for Web Data Sources
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
WSExpress: A QoS-Aware Search Engine for Web Services
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID
Presentation transcript:

INFORMATION RETRIEVAL TECHNIQUES BY DR. ADNAN ABID Lecture # 39 Search Computing

ACKNOWLEDGEMENTS The presentation of this lecture has been taken from the following sources “Introduction to information retrieval” by Prabhakar Raghavan, Christopher D. Manning, and Hinrich Schütze “Managing gigabytes” by Ian H. Witten, ‎Alistair Moffat, ‎Timothy C. Bell “Modern information retrieval” by Baeza-Yates Ricardo, ‎  “Web Information Retrieval” by Stefano Ceri, ‎Alessandro Bozzon, ‎Marco Brambilla

Outline Multi-domain queries with ranking Why Search Engines can’t do it? Observed trends Search Computing The Search Computing “Manifesto” Search Computing architecture

Motivation: multi-domain queries with ranking A class of queries search engines are not good at “Where can I attend an interesting scientific conference in my field and at the same time relax on a beautiful beach nearby?” “Retrieve jobs as Java developer in the Silicon Valley, nearby affordable fully-furnished flats, and close to good schools “Find a theater close to Union Square, San Francisco, showing a recent thriller movie, close to a steak house?” With a complex notion of “best” with many factors contributing to optimality Involving several different data sources possibly hidden in the deep Web typically returning ranked results (search services) With possibly articulated “join” conditions capturing search sessions rather than one-shot queries Due to query complexity, not data heterogeneity or unavailability 00:10:20  00:10:36 (where can i) 00:11:50  00:12:00 (retrieve jobs) 00:12:25  00:12:32 (find a theater) 00:14:40  00:14:50 (find a theater) 00:16:40  00:17:10 (with a & involving & with possibly)

Search For a Solution Using All Keywords 00:18:50  00:19:25

Split the task, and search for theaters first 00:19:30  00:19:45

Inspect Theatre Details: Looks good… 00:19:50  00:20:10

But there’s no thriller! Try another theater: Found! (The Next Three Days) close enough to Union square.... 00:20:15  00:20:30

Independent search for steak house 00:20:38  00:21:00

Done! Close enough! (data integration and ranking in the user’s brain) 00:21:05  00:21:28

Motivating Examples – Why Search Engines can’t do it? Query is about distinct domains that should be linked Query deals with multiple rankings, although hard to compute (“close” theatre, “recent” thriller, “good” steak house) 00:21:50  00:22:10 (query & query) 00:23:25  00:23:40 (note that) Note that enough data is on the Web but not on a single web page.

Observed trends More and more data sources become accessible through Web APIs (as services) Sufrace & deep Web Data sources are often coupled with search APIs Publishing of structured and interconnected data is becoming popular (Linked Open Data) Opportunity for building focused search systems composing results of several data source easy-to-build, easy-to-query, easy-to-maintain, easy-to-scale... covering the functionalities of vertical search systems (e.g. “expedia”, “amazon”) on more focused application domains (e.g. localized real estate or leasure planning, sector-specific job market offers, support of biomedical research, ...) 00:24:15  00:24:47 (observed trends) 00:25:30  00:26:05 (opportunity)

Search Computing = service composition “on demand” Composition abstractions should emphasize few aspects: service invocations fundamental operations (parallel invocations, joins, pipelining, …) global constraints on execution Data composition should be search-driven aimed at producing few top results very fast A house in a walk able area, close to public transportation and located in a pleasant neighborhood 00:29:10  00:30:25 00:30:45  00:32:10

The Search Computing “Manifesto” Build theories, methods, and tools to support search-oriented multi- domain queries Given a multi-domain query over a set of search services Build global answers by combining data from each service Rank global answers according to a global ranking and output results in ranking order Support user-friendly query formulation and browsing of results Include new domains while the search process proceeds Possibly change the relative weight of each partial ranking “Searching via interactive/dynamic mashups of ranked data sources” 00:36:40  00:37:30 (build & rank & support) 00:38:28  00:38:35 (include) 00:39:10  00:39:20 (possibly)

Search Computing architecture: overall view High level query “Where can I attend a DB scientific conference close to a beautiful beach reachable with cheap flights?” Presented results ESWC-Crete-Olympic CAISE- Hammamet – Alitalia TOOLS-Malaga-EasyJet Sub query 1 “Where can I attend a DB scientific conference?” Sub query 2 “place close to a beautiful beach?” Sub query 3 “place reachable with cheap flight?” Low level query 1 ConfSearch(“DB”,placeX,dateY) Low level query 2 TourSearch(“Beach”,PlaceX) Low level query 3 Flight(“cost<200”,PlaceX,DateY) Results Graphics:- Animation required as in the slide 00:42:10  00:42:25 (high level query) 00:42:40  00:43:50 (architecture only) 00:44:55  00:45:40 (sub queries) 00:45:45  00:46:30 (low level query) 00:46:35  00:46:55 (query plan) 00:47:10  00:47:30 (services invocation) 00:47:45  00:49:45 (results) 00:50:02  00:50:42 (Presented results) Query plan Main Query flow Services invocations and operators execution <Uses> relation

Search Computing architecture: incremental prototyping 16 Prototype 4: NL or keyword queries Prototype 3: Ontology-driven search Ontological query interpretation Ontological description & annotation of services Prototype 2: Vertical solutions ER Domain description Query planner Application design tools Prototype 1: Core behaviour of the system. Query engine Domain repository Service repository Result presentation 00:51:25  00:52:00 (Prototype 1) 00:52:20  00:52:35 (Prototype 2) 00:52:38  00:53:00 (Prototype 3 & 4) (layering) <Uses> relation 16