Presented by: Siddhant Kulkarni Spring 2016. Authors: Publication:  ICDE 2015 Type:  Research Paper 2.

Presented by: Siddhant Kulkarni Spring 2016

Authors: Publication:  ICDE 2015 Type:  Research Paper 2

 Text Segmentation  As per previous slide  POS Tagging  Part of Speech Tagging  Context Assignment to words  jazz is popular in this country  he likes jazz more than country  Concept Labelling  Use of models to assign meaning  read harry potter <- BOOK  watch harry potter <- MOVIE 4

5 Build a graph -> Weigh the terms using acquired knowledge -> Maximum Spanning Tree for Segmentation

 Pairwise model for POS  Don’t JUST look at the adjacent terms 6

Presented by: Zohreh Raghebi Spring 2016:

 The explosion of a vast number of location aware services  Predictive queries offer a fundamental type of location-based services  based on users’ future locations  Predictive range query  “find all hotels that are located within two miles from a user’s anticipated location after 30 minutes“  Predictive KNN query  “find the three taxis that are closest to a user’s location within the next 10 minutes“  Predictive aggregate query  “find the number of cars expected to be around the stadium during the next 20 minutes“ 9

 (1) Traffic management  To predict areas with high traffic in the next half hour  (2) Location aware advertising  To distribute coupons and sales promotions to customers more likely to show up around a certain store during the sale time in the next hour  (3) Routing services  To take into consideration the predicted traffic on each road segment to find the shortest path of a user’s trip starting after 15 minutes from the present time 10

 (4) Ride sharing systems  To match the drivers that will pass by a rider’s location within few minutes  (5) Store finders  To recommend the closest restaurants to a user’s predicted destination in 15 minutes 11

 To process predictive queries for moving objects on road networks efficiently  Index structure, named the Predictive tree (P-tree)  Proposed to precompute the predicted moving objects around each node in the underlying road network graph over time 12

 (1) They consider an Euclidean space where objects can move freely in a two dimensional space  Yet, practical predictive location-based services target moving objects on road networks  (2) Many techniques using a massive amount of objects’ historical trajectories  Historical data is not easily obtainable for many reasons  Users’ privacy and data confidentiality  Unavailability of historical data in rural areas  (3) To support a specific query type only  Predictive range query  Predictive KNN query  Predictive aggregate query 13

 A novel data structure named Predictive tree (P-tree)  We introduce a probability model that computes the likelihood of a node in the road network being a destination to a moving object  The probability assignment is made  Node’s position in the tree  Travel time between the object and this node  Number of the sibling nodes  Provide an incremental approach to update the predictive tree  As objects move around  Propose the iRoad framework that leverages the introduced predictive tree  To support a wide variety of predictive queries 14

 At each node in the given road network  We keep track of a list of objects predicted to appear in this node,  along with their probabilities, and travel time  When an object moves  we incrementally update the predictive tree  This update is reflected to the original nodes in the road network.  When a predictive query is received  We fetch the up-to-date answer from the node of interest  Compile it according to the query type. 15

Presented by: Zohreh Raghebi Spring 2016:

 Bump hunting is a common approach to extracting insights from data  Looking for regions of a dataset  Where a property of interest occurs frequently.  In this paper, we apply that approach on graphs  A subset of the nodes exhibit a property of interest  The goal is to find a connected subgraph (the “bump”)  Query nodes appear more often compared to non-query nodes 18

 We find such a subgraph by maximizing the linear discrepancy  The possible weighted difference between the number of query and non-query nodes in the subgraph  we consider the problem under a local-access model  we assume that only query nodes are provided as input  while all other nodes and edges can be discovered only via calls to a costly get- neighbors function from a previously discovered node  For example, accessing the social graph of an online social network (e.g., Twitter) is based on such a function 19

 When one submits a text query to Twitter’s search engine  Twitter returns a list of messages that match the query, along with the author of each message.  For example, when one submits “Ukraine,” Twitter returns all recent messages that contain that keyword, together with the users who posted them.  We wish to perform the following task:  Among all Twitter users who posted a relevant message,  Find a subset of them that form a local cluster on Twitter’s social network.  Our goal is to discover a community of users who talk about that topic  Our input consists only of those users who have recently published a relevant message  Note that we do not have the entire social graph  We can only retrieve the social connections of a user by submitting a query to Twitter’s API 20

 Consider an online library system that allows its users to perform text search on top of a bibliographic database.  Upon receiving a query, the system returns a list of authors that match the query.  For example, when one submits the keyword “discrepancy,”  The system returns a list of authors who have published about the topic 21

 We wish to perform the following task:  Among all authors in this particular result set, identify a subset of similar authors  Nodes represent authors, and edges indicate pairs of authors whose similarity exceeds a user-defined threshold  As the similarity threshold is user-defined the graph cannot be pre-computed.  An all-pairs similarity join to materialize the graph at query time is impractical  We can thus solve the problem without materializing the entire similarity graph 22

 To devise algorithms that, given as input a set of query nodes  Find the maximum-discrepancy connected subgraph  Using as few calls to the get-neighbors function as possible  Consider the discrepancy-maximization problem under an unrestricted-access model  The algorithm can access nodes and edges of the graph in an arbitrary manner 23

 The solution for the local-access model based on a two-phase approach  In the first phase, we make calls to the get-neighbors function to retrieve a subgraph  In many cases, the optimal subgraph can be found without retrieving the entire graph  In the second phase, we can use any algorithm for the unrestricted-access model on the retrieved subgraph 24

 Prove that the problem of linear-discrepancy maximization on graphs is NP-hard in the general case  Prove that the problem has a polynomial-time solution when the graph is a tree  To define fast heuristic algorithms to produce solutions 25

Presented by: Siddhant Kulkarni Spring 2016. Authors: Publication:  ICDE 2015 Type:  Research Paper 2.

Similar presentations

Presentation on theme: "Presented by: Siddhant Kulkarni Spring 2016. Authors: Publication:  ICDE 2015 Type:  Research Paper 2."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Presented by: Siddhant Kulkarni Spring 2016. Authors: Publication:  ICDE 2015 Type:  Research Paper 2.

Similar presentations

Presentation on theme: "Presented by: Siddhant Kulkarni Spring 2016. Authors: Publication:  ICDE 2015 Type:  Research Paper 2."— Presentation transcript:

Similar presentations

About project

Feedback