Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.

Similar presentations


Presentation on theme: "1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier."— Presentation transcript:

1 1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier

2 2 Outline Introduction Background Work Mobility Prediction Based On Mobility Rules Experimental Results Conclusion Future Work

3 3 Introduction Personal Communication Systems are becoming more popular Dynamic relocation of users gives rise to the problem of Mobility Management Methods for storing and updating the location information of users Mobility Prediction: the prediction of a user’s next inter- cell movement

4 4 Motivation Predicted movement can be used for effectively allocating resources instead of blindly allocating excessive resources Benefit to the broadcast program generation [1], data items can be broadcast to the predicted cell Location prediction is crucial in processing of location dependent queries [2], since answer depends on the location of user Queries depending on future positions can be answered by effective location prediction [1] Y. Saygin and O. Ulusoy. Exploiting Data Mining Techniques for Broadcasting Data in Mobile Computing Environments. IEEE Transactions on Knowledge and Data Engineering, 14(6): 1387-1399, 2002. [2] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, 1995. [2] G. Gok and O. Ulusoy. Transmission of Continuous Query Results in Mobile Computing Systems. Information Sciences, 125(1-4): 37-63, 2000

5 5 Network Model PCS network partitioned into smaller areas called cells Each cell has a Base Station (BS), used for broadcasting and receiving information Home Location Register (HLR): database which keeps the inter-cell movement history of user Visitor Location Register (VLR): each BS has a database which keeps the profiles of the users located in this cell.

6 6 Problem Definition It is possible for us to get the movement history of a mobile user from HLR of a user Movement trajectories in the form of T= Partitioned into subsequences, named user actual paths, UAPs UAPs have the form of U= We mine UAPs to find user mobility patterns, UMPs

7 7 Related Work The roots of our method go back to the Apriori algorithm [3]  Association rule mining Sequential pattern mining problem [4]  Ordering of the items in an itemset must be taken into consideration  Not appropriate for our domain, because does not take into account the network topology [3] R. Agrawal, R. Srikant, Fast Algorithms for mining association rules. In Proceedings of Very Large Databases Conference (VLDB’94), pages 487-499, 1994. [4] R. Agrawal and R. Srikant. Mining sequential patterns. In Proceedings of the IEEE Conference on Data Engineering (ICDE’95), pages 3–14, 1995.

8 8 Mobility Prediction Based On Mobility Rules 1.Mining UMPs from Graph Traversals: Movement data mined for discovering regularities (UMP) in inter-cell movements 2.Generation of Mobility Rules: Mobility rules are extracted from UMPs 3.Mobility Prediction: Prediction of next inter-cell movement based on mobility rules

9 9 Mining UMPs from Graph Traversals Vertices of G: the cells in the coverage region Edges of G: if two cells, A and B, are neighbors in the coverage region, then there are two edges in G, A  B and B  A An example coverage region and corresponding graph G

10 10 Mining UMPs from Graph Traversals Subsequence definition: Assume we have two UAPs, A = and B =. B is a subsequence of A, iff all cells in B also exist in A while keeping their order in B Example: A=, then B= is a length-2 subsequence of A. In other words, B is contained by A

11 11 Mining UMPs from Graph Traversals Every candidate has a count value that keeps the support given to this candidate by UAPs This is the point our work extends algorithm in [5, 6] Method in [5, 6] increments the count value of a candidate by 1 if this candidate is contained by a UAP Unfair !!! Treats in the same way  a highly corrupted candidate pattern  a slightly corrupted (or even not corrupted at all) candidate pattern [5] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, A Data Mining Algorithm for Generalized Web Prefetching, IEEE Transactions on Knowledge and Data Engineering, 15(5): 1155-1169, 2003. [6] A. Nanopoulos, D. Katsaros, Y. Manolopoulos, Effective Prediction of Web User Accesses: A Data Mining Approach, In Proceedings of the WebKDD Workshop (WebKDD’01), 2001.

12 12 Mining UMPs from Graph Traversals Should consider the degree of corruption for the mobile motion prediction context suppInc Support assigned to a candidate pattern B by a UAP A (i.e., suppInc)

13 13 Mining UMPs from Graph Traversals totDist Define totDist value by means of the notion of string alignment Definition 2.1: If x and y are each single character or space, then  (x, y) denotes the score of aligning x and y. In our case, the scoring function is defined as follows:

14 14 Mining UMPs from Graph Traversals Definition 2.3: Let A be a UAP and B be a pattern. A containment alignment X' maps A and B into strings A‘ and B‘ where:  |A'| = |B'|  B is contained by A, and  Removal of all spaces from A' and B' leaves A and B Total score of the alignment X':

15 15 Mining UMPs from Graph Traversals For any two patterns, there may be more than one alignment Ex: Consider A=, B=

16 16 Mining UMPs from Graph Traversals Definition 2.4: An optimal containment alignment of UAP A and pattern B is one that has the minimum possible value for these two patterns  Total score of an alignment: sum of penalties  An optimal alignment should have the minimum number of mismatches, which means the minimum score of alignment totDist(A, B) = Score of the optimal alignment for the UAP A and pattern B

17 17 Mining UMPs from Graph Traversals Example: Given UAP A= and pattern B=, optimal containment alignment for these:  Score of the alignment = totDist (A, B) = 3 Support assigned to the candidate pattern B by the UAP A:

18 18 Mining UMPs from Graph Traversals The quality of the patterns will improve since this method is a more accurate way of support counting  Degree of corruption taken into account This will give rise to more accurate mobility rules Resulting in the prediction accuracy improved compared to the accuracy by using the rules that are generated with the former way of support counting Application of different methods for totDist will affect the quality of rules

19 19 Mining UMPs from Graph Traversals Candidate Generation: Example: C = N + (c k ) : the set of all nodes in G, which have an incoming edge from the cell c k A cell from N + (c k ) is attached to the end of C to generate C' Add C' to the set of Candidates

20 20 Mining UMPs from Graph Traversals Apriori Pruning can be used? NO due to the nature of our new support counting method Support is no longer monotonically decreasing with the increasing size of the pattern A length-(k-1) subpattern S of a length-k pattern P doesn’t need to be large even if P is large Ex: UAP, P 1 = and its subpattern P 2 = UAP assigns a support  to P 1 and to P 2

21 21 Mining UMPs from Graph Traversals Example: Use supp min = 1.33 Database of UAPs Set of all large Patterns (UMPs)

22 22 Generation of Mobility Rules Extract rules from the UMPs For a rule: R:  A confidence value is calculated: Head Tail

23 23 Generation of Mobility Rules The rules which have confidence higher than conf min are selected All possible mobility rules for the UMPs given in previous example are:

24 24 Mobility Prediction User has followed a path P= up to now Find the rules whose head parts are contained in P and the last cell in their head is c i-1 Store the first cell of tail along with the (confidence + support) of rule as a tuple Sort these tuples w.r.t. the (confidence + support) values in descending order Select the first m tuples

25 25 Mobility Prediction Example: Assume that the current trajectory of the user is P= Matching Rules:   Sorted tuple array is: TupleArray = [(5, 85.83), (0, 76.5)] If m=1, then Predicted Cells Set = {5} If m=2, then Predicted Cells Set = {5, 0}

26 26 Simulation Design Mobile users travel on a 15 by 15 hexagonal shaped network To generate UAPs, first UMPs are generated UMPs are taken as a random walk over the network Two types of UAPs:  Outliers: a random walk over the network  Non-outliers: those which follow a UMP o (outlier percentage): ratio of the number of outliers to the number of non-outliers

27 27 Simulation Design Corruption mechanism: insert random cells between the consecutive cells of an UMP c (corruption ratio): denotes the ratio of the number of such random cells to the number of cells in the corresponding UMP Three possible outcomes of a prediction  Correct prediction  Incorrect prediction  No prediction Two performance measures:

28 28 Algorithms Used for Comparison Mobility Prediction Based on Transition Matrix (TM)  A cell-to-cell transition matrix formed  Select the m most probable cells from the transition matrix Ignorant Prediction  Randomly select the m neighboring cells of the current cell

29 29 Impact of m on Precision and Recall Decreasing precision for both our algorithm and TM Increasing probability of making some incorrect predictions as m increases Increasing recall for all algorithms, but more significant increase for TM and Ignorant prediction

30 30 Impact of m on Precision and Recall Setting m as small as possible is convenient for our method The increase rate in the recall value from m values 1 to 2 is maximum for TM m ≥ 3 would cause excessive network resource waste Thus choose m = 2

31 31 Impact of Supp min Reduced recall and precision The increase in the supp min value leads to a decrease in the number of mined mobility rules Number of correct predictions is reduced Choose supp min =0.1

32 32 Impact of Conf min Increasing precision  Higher quality rules with the increasing conf min  Leading to a higher decrease rate in number of predictions when compared to the decrease rate in number of correct predictions Decreasing recall  The number of mined rules is reduced leading to a decrease in the number of correct predictions Choose conf min =80

33 33 Impact of Corruption Factor Decreasing precision and recall for our method and TM For all c, better precision than TM but worse recall than TM For our method, as c increases:  The number of mined mobility rules decreases  No prediction in some cases because no matching rules due to the corrupted UAPs

34 34 Impact of Outlier Percentage Both performance measures not affected significantly for all methods Rules extracted from outlier UAPs not used commonly, thus not reducing recall and precision significantly

35 35 Conclusion A data mining algorithm for the prediction of user movements in a mobile computing system Algorithm is based on  Mining the mobility patterns of users  Then forming mobility rules from these patterns  Finally predicting a mobile user’s next movements by using the mobility rules A good performance when compared to the performance of Ignorant Method

36 36 Conclusion Performance when compared to the TM  Better Precision: More accurate predictions Most of its predictions made at each request are correct  Worse Recall: Our method may not make prediction in response to some of the prediction requests Because there may not be any matching rule for the current trajectory of the user when a prediction request is made

37 37 Future Work For calculating the totDist value, our method:  Decrease the support given to pattern by a UAP as the number of corrupted cells increases in pattern  Other methods may be employed for calculating totDist value No time domain of the mobility patterns and mobility rules considered  In real life, mobility patterns might be related to time  Some specific rules valid for a specific time interval  Extend our algorithm to include the time domain of mobility rules A candidate pruning criterion suitable for our support counting method may be employed

38 38 ? Questions & Comments


Download ppt "1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier."

Similar presentations


Ads by Google