Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,

Similar presentations


Presentation on theme: "Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,"— Presentation transcript:

1 Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung, Chih-Wen Chang, Wen-Chih Peng

2 Outline MotivationMotivation GoalGoal FrameworkFramework –Preprocess –Construct User’s Profiles –Formulate Distance function –Identify Community ExperimentsExperiments ConclusionConclusion 2

3 Motivation (1/2) Rapid development of positioning techniques, users can easily collect their trajectoriesRapid development of positioning techniques, users can easily collect their trajectories –GPS Logger, smart phones and navigation devices 3

4 Motivation (2/2) Many GPS community sites are establishedMany GPS community sites are established –Users can share their own trajectories –Users can search trajectories 4 My tracks Every Trail Query

5 Goal Mine user communities from raw trajectoriesMine user communities from raw trajectories –User Communities Sets of users who have similar moving behaviors ApplicationsApplications –Find new friends –Recommendation –Rank of trajectories 5

6 6 Profile Measure Distance Between Users Community 2 Community 1 1. Construct User’s Profile2. Formulate distance function 3. Identify users communities

7 Outline MotivationMotivation GoalGoal FrameworkFramework –Preprocess –Construct User’s Profiles –Formulate Distance function –Identify Community ExperimentsExperiments ConclusionConclusion 7

8 Framework 8 Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

9 Preprocessing 9 Step 1:Step 1: –Find frequent regions Input: all trajectories of users Output: frequent regions Density-based approach Step 2:Step 2: –Transform trajectories into sequences of frequnet region id T1 :

10 Framework 10 Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

11 Construct User’s Profiles (1/2) User’s ProfileUser’s Profile –Probabilistic Suffix Tree (abbreviated as PST) Find and organize trajectory patterns Record the probability of next movements 11 Frequently moving sequence Conditional tables (next possible movements)

12 Construct User’s Profiles (2/2) Construct PSTConstruct PST –Level by level –Two operations: Create a child node –The counts of Before symbol > MinSup Add a symbol into the related conditional table –The counts of After symbol > MinSup 12 root A:0.5B:0.375 A A B ABE ABA AC B ADF H JHI EDH AB:0.25 Before symbol A : 2  2/3 × 0.375 = 0.25 After symbol A : 1  1/2 = 0.5 E : 1  1/2 = 0.5 Node B SIDCount C. Prob. A10.5 E1 ABE ABA AC B ADF H JHI EDH ABE ABA AC B ADF H JHI EDH B:0.375 MinSup = 0.2

13 Framework 13 Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

14 Determine distance of usersDetermine distance of users 1.Transform the PST into Moving Sequence List Each element in moving sequence list is a branch of PST with their probability Formulate Distance function (1/3) 14 L 1 [1..2] =

15 Formulate Distance function (2/3) 2.Define the distance between PSTs −Find the minimal dist(L i [1..m], L j [1..n]) −Use three editing operations Insertion 15 L 1 ={m 1 :0.3,m 2 :0.2,m 3 :0.3}L 2 ={m 1 :0.3,m 2 :0.2} L 1 ={m 1 :0.3,m 2 :0.2,m 3 :0.3}L 2 ={m 1 :0.3,m 2 :0.2,m 3 :0.3} Insert 0.2 0.1 T1 T2 Cost = 0.3

16 Deletion Replacement L 1 ={m 1 :0.2,m 2 :0.2,m 3 :0.2} L 2 ={m 1 :0.2,m 2 :0.2,m 3 :0.2} Replace Formulate Distance function (3/3) 16 L 1 ={m 1 :0.2,m 2 :0.3} L 2 ={m 1 :0.2,m 2 :0.3,m 3 :0.3} Delete L 1 ={m 1 :0.2,m 2 :0.3}L 2 ={m 1 :0.2,m 2 :0.3,____} L 1 ={m 1 :0.2,m 2 :0.2,m 3 :0.2} L 2 ={m 1 :0.2,m 2 :0.2,m 4 :0.3} T1T2 T1T2 0.3Cost = 0.3 0.2 0.3 Cost = 0.3+0.2 = 0.5 0.2

17 Framework 17 Preprocess Construct User’s Profile Measure Distance Between Users Identify Community

18 Identify Community (1/4) User communityUser community –The same community: δ MLS (T i,T j ) < threshold δ –The number of communities is minimal Transform the relation between PSTs into a graphTransform the relation between PSTs into a graph –A vertex represents a user –An edge exists between two vertices when δ MLS (T i,T j ) < threshold δ 18 O1 O2O5O3 O4

19 Identify Community (2/4) Model as a minimum clique problemModel as a minimum clique problem –A clique is a set of pair-wise adjacent vertices Example 19 O1 O2 O5 O3 O4

20 Identify Community (3/4) Select a representative PST for each communitySelect a representative PST for each community –Represent all PSTs in the same community –Advantages Reduce the overhead of storages Speed up query processing Identify new users for their communities 20 Representative PST Add into ?

21 Identify Community (4/4) Two factorsTwo factors 1.Size of representative PST ▪The number of tree nodes, denoted as N(Ti) 2. Distance between the selected PST and others in the same community ▪The error sum, denoted as ES - Sum of the distance between selected PST and others Representative PSTRepresentative PST –Minimize 21

22 Outline MotivationMotivation GoalGoal FrameworkFramework –Preprocess –Construct User’s Profiles –Formulate Distance function –Identify Community ExperimentsExperiments ConclusionConclusion 22

23 Experiments (1/4) Simulator ModelSimulator Model –Use real trajectories from CarWeb to simulate the group mobility of users Total : 2400 trajectories 23

24 Compare to General Sequential Pattern mining algorithm (GSP)Compare to General Sequential Pattern mining algorithm (GSP) –Set of sequential patterns Ex. sp 1, sp 2,..., sp n –Trajectory profile of a user represented as a –Distance function between profiles Cosine similarity measurement, similarity(V i, V j ) = Example Experiments (2/4) 24 Similarity : . | || |

25 Experiments (3/4) Impact of Trajectory ProfilesImpact of Trajectory Profiles 25 Storage Prediction GSP are always larger than PST Especially in MinSup smaller than 0.15

26 Experiments (4/4) Impact of the threshold δ and MinSupImpact of the threshold δ and MinSup –Smaller threshold δ will find more number of communities 26 Storage Prediction

27 Outline MotivationMotivation GoalGoal FrameworkFramework –Preprocess –Construct User’s Profiles –Formulate Distance function –Identify Community ExperimentsExperiments ConclusionConclusion 27

28 Conclusion Explore the problem of mining communities from trajectoriesExplore the problem of mining communities from trajectories 28 Preprocess Construct User’s Profile Measure Distance Between Users Identify Community Find frequent regions Replace trajectories by region ids Formulate distance function Cluster users by distance function Select Representative PSTs Build probabilistic suffix tree (abbreviated as PST)

29 THANK YOU! 29


Download ppt "Mining Trajectory Profiles for Discovering User Communities Speaker : Chih-Wen Chang National Chiao Tung University, Taiwan 2009.11.03 Chih-Chieh Hung,"

Similar presentations


Ads by Google