# Trajectory Segmentation Marc van Kreveld. Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency.

## Presentation on theme: "Trajectory Segmentation Marc van Kreveld. Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency."— Presentation transcript:

Trajectory Segmentation Marc van Kreveld

Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency … often consider themselves to be puzzlers

Jigsaw Puzzles John finishes a jigsaw puzzle with 100 pieces with a lot of blue sky in 1 hour How much time will John need for a jigsaw puzzle with a lot of blue sky and 200 pieces? A. About 2 hours B. About 3 hours C. About 4 hours D. More than 4 hours

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only one piece left How many tries do you need in the worst case to finish the puzzle? 2

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only two pieces left How many tries do you need in the worst case to finish the puzzle? 4+2

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only three pieces left How many tries do you need in the worst case to finish the puzzle? 6+4+2

Jigsaw Puzzles Pieces remaining:1234 n Tries (max):261220 What will be here, expressed in n ? n 2 + n On the average one would need half as many tries to finish a puzzle with n pieces: ½ ( n 2 + n )

Jigsaw Puzzles For the puzzle with 100 pieces, John tries ½ (100 2 +100) = 5050 times a piece, and that takes 1 hour For the puzzle with 200 pieces, John tries ½ (200 2 +200) = 20,100 times a piece, which should take 20,100/5050  3,98 hours

Scalability Most data analysis tasks work on large data sets whose size is represented by a variable n An algorithm that performs a task will have a running time that grows with n Algorithms that scale well are considered efficient O-notation is used to denote scaling behavior: – Linear is O( n ) – Quadratic is O( n 2 ) n time O( n 2 ) O( n )

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points O( n log n ) O( n 2 )

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points An algorithms researcher designs the sequence of steps that is correct for the problem, and analyzes the efficiency by determining the scaling behavior O( n log n ) O( n 2 )

Trajectories A trajectory is usually represented by (collected as) a sequence of points in the plane with a time stamp The moving object has to travel from point-to-point, so we must assume something in between: linear interpolation of position  constant speed (x i,y i,t i ) (x 2,y 2,t 2 ) (x 1,y 1,t 1 ) (x n,y n,t n )

Trajectories What can happen if you just take the data points and not the trajectories?

Trajectories What can happen if you just take the data points and not the trajectories?

What can you analyze? Simple things like travel distance, average speed, etc. Complicated things like similarity of different pieces of the same trajectory Kevin Buchin, Maike Buchin, Joachim Gudmundsson, Maarten Löffler, Jun Luo: Detecting Commuting Patterns by Clustering Subtrajectories. ISAAC (2008)

What can you analyze? Computing travel distance, average speed, is simple because: – It is well-defined – A simple algorithm achieves linear efficiency (scaling) Computing similarity of different pieces of the same trajectory is complicated because: – It is not clear how to define it precisely – Algorithms will be longer (in code) and less efficient

What can you analyze? For multiple trajectories, – An average similarity – A clustering – The most typical trajectory – The most outlying trajectory – Similar subtrajectories – Flocking – … Kevin Buchin, Maike Buchin, Marc van Kreveld, Jun Luo: Finding long and similar parts of trajectories. SIGSPATIAL (2009)

What can you analyze? For multiple trajectories: – An average similarity [What similarity measure?] – A clustering [What similarity measure, optimize what? ] – The most typical trajectory [By which definition?] – The most outlying trajectory – Similar subtrajectories – Flocking – … Kevin Buchin, Maike Buchin, Marc van Kreveld, Jun Luo: Finding long and similar parts of trajectories. SIGSPATIAL (2009)

Segmentation Cutting a trajectory in pieces that are “similar” within the piece Similar in: heading, speed, curvature, sinuosity, …

Why segmentation? Explaining behavior of a moving entity: one type of behavior may be characterized by similarity of movement Detecting outliers: short segments in a segmentation may be caused by outlying observations

Segmentation in other areas Image segmentation: partition a digital image in parts with similar characteristics (hopefully meaningful pieces)

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics Assume a series of n data points is given and k segments are desired n = 20 k = 4 Minimize sum-of-distances or sum-of-squared distances for piecewise-constant or piecewise-linear approximations Richard Bellman: On the approximation of curves by line segments using dynamic programming. Communications of the ACM (1961) O( n 2 k )

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics Assume a series of n data points is given and k segments are desired n = 20 k = 4 Minimize sum-of-distances or sum-of-squared distances for piecewise-constant or piecewise-linear approximations Richard Bellman: On the approximation of curves by line segments using dynamic programming. Communications of the ACM (1961) O( n 2 k )

Segmentation of trajectories Can be treated as trajectory simplification: reduce (minimize) the number of vertices that represents a trajectory Gill Barequet, Danny Chen, Ovidiu Daescu, Michael Goodrich, Jack Snoeyink: Efficiently approximating polygonal paths in three and higher dimensions. Algorithmica (2002) Hu Cao, Ouri Wolfson, Goce Trajcevski: Spatio-temporal data reduction with deterministic error bounds. VLDB Journal (2006) Joachim Gudmundsson, Jyrki Katajainen, Damian Merrick, Cahya Ong, Thomas Wolle: Compressing spatio-temporal trajectories. ISAAC (2007) Falko Schmid, Kai-Florian Richter, Patrick Laube: Semantic trajectory compression. SSTD (2009)

Segmentation of trajectories Can be treated as trajectory simplification: reduce (minimize) the number of vertices that represents a trajectory The segments are the subtrajectories that would be replaced by a single edge in the simplification

Segmentation: two steps back Cutting a trajectory in pieces that are “similar” within the piece (similar in: heading, speed, curvature, sinuosity, … ) We want few pieces How do we define “similar”? Next part based on: Maike Buchin, Anne Driemel, Marc van Kreveld, Vera Sacristan: An algorithmic framework for segmenting trajectories based on spatio-temporal criteria. SIGSPATIAL (2010)

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South We would segment at every vertex, while we want one single segment bad idea  over-segmentation

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: speed Linear interpolation of position between the vertices makes speed piecewise constant (constant on every edge) Segmentation can be based on absolute intervals like [0-2], [2-5], [5-10], [10-15], [15-20], [20-30], [30-..] km/h 3129312931293129  over-segmentation

Segmentation: speed Linear interpolation of position between the vertices makes speed piecewise constant (constant on every edge) Segmentation can be based on absolute intervals like [0-2], [2-5], [5-10], [10-15], [15-20], [20-30], [30-..] km/h Segmentation can also be based on relative speeds: within any single segment the speed ratio is at most, say, 1.5 (alternatively: the speed difference is at most 10 km/h)

Segmentation: heading and speed Suppose require that within any single segment: – the headings are within an angle  /2 everywhere, and – the speed ratio is at most 2 speed heading

Segmentation In all three cases (heading, speed, heading&speed), a greedy approach works: make each next segment as long as possible Easy from the algorithms perspective: O( n ) time for a trajectory with n vertices Why does the greedy approach work? Because any sub-segment of a valid segment is also a valid segment. Therefore, it can never hurt to let a segment extend as far as it can if the goal is a minimum number of segments.

Segmentation: attributes Heading and speed are examples of attributes that are defined at (almost) every point on the trajectory Location, curvature, sinuosity, and curviness are also attributes  need a framework to handle different attributes and ways of combining them

Segmentation: framework Attribute: some value defined at every point on the trajectory Criterion: restriction on allowed values of an attribute within the same segment Segmentation on any combination (conjunction or disjunction) of criteria  Optimal (minimum number of segments)  Guaranteed properties within each segment

Segmentation: location The attribute “location” is defined by a pair of values (x,y) (for trajectories in 2D) Possible criteria: – Any two points within one segment are no more than 5 km apart (diameter criterion) – For any segment, there is a point in the plane that is within 3 km of every point in the segment (enclosing disk criterion)

Segmentation: location The attribute “location” is defined by a pair of values (x,y) (for trajectories in 2D) Possible criteria: – Any two points within one segment are no more than 5 km apart (diameter criterion) – For any segment, there is a point in the plane that is within 3 km of every point in the segment (enclosing disk criterion)

Segmentation: location An optimal segmentation on the diameter or disk criterion for location requires segmentation potentially anywhere on edges We can segment optimally on these criteria in O( n log n ) time Also in combination with heading and speed criteria

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start good not good Binary search here for the last good vertex

Segmentation: algorithm Greedy algorithm for optimal segmentation start Binary search here for the last good vertex good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start Decide precisely where on the edge the end is good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation We have found a maximal segment

Segmentation: algorithm Greedy algorithm for optimal segmentation We have found a maximal segment; iterate start

Segmentation: algorithm Suppose we have an algorithm for Test(s,v i ), which tests whether the subtrajectory from s to v i is good or not Suppose we have an algorithm for Furthest(s,v j ), which returns the furthest point on the edge v j-1 v j

Segmentation: algorithm 1. while s ≠ v n 2. { 3. a = 1; 4. while ( i+a ≤ n && Test(s,v i+a ) ) 5. {a = 2a; } 6. j = binary search in [ i+a/2, min(i+a,n) ] such that Test(s,v j-1 ) = true && Test(s,v j ) = false 7. q = Furthest(s,v j ) 8.Accept the subtrajectory from s to q as the next segment 9.s = q; i = j – 1 10. }

Segmentation: algorithm Assume that Test runs in T( m ) time and Furthest in F( m ) time on a subtrajectory with m vertices Then optimal segmentation takes O( T( n ) log n + F( n ) ) time For almost all criteria we have, T( n ) = F( n ) = O( n )  optimal segmentation takes O( n log n ) time

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour – Winding Curviness

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour – Winding Curviness

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour: arc-length divided by distance – Winding: angular range of heading Curviness: total angular change in some neighborhood Absolute or relative criteria specify bounds on the attribute values

Segmentation For any logical combination (conjunction, disjunction) for all criteria given, we can compute an optimal segmentation in O( n log n ) time, for a trajectory with n vertices  we just combine the outcome of the different Test and Furthest functions of each criterion

Segmentation: algorithmic approach Formalize the problem at hand using well-defined concepts before thinking about solutions Try to ensure that the problem statement is such that the output has guaranteed properties that can be proved – Guarantees in within-segment similarity – Guarantee on number of segments (minimum) Search for the most efficient (scalable) algorithm – Proof of running time

Segmentation: what’s next? Can we make the criteria more robust, to deal with real-World data? Can we extend the approach to include criteria for the environment in the segmentation? Can behavioral characteristics indeed be stated as combinations of criteria based on basic attributes? Possibly, but not easily Probably Unknown

Download ppt "Trajectory Segmentation Marc van Kreveld. Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency."

Similar presentations