Trajectory Segmentation Marc van Kreveld. Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency.

Slides:



Advertisements
Similar presentations
Algorithm Analysis Input size Time I1 T1 I2 T2 …
Advertisements

Chapter 4 Computation Bjarne Stroustrup
Higher-Order Delaunay Triangulations
CS 332: Algorithms NP Completeness David Luebke /2/2017.
Splines I – Curves and Properties
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
Incremental Linear Programming Linear programming involves finding a solution to the constraints, one that maximizes the given linear function of variables.
Computational Movement Analysis Lecture 3:
Lecture 24 MAS 714 Hartmut Klauck
Fundamental tools: clustering
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Order Statistics Sorted
Fast Algorithms For Hierarchical Range Histogram Constructions
Approximations of points and polygonal chains
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Advanced Topics in Algorithms and Data Structures Lecture 7.1, page 1 An overview of lecture 7 An optimal parallel algorithm for the 2D convex hull problem,
Image Segmentation and Active Contour
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Computational Geometry and Spatial Data Mining
What is an Algorithm? (And how do we analyze one?)
Trajectory Simplification
Median trajectories: define and compute a trajectory composed of the input trajectories and that is somehow in the middle Marc van Kreveld Department of.
The Theory of NP-Completeness
CSE 830: Design and Theory of Algorithms
Lec 5 Feb 10 Goals: analysis of algorithms (continued) O notation summation formulas maximum subsequence sum problem (Chapter 2) three algorithms image.
Chapter 11: Limitations of Algorithmic Power
© NICTA 2007 Joachim Gudmundsson Detecting Movement Patterns Among Trajectory Data.
Hardness Results for Problems
Radial Basis Function Networks
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Copyright © Cengage Learning. All rights reserved. CHAPTER 11 ANALYSIS OF ALGORITHM EFFICIENCY ANALYSIS OF ALGORITHM EFFICIENCY.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
October 14, 2014Computer Vision Lecture 11: Image Segmentation I 1Contours How should we represent contours? A good contour representation should meet.
Computational Movement Analysis Lecture 5: Segmentation, Popular Places and Regular Patterns Joachim Gudmundsson.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 8: Complexity Theory.
Genome Rearrangements [1] Ch Types of Rearrangements Reversal Translocation
Télécom 2A – Algo Complexity (1) Time Complexity and the divide and conquer strategy Or : how to measure algorithm run-time And : design efficient algorithms.
Analysis of algorithms Analysis of algorithms is the branch of computer science that studies the performance of algorithms, especially their run time.
Analysis of Algorithms CSCI Previous Evaluations of Programs Correctness – does the algorithm do what it is supposed to do? Generality – does it.
Image segmentation Prof. Noah Snavely CS1114
Approximation algorithms for TSP with neighborhoods in the plane R 郭秉鈞 R 林傳健.
CSC 211 Data Structures Lecture 13
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Review: Tree search Initialize the frontier using the starting state While the frontier is not empty – Choose a frontier node to expand according to search.
Stabbing balls and simplifying proteins Ovidiu Daescu and Jun Luo Department of Computer Science University of Texas at Dallas Richardson, TX
Course 8 Contours. Def: edge list ---- ordered set of edge point or fragments. Def: contour ---- an edge list or expression that is used to represent.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 6: Segmentation.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
Copyright © 2014 Curt Hill Algorithms From the Mathematical Perspective.
Common Intersection of Half-Planes in R 2 2 PROBLEM (Common Intersection of half- planes in R 2 ) Given n half-planes H 1, H 2,..., H n in R 2 compute.
BITS Pilani Pilani Campus Data Structure and Algorithms Design Dr. Maheswari Karthikeyan Lecture1.
2IMA20 Algorithms for Geographic Data Spring 2016 Lecture 5: Simplification.
Algorithmic Foundations COMP108 COMP108 Algorithmic Foundations Algorithm efficiency Prudence Wong
Trajectory Data: Analysis and Patterns Pattern Recognition 2015/2016.
Introduction to Algorithms
OPERATING SYSTEMS CS 3502 Fall 2017
Data Mining: Concepts and Techniques
Haim Kaplan and Uri Zwick
2IMA20 Algorithms for Geographic Data
Algorithm design and Analysis
Enumerating Distances Using Spanners of Bounded Degree
Randomized Algorithms CS648
Chapter 3: The Efficiency of Algorithms
Introduction to Algorithms
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
Clustering.
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Invitation to Computer Science 5th Edition
Presentation transcript:

Trajectory Segmentation Marc van Kreveld

Algorithms Researchers … … want their problems to be well-defined (fully specified) … care about efficiency … often consider themselves to be puzzlers

Jigsaw Puzzles John finishes a jigsaw puzzle with 100 pieces with a lot of blue sky in 1 hour How much time will John need for a jigsaw puzzle with a lot of blue sky and 200 pieces? A. About 2 hours B. About 3 hours C. About 4 hours D. More than 4 hours

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only one piece left How many tries do you need in the worst case to finish the puzzle? 2

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only two pieces left How many tries do you need in the worst case to finish the puzzle? 4+2

Jigsaw Puzzles Suppose you have a puzzle without any image You have already finished the borders The puzzle is well-made: if a piece fits, you know it is right Suppose every piece is normal Suppose you have only three pieces left How many tries do you need in the worst case to finish the puzzle? 6+4+2

Jigsaw Puzzles Pieces remaining:1234 n Tries (max): What will be here, expressed in n ? n 2 + n On the average one would need half as many tries to finish a puzzle with n pieces: ½ ( n 2 + n )

Jigsaw Puzzles For the puzzle with 100 pieces, John tries ½ ( ) = 5050 times a piece, and that takes 1 hour For the puzzle with 200 pieces, John tries ½ ( ) = 20,100 times a piece, which should take 20,100/5050  3,98 hours

Scalability Most data analysis tasks work on large data sets whose size is represented by a variable n An algorithm that performs a task will have a running time that grows with n Algorithms that scale well are considered efficient O-notation is used to denote scaling behavior: – Linear is O( n ) – Quadratic is O( n 2 ) n time O( n 2 ) O( n )

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points O( n log n ) O( n 2 )

Well-defined Problems The properties of the output, expressed in the input, are fully determined An algorithm is a sequence of steps that converts an input into an output according to the specifications For a given set of numbers, put them in sorted order For a given set of points, find the two that are closest together For a given set of line segments, find all intersection points An algorithms researcher designs the sequence of steps that is correct for the problem, and analyzes the efficiency by determining the scaling behavior O( n log n ) O( n 2 )

Trajectories A trajectory is usually represented by (collected as) a sequence of points in the plane with a time stamp The moving object has to travel from point-to-point, so we must assume something in between: linear interpolation of position  constant speed (x i,y i,t i ) (x 2,y 2,t 2 ) (x 1,y 1,t 1 ) (x n,y n,t n )

Trajectories What can happen if you just take the data points and not the trajectories?

Trajectories What can happen if you just take the data points and not the trajectories?

What can you analyze? Simple things like travel distance, average speed, etc. Complicated things like similarity of different pieces of the same trajectory Kevin Buchin, Maike Buchin, Joachim Gudmundsson, Maarten Löffler, Jun Luo: Detecting Commuting Patterns by Clustering Subtrajectories. ISAAC (2008)

What can you analyze? Computing travel distance, average speed, is simple because: – It is well-defined – A simple algorithm achieves linear efficiency (scaling) Computing similarity of different pieces of the same trajectory is complicated because: – It is not clear how to define it precisely – Algorithms will be longer (in code) and less efficient

What can you analyze? For multiple trajectories, – An average similarity – A clustering – The most typical trajectory – The most outlying trajectory – Similar subtrajectories – Flocking – … Kevin Buchin, Maike Buchin, Marc van Kreveld, Jun Luo: Finding long and similar parts of trajectories. SIGSPATIAL (2009)

What can you analyze? For multiple trajectories: – An average similarity [What similarity measure?] – A clustering [What similarity measure, optimize what? ] – The most typical trajectory [By which definition?] – The most outlying trajectory – Similar subtrajectories – Flocking – … Kevin Buchin, Maike Buchin, Marc van Kreveld, Jun Luo: Finding long and similar parts of trajectories. SIGSPATIAL (2009)

Segmentation Cutting a trajectory in pieces that are “similar” within the piece Similar in: heading, speed, curvature, sinuosity, …

Why segmentation? Explaining behavior of a moving entity: one type of behavior may be characterized by similarity of movement Detecting outliers: short segments in a segmentation may be caused by outlying observations

Segmentation in other areas Image segmentation: partition a digital image in parts with similar characteristics (hopefully meaningful pieces)

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics Assume a series of n data points is given and k segments are desired n = 20 k = 4 Minimize sum-of-distances or sum-of-squared distances for piecewise-constant or piecewise-linear approximations Richard Bellman: On the approximation of curves by line segments using dynamic programming. Communications of the ACM (1961) O( n 2 k )

Segmentation in other areas Time series segmentation: partition time series data into pieces with similar characteristics Assume a series of n data points is given and k segments are desired n = 20 k = 4 Minimize sum-of-distances or sum-of-squared distances for piecewise-constant or piecewise-linear approximations Richard Bellman: On the approximation of curves by line segments using dynamic programming. Communications of the ACM (1961) O( n 2 k )

Segmentation of trajectories Can be treated as trajectory simplification: reduce (minimize) the number of vertices that represents a trajectory Gill Barequet, Danny Chen, Ovidiu Daescu, Michael Goodrich, Jack Snoeyink: Efficiently approximating polygonal paths in three and higher dimensions. Algorithmica (2002) Hu Cao, Ouri Wolfson, Goce Trajcevski: Spatio-temporal data reduction with deterministic error bounds. VLDB Journal (2006) Joachim Gudmundsson, Jyrki Katajainen, Damian Merrick, Cahya Ong, Thomas Wolle: Compressing spatio-temporal trajectories. ISAAC (2007) Falko Schmid, Kai-Florian Richter, Patrick Laube: Semantic trajectory compression. SSTD (2009)

Segmentation of trajectories Can be treated as trajectory simplification: reduce (minimize) the number of vertices that represents a trajectory The segments are the subtrajectories that would be replaced by a single edge in the simplification

Segmentation: two steps back Cutting a trajectory in pieces that are “similar” within the piece (similar in: heading, speed, curvature, sinuosity, … ) We want few pieces How do we define “similar”? Next part based on: Maike Buchin, Anne Driemel, Marc van Kreveld, Vera Sacristan: An algorithmic framework for segmenting trajectories based on spatio-temporal criteria. SIGSPATIAL (2010)

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South

Segmentation: heading On every edge of the trajectory, heading is well-defined Similarity can mean: in the same cardinal direction Northbound EastWest South We would segment at every vertex, while we want one single segment bad idea  over-segmentation

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: heading Use relative directions: We require that within any single segment the headings are within an angle  /2 everywhere

Segmentation: speed Linear interpolation of position between the vertices makes speed piecewise constant (constant on every edge) Segmentation can be based on absolute intervals like [0-2], [2-5], [5-10], [10-15], [15-20], [20-30], [30-..] km/h  over-segmentation

Segmentation: speed Linear interpolation of position between the vertices makes speed piecewise constant (constant on every edge) Segmentation can be based on absolute intervals like [0-2], [2-5], [5-10], [10-15], [15-20], [20-30], [30-..] km/h Segmentation can also be based on relative speeds: within any single segment the speed ratio is at most, say, 1.5 (alternatively: the speed difference is at most 10 km/h)

Segmentation: heading and speed Suppose require that within any single segment: – the headings are within an angle  /2 everywhere, and – the speed ratio is at most 2 speed heading

Segmentation In all three cases (heading, speed, heading&speed), a greedy approach works: make each next segment as long as possible Easy from the algorithms perspective: O( n ) time for a trajectory with n vertices Why does the greedy approach work? Because any sub-segment of a valid segment is also a valid segment. Therefore, it can never hurt to let a segment extend as far as it can if the goal is a minimum number of segments.

Segmentation: attributes Heading and speed are examples of attributes that are defined at (almost) every point on the trajectory Location, curvature, sinuosity, and curviness are also attributes  need a framework to handle different attributes and ways of combining them

Segmentation: framework Attribute: some value defined at every point on the trajectory Criterion: restriction on allowed values of an attribute within the same segment Segmentation on any combination (conjunction or disjunction) of criteria  Optimal (minimum number of segments)  Guaranteed properties within each segment

Segmentation: location The attribute “location” is defined by a pair of values (x,y) (for trajectories in 2D) Possible criteria: – Any two points within one segment are no more than 5 km apart (diameter criterion) – For any segment, there is a point in the plane that is within 3 km of every point in the segment (enclosing disk criterion)

Segmentation: location The attribute “location” is defined by a pair of values (x,y) (for trajectories in 2D) Possible criteria: – Any two points within one segment are no more than 5 km apart (diameter criterion) – For any segment, there is a point in the plane that is within 3 km of every point in the segment (enclosing disk criterion)

Segmentation: location An optimal segmentation on the diameter or disk criterion for location requires segmentation potentially anywhere on edges We can segment optimally on these criteria in O( n log n ) time Also in combination with heading and speed criteria

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start good

Segmentation: algorithm Greedy algorithm for optimal segmentation start

Segmentation: algorithm Greedy algorithm for optimal segmentation start not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start good not good Binary search here for the last good vertex

Segmentation: algorithm Greedy algorithm for optimal segmentation start Binary search here for the last good vertex good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation start Decide precisely where on the edge the end is good not good

Segmentation: algorithm Greedy algorithm for optimal segmentation We have found a maximal segment

Segmentation: algorithm Greedy algorithm for optimal segmentation We have found a maximal segment; iterate start

Segmentation: algorithm Suppose we have an algorithm for Test(s,v i ), which tests whether the subtrajectory from s to v i is good or not Suppose we have an algorithm for Furthest(s,v j ), which returns the furthest point on the edge v j-1 v j

Segmentation: algorithm 1. while s ≠ v n 2. { 3. a = 1; 4. while ( i+a ≤ n && Test(s,v i+a ) ) 5. {a = 2a; } 6. j = binary search in [ i+a/2, min(i+a,n) ] such that Test(s,v j-1 ) = true && Test(s,v j ) = false 7. q = Furthest(s,v j ) 8.Accept the subtrajectory from s to q as the next segment 9.s = q; i = j – }

Segmentation: algorithm Assume that Test runs in T( m ) time and Furthest in F( m ) time on a subtrajectory with m vertices Then optimal segmentation takes O( T( n ) log n + F( n ) ) time For almost all criteria we have, T( n ) = F( n ) = O( n )  optimal segmentation takes O( n log n ) time

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour – Winding Curviness

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour – Winding Curviness

Segmentation: more attributes Curvature 3-point estimators Sinuosity – Detour: arc-length divided by distance – Winding: angular range of heading Curviness: total angular change in some neighborhood Absolute or relative criteria specify bounds on the attribute values

Segmentation For any logical combination (conjunction, disjunction) for all criteria given, we can compute an optimal segmentation in O( n log n ) time, for a trajectory with n vertices  we just combine the outcome of the different Test and Furthest functions of each criterion

Segmentation: algorithmic approach Formalize the problem at hand using well-defined concepts before thinking about solutions Try to ensure that the problem statement is such that the output has guaranteed properties that can be proved – Guarantees in within-segment similarity – Guarantee on number of segments (minimum) Search for the most efficient (scalable) algorithm – Proof of running time

Segmentation: what’s next? Can we make the criteria more robust, to deal with real-World data? Can we extend the approach to include criteria for the environment in the segmentation? Can behavioral characteristics indeed be stated as combinations of criteria based on basic attributes? Possibly, but not easily Probably Unknown