Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.

Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie Mellon University leili@cs.cmu.edu 1 School of Computer Science

2 Motion stitching via effort minimization with James McCann, Nancy Pollard and Christos Faloutsos [Eurographics 2008] Parallel learning of linear dynamical systems with Wenjie Fu, Fan Guo, Todd Mowry and Christos Faloutsos [KDD 2008]

Background Motion Capture Markers on human body, optical cameras to capture the marker positions, and translated into body local coordinates. Application: – Movie/game/medical industry 3

Outline Background Motivation: effortless motion stitching Parallel learning with Cut-And-Stitch Experiments and Results Conclusion 4

Motivation Given two human motion sequences, how to stitch them together in a natural way( = looks natural in human’s eyes)? e.g. walking to running Given a human motion sequence, how to find the best natural stitchable motion in motion capture database? 5

Intuition Intuition: – Laziness is a virtue. Natural motion use minimum energy Laziness-score (L-score) = energy used during stitching Objective: – Minimize laziness-score 6

Example 7 Taking off landing

Example, Natural stitching 8 Taking off landing

But, how about this way? 9 Taking off landing

Observations Naturalness depends on smoothness Naturalness also depends on motion speed 10

Proposed Method Estimate stitching path using Linear Dynamical Systems 11

Proposed Method (cont’) Estimate the velocity and acceleration during the stitching, compute energy (defined as L- score) 12

Proposed Method (cont’) Minimize L-score with respect to any stitching hops. (defined as elastic L-score) 13

Example stitching Link to video 14

Parallel Learning for LDS Challenge: – Learning Linear Dynamical System is slow for long sequences Traditional Method: – Maximum Likelihood Estimation via Expectation- Maximization(EM) algorithm Objective: – Parallelize the learning algorithm Assumption: – shared memory architecture 16

Linear Dynamical System aka. Kalman Filter 17 Parameters:  =(u 0, V 0, A, Γ, C, Σ) Observation: y 1 …y n Hidden variables: z 1 … z n 17 Z1Z1 Z1Z1 Z2Z2 Z2Z2 Z3Z3 Z3Z3 Z4Z4 Z4Z4 Z5Z5 Z5Z5 Y1Y1 Y1Y1 Y2Y2 Y2Y2 Y3Y3 Y3Y3 Y4Y4 Y4Y4 Y5Y5 Y5Y5  (A∙z 1, Γ)  (u 0, V 0 )  (C∙z 3, Σ)  (A∙z 2, Γ)  (C∙z 1, Σ)  (C∙z 2, Σ)  (C∙z 4, Σ)  (A∙z 3, Γ)  (C∙z 5, Σ)  (A∙z 4, Γ)

Example 18 given positions, estimate dynamics (i.e. params) z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 Time Position of left elbow

Traditional: How to learn LDS? 19

Sequential Learning (EM) z1z1 z1z1 z2z2 z2z2 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 Compute P(z 1 | y 1 ) Time * Measured Estimated Position of left elbow

Sequential Learning (EM) 21 z1z1 z1z1 z2z2 z2z2 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 From P(z 1 | y 1 )  Compute P(z 2 | y 1, y 2 ) Time * Intuition: z 2 may be close to z 1 * Measured Estimated Position of left elbow

Sequential Learning (EM) 22 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 From P(z 2 | y 1, y 2 )  Compute P(z 3 | y 1, y 2, y 3 ) z2z2 z2z2 Time * * * Measured Estimated Position of left elbow

Sequential Learning (EM) 23 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 From P(z 3 | y 1, y 2, y 3 )  Compute P(z 4 | y 1, y 2, y 3, y 4 ) z2z2 z2z2 23 Time * * * * Measured Estimated Position of left elbow

Sequential Learning (EM) 24 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 4 | y 1, y 2, y 3, y 4 )  Compute P(z 5 | y 1, y 2, y 3, y 4, y 5 ) Time * * * * * Measured Estimated Position of left elbow

Sequential Learning (EM) 25 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 5 | y 1, y 2, y 3, y 4, y 5 )  Compute P(z 6 | y 1, y 2, y 3, y 4, y 5, y 6 ) Time * * * * * * Measured Estimated Position of left elbow

* Sequential Learning (EM) 26 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 6 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 5 | y 1, y 2, y 3, y 4, y 5, y 6 ) 26 Time * * * * Measured Estimated * * Intuition: take the future backward Position of left elbow

Sequential Learning (EM) 27 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 6 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 4 | y 1, y 2, y 3, y 4, y 5, y 6 ) * 27 Time * * * * Measured * * * Estimated * * * * * Position of left elbow

Sequential Learning (EM) z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 4 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 3 | y 1, y 2, y 3, y 4, y 5, y 6 ) 28 * Time * * * * Measured * * * * Estimated * * * * * Position of left elbow

Sequential Learning (EM) z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 3 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 2 | y 1, y 2, y 3, y 4, y 5, y 6 ) 29 * Time * * * * Measured * * * * * Estimated * * * * * Position of left elbow

Sequential Learning (EM) 30 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 2 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 1 | y 1, y 2, y 3, y 4, y 5, y 6 ) 30 * * Time * * * * Measured * * * * * Estimated * * * * * Position of left elbow

Sequential Learning (EM) z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From all posterior z 1, z 2, z 3, z 4, z 5, z 6 P(z 1 | y 1, y 2, y 3, y 4, y 5, y 6 ), P(z 2 | y 1, y 2, y 3, y 4, y 5, y 6 )… Compute sufficient statistics E[z i ] E[z i z i ’] E[z i-1 z i ’]

Sequential Learning (EM) 32 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 * * Time * * * * Measured * * * * * with sufficient statistics, compute argmax ←likelihood(θ) θ reconstructed signal Position of left elbow

Sequential Learning (EM) 33 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 z6z6 z6z6 y6y6 y6y6 z2z2 z2z2 From P(z 6 | y 1, y 2, y 3, y 4, y 5, y 6 )  Compute P(z 5 | y 1, y 2, y 3, y 4, y 5, y 6 ) 33 Time Position * * * * * Measured Estimated * * * * * *

How to parallelize it? 34 Speed Bottleneck: sequential computation of posterior z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z2z2 z2z2 z6z6 z6z6

“Leap of faith” start computation without feedback from previous node (cut), and reconcile later (stitch) 35

Proposed Method: Cut-And-Stitch 36 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z2z2 z2z2 z6z6 z6z6 υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 start computation without feedback from previous node (cut) reconcile later (stitch)

Cut-And-Stitch υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 Cut step: Estimate posteriors (E) Time Measured Estimated Intuition: compute all three at once * * * P(z 1 | y 1 ), P(z 3 | y 3 ), P(z 5 | y 5 ) Position of left elbow

Cut-And-Stitch υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 Cut step: Estimate posteriors (E) Time * * * * * * Measured Estimated Position of left elbow

Cut-And-Stitch υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 Cut step: Estimate posteriors (E) Time Measured * * * * * * * * * Intuition: backward adjust all at once Position of left elbow

Cut-And-Stitch υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 Stitch step: Collect sufficient statistics for each block (C) E[z i ] E[z i z i ’] E[z i-1 z i ’]

Cut-And-Stitch Stitch step: Collect sufficient Statistics (C) Maximize parameters (M) υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 41 Time Measured * * * * * * * * * Position of left elbow

Cut-And-Stitch υ2,Φ2,η2,Ψ2υ2,Φ2,η2,Ψ2 υ1,Φ1,η1,Ψ1υ1,Φ1,η1,Ψ1 z1z1 z1z1 y1y1 y1y1 y2y2 y2y2 z' 2 z2z2 z2z2 z3z3 z3z3 y3y3 y3y3 z4z4 z4z4 y4y4 y4y4 z' 4 z5z5 z5z5 y5y5 y5y5 y6y6 y6y6 z6z6 z6z6 υ3,Φ3,η3,Ψ3υ3,Φ3,η3,Ψ3 Stitch together: Re-estimate block parameters (R) Time Measured * * * * * * * * * * * * Intuition: exchange messages cross block Iterate… reconstructed signal Position of left elbow

Experiments Q1: How much speed up can we get? Q2: How good is the reconstruction accuracy? 44

Experiments Dataset: – 58 human motion sequences, 200 – 500 frames – Each frame with 93 bone positions in body local coordinates – http://mocap.cs.cmu.edu Setup: – Supercomputer: SGI Altix system, distributed shared memory architecture – Multi-core desktop: 4 Intel Xeon cores, shared memory Task: – Learn the dynamics, hidden variables and reconstruct motion 45

Q1: How much speed up? 46 Supercomputer Result speedup # of processors ideal average of 58

Q1: How much speed up? 47 Multi-core Result speedup # of cores ideal average of 58

Q2: How good? 48 Result: ~ IDENTICAL accuracy

Conclusion & Contributions A distance function for motion stitching – Based on first principle: minimize effort General approximate parallel learning algorithm for LDS – Near linear speed up – Accuracy (NRE): ~ identical to sequential learning – Easily extended to HMM and other chain Markovian models Software (C++ w. openMP) and datasets: www.cs.cmu.edu/~leili/paralearn www.cs.cmu.edu/~leili/paralearn 49

Promising Extensions Extension – HMM – other Markov models (similar graphical model) Open Problem: – Can prove the error bound? 50

Thank you 51 Questions

53 Approximate Parallel Learning 53 12345 Iteration 0 Warm Up Iteration 1 Step 1 Step 2 Iteration 2 Step 3 Step 4 Iteration 3 Step 5 Step 6

Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.

Similar presentations

Presentation on theme: "Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie.

Similar presentations

Presentation on theme: "Cut-And-Stitch: Efficient Parallel Learning of Linear Dynamical Systems on SMPs Lei Li Computer Science Department School of Computer Science Carnegie."— Presentation transcript:

Similar presentations

About project

Feedback