Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structure from motion Input: Output: (Tomasi and Kanade)

Similar presentations


Presentation on theme: "Structure from motion Input: Output: (Tomasi and Kanade)"β€” Presentation transcript:

1

2 Structure from motion Input: Output: (Tomasi and Kanade)
a set of point tracks Output: 3D location of each point (shape) camera parameters (motion)

3 Orthographic SFM: Setup
𝐼 1 , 𝐼 2 ,…, 𝐼 𝑓 : a collection of images (video frames) depicting a rigid scene Orthographic projection (no scale) 𝑝 point tracks in those 𝑓 frames Unknown 3D location: 𝑃 𝑗 =( 𝑋 𝑗 , π‘Œ 𝑗 , 𝑍 𝑗 ) 𝑇 ∈ ℝ 3 , 𝑗=1,…,𝑝 Projected locations: denote by ( π‘₯ 𝑖𝑗 , 𝑦 𝑖𝑗 ) 𝑇 the location of 𝑃 𝑗 at frame 𝑖, then π‘₯ 𝑖𝑗 = 𝒓 𝑖 𝑇 𝑃 𝑗 + 𝑐 𝑖 𝑦 𝑖𝑗 = 𝒔 𝑖 𝑇 𝑃 𝑗 + 𝑑 𝑖 𝒓 𝑖 𝑇 , 𝒔 𝑖 𝑇 are the two top rows of a rotation matrix

4 Orthographic SFM: Objective
Find 𝒓 𝑖 𝒔 𝑖 ∈ ℝ 3 and 𝑐 𝑖 , 𝑑 𝑖 βˆˆβ„ that minimize 𝑖=1 𝑓 𝑗=1 𝑝 ( 𝒓 𝑖 𝑇 𝑃 𝑗 + 𝑐 𝑖 )βˆ’ π‘₯ 𝑖𝑗 2 + ( 𝒔 𝑖 𝑇 𝑃 𝑗 + 𝑑 𝑖 )βˆ’ 𝑦 𝑖𝑗 2 Subject to 𝒓 𝑖 = 𝒔 𝑖 =1 𝒓 𝑖 𝑇 𝒔 𝑖 =0

5 Eliminate translation
We can eliminate translation by representing the location of each point relative to the centroids of all 𝑝 points: Assume without loss of generality that the centroid of 𝑃 1 ,…, 𝑃 𝑝 coincides with the origin 𝟎∈ ℝ 3 Translate each image point by setting π‘₯ 𝑖𝑗 = π‘₯ 𝑖𝑗 βˆ’ π‘₯ 𝑖 𝑦 𝑖𝑗 = 𝑦 𝑖𝑗 βˆ’ 𝑦 𝑖 ( π‘₯ 𝑖 , 𝑦 𝑖 ) denotes the centroid of ( π‘₯ 𝑖𝑗 , 𝑦 𝑖𝑗 )

6 Objective (w/o translation)
Find 𝒓 𝑖 𝒔 𝑖 ∈ ℝ 3 that minimize 𝑖=1 𝑓 𝑗=1 𝑝 𝒓 𝑖 𝑇 𝑃 𝑗 βˆ’ π‘₯ 𝑖𝑗 2 + 𝒔 𝑖 𝑇 𝑃 𝑗 βˆ’ 𝑦 𝑖𝑗 2 Subject to 𝒓 𝑖 = 𝒔 𝑖 =1 𝒓 𝑖 𝑇 𝒔 𝑖 =0

7 Measurement matrix 𝑀= π‘₯ 11 π‘₯ … π‘₯ 𝑓1 π‘₯ 𝑓 π‘₯ 1𝑝 … . . π‘₯ 𝑓𝑝 𝑦 11 𝑦 𝑦 𝑓1 𝑦 𝑓 𝑦 1𝑝 … . . 𝑦 𝑓𝑝 2𝑓×𝑝

8 Transformation and shape matrices
𝑇= 𝒓 1 𝑇 … 𝒓 𝑓 𝑇 𝒔 1 𝑇 … 𝒔 𝑓 𝑇 = π‘Ÿ 11 π‘Ÿ 12 π‘Ÿ 13 … … π‘Ÿ 𝑓1 π‘Ÿ 𝑓2 π‘Ÿ 𝑓3 𝑠 11 𝑠 12 𝑠 13 … … 𝑠 𝑓1 𝑠 𝑓2 𝑠 𝑓3 2𝑓×3 𝑆= 𝑋 1 𝑋 2 π‘Œ 1 π‘Œ 2 . 𝑍 1 𝑍 2 𝑋 𝑝 . . π‘Œ 𝑝 𝑍 𝑝 3×𝑝

9 Objective: matrix notation
Find 𝑇 and 𝑆 that minimize π‘€βˆ’π‘‡π‘† 𝐹 Subject to 𝒓 𝑖 = 𝒔 𝑖 =1 𝒓 𝑖 𝑇 𝒔 𝑖 =0 𝑀 is 2𝑓×𝑝, 𝑇 is 2𝑓×3, 𝑆 is 3×𝑝

10 𝑀=𝑇𝑆+Noise π‘₯ 11 π‘₯ … π‘₯ 𝑓1 π‘₯ 𝑓 π‘₯ 1𝑝 … . . π‘₯ 𝑓𝑝 𝑦 11 𝑦 𝑦 𝑓1 𝑦 𝑓 𝑦 1𝑝 … . . 𝑦 𝑓𝑝 2𝑓×𝑝 = π‘Ÿ 11 π‘Ÿ 12 π‘Ÿ 13 … … π‘Ÿ 𝑓1 π‘Ÿ 𝑓2 π‘Ÿ 𝑓3 𝑠 11 𝑠 12 𝑠 13 … … 𝑠 𝑓1 𝑠 𝑓2 𝑠 𝑓 𝑓×3 𝑋 1 … 𝑋 𝑝 π‘Œ 1 π‘Œ 𝑝 𝑍 1 … 𝑍 𝑝 3×𝑝 +Noise

11 TK-Factorization 𝑀=𝑇𝑆+Noise
Step 1: find rank 3 approximation to 𝑀 using SVD 𝑀=π‘ˆΞ£ 𝑉 𝑇 where π‘ˆ is 2𝑓×2𝑓, π‘ˆ 𝑇 π‘ˆ=𝐼, Ξ£=π‘‘π‘–π‘Žπ‘”( 𝜎 1 , 𝜎 2 ,…), size 2𝑓×𝑝, and 𝜎 1 β‰₯ 𝜎 2 β‰₯…β‰₯0 𝑉 is 𝑝×𝑝, 𝑉 𝑇 𝑉=𝐼

12 TK-Factorization 𝑀 =π‘ˆ Ξ£ 3 𝑉 𝑇
𝑀 =π‘ˆ Ξ£ 3 𝑉 𝑇 where Ξ£ 3 =π‘‘π‘–π‘Žπ‘”( 𝜎 1 , 𝜎 2 , 𝜎 3 ,0, 0,…) Note: this is a relaxation, only noise components outside the 3D space are annihilated Step 2: factorization 𝑇 =π‘ˆ Ξ£ 𝑆 = Ξ£ 3 𝑉 𝑇 Ambiguity: 𝑀 =( 𝑇 𝐴)( 𝐴 βˆ’1 𝑆 ) for any non-singular, 3Γ—3 matrix 𝐴

13 TK-Factorization Step 3: resolve ambiguity 𝒓 𝑖 = 𝒔 𝑖 =1 𝒓 𝑖 𝑇 𝒔 𝑖 =0 Let 𝑅 𝑖 = 𝒓 𝑖 𝑇 𝒔 𝑖 𝑇 2Γ—3 , note that 𝑅 𝑖 𝑅 𝑖 𝑇 =𝐼 Let 𝑇 𝑖 = 𝒓 𝑖 𝑇 𝒔 𝑖 𝑇 2Γ—3 be the corresponding rows in 𝑇 , then 𝑅 𝑖 = 𝑇 𝑖 𝐴 Find a 3Γ—3 symmetric matrix 𝐴 𝐴 𝑇 𝑇 𝑖 𝐴 𝐴 𝑇 𝑇 𝑖 𝑇 = 𝑅 𝑖 𝑅 𝑖 𝑇 =𝐼

14 TK-Factorization 𝑇 𝑖 𝐴 𝐴 𝑇 𝑇 𝑖 𝑇 = 𝑅 𝑖 𝑅 𝑖 𝑇 =𝐼
𝑇 𝑖 𝐴 𝐴 𝑇 𝑇 𝑖 𝑇 = 𝑅 𝑖 𝑅 𝑖 𝑇 =𝐼 Equation is linear in 𝐴 𝐴 𝑇 There are 3𝑓 equations in 6 unknowns Find 𝐴 by eigen-decomposition 𝐴 𝐴 𝑇 =π‘Šβˆ† π‘Š 𝑇 so that 𝐴=π‘Š βˆ† Solution is obtained up to a rotation ambiguity 𝑇 𝑖 (𝐴𝐡)( 𝐡 𝑇 𝐴 𝑇 ) 𝑇 𝑖 𝑇 such that 𝐡 𝐡 𝑇 =𝐼

15 TK-Factorization: Summary
Eliminate translation, construct 𝑀 𝑆𝑉𝐷(𝑀) to get rank 3 𝑀 and factorize 𝑀 = 𝑇 𝑆 (3Γ—3 ambiguity 𝐴 remains) Resolve ambiguity: estimate 𝐴 𝐴 𝑇 from orthonormality and factorize to obtain 𝐴 Solution up to rotation and reflection

16 Incomplete tracks Tracks are often incomplete –
Factorization with missing data Rank is difficult to enforce Surrogate: minimize the nuclear norm – sum of singular values, 𝜎 1 + 𝜎 2 + 𝜎 3 +… Nuclear norm is convex, minimization often achieves low rank Accurate reconstruction usually requires accounting for perspective distortion

17 Perspective projection
A point 𝑃=(𝑋,π‘Œ,𝑍) is projected to π‘₯= 𝑓𝑋 𝑍 𝑦= π‘“π‘Œ 𝑍 A point rotated by 𝑅 and translated by 𝒕 projects to π‘₯= 𝑓( 𝒓 1 𝑇 𝑃+ 𝑑 π‘₯ ) 𝒓 3 𝑇 𝑃+ 𝑑 𝑧 𝑦= 𝑓( 𝒓 2 𝑇 𝑃+ 𝑑 𝑦 ) 𝒓 3 𝑇 𝑃+ 𝑑 𝑧 𝒓 𝑖 𝑇 denotes the rows of 𝑅 We call 𝐢=𝐾[𝑅,𝒕] 3Γ—4 a camera matrix 𝐾 calibration matrix, 𝑅 camera orientation, 𝒕 camera location

18 Bundle adjustment Given 𝑝 points in 𝑓 frames, (π‘₯ 𝑖𝑗 , 𝑦 𝑖𝑗 ), find camera matrices 𝐢 𝑖 and positions 𝑃 𝑗 (𝑗=1,…,𝑝) that minimize 𝑖=1 𝑓 𝑗=1 𝑝 𝑓 ( 𝒓 𝑖1 𝑇 𝑃 𝑗 + 𝑑 π‘₯ ) 𝒓 𝑖3 𝑇 𝑃 𝑗 + 𝑑 𝑧 βˆ’ π‘₯ 𝑖𝑗 𝑓 (𝒓 𝑖2 𝑇 𝑃 𝑗 + 𝑑 𝑦 ) 𝒓 𝑖3 𝑇 𝑃 𝑗 + 𝑑 𝑧 βˆ’ 𝑦 𝑖𝑗 2 Alternate optimization Given 𝑅 𝑖 and 𝒕 π’Š , solve for 𝑃 𝑗 Given 𝑃 𝑗 solve for 𝑅 𝑖 and 𝒕 π’Š Very good initial guess is required

19 Bundler (photo-tourism)
(Snavely et al.)

20 Bundler (photo-tourism)
Given images, identify feature points, describe them with SIFTs Match SIFTs, accept each match 𝑝 𝑖 ↔ 𝑝 𝑗 whose score is at least twice of any other match 𝑝 𝑖 ↔ 𝑝 π‘˜ For every pair of images with sufficiently many matches use RANSAC to recover Essential matrices Starting with two images and adding one image at a time: use essential matrix to recover depth and apply bundle adjustment

21 Simultaneous solutions
𝐸 𝑖𝑗 : Essential matrix between 𝐼 𝑖 and 𝐼 𝑗 , 𝑖,𝑗=1,…,𝑓 𝐸 𝑖𝑗 = 𝒕 𝑖𝑗 Γ— 𝑅 𝑖𝑗 (on a subset of image pairs) Objective: recover camera orientation 𝑅 𝑖 and location 𝒕 𝑖 relative to a global coordinate system min 𝑅 𝑖 𝑅 𝑖𝑗 βˆ’ 𝑅 𝑖 𝑅 𝑗 𝑇 𝐹 This can be solved in various ways, for example min 𝑅 𝑖 𝑅 𝑖𝑗 𝑅 𝑗 βˆ’ 𝑅 𝑖 𝐹 : least squares solution if we ignore the orthonormality constraints for 𝑅 𝑖

22 Essential in global coordinates
Corresponding points, 𝑝 and π‘ž, satisfy the following relation 𝑝 𝑇 𝑅 𝑖 𝑇 𝒕 𝑖 Γ— βˆ’ 𝒕 𝑗 Γ— 𝑅 𝑗 π‘ž=0 This generalizes the formula for the essential matrix (plug in 𝑅 𝑖 =𝐼, 𝒕 𝑖 =𝟎) Once camera orientations 𝑅 𝑖 are known we can solve for camera locations Solution suffers from shrinkage problems

23 Reconstruction example


Download ppt "Structure from motion Input: Output: (Tomasi and Kanade)"

Similar presentations


Ads by Google