Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.

Similar presentations


Presentation on theme: "Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington."— Presentation transcript:

1 Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington

2 2 Static Gestures (Hand Poses) Input image Articulated hand model Given a hand model, and a single image of a hand, estimate: –3D hand shape (joint angles). –3D hand orientation. Joints

3 3 Static Gestures Input image Articulated hand model Given a hand model, and a single image of a hand, estimate: –3D hand shape (joint angles). –3D hand orientation.

4 4 Goal: Hand Tracking Initialization Given the 3D hand pose in the previous frame, estimate it in the current frame. –Problem: no good way to automatically initialize a tracker. Rehg et al. (1995), Heap et al. (1996), Shimada et al. (2001), Wu et al. (2001), Stenger et al. (2001), Lu et al. (2003), …

5 5 Assumptions in Our Approach A few tens of distinct hand shapes. –All 3D orientations should be allowed. –Motivation: American Sign Language.

6 6 Assumptions in Our Approach A few tens of distinct hand shapes. –All 3D orientations should be allowed. –Motivation: American Sign Language. Input: single image, bounding box of hand.

7 7 Assumptions in Our Approach We do not assume precise segmentation! –No clean contour extracted. input image skin detection segmented hand

8 8 Approach: Database Search Over 100,000 computer-generated images. –Known hand pose. input

9 9 Why? We avoid direct estimation of 3D info. –With a database, we only match 2D to 2D. We can find all plausible estimates. –Hand pose is often ambiguous. input

10 10 Building the Database 26 hand shapes

11 11 4128 images are generated for each hand shape. Total: 107,328 images. Building the Database

12 12 Features: Edge Pixels We use edge images. –Easy to extract. –Stable under illumination changes. inputedge image

13 13 Chamfer Distance input model Overlaying input and model How far apart are they?

14 14 Input: two sets of points. –red, green. c(red, green): –Average distance from each red point to nearest green point. Directed Chamfer Distance

15 15 Input: two sets of points. –red, green. c(red, green): –Average distance from each red point to nearest green point. c(green, red): –Average distance from each red point to nearest green point. Directed Chamfer Distance

16 16 Input: two sets of points. –red, green. c(red, green): –Average distance from each red point to nearest green point. c(green, red): –Average distance from each red point to nearest green point. Chamfer Distance Chamfer distance: C(red, green) = c(red, green) + c(green, red)

17 17 Evaluating Retrieval Accuracy A database image is a correct match for the input if: –the hand shapes are the same, –3D hand orientations differ by at most 30 degrees. input correct matches incorrect matches

18 18 Evaluating Retrieval Accuracy An input image has 25-35 correct matches among the 107,328 database images. –Ground truth for input images is estimated by humans. input correct matches incorrect matches

19 19 Evaluating Retrieval Accuracy Retrieval accuracy measure: what is the rank of the highest ranking correct match? input correct matches incorrect matches

20 20 Evaluating Retrieval Accuracy input rank 1rank 2rank 3rank 5rank 6rank 4 … … highest ranking correct match

21 21 Results on 703 Real Hand Images Rank of highest ranking correct match Percentage of test images 115% 1-1040% 1-10073%

22 22 Results on 703 Real Hand Images Rank of highest ranking correct match Percentage of test images 115% 1-1040% 1-10073% Results are better on “nicer” images: –Dark background. –Frontal view. –For half the images, top match was correct.

23 23 Examples initial image segmented hand edge image correct match rank: 1

24 24 Examples initial image segmented hand edge image correct match rank: 644

25 25 Examples initial image segmented hand edge image incorrect match rank: 1

26 26 Examples initial image segmented hand edge image correct match rank: 1

27 27 Examples initial image segmented hand edge image correct match rank: 33

28 28 Examples initial image segmented hand edge image incorrect match rank: 1

29 29 Examples “hard” case segmented hand edge image “easy” case segmented hand edge image

30 30 Research Directions More accurate similarity measures. Better tolerance to segmentation errors. –Clutter. –Incorrect scale and translation. Verifying top matches. Registration.

31 31 Efficiency of the Chamfer Distance Computing chamfer distances is slow. –For images with d edge pixels, O(d log d) time. –Comparing input to entire database takes over 4 minutes. Must measure 107,328 distances. input model

32 32 The Nearest Neighbor Problem database

33 33 The Nearest Neighbor Problem query Goal: –find the k nearest neighbors of query q. database

34 34 The Nearest Neighbor Problem Goal: –find the k nearest neighbors of query q. Brute force time is linear to: –n (size of database). –time it takes to measure a single distance. query database

35 35 Goal: –find the k nearest neighbors of query q. Brute force time is linear to: –n (size of database). –time it takes to measure a single distance. The Nearest Neighbor Problem query database


Download ppt "Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington."

Similar presentations


Ads by Google