Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cameras and Vision: Giving Your Games Sight Antonio Haro Nokia Research Center Computer Graphics and Vision Group.

Similar presentations


Presentation on theme: "Cameras and Vision: Giving Your Games Sight Antonio Haro Nokia Research Center Computer Graphics and Vision Group."— Presentation transcript:

1 Cameras and Vision: Giving Your Games Sight Antonio Haro Nokia Research Center Computer Graphics and Vision Group

2 Cameras and Vision: Giving Your Games Sight Antonio Haro Nokia Research Center Computer Graphics and Vision Group Crash Course

3 Current Mobile Challenges (Non-technical)  Carrier closed camera APIs  Depends on hardware/company/country  APIs can be poorly documented  Biggest challenge: portability  Camera APIs will be different  Most are still changing

4 Current Mobile Challenges (Non-technical)  Chicken and egg problem  Situation should improve soon – demand for imaging/camera applications is increasing  Third-party development is key

5 Current Mobile Challenges  Limited computation  Lens quality  Limited frame rate  Imaging processors  Lack of floating point (for now)  Adaptive exposure (for now)

6  Limited computation  Lens quality  Limited frame rate  Imaging processors  Lack of floating point (for now)  Adaptive exposure (for now) = Mid 80s – early 90s desktop hardware Current Mobile Challenges

7 Outline  Cameras & eyeballs  Motion & tracking  Gesture recognition  Case studies

8 1. Cameras & Eyeballs

9 Computer vision vs. Human vision © Harvard Whole Brain Atlas (http://www.med.harvard.edu/AANLIB/) CPU =

10 Computer vision vs. Human vision © Harvard Whole Brain Atlas (http://www.med.harvard.edu/AANLIB/) CPU =

11 Challenge: Find sign

12

13 Challenge 2: Find sign

14

15 Challenge 3: Find ‘P’

16

17 Challenge 3: Find ‘P’ (smaller) (20 x 23)

18 Challenge 4: Find ‘P’ (smaller)

19 ?

20 Challenge

21 Well, just look for blue

22 Pick “blue” What does “blue” mean?

23 Pick “blue” Ok, Adobe Photoshop to select color range

24 Pick “blue”

25 Picking “blue” Threshold41

26 Picking “blue” Threshold4184

27 Picking “blue” Threshold

28 Picking “blue” Threshold

29 1.Colors change wildly - even in 20 seconds 2.Colors can change frame to frame

30 1.Colors change wildly - even in 20 seconds 2.Colors can change frame to frame

31 Well, just look at the edges

32 Edges - Canny

33 Edges - Sobel

34 Edges – Adobe Photoshop

35 Edges - Comparison CannySobelAdobe Photoshop

36 Edges - Comparison CannySobelAdobe Photoshop

37 Edges - Comparison CannySobelAdobe Photoshop

38 1.Edges are rarely connected 2.Edges change frame to frame

39 1.Edges are rarely connected 2.Edges change frame to frame

40 Q: Why are edges and colors unreliable?

41 Q: Why are edges and colors unreliable? A: Meet the enemy….

42 Q: Why are edges and colors unreliable? A: Meet the enemy…. NOISE

43 Noise?  Source of problems described (and more)

44 Noise?  Source of problems described (and more)  Affects colors, edges, motion, …, everything!

45 Noise?  Source of problems described (and more)  Affects colors, edges, motion, …, everything!  We can fight it!

46 But first…

47 How are images created?

48 Image formation Object

49 Image formation Object

50 Image formation Object

51 Image formation Object

52 Image formation Object

53 Image formation Object

54 Image formation Lens

55 Image formation Lens CCD

56 CCD Images One color per element More green (like eye) RGB pixels created from array (Figure from Wikipedia)

57 Image formation Lens CCD “Imaging” Image

58 CCD Images (Figures from Wikipedia) “Real”CCDRGB Image (possible)

59 CCD Images (Figures from Wikipedia) “Real”CCDRGB Image (possible)

60 Image noise sources  Bad lens

61 Image noise sources  Bad lens  Electronic noise (CCD)

62 Image noise sources  Bad lens  Electronic noise (CCD)  Imaging chain  White balance, correction: exposure, gamma, color, shading, geometrical, noise reduction, etc.

63 Imaging Chain Implementations BadOkGood

64 Also…  Amount of intra-frame processing varies per chain  What happens over seconds? Minutes? Hours?

65 Noise is always present Noise is unavoidable

66 Images are unstable

67 (unlike human vision)

68 2. Motion & Tracking

69 Tracking  Used in video for:  Determining motion of objects  Determining global motion of camera time

70 Tracking  Very complex algorithms possible  “Guts” composed of:  Image filtering  Thresholding  Statistics  Linear algebra

71 Tracking  Very complex algorithms possible  “Guts” composed of:  Image filtering  Thresholding  Statistics  Linear algebra

72 The Tracking Problem  Found something to track in frame n  Where is it in frame n + 1? nn + 1 ?

73 Tracking  Many approaches to: Finding something to start tracking Finding it over and over

74 Tracking  Many approaches to: Finding something to start tracking Finding it over and over (for now)

75 Simplest tracking  Two algorithms, but there are many more:  Template Matching  Optical flow  Most games out now use one/both

76  Template = an image region to track  Reliability issues, but speed is major advantage  Larger windows capture more motion, but more processing needed n + 1n N x N search window Template matching

77 Template Matching Example (3x3) Template Image Best match here?

78 Template Matching Example (3x3) Template Image Best match here?

79 Template Matching Example (3x3) Template Image Best match here?

80 Template Matching Example (3x3) Template Image Best match here?

81 Template Matching Example (3x3) Template Image Best match here?

82 Template Matching Example (3x3) Template Image Best match here?

83 Template Matching Example (3x3) Template Image Best match here?

84 Template Matching Example (3x3) Template Image Best match here?

85 Template Matching Example (3x3) Template Image Best match here?

86 Template Matching Example (3x3) Template Image Best match here?

87 Matching criteria TemplateImage location

88 Matching criteria TemplateImage location

89 Matching criteria: SSD TemplateImage location Sum of squared differences

90 Matching criteria: SSD Template (t)Image location (i) (For grayscale image)

91 Matching criteria: SSD Template (t)Image location (i) Sqrt is always positive - remove it!

92 Matching criteria: SSD (nxn)

93 Matching criteria: Faster SSD (nxn) If non-changing template/image (may fail and mainly for small hoods)

94 Template Matching for Tracking track this Frame n

95 Template Matching for Tracking Frame n

96 Template Matching for Tracking err1 Frame n + 1

97 Template Matching for Tracking err1 err2 Frame n + 1

98 Template Matching for Tracking err1 err2 err3 Frame n + 1

99 Template Matching for Tracking err1 err2 err3 err4 Frame n + 1

100 Template Matching for Tracking err1 err2 err3 err4 err5 Frame n + 1

101 Template Matching for Tracking err1 err2 err3 err4 err5 err6 Frame n + 1

102 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 Frame n + 1

103 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 err8 Frame n + 1

104 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 err8 err9 Frame n + 1

105 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 err8 err9 Frame n + 1 Choose min

106 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 err8 err9 Frame n + 1 Choose min

107 Template Matching for Tracking err1 err2 err3 err4 err5 err6 err7 err8 err9 Frame n + 1

108 Template Matching for Tracking Frame n + 2 Start again

109 Template Matching properties  Pros: Simple to implement Good performance – if tuned  Cons: O(N^2 x #color channels operations) Per feature!! Need good things to track Window size must match feature size and motion

110 Optical flow  Velocity field of pixels between 2 frames (all or some pixels)

111 Optical flow properties  Pros: More correct solutions (sometimes) Entire image can be used (patch-wise), instead of individual pixel hoods  Cons: Vector field may not be smooth (pixel disagreements) Brightness constancy assumption

112 Optical flow algorithms  Fast, low accuracy: Horn-Schunck, Camus  Slow, high accuracy: Lucas-Kanade, Black-Anandan

113 Optical flow algorithms  Fast, low accuracy: Horn-Schunck, Camus  Slow, high accuracy: Lucas-Kanade, Black-Anandan

114 Tracking  Many approaches to: Finding something to start tracking Finding it over and over

115 Image filtering  Filters useful for:  Edges  Corners  Enhancing (blurring, sharpening)  Can be cascaded for complex effects  (Adobe Photoshop)  Useful for finding good things to track

116 Image filter  A filter is an array of numbers  Usually 3x3 or 5x5 (can be NxN)  Applying filter = convolution Filter

117 Convolution is…  Mathematically (deeply) related to Fourier Transform and DSP  A weighted average of a pixel’s neighbors

118 Convolution Filter f

119 Convolution Filter f

120 Convolution Filter f Pixel p

121 Convolution Filter f Pixel p’s neighborhood

122 Convolution Filter f Pixel p’s neighborhood Each corresponding pixel multiplied All products added

123 Convolution Filter f Pixel p’s neighborhood

124 Convolution Filter f Pixel p’s neighborhood Normalization Prevents under/overflow (Shift if pow 2)

125 Sample 3x3 filters: Gaussian (blur) Filter Image 1/16

126 Sample 3x3 filters: Sharpen (one way) Image 1/8 16 Filter

127 Sample 3x3 filters: Gradient X-direction Image 1/ Filter

128 Sample 3x3 filters: Gradient Y-direction Image 1/ Filter

129 ( )  Combine filters for more power GyGx Edges: Sobel operator =

130 Thresholding  Used to select particular colors, range, etc.  Useful for speeding up processing

131 Thresholding  Difficult to select single number as threshold  Thresholds are almost always: Region varying Time varying

132 Adaptive Thresholding  Don’t just use a set single number  Different threshold calculated for each pixel – neighborhood-based

133 Adaptive Thresholding  Don’t just use a set single number  Different threshold calculated for each pixel – neighborhood-based  Much more robust since best thresh is calculated per pixel

134 Adaptive Thresholding 1) Find min, max

135 Adaptive Thresholding [min-max] 1) Find min, max 2) Threshold = (max – min)/2 - c [find best c for your use case]

136 Adaptive Thresholding [min-max] 1) Find min, max 2) Threshold = (max – min)/2 - c [find best c for your use case] OR

137 Adaptive Thresholding [mean] 1) Threshold = mean - c [find best c for your use case]

138 3. Gesture Recognition

139 Gesture recognition  Gesture = a particular movement in front of the camera  In mobile case, motion of the camera

140 Gesture recognition  Gesture = a particular movement in front of the camera  In mobile case, motion of the camera  Can be a motion path (via tracking) [Graffiti]  Or, things like shaking of camera

141 Gesture recognition  Most techniques too intensive for current devices (e.g. Hidden Markov Models)

142 Gesture recognition  Lightweight recognition is possible, but for simpler gestures

143  Used in 90s to recognize sitting/waving/etc.  Very computationally efficient and compact [Davis & Bobick 97] Motion History Images - MHIs

144  Used in 90s to recognize sitting/waving/etc.  Very computationally efficient and compact  Useful to recognize shaking, how fast camera is moving  Main idea: bin for each pixel with timer inside – timer is reset when pixel exceeds difference threshold [Davis & Bobick 97] Motion History Images - MHIs

145 nn + 1 n + 2 MHI if otherwise image differencetimer (parameter) Motion History Images (MHI)

146 nn + 1 n + 2 MHI Mobile device motion: NoneSomeMuch Motion History Images (MHI)

147 4. Case Studies

148 Case Studies  Move device to aim at enemies  Optical flow based  Many clones  Flow is good because: Non-exact tracking is needed Motion = sprite translation But…use tiny images for speed (Siemens.com) Mozzies (Ojom.com) Attack of the Killer Virus

149 Case Studies  Marble Revolution by bit-side GMBH  Also optical flow based  Motion mapped to game physics (bit-side.com)

150 Case Studies  AR Soccer – foot-based game  (1) Sobel filter for edges, (2) edge thinning, (3) line extraction  Done on interleaved frames (Pocket PC) [Stichling, Kleinjohann 02]

151 Case Studies  Edge based tracking as joypad pressing - aim [Haro et al. 05] (Sprites based on “Track & Field”, © Konami 1985)  MHIs to detect shaking – map to button pressing: jumping, shooting,etc.

152 Summary  Cameras & eyeballs  Motion & tracking  Gesture recognition  Case studies

153 Further reading Davies, “Machine Vision” (2005) [third edition] Deep overview of field and core algorithms

154 Further reading Jain, et al., “Machine Vision” (1995) Classic textbook, good introduction

155 Further reading Duda, et al., “Pattern Classification” (2001) [second edition] Essential for any work on recognition/classification/learning

156 Further reading - conferences IEEE International Conf. on Computer Vision IEEE Computer Vision and Pattern Recognition CVPR IAPR International Conf. on Pattern Recognition International Conf. on Computer Vision Theory and Applications IEEE International Conf. on Image Processing ACM Siggraph Eurographics Vision, Modeling, and Visualization VMV

157 Publications  A. Haro, K. Mori, V. Setlur, T. Capin, "Mobile Camera-based Adaptive Viewing", 4th International Conference on Mobile Ubiquitous Multimedia, ACM MUM 2005, Christchurch, New Zealand December  V. Paelke, C. Reimann, D. Stichling, "Foot-based mobile Interaction with Games", ACM SIGCHI International Conference on Advances in Computer Entertainment Technology (ACE). Singapore, June 2004  J. Davis and A. Bobick, “The Representation and Recognition of Action Using Temporal Templates”, IEEE Conference on Computer Vision and Pattern Recognition, June 1997, pp  Others…

158 Software  For learning/prototyping algorithms:  Matlab  Scilab  Desktop PC testing:  Intel’s Open CV  Mobile device development….

159 Plug: Nokia Computer Vision Library  Available from research.nokia.com in near future  Also available from Forum Nokia Pro (forum.nokia.com)  For Nokia Symbian OS devices  Core vision functionality – image/video processing

160 Plug: Nokia Computer Vision Library  Available from research.nokia.com in near future  Also available from Forum Nokia Pro (forum.nokia.com)  For Nokia Symbian OS devices  Core vision functionality – image/video processing Almost everything presented!

161 Thanks! Questions/comments?

162


Download ppt "Cameras and Vision: Giving Your Games Sight Antonio Haro Nokia Research Center Computer Graphics and Vision Group."

Similar presentations


Ads by Google