Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aerial Video Surveillance and Exploitation Roland Miezianko CIS 750 - Video Processing and Mining Prof. Latecki.

Similar presentations


Presentation on theme: "Aerial Video Surveillance and Exploitation Roland Miezianko CIS 750 - Video Processing and Mining Prof. Latecki."— Presentation transcript:

1 Aerial Video Surveillance and Exploitation Roland Miezianko CIS 750 - Video Processing and Mining Prof. Latecki

2 Agenda Aerial Surveillance ComparisonsAerial Surveillance Comparisons Technical Challenges and the MissionTechnical Challenges and the Mission Framework Ideas for Video SurveillanceFramework Ideas for Video Surveillance –Alignment and Change Detection –Mosaicing –Tracking Moving Objects –Geo-location –Enhanced Visualization Image MosaicsImage Mosaics

3 Types of Aerial Surveillance Using film and framing camerasUsing film and framing cameras –Hi-resolution still images –Examined by human or machine Video captures dynamic eventsVideo captures dynamic events –Used to detect and geo-locate moving objects in real-time –Follow detected motion –Constantly monitor a site

4 Technical Challenges, 1 Video cameras have lower resolution than framing camerasVideo cameras have lower resolution than framing cameras –Video uses telephoto lens to get high resolution to identify objects –Telephoto lens - Narrow field of view –Provides “soda straw” view of the scene [2]

5 Technical Challenges, 2 Camera must scan the region of interest to get the “full-picture”Camera must scan the region of interest to get the “full-picture” Objects of interest move in and out of the field of viewObjects of interest move in and out of the field of view Difficulty in perceiving object relative locationsDifficulty in perceiving object relative locations

6 Technical Challenges, 3 Challenge in manually tracking an object due to camera’s small field of viewChallenge in manually tracking an object due to camera’s small field of view Video contains much more data then film frames; Storage is expensiveVideo contains much more data then film frames; Storage is expensive

7 The Mission The new aerial surveillance systems must provide a framework for spatio-temporal aerial video analysisThe new aerial surveillance systems must provide a framework for spatio-temporal aerial video analysis

8 Video Analysis Framework, 1 Frame-to-Frame alignment and decomposition of video frames into motion layersFrame-to-Frame alignment and decomposition of video frames into motion layers Mosaicing static background layers to form panoramas as compact representations of the static sceneMosaicing static background layers to form panoramas as compact representations of the static scene

9 Video Analysis Framework, 2 Detecting and tracking independently moving objects in the presents of background clutterDetecting and tracking independently moving objects in the presents of background clutter Geo-locating the video and tracked objects by registering it to controlled reference imagery; digital terrain maps and modelsGeo-locating the video and tracked objects by registering it to controlled reference imagery; digital terrain maps and models

10 Video Analysis Framework, 3 Enhanced visualization of the video by re-projecting and merging it with reference imagery, terrain, and maps to provide a larger contextEnhanced visualization of the video by re-projecting and merging it with reference imagery, terrain, and maps to provide a larger context

11 Alignment and Change Detection, 1 Displacement of pixels between video frames may occur due to the following:Displacement of pixels between video frames may occur due to the following: –Motion of the video sensor –Independent motion of objects in the field of view –Motion of the source of illumination

12 Alignment and Change Detection, 2 Global motion estimationGlobal motion estimation –Displacement of pixels due to the motion of the sensor is computed Alignment of Video FramesAlignment of Video Frames –Pyramid-Processing –Lock into the motion of background scene –Warp images into common coordinate frame

13 Alignment and Change Detection, 3 Moving objects are detected by aligning video frames and detecting pixels with poor correlation across the temporal domainMoving objects are detected by aligning video frames and detecting pixels with poor correlation across the temporal domain

14 MosAICING, 1 Images are accumulated into the mosaic as the camera pansImages are accumulated into the mosaic as the camera pans Construction of a 2D mosaic requires computation of alignment parameters that relate all of the images in the collection to a common world coordinate systemConstruction of a 2D mosaic requires computation of alignment parameters that relate all of the images in the collection to a common world coordinate system

15 MosAICING, 2 Transformation parameters are used to warp the images into the mosaic coordinate systemTransformation parameters are used to warp the images into the mosaic coordinate system Warped images are then combined to form a mosaicWarped images are then combined to form a mosaic To avoid seams, warped frames are merged in the Laplacian pyramid domainTo avoid seams, warped frames are merged in the Laplacian pyramid domain

16 MosAICING, example

17

18 Tracking Moving Objects, 1 Scene analysis includes operations that interpret the source video in terms of objects and activities in the sceneScene analysis includes operations that interpret the source video in terms of objects and activities in the scene Moving objects are detected and tracked over the cluttered sceneMoving objects are detected and tracked over the cluttered scene

19 Tracking Moving Objects, 2 State of each moving object is represented by its:State of each moving object is represented by its: –Motion –Appearance –Shape The state is updated at each instant of time using Expectation- Maximization (EM) algorithmThe state is updated at each instant of time using Expectation- Maximization (EM) algorithm

20 Tracking Moving Objects, example

21 Geo-location Video Surveillance system must also determine the geodetic coordinates of objects within the camera’s field of viewVideo Surveillance system must also determine the geodetic coordinates of objects within the camera’s field of view More precise geo-locations can be estimated by aligning video frames to calibrated reference imagesMore precise geo-locations can be estimated by aligning video frames to calibrated reference images

22 Enhanced Visualization Challenging aspect of aerial video surveillance is formatting video imagery for effective presentation to an operatorChallenging aspect of aerial video surveillance is formatting video imagery for effective presentation to an operator The “soda straw” view makes direct observation tedious and disorientingThe “soda straw” view makes direct observation tedious and disorienting

23 Mosaic-Based Display Display de-couples the observer’s display from the cameraDisplay de-couples the observer’s display from the camera Operator may scroll or zoom to examine one region of the mosaic even as the camera is updating another region of the mosaicOperator may scroll or zoom to examine one region of the mosaic even as the camera is updating another region of the mosaic

24 Elements of Mosaic Display ED warpmerge camera Estimate displacement Pyramid merge Operator’s display Image accumulating memory Update window

25 Camera Input

26 Mosaic Generation, 1

27 Mosaic Generation, 2

28 Psuedo codes of main algorithm [5] read(base_image); read(unregistered_image); base_image=expand(base_image); confirm three pairs of matched points between base_image and unregistered_image; calculate initial matrix M; Apply Levenberg-Marquardt minimization to update M; M = inverse(M); Resample and apply blending function to render the mosaics;

29 Homogeneous Coordinates Using homogeneous coordinates, we can describe the class of 2D planar projective transformations using matrix multiplication [4]:

30 Rigid Transformation The same hierarchy of transformations exists in 3D. Rigid (Euclidean) transformation where R is a 3 × 3 orthonormal rotation matrix and t is a 3D translation vector.

31 Viewing Matrix The 3×4 viewing matrix: projects 3D points through the origin onto a 2D projection plane a distance f along the z axis.

32 Combined Equations The combined equations projecting a 3D world coordinate p = (x, y, z, w) onto a 2D screen location u = (x', y', w') can thus be written as where P is a 3 × 4 camera matrix. This equation is valid even if the camera calibration parameters and/or the camera orientation are unknown.

33 Local Image Registration, 1 How do we compute the transformations relating the various scene pieces so that we can paste them together?How do we compute the transformations relating the various scene pieces so that we can paste them together? –We could manually identify four or more corresponding points between the two views –Manual approaches are too tedious to be useful

34 Local Image Registration, 2 This has the advantages of not requiring any easily identifiable feature points and of being statistically optimal, that is, giving the maximum likelihood estimate once we are in the vicinity of the true solution.This has the advantages of not requiring any easily identifiable feature points and of being statistically optimal, that is, giving the maximum likelihood estimate once we are in the vicinity of the true solution. Rewrite our 2D transformations

35 Minimizes Intensity Errors Technique minimizes the sum of the squared intensity errors. Over all corresponding pairs of pixels i inside both images I(x, y) and I’(x’, y’). Pixels that are mapped outside image boundaries do not contribute.

36 Minimization To perform the minimization, we use the Levenberg-Marquardt iterative nonlinear minimization algorithm. This algorithm requires computation of the partial derivatives of e i with respect to the unknown motion parameters {m 0... m 7 }.

37 Complete Registration Algorithm: Step 1 [4]

38 Complete Registration Algorithm: Steps 2-4

39 Bryce Canyon Mosaic

40 Wall Frame, example

41 Conclusion The techniques presented here automatically register video frames into 2D and partial 3D scene models.The techniques presented here automatically register video frames into 2D and partial 3D scene models. Video mosaics and related techniques will enable an even more exciting range of interactive computer graphics, telepresence, and virtual reality applications.Video mosaics and related techniques will enable an even more exciting range of interactive computer graphics, telepresence, and virtual reality applications.

42 References [1] Automatic Panoramic Image Construction Yap-Peng Tan, Sanjeev R. Kulkarni and Peter J. Ramadge Yap-Peng Tan, Sanjeev R. Kulkarni and Peter J. Ramadge Princeton University, Department of Electrical Engineering Princeton University, Department of Electrical Engineering [2] Chapter 2 by Rakesh Kumar Aerial Video Survelliance and Exploitation Aerial Video Survelliance and Exploitation [3] A Multiresolution Spline With Application to Image Mosaics PETER J. BURT and H. ADELSON PETER J. BURT and EDWARD H. ADELSON RCA David Sarnoff Research Center RCA David Sarnoff Research Center

43 References [5] Jingbin Wang, Boston University CS580:Advanced Graphics Project 1: Image Mosaics CS580:Advanced Graphics Project 1: Image Mosaics [4] Richard Szeliski. Video mosaics for virtual environments. IEEE Computer Graphics and Applications, 16(2):22--30, March 1996 Graphics and Applications, 16(2):22--30, March 1996


Download ppt "Aerial Video Surveillance and Exploitation Roland Miezianko CIS 750 - Video Processing and Mining Prof. Latecki."

Similar presentations


Ads by Google