Presentation on theme: "PHOTOSYNTH: A 3D Photo Experience! by swami_worldtraveler PREFACE A version of the following was presented before the Central Florida Computer Society."— Presentation transcript:
PHOTOSYNTH: A 3D Photo Experience! by swami_worldtraveler PREFACE A version of the following was presented before the Central Florida Computer Society on July 18, 2010. This current version includes additional web links, plus explanatory slide notes. The latter was done since the original slides simply provide an outline, key points, visuals, and things to elaborate on. Feel free to read these, or skip over them as you desire. This presentation is geared toward the general user as well as a more technical audience. The middle part deals w/ advanced concepts, algorithms, and mathematics, while the latter part lightens back up a bit. Feel free to skip these, or linger as you desire. This slideshow is configured for a 4:3 display, though some cropping may still occur on some displays. If you would like me to present before your group, please contact me at firstname.lastname@example.org. email@example.com v7.01
PHOTOSYNTH A 3D Photo Experience! presented by swami_worldtraveler
Slide notes: Photosynth is a visual experience. Any description in words is limited and cannot fully convey the experience. So, jump right into an example!...
SO, WHAT IS PHOTOSYNTH?! - ITS COOL! - AND ITS BEST EXPERIENCED! SO, HERE GOES!...HERE GOES SYNTH: Hiroshima A-Bomb Dome! (guided tour, and scavenger hunt!)
Slide notes: Added a link to my synth: Hiroshima A-Bomb Dome! (guided tour, and scavenger hunt!). This is a good example for several reasons. 1) Its an interesting subject, especially from an American perspective (my original audience); 2) Its composed of a large number of photos (~600); 3) It has extensive Highlights links to points of interest in the synth; 4) It has an impressive 3D point cloud; 5) It demonstrates a novel use by including a scavenger hunt of hidden items to have fun searching for; and 6) I was there and I made the synth, so its personal to me and I can share that energy.
PHOTOSYNTH IN A NUTSHELL… - PHOTOSYNTH IS A NOVEL, AND ENGAGING WAY TO EXPLORE/EXPERIENCE A LARGE COLLECTION OF RELATED PHOTOS. - ITS ALSO A WAY TO MAKE A 3D RECONSTRUCTION FROM THAT PHOTO COLLECTION. - AND ITS MUCH MORE!...
Slide notes: Photosynth breaks from the traditional photo gallery of individual photos or a grid of photos. By arranging the photos in a 3D space and allowing navigation w/i it, a sense of presence is created. Things in the scene can be observed from multiple angles and at multiple resolutions (close to far). This spatial sense and structure from motion (i.e. from viewpoint motion) is present in the collection of photos, but requires something like Photosynth to present it in a way that conveys this. In addition to arranging the photos in 3D, a point cloud is created, which, given enough photos, presents a recognizable model that enhances the sense of spatiality, plus allows viewpoints not present in the photo collection! Pretty amazing, actually.
YOU MIGHT ASK… - IS IT FREE? - WILL IT WORK ON MY COMPUTER? - WILL IT WORK ON MY SMART PHONE? - CAN I MAKE MY OWN SYNTHS?
YOU MIGHT ASK… - IS IT FREE? - WILL IT WORK ON MY COMPUTER? - WILL IT WORK ON MY SMART PHONE? 1, 212 - CAN I MAKE MY OWN SYNTHS? YES!
Slide notes: Synths can be VIEWED on PCs and Macs. Currently (as of July 2010), they can only be CREATED on PCs. This is not strictly true, since a Mac can have a dual boot into Windows, but this is beyond the average user. Though I havent investigated it, perhaps a virtual machine (OS emulator) could be run on a Mac, or even a Linux box, thus allowing creation of synths. And, Theres an app for it! Yes, for the iPhone there is iSynth. The links in the slide take you to the creators page, and to the iTunes store, respectively.
YOU MIGHT ALSO ASK… - WHERE DID IT COME FROM? - HOW DOES IT WORK!? - WHAT TECHNOLOGIES DOES IT USE? - HOW TO MAKE A SYNTH? - HOW IS IT BEING USED? - WHAT DOES THE FUTURE HOLD?...
IT STARTED AS A DOCT. THESISTHESIS - 2008 (but developed over previous ~2 years). - NOAH SNAVELY, U. of WASHINGTON.NOAH SNAVELY - PHOTO TOURISMPHOTO TOURISM - 3D RECONSTRUCTION OF NOTRE DAME (& OTHER SITES) USING PHOTOS FROM FLICKR PHOTO SHARING SITE. - ADVISED BY MICROSOFT LIVE LABS.
Slide notes: Added a link to Noahs thesis (as a PDF: all 210 pages of it!). Added link to Noahs faculty page at Cornell Univ., where he is an Assistant Professor in the Computer Science Dept. Added link to Photo Tourism official homepage at the Univ. of Washington.
Slide notes: The top photos are from a presentation at a SIGGraph (Special Interest Group on Computer Graphics) conference. Pictured are 1) A sample of photos collected from Flickr (a photosharing site); 2) The point cloud model and the cameras (small white pyramids); and 3) The viewer. The bottom photos are from Noahs thesis paper (p. 58). Pictured are 1) An image of Notre Dame; 2) The 3D reconstruction; 3) The image connectivity graph showing matching relationships (explained on pp. 33, 34); and 4) Subgraph of reconstructed photos (explained on pp. 56 - 60).
SO, WHAT IS MS LIVE LABS?MS LIVE LABS - MS LIVE LABS, FORMED IN 2006, IS A THINK-TANK FOR INNOVATION AND TRANSFORMATIVE WEB EXPERIENCES. - IT IS PART OF MICROSOFT RESEARCH.MICROSOFT RESEARCH - MSLL ENLISTED NOAH SNAVELY, & OTHERS, TO CREATE PHOTOSYNTH.
Slide notes: A third component could be the Photosynth website itself. As an afterthought, I touched on this at the end of the talk. Im considering creating some slides for this, since it plays a role in the overall experience. The synther, and viewer will be covered next in order…
THE SYNTHER - 3D RECONSTRUCTION USING STRUCTURE FROM MOTION (SfM).SfM - EXTRACT ~ MATCH ~ RECONSTRUCT
Slide notes: Added link for SfM. Humans perform SfM all the time, naturally! Via motion through space over time – observing stationary objects and object in motion – we reconstruct a spatial sense of the environment, the objects in the environment, and the associations between them. Photosynth takes a discrete set of views (i.e. photos) and reconstructs the scene in a somewhat analogous way that humans do. After all, we observe a largely 2D world (w/ some stereo binocular vision) and construct a perception of a greater whole.
1) EXTRACT - USE COMPUTER VISION TO EXTRACT FEATURES (EDGES/CORNERS/BLOBS). - 10s, 100s, EVEN 1000s OF FEATURES PER PHOTO. - FEATURES SHOULD BE UNIQUE, & INVARIANT TO SCALE, ORIENTATION, ILLUMINATION, CAMERA DISTORTIONS & OTHER FACTORS. - DEFINE A FEATURE DESCRIPTOR (IN FORM OF A MATRIX).
Slide notes: Though I could not find out what method Photosynth uses (even emailed the programmer at Live labs: no reply), in Noah Snavelys Photo Tourism he used SIFT (Scale- Invariant Feature Transform). To allow feature matching between pictures taken at different scales (e.g. close, medium, +/or far), different orientations (e.g. rotated, tilted), and under varying illuminations (e.g. day, night, sunny, cloudy) a descriptor must be found that does not vary with these factors (i.e. must be invariant). FYI, the extraction process is programmatically highly parallelizable.SIFT
Slide notes: Various visualizations of feature detection (NOT from Photosynth). From left to right, and top to bottom: 1) Green circles mark feature points. Note that high texture areas, such as ivy on the walls, yield a high density of feature points; 2) Architectural sculpture from N. Snavelys thesis: size and rotation of boxes, respectively, mark the scale and orientation of the feature points; 3) Another architectural example; and 4) a real-time robotic navigation visualization of feature points.
2) MATCH - COMPARE FEATURES TO DETERMINE PHOTOS THAT OVERLAP.
Slide notes: FYI, this process in programmatically highly parallelizable.
Slide notes: Image connectivity graph. This represents image connectivity AFTER matching.
3) RECONSTRUCT - USE BUNDLE ADJUSTMENT.BUNDLE ADJUSTMENT - LARGE OPTIMIZATION PROBLEM: FIND FIT FROM SYSTEM W/ MORE EQUATIONS THAN VARIABLES (ITERATIVE PROCESS).OPTIMIZATION - PHOTO: X, Y, & Z POSITION. - CAMERA: X, Y, & Z POSITION; X, Y, Z ROTATION; FOCAL LENGTH. - REPROJECT IMAGE PT. TO CAMERA; COMPARE THIS TO THE ACTUAL LOCATION (USING LEAST SQUARES).
Slide notes: In hindsight, I think it might be better to start this discussion w/ reprojection error shown graphically on the next slide, then come back to this one. Added link for bundle adjustment, and optimization. The following is a bit technical and mathematical, so proceed if this kind of thing interests you… BUNDLE ADJUSTMENT: Basically, bundles of rays are reprojected from the image to the camera and the error between the observed and predicted point is used as a measure of fit. OPTIMIZATION: Given a system of equations w/ an equal number of equations and variables, a deterministic, exact solution can be found using the method of substitution, or of simultaneous equations. But for a system of equations where there are more equations than variables, another approach must be used. This approach is called optimization. Optimization is an iterative, non-deterministic (i.e. does not have an exact solution) method that finds a best fit w/i an acceptable degree of confidence. In the case of a synth, there are 10 variables (3 point position variables + 3 camera position variables + 3 camera rotation variables + 1 camera focal length variable), but many more equations. By intelligently shuffling the points and cameras around, their positions and orientations can be determined. These feature points located in 3D space form the point cloud, and the original camera position/orientation can be seen using the Overhead feature in the synth viewer. Simply mouse-over until a photo is displayed. The camera will appear as a white triangle.
Slide notes: Pictured top to bottom: 1) Graphic representation of reprojection error (from image back into camera); 2) Objective function, g, is simply a sum of least squares, where C denotes a camera, X denotes an image, P represents position of a reprojected point, i is an image index, j is a track index (i.e. into a track of common points across images), and w is 1 or 0 depending on if the i_th image is associated w/ the j_th track (i.e. does it or does it not contribute to the summation?). So, this equation is simply a measure of the summed distances between projected and actual points. The closer to zero this is, the closer to a fit in 3D of the cameras and image feature points; and 3) A rotation parameterization of the camera.
THE VIEWER: GENERAL - WEB-BASED. - DIRECT3D & SILVERLIGHT.
Slide notes: The synth is stored remotely on a server maintained by Microsoft. The viewer is a web client on the local machine. So, an Internet connection must be established to view a synth. The original synth viewer was written in DirectX, but this only runs on PCs, so to reach a broader audience, a Silverlight view was developed and is the current default viewer.
DIRECT3D - 3D API OF THE DIRECTX SUITE OF APIs.DIRECTX - FAST, SMOOTH. - GOOD FOR COMPUTER GAMES (e.g. XBOX). - ONLY WORKS NATIVELY ON PCs.
Slide notes: Screen-grab of the Direct3D viewer. Note that the navigation controls are distributed around the perimeter. This was found to be non-optimal, so were consolidated together in the Silverlight viewer. This advantage of this viewer is that its fast, smooth, and can continuous display/update the point cloud when orbiting and panning.
SILVERLIGHT (1, 2)12 - MICROSOFT MULTIMEDIA ENVIRONMENT (SIMILAR TO FLASH). - REQUIRES DOWNLOAD. - SLOWER, NOT AS SMOOTH, BUT CAPABLE ENOUGH. - ALLOWS DEVELOPMENT OF SOME INTERESTING USER FEATURES (e.g. HIGHLIGHTS, & OVERHEAD). - & LAST, BUT NOT LEAST: CROSS-PLATFORM!
Slide notes: Screen-grab of Silverlight viewer. Note the consolidate controls at the bottom, and the Highlight links to the right that jump w/i the synth. This feature is extremely helpful as navigation can be confusing at times, plus the user can be directed to points of interest, even in a desired sequence or guided tour.
THE VIEWER: DEEP ZOOM (1, 2, 3)123 - MULTI-SCALE IMAGE (BINARY PYRAMID). - CREATE SUCCESSIVELY SMALLER VERSIONS OF AN IMAGE (1/2 EACH TIME). - BREAK EACH OF THESE INTO TILES. - NOW, JUST SEND THE NECESSARY TILES AT THE APPROPRIATE SCREEN RESOLUTION. - ON PAN/ZOOM, DISCARD/LOAD THE APPROPRIATE TILES. - AND VOILA! VERY LARGE IMAGES OVER A LIMITED THROUGHPUT CHANNEL.
Slide notes: Added links to Deep Zoom: 1) Wikipedia article; 2) Deep Zoom in MSDN (Microsoft Developers Network): programming details; and 3) Deep Zoom Composer: FREE program to create/view Deep Zoom images and collections.
Slide notes: The image in the upper-left shows a multi-scale image. To the back is the original image. This is followed by successively smaller versions, each half as big on each side, which means 1/4 th as many pixels. Each of these scaled images is divided up into tiles. The image in the upper-right is a multi-scale index (of thumbnails) into a collection of multi-scale images. This provides certain optimization advantages (e.g. speed across the Internet) and is what Photosynth uses. The image at bottom illustrates the memory overhead due to the multi-scale image process. Since extra images are required for a given image, obviously some extra memory is required to store the multi-scale image, but this amount is minimal (at most 1/3 more) and is more than made up for in the ability to display very large images over the internet. This memory overhead can be calculated using the infinite summation where each successive image is 1/4 th as big as the previous one, hence (1/4)^0 + (1/4)^1 + (1/4)^2 + (1/4)^3 +… which is 1 + ¼ + 1/16 + 1/64 +… = 1.33333 (repeating). In practice, Photosynth uses 13 levels.
HOW TO MAKE A SYNTH - TAKE LOTS OF PHOTOS! - SUBMIT THEM TO THE SYNTHER. - LOCAL MACHINE CRUNCHES THEM, THEN SENDS RESULTS TO SERVER. - SYNTH LIVES ON THE SERVER. - CLIENT USED FOR VIEWING. (Heres the synth from a live demo at the C. FL Computer Society meeting.)synth
Slide notes: - This part was illustrated and expanded upon via a live demo! Depending on the audience, this part could receive more attention. This could also entail the creation of a number of accompanying slides. Heres the description from the demo synth: - Synth created before and during my presentation to the Central Florida Computer Society in Orlando, FL on July 18, 2010, titled, "PHOTOSYNTH: A 3D Photo Experience!" - Before the meeting, the majority of the photos were taken at medium-res, then synthed. During the in-meeting demo, several additional low-res photos where taken and added to the synth. - Given the context of creation, this synth does not necessarily represent a good synth, but did serve to illustrate the process of creating a synth and the associated issues. - Highlight links were added after the meeting. - This is an "Unlisted" synth: It will not show up in any search, but can still be linked to, given the URL.
HOW IS PHOTOSYNTH BEING USED? - TO MAKE HISTORY! (OBAMA INAUG.)OBAMA INAUG. - TV (e.g. CSI:NY – ADMISSIONS)CSI:NYADMISSIONS - TV SHOW PROMO (e.g. STARGATE UNIVERSE)STARGATE UNIVERSE - TOURISM SITES (1, 2)TOURISM 12 - VIRTUAL REAL ESTATE TOURSREAL ESTATE - VIRTUAL ART GALLERYART GALLERY - POINT CLOUD TO 3D MESH (1, 2)12 - VIDEO GAME VIZUALIZATION/PROMOTIONVIUAIZPROMOTION - BING MAPS/VIRTUAL EARTHBING MAPS - AIRLINES (British Air in-flight synths)British Air in-flight synths - FACEBOOK (post to fb from Photosynth site)
Slide notes: This section sparked interesting comments from the audience. Photosynth is really still just an experiment; a way for Microsoft to explore new technologies. Photosynth is appropriate for some applications, but not suited for others. Considerable investigation is being done to figure out which are which. Maybe YOU will find a cool new application!...
Q & A - IF YOU DONT HAVE ANY QUESTIONS, THEN YOU WERENT PAYING ATTENTION! …