Presentation is loading. Please wait.

Presentation is loading. Please wait.

Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman,

Similar presentations


Presentation on theme: "Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman,"— Presentation transcript:

1 Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman, http://www.cs.cmu.edu/~hws) © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

2 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 2 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

3 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 3 Carnegie Mellon Image Processing vs. Computer Vision Image ProcessingImage Processing Research area within electrical engineering/signal processing Research area within electrical engineering/signal processing Focus on syntax, Focus on syntax, low level features low level features Computer VisionComputer Vision Research area within computer science/artificial intelligence Research area within computer science/artificial intelligence Focus on semantics, Focus on semantics, symbolic or geometric symbolic or geometric descriptions descriptions image Faces People Chairs etc.

4 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 4 Carnegie Mellon Optical Character Recognition (OCR) First patent in OCR in 19 th centuryFirst patent in OCR in 19 th century First applications in post-office and banksFirst applications in post-office and banks Documents easier to distribute, search, organize, and edit in digital formDocuments easier to distribute, search, organize, and edit in digital form Typewriter has been replaced by word processor Typewriter has been replaced by word processor Lots of legacy materials (the world’s libraries of books) available only in print Lots of legacy materials (the world’s libraries of books) available only in print State of the art not perfect, but 99% accurate on cleanly printed pagesState of the art not perfect, but 99% accurate on cleanly printed pages Examples of errors...Examples of errors...

5 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 5 Carnegie Mellon Heavy Print Output from 3 commercial OCR systems

6 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 6 Carnegie Mellon Light Print

7 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 7 Carnegie Mellon Stray Marks

8 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 8 Carnegie Mellon Typography

9 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 9 Carnegie Mellon Processing Overlaid Text in Video Text Area Detection Text Area Preprocessing Commercial OCR Video ASCII Text The Video OCR (VOCR) process used by the Informedia research group at Carnegie Mellon

10 Text Area Detection

11 (1/2 s intervals) Video Frames Filtered Frames AND-ed Frames © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann Carnegie Mellon

12 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 12 Carnegie Mellon VOCR Preprocessing Problems

13 Augmenting VOCR with Dictionary Look-up

14 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 14 Carnegie Mellon Handwriting Recognition Natural progression to OCR work for printNatural progression to OCR work for print Works if constraints on writer, e.g. palm pilot, where user is asked to conform to specific style or conventionWorks if constraints on writer, e.g. palm pilot, where user is asked to conform to specific style or convention

15 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 15 Carnegie Mellon Other Document Processing Not just for text...Not just for text... Examples:Examples: Engineering document to CAD file Engineering document to CAD file Maps to GIS format Maps to GIS format Music score to MIDI representation Music score to MIDI representation

16 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 16 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

17 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 17 Carnegie Mellon Digital Cameras = Convenience Easy to capture photosEasy to capture photos Easy to store and organize photosEasy to store and organize photos Easy to duplicate photosEasy to duplicate photos Easy to edit photosEasy to edit photos Rough Multimedia eCommerce class survey:Rough Multimedia eCommerce class survey: 1999: 10% own digital cameras 1999: 10% own digital cameras 2000: 25% 2000: 25% 2001: 50% 2001: 50% 2002: ?? 2002: ??

18 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 18 Carnegie Mellon Digital Camera Cautions Via “Photo Industry Reporter” e-Magazine at: http://www.photoreporter.com/2002/10- 21/photokina_report_look_at_35mm.html Film cameras still outsell digital cameras by almost three to oneFilm cameras still outsell digital cameras by almost three to one The household penetration of digital is at about 15%The household penetration of digital is at about 15% “But let’s face it: film’s days are numbered. Anyone staying solely with film these days will have a glorious buggy whip in a market that will be clamoring for cars.”“But let’s face it: film’s days are numbered. Anyone staying solely with film these days will have a glorious buggy whip in a market that will be clamoring for cars.”

19 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 19 Carnegie Mellon Digital Camera Growth Photo Marketing Association on US digital camera sales:Photo Marketing Association on US digital camera sales: 4.5 million in 2000 4.5 million in 2000 6.9 million in 2001 6.9 million in 2001 Projected 9.3 million for 2002 Projected 9.3 million for 2002 http://www.visioneer.com/About/press/june2402.html http://www.visioneer.com/About/press/june2402.html InfoTrends Research Group estimates that the U.S. photo-enabled TV set-top installed base will grow from less than 1 million units in 2002, to over 114 million units in 2006. Household penetration will climb from under 1% to around 85%.InfoTrends Research Group estimates that the U.S. photo-enabled TV set-top installed base will grow from less than 1 million units in 2002, to over 114 million units in 2006. Household penetration will climb from under 1% to around 85%. InfoTrends projects digital camera sales to grow at a rate of 38% through 2003InfoTrends projects digital camera sales to grow at a rate of 38% through 2003

20 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 20 Carnegie Mellon State of the Art: Digital Cameras Film is currently better in resolution and colorFilm is currently better in resolution and color Professional photographers Professional photographers Digital for low quality newspaper advertisementsDigital for low quality newspaper advertisements Film for portrait photosFilm for portrait photos Computer storage limitations: 1 high resolution digital image = 20- 25 MegabytesComputer storage limitations: 1 high resolution digital image = 20- 25 Megabytes http://pic.templetons.com/brad/photo/pixels.html http://pic.templetons.com/brad/photo/pixels.html 3500 line pairs/35 mm or about 5000 dots/inch, but grainy 3500 line pairs/35 mm or about 5000 dots/inch, but grainy At 3:2 frame size, ~20 million pixels At 3:2 frame size, ~20 million pixels Conclusion: “a 5300 x 4000 digital camera would produce a shot equivalent to a scan from a quality 35mm camera -- provided you can get more than 8 bits per pixel. …A 3000 x 2000 digital camera would match the 35mm for a good percentage of shots.” Conclusion: “a 5300 x 4000 digital camera would produce a shot equivalent to a scan from a quality 35mm camera -- provided you can get more than 8 bits per pixel. …A 3000 x 2000 digital camera would match the 35mm for a good percentage of shots.” Printing: home printers not comparable to commercial printersPrinting: home printers not comparable to commercial printers

21 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 21 Carnegie Mellon Future of Digital Cameras Improved resolution and colorImproved resolution and color “Smart” cameras“Smart” cameras More programmable featuresMore programmable features Auto-focus on object of interest Auto-focus on object of interest “Everything in focus” photo “Everything in focus” photo Capture photo when event X occurs Capture photo when event X occurs

22 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 22 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

23 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 23 Carnegie Mellon Biometrics Technology for identificationTechnology for identification Finger/palm print Finger/palm print Iris Iris Face Face

24 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 24 Carnegie Mellon Fingerprints Minutae – spits and merges of ridgesMinutae – spits and merges of ridges

25 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 25 Carnegie Mellon Face Identification Not quite reliable yet.Not quite reliable yet. Performance degrades rapidly with uncontrolled lighting, facial expression, and size of database Performance degrades rapidly with uncontrolled lighting, facial expression, and size of database Several companies exist:Several companies exist: Visionics (Rockfeller University spin-off) Visionics (Rockfeller University spin-off) Viisage (MIT spin-off) Viisage (MIT spin-off) EyeMatic (USC spin-off) EyeMatic (USC spin-off) Miros (MIT spin-off) Miros (MIT spin-off) Banque-Tec Intl (Australia) Banque-Tec Intl (Australia) C-VIS Computer Vision (Germany) C-VIS Computer Vision (Germany) LAU Technologies LAU Technologies Commercial systems installed in London and Brazil to catch criminalsCommercial systems installed in London and Brazil to catch criminals

26 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 26 Carnegie Mellon Automatic Age Progression Original Image (1962) Computer-Aged (1997) Actual Photo (1997)

27 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 27 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

28 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 28 Carnegie Mellon Management of images on computers Compression – reducing storage size needed for imagesCompression – reducing storage size needed for images Watermarking – Protecting copyrightWatermarking – Protecting copyright Microsoft, Bell Labs, NEC, etc.Microsoft, Bell Labs, NEC, etc. Visible watermark

29 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 29 Carnegie Mellon Photo Manipulation Adobe Photoshop, Corel PhotoPaint, Pixami, PhotoIQ, etc.Adobe Photoshop, Corel PhotoPaint, Pixami, PhotoIQ, etc. Image editing: crop an image, adjust the color, paint over part of any image, airbrush part of an image, combine images, etc.Image editing: crop an image, adjust the color, paint over part of any image, airbrush part of an image, combine images, etc. Future: Applications of computer vision, e.g., discriminating foreground from background.Future: Applications of computer vision, e.g., discriminating foreground from background.

30 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 30 Carnegie Mellon Online Digital Image Collections Stock photos of use to graphic designers, artists, etc.Stock photos of use to graphic designers, artists, etc. Large collections of images existLarge collections of images exist Corbis 67 million images Corbis 67 million images Getty 70 million stock photography images Getty 70 million stock photography images AP collects 1000s of digitized images per day AP collects 1000s of digitized images per day

31 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 31 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

32 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 32 Carnegie Mellon Inspection for Manufacturing Occum – inspection of printed circuit boards ($100M / year)Occum – inspection of printed circuit boards ($100M / year) Cognex – Do-it-yourself toolkits for inspection (400 employees)Cognex – Do-it-yourself toolkits for inspection (400 employees)

33 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 33 Carnegie Mellon Automatic Target Recognition (ATR) Finding mines, tanks, etc.Finding mines, tanks, etc. Billion dollar a year industryBillion dollar a year industry Martin-Lockheed, TSR, Northrup-Grumman, other aerospace contractors. Martin-Lockheed, TSR, Northrup-Grumman, other aerospace contractors. Various types of imagery:Various types of imagery: Synthetic Aperture Radar (SAR), Sonar, hyper-spectral imagery (more than 3 colors) Synthetic Aperture Radar (SAR), Sonar, hyper-spectral imagery (more than 3 colors)

34 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 34 Carnegie Mellon Aerial Photo Interpretation Also referred to as “automated cartography”Also referred to as “automated cartography” Classification of land-use: forest, vegetation, waterClassification of land-use: forest, vegetation, water Identification of man-made objects: buildings, roads, etc.Identification of man-made objects: buildings, roads, etc.

35 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 35 Carnegie Mellon Better Security Cameras Cameras that are responsive to the environmentCameras that are responsive to the environment Track and zoom on moving objects Track and zoom on moving objects Automatic adjustment of contrast Automatic adjustment of contrast

36 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 36 Carnegie Mellon Medical imagery Medical image libraries for study and diagnosisMedical image libraries for study and diagnosis Image overlay to guide surgeonsImage overlay to guide surgeons

37 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 37 Carnegie Mellon History 1980’s ~100 companies – manufacturing applications mostly1980’s ~100 companies – manufacturing applications mostly Early 1990’s less than 10 companiesEarly 1990’s less than 10 companies Late 1990’s ~100 companies – face recognition, intelligent teleconferencing, inspection, digital libraries, medical imagingLate 1990’s ~100 companies – face recognition, intelligent teleconferencing, inspection, digital libraries, medical imaging

38 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 38 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

39 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 39 Carnegie Mellon Image Processing: Filtering Enhancing an image’s quality for human viewing, e.g., in medical imaging or in telescopic views of space

40 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 40 Carnegie Mellon Image Processing: Compression Lossless – No loss in quality: gif, tiffLossless – No loss in quality: gif, tiff Lossy – Original image cannot be reconstructed: jpegLossy – Original image cannot be reconstructed: jpeg New work on advancing lossy compression strategies with fewer visual artifacts: JPEG 2000 and wavelet transformationsNew work on advancing lossy compression strategies with fewer visual artifacts: JPEG 2000 and wavelet transformations

41 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 41 Carnegie Mellon Image Processing: Watermarking Information hidingInformation hiding Protecting copyright Protecting copyright

42 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 42 Carnegie Mellon Image Processing: Transformation Transforming image can make it easier to analyzeTransforming image can make it easier to analyze Wavelet transform of image

43 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 43 Carnegie Mellon Wavelet Coefficients Horizontal LP, Vertical HP Horizontal HP, Vertical HP Horizontal HP, Vertical LP Horizontal LP, Vertical LP

44 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 44 Carnegie Mellon 5/3 Linear Phase Wavelets Linear phase 5/3: c[n] = {-1, 2,6,2,-1}, d[n]={1,-2,1} g[n] = {1, 2,-6,2, 1}, f[n]={1, 2,1}

45 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 45 Carnegie Mellon Computer Vision: 3D Shape Reconstruction Use images to build 3D model of object or siteUse images to build 3D model of object or site 3D site model built from laser range scans collected by CMU autonomous helicopter

46 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 46 Carnegie Mellon Computer Vision: Guiding Motion Visually guided manipulationVisually guided manipulation Hand-eye coordination Hand-eye coordination Visually guided locomotionVisually guided locomotion robotic vehicles robotic vehicles CMU NavLab II

47 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 47 Carnegie Mellon Computer Vision: Recognition & Classification

48 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 48 Carnegie Mellon Challenges in Object Recognition 245 267 234 142 22 28 38 121 156 187 98 73 32 12 123 21 21 38 209 237 121 99 87 59 197 216 244

49 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 49 Carnegie Mellon Object Recognition Research Low Image Quality Large Quantity of Data Intra- class Object Variation Large number of Object Classes Automated Learning Robust Algorithms Advanced Image Enhancement Segmentation and Hierarchical Analysis Lips Face Text Building Hand Gesture Vehicle Clock License Plate Object Detection Object Detection Issues Quality/Quantity Issues

50 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 50 Carnegie Mellon Intra-Class Variation

51 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 51 Carnegie Mellon Lighting Variation

52 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 52 Carnegie Mellon Geometric Variation

53 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 53 Carnegie Mellon Simpler Problem: Classification Fixed size inputFixed size input Fixed object size, orientation, and alignmentFixed object size, orientation, and alignment “Object is present” (at fixed size and alignment) “Object is NOT present” (at fixed size and alignment) Decision

54 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 54 Carnegie Mellon Detection: Apply Classifier Exhaustively Search in position Search in scale

55 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 55 Carnegie Mellon View-based Classifiers Face Classifier #1 Face Classifier #2 Face Classifier #3

56 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 56 Carnegie Mellon 1) Apply Local Operators f 1 (0, 1) = #3214 f 1 (0, 0) = #5710 f k (n, m) = #723

57 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 57 Carnegie Mellon 2) Look Up Probabilities f 1 (0, 1) = #3214 f 1 (0, 0) = #5710 f k (n, m) = #723 P 1 ( #5710, 0, 0 | obj) = 0.53 P 1 ( #5710, 0, 0 | non-obj) = 0.56 P 1 ( #3214, 0, 1 | obj) = 0.57 P 1 ( #3214, 0, 1 | non-obj) = 0.48 P k ( #723, n, m | obj) = 0.83 P k ( #723, n, m | non-obj) = 0.19

58 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 58 Carnegie Mellon 3) Make Decision P 1 ( #5710, 0, 0 | obj) = 0.53 P 1 ( #5710, 0, 0 | non-obj) = 0.56 P 1 ( #3214, 0, 1 | obj) = 0.57 P 1 ( #3214, 0, 1 | non-obj) = 0.48 P k ( #723, n, m | obj) = 0.83 P k ( #723, n, m | non-obj) = 0.19 0.53 * 0.57 *... * 0.83 0.56 * 0.48 *... * 0.19 >

59 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 59 Carnegie Mellon Two Classifiers Trained for Faces

60 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 60 Carnegie Mellon Eight Classifiers Trained for Cars

61 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 61 Carnegie Mellon Probabilities Estimated Off-Line f 1 (0, 0) = #567H 1 (#567, 0, 0) = H 1 (567, 0, 0) + 1 f k (n, m) = #350H k (#350, 0, 0) = H k (#350, 0, 0) + 1 P 1 (#567, 0, 0) =  H 1 (#i, 0, 0) H 1 (#567, 0, 0) P k (#350, 0, 0) =  H k (#i, 0, 0) H k (#350, 0, 0)

62 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 62 Carnegie Mellon Training Classifiers Cars: 300-500 images per viewpointCars: 300-500 images per viewpoint Faces: 2,000 images per viewpointFaces: 2,000 images per viewpoint ~1,000 synthetic variations of each original image~1,000 synthetic variations of each original image background scenery, orientation, position, frequency background scenery, orientation, position, frequency 2000 non-object images2000 non-object images Samples selected by bootstrapping Samples selected by bootstrapping Minimization of classification error on training setMinimization of classification error on training set AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer ‘99) AdaBoost algorithm (Freund & Shapire ‘97, Shapire & Singer ‘99) Iterative methodIterative method Determines weights for samplesDetermines weights for samples

63

64

65 Web-based Demo of Face Detector http://www.vasc.ri.cmu.edu/cgi-bin/demos/findface.cgi

66

67

68 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 68 Carnegie Mellon CMU Face Detector in Commercial Product CMU Face Detector

69 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 69 Carnegie Mellon Applications of Face Detection Automatic red-eye removal from photographsAutomatic red-eye removal from photographs Automatic color balancing in photo-finishingAutomatic color balancing in photo-finishing Intelligent teleconferencingIntelligent teleconferencing Component in face identification systemComponent in face identification system

70 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 70 Carnegie Mellon Difficulty Increases with Complexity of Object 2D vs. 3D2D vs. 3D Specific objects – e.g. my coffee mugSpecific objects – e.g. my coffee mug A category of objects – e.g. all coffee mugsA category of objects – e.g. all coffee mugs Amount of intra-category variationAmount of intra-category variation Rigid or semi-rigid structure, e.g. face Rigid or semi-rigid structure, e.g. face Articulated objects, e.g. human body Articulated objects, e.g. human body Functionally defined objects, e.g. chairs Functionally defined objects, e.g. chairs

71 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 71 Carnegie Mellon Outline Defining Image Processing and Computer VisionDefining Image Processing and Computer Vision Emerging TechnologyEmerging Technology Digitization of documents Digitization of documents Digitization of images/photographs Digitization of images/photographs Biometrics Biometrics Management of images on computers Management of images on computers Other: manufacturing, military, games, … Other: manufacturing, military, games, … Research in Image Processing and Computer VisionResearch in Image Processing and Computer Vision Automatically Finding Faces and Cars Automatically Finding Faces and Cars Content-based Image Retrieval Content-based Image Retrieval

72 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 72 Carnegie Mellon Find Images With Similar Colors

73 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 73 Carnegie Mellon Find Images with Similar Shape

74 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 74 Carnegie Mellon Goal: Find Images with Similar Content

75 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 75 Carnegie Mellon Spectrum of Content-Based Image Retrieval Similar color distribution Similar texture pattern Similar shape/pattern Similar real content Degree of difficulty Histogram matching Texture analysis Image Segmentation, Pattern recognition Life-time goal :-)

76 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 76 Carnegie Mellon Status of Image Search Typical Search FeaturesTypical Search Features Color Color Texture Texture Shape Shape Spatial attributes (local color regions, less common than global color, texture, shape metrics) Spatial attributes (local color regions, less common than global color, texture, shape metrics) Commercial ActivityCommercial Activity eVision (notes that “visual search engine market segment is projected to reach $1.4 billion by 2005 according to the McKenna Group” http://www.evisionglobal.com/about/index.html eVision (notes that “visual search engine market segment is projected to reach $1.4 billion by 2005 according to the McKenna Group” http://www.evisionglobal.com/about/index.html Virage (www.virage.com) Virage (www.virage.com) IBM (QBIC part of database toolset) IBM (QBIC part of database toolset)

77 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 77 Carnegie Mellon Reference: “A Review of CBIR” Recommended reading: A Review of Content-Based Image Retrieval Systems Colin C. Venters and Dr. Matthew Cooper, University of Manchester Available at http://www.jisc.ac.uk/jtap/htm/jtap-054.html This review lists features from a number of image retrieval systems, along with heuristic evaluations on the interfaces for a subset of these systems.

78 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 78 Carnegie Mellon Search Engines Used by 2001 Multimedia Class Search Engines used for 2001 multimedia retrieval homework (15 others answered a single query each):Search Engines used for 2001 multimedia retrieval homework (15 others answered a single query each):

79 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 79 Carnegie Mellon Search Engines Used in This 2002 Class Also answering 1 query each were: Excite+, Rexfeature, Webseek+, search.netscape.com+, animalplanet.com+, ask.com, naver.com+

80 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 80 Carnegie Mellon For Further Reading on Texture Search Texture Search: “Texture features for browsing and retrieval of image data”, B.S. Manjunath and W.Y. Ma, IEEE Trans. on Pattern Analysis and Machine Intelligence 18(8), Aug. 1996, pp. 837-842.Texture Search: “Texture features for browsing and retrieval of image data”, B.S. Manjunath and W.Y. Ma, IEEE Trans. on Pattern Analysis and Machine Intelligence 18(8), Aug. 1996, pp. 837-842. Texture search via http://www.engin.umd.umich.edu/ceep/tech_day/2000/r eports/ECEreport2/ECEreport2.htm (texture features include coarseness, average gray scale value, and number of horizontal and vertical extrema of a specific image region)Texture search via http://www.engin.umd.umich.edu/ceep/tech_day/2000/r eports/ECEreport2/ECEreport2.htm (texture features include coarseness, average gray scale value, and number of horizontal and vertical extrema of a specific image region) For QBIC, texture search works on global coarseness, contrast and directionality featuresFor QBIC, texture search works on global coarseness, contrast and directionality features

81 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 81 Carnegie Mellon For Further Exploration of Image Segmentation BlobWorld work at UC BerkeleyBlobWorld work at UC Berkeley Papers, description, sample system available at http://elib.cs.berkeley.edu/photos/blobworld/Papers, description, sample system available at http://elib.cs.berkeley.edu/photos/blobworld/

82 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 82 Carnegie Mellon Further Reading on Wavelet Compression and JPEG 2000 http://www.gvsu.edu/math/wavelets/student_work/EF/how- works.htmlhttp://www.gvsu.edu/math/wavelets/student_work/EF/how- works.html http://www-ise.stanford.edu/class/psych221/00/shuoyen/http://www-ise.stanford.edu/class/psych221/00/shuoyen/ Henry Schneiderman Ph.D. Thesis “A Statistical Approach to 3D Object Detection Applied to Faces and Cars”, http://www.ri.cmu.edu/pub_files/pub2/schneiderman_henr y_2000_2/schneiderman_henry_2000_2.pdfHenry Schneiderman Ph.D. Thesis “A Statistical Approach to 3D Object Detection Applied to Faces and Cars”, http://www.ri.cmu.edu/pub_files/pub2/schneiderman_henr y_2000_2/schneiderman_henry_2000_2.pdf http://www.jpeg.org/JPEG2000.htmlhttp://www.jpeg.org/JPEG2000.html

83 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 83 Carnegie Mellon Summary: Image Processing & Computer Vision Not as mature as speech recognitionNot as mature as speech recognition Technology not as reliable Technology not as reliable Fewer companies, fewer products Fewer companies, fewer products Success on limited problems, e.g., documentsSuccess on limited problems, e.g., documents More applicable to fault tolerant problemsMore applicable to fault tolerant problems Technology will growTechnology will grow Emergence of digital camera Emergence of digital camera Improved methods Improved methods

84 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 84 Carnegie Mellon Decomposition in Resolution/Frequency fine coarseintermediate

85 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 85 Carnegie Mellon Wavelet Decomposition Vertical subbands (LH)

86 © Copyright 2002 Michael G. Christel and Alexander G. Hauptmann 86 Carnegie Mellon Wavelet Decomposition Horizontal subbands (HL)


Download ppt "Image Processing and Computer Vision Lecture 4, Multimedia E-Commerce Course November 5, 2002 Mike Christel (significant input by Henry Schneiderman,"

Similar presentations


Ads by Google