Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing.

Slides:

Advertisements

Similar presentations

On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

Advertisements

High Performance Associative Neural Networks: Overview and Library High Performance Associative Neural Networks: Overview and Library Presented at AI06,

Bioinspired Computing Lecture 16

National Research Council Canada Conseil national de recherches Canada National Research Council Canada Conseil national de recherches Canada Canada Dmitry.

Introduction to Neural Networks Computing

Image Analysis Phases Image pre-processing –Noise suppression, linear and non-linear filters, deconvolution, etc. Image segmentation –Detection of objects.

HMAX Models Architecture Jim Mutch March 31, 2010.

Low Complexity Keypoint Recognition and Pose Estimation Vincent Lepetit.

CSC321: Neural Networks Lecture 3: Perceptrons

CS 678 –Relaxation and Hopfield Networks1 Relaxation and Hopfield Networks Totally connected recurrent relaxation networks Bidirectional weights (symmetric)

Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.

1 Neural networks 3. 2 Hopfield network (HN) model A Hopfield network is a form of recurrent artificial neural network invented by John Hopfield in 1982.

Soft computing Lecture 6 Introduction to neural networks.

Correlation Matrix Memory CS/CMPE 333 – Neural Networks.

A Study of Approaches for Object Recognition

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Un Supervised Learning & Self Organizing Maps Learning From Examples

Pattern Recognition using Hebbian Learning and Floating-Gates Certain pattern recognition problems have been shown to be easily solved by Artificial neural.

Basic Models in Neuroscience Oren Shriki 2010 Associative Memory 1.

Visual Cognition II Object Perception. Theories of Object Recognition Template matching models Feature matching Models Recognition-by-components Configural.

2007Theo Schouten1 Introduction. 2007Theo Schouten2 Human Eye Cones, Rods Reaction time: 0.1 sec (enough for transferring 100 nerve.

A globally asymptotically stable plasticity rule for firing rate homeostasis Prashant Joshi & Jochen Triesch

Simultaneous Localization and Map Building System for Prototype Mars Rover CECS 398 Capstone Design I October 24, 2001.

November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.

Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.

Biologically Inspired Robotics Group,EPFL Associative memory using coupled non-linear oscillators Semester project Final Presentation Vlad TRIFA.

SOMTIME: AN ARTIFICIAL NEURAL NETWORK FOR TOPOLOGICAL AND TEMPORAL CORRELATION FOR SPATIOTEMPORAL PATTERN LEARNING.

Overview and Mathematics Bjoern Griesbach

Facial Recognition CSE 391 Kris Lord.

Multiclass object recognition

MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way

MIND: The Cognitive Side of Mind and Brain  “… the mind is not the brain, but what the brain does…” (Pinker, 1997)

Computational Video Group From recognition in brain to recognition in perceptual vision systems. Case study: face in video. Example: identifying computer.

Active Vision Key points: Acting to obtain information Eye movements Depth from motion parallax Extracting motion information from a spatio-temporal pattern.

IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.

 The most intelligent device - “Human Brain”.  The machine that revolutionized the whole world – “computer”.  Inefficiencies of the computer has lead.

Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10

1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.

Artificial Neural Network Supervised Learning دكترمحسن كاهاني

2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.

Deriving connectivity patterns in the primary visual cortex from spontaneous neuronal activity and feature maps Barak Blumenfeld, Dmitri Bibitchkov, Shmuel.

National Research Council Canada Conseil national de recherches Canada National Research Council Canada Conseil national de recherches Canada Institute.

Neural Networks and Fuzzy Systems Hopfield Network A feedback neural network has feedback loops from its outputs to its inputs. The presence of such loops.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Recognition in Video: Recent Advances in Perceptual Vision Technology Dept. of Computing Science, U. Windsor 31 March 2006 Dr. Dmitry O. Gorodnichy Computational.

1 Webcam Mouse Using Face and Eye Tracking in Various Illumination Environments Yuan-Pin Lin et al. Proceedings of the 2005 IEEE Y.S. Lee.

Face Recognition in Video Int. Conf. on Audio- and Video-Based Biometric Person Authentication (AVBPA ’03) Guildford, UK June 9-11, 2003 Dr. Dmitry Gorodnichy.

Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam

Recognizing faces in video: Problems and Solutions NATO Workshop on "Enhancing Information Systems Security through Biometrics" October 19, 2004 Dr. Dmitry.

Neural Network Basics Anns are analytical systems that address problems whose solutions have not been explicitly formulated Structure in which multiple.

Spatiotemporal Saliency Map of a Video Sequence in FPGA hardware David Boland Acknowledgements: Professor Peter Cheung Mr Yang Liu.

Face Image-Based Gender Recognition Using Complex-Valued Neural Network Instructor :Dr. Dong-Chul Kim Indrani Gorripati.

3:01 PM Three points for today Sensory memory (SM) contains highly transient information about the dynamic sensory array. Stabilizing the contents of SM.

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22

Designing High-Capacity Neural Networks for Storing, Retrieving and Forgetting Patterns in Real-Time Dmitry O. Gorodnichy IMMS, Cybernetics Center of Ukrainian.

Activations, attractors, and associators Jaap Murre Universiteit van Amsterdam en Universiteit Utrecht

Robodog Frontal Facial Recognition AUTHORS GROUP 5: Jing Hu EE ’05 Jessica Pannequin EE ‘05 Chanatip Kitwiwattanachai EE’ 05 DEMO TIMES: Thursday, April.

Face recognition in video as a new biometrics modality & the associative memory framework Talk for IEEE Computation Intelligence Society (Ottawa Chapter)

Face Recognition Technology By Catherine jenni christy.M.sc.

A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.

Activity Recognition Journal Club “Neural Mechanisms for the Recognition of Biological Movements” Martin Giese, Tomaso Poggio (Nature Neuroscience Review,

Over the recent years, computer vision has started to play a significant role in the Human Computer Interaction (HCI). With efficient object tracking.

By: Suvigya Tripathi (09BEC094) Ankit V. Gupta (09BEC106) Guided By: Prof. Bhupendra Fataniya Dept. of Electronics and Communication Engineering, Nirma.

National Taiwan Normal A System to Detect Complex Motion of Nearby Vehicles on Freeways C. Y. Fang Department of Information.

Paper – Stephen Se, David Lowe, Jim Little

Final Year Project Presentation --- Magic Paint Face

On Facial Recognition in Video

Retrieval of blink- based lexicon for persons with brain-stem injury using video cameras CVPR Workshop on Face Processing in Video June 28, 2004, Washington,

Towards building user seeing computers

Presentation transcript:

Using associative memory principles to enhance perceptual ability of vision systems (Giving a meaning to what you see) CVPR Workshop on Face Processing in Video June 28, 2004, Washington, DC, USA Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology National Research Council Canada

Designing visual memory using attractor-based neural networks with application to perceptual vision systems CVPR Workshop on Face Processing in Video June 28, 2004, Washington, DC, USA Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology National Research Council Canada

3. Associative memory for video (Dr. Dmitry Gorodnichy) The unique place of this research You are here Computer vision Pattern recognition Your eyes, your brain

4. Associative memory for video (Dr. Dmitry Gorodnichy) Talk overview 1. On neurobiology side - How it works in brain: From eye retina, to primary visual cortex, to neurons, to synapses 2. Memories as attractors of associative neural network - Finding Best learning rule to tune the synapses 3. On computer vision path - Evolution of Perceptual Vision User Interface Systems: From Face Detection to Face Tracking to Face Localization to Face Recognition 4. Putting it all together: Visual memory for analyzing faces: - What makes processing in video special - Canonical face representation - Memories of faces as attractors of the network

5. Associative memory for video (Dr. Dmitry Gorodnichy) How we (humans) do it: see & memorize what we see ? Dorsal (“where”) stream: V1,V2,V3… deal with object localization Ventral (“what”) stream: V1, V2, V4, inferior temporal cortex (TE/IT) deals with object recognition Refs: Perus, Ungerleider, Haxby, Riesenhuber, Poggio … Seeing

6. Associative memory for video (Dr. Dmitry Gorodnichy) How we (humans) do it: see & memorize what we see ? (cntd) In brain: to interconnected neurons Neurons are either in rest or activated (modelled as units taking values) Yi={+1,-1}, depending on value of other neurons Yj and the strength of synaptic connections Cij Brain is thus modelled as a network of binary neurons evolving in time from an initial state (e.g. stimulus coming from retina) until it reaches a stable state - attractor The attractors of the network is what we actually remember – associative memory. Recognizing / Memorizing Refs: Hebb’49, Little’74,’78, Willshaw’71, …

7. Associative memory for video (Dr. Dmitry Gorodnichy) Recognition / memorization: formally V ~CV Main question: How to compute Cij so that a) the desired patterns V m become attractors, i.e. V m ~CV m and largest attraction radius (tolerated noise) - largest number of prototypes M stored b) network exhibits best associative (error- correction) properties, i.e. - largest attraction radius (tolerated noise) - largest number of prototypes M stored? Refs: Hebb’49, McCalloch-Pitts‘43, Amari’71,’77, Hopfield’82,Sejnowski’89, Willshaw’71 Attractor-based neural networks

8. Associative memory for video (Dr. Dmitry Gorodnichy) Learning rules: From biologically plausible to mathematically justifiable Neurophysiological Postulate: “If two neurons on either side of a synapse are activated, then the strength of the synapse is strengthened” “When a child is born, she knows nothing. As she repeatedly observed, she learns” – Postulate from Montessory approach to enfant development. Models Hebb: ( C = 1/N VV T ), Generalized Hebb: Better however: or even Refs: Hebb’49, Hopfield’82, Sejnowski’77, Willshaw’71 How to update weights

9. Associative memory for video (Dr. Dmitry Gorodnichy) C = VV + V =CV Obtained mathematically from stability condition: V m =CV m With reduced self-connection ( Cii = 0.15 Cii ), it is guaranteed [Gorodnichy’97] to retrieve M=0.5N patterns from 8% noise M=0.7N patterns from 2% noise (for comparison: Hebb rule stops retrieving when M=0.14N) Widrow-Hoff’s (delta) rule is the iterative approximation of it. Hebb rule is the special case of it for orthogonal prototypes. Refs: Amari’71,’77, Kohonen’72, Personnaz’85, Kanter-Sompolinsky’86,Gorodnichy‘95-’99 Pseudo-inverse as the best learning rule

10. Associative memory for video (Dr. Dmitry Gorodnichy) … besides that it yields the best retrieval for this type of networks. It is non-iterative – good for fast (real-time) learning It is also fast in retrieval. The performance of the network can be examined and improved analytically.  Guaranteed to converge. It can deal with continuous stream of data, never being saturated: if dynamic desuturation is used -> maintaining the capacity of 0.2N (with complete retrieval) -> providing means for forgetting obsolete data -> setting the basis for designing of adaptive filters All this makes the network very suitable for real-time memorization and recognition, as needed for video processing tasks. Finally, there's a free CPP code which you can compile and try yourself!free CPP code At PINN website: - What’s else good about PI rule

11. Associative memory for video (Dr. Dmitry Gorodnichy) These Neural Network are known as… pseudo-inverse networks - for using Moore-Penrose pseudoinverse V + in computing the synapces projection networks - for synaptic (weight) matrix C=VV + being the projection matrix on the space of prototypes Hopfield-like networks - for being binary and fully-connected in the stage of learning recurrent networks - for evolving in time, based on external input and internal memory attractor-based networks - for storing patterns as attractors (i.e. stable states of the network) dynamic systems - for allowing the dynamic systems theory to be applied associative memory - for being able to memorize, recall and forget patterns, just as much as humans do.

12. Associative memory for video (Dr. Dmitry Gorodnichy) Analytical examination By looking at the synaptic weights Cij, one can say a lot … about the properties of memory: - how many main attractors (stored memories) it has. - how good the retrieval is.

13. Associative memory for video (Dr. Dmitry Gorodnichy) Attraction Radius as function of weights Theoretical result: (for direct attraction radius)

14. Associative memory for video (Dr. Dmitry Gorodnichy) Dynamics of the network The behaviour of the network is governed by the energy functions However : They are few -> They are few, when D>0.1 [Gorodnichy&Reznik’97] They are detected automatically -> They are detected automatically  The network always converges: as long as Cij=Cji Cycles are possible, when D<1 :

15. Associative memory for video (Dr. Dmitry Gorodnichy) Update flow neuro-processing ->is very fast -> is very fast (as only few neurons are actually changing in one iteration) -> detects cycle automatically -> detects cycle automatically -> suitable for parallel implementation -> suitable for parallel implementation [Gorodnichy&Reznik’94]: “Process only those neurons which change during the evolution”, i.e. instead of N multiplications: do only few of them :

16. Associative memory for video (Dr. Dmitry Gorodnichy) How video information is processed ? As we know how to memorize, the question is what should be memorized? What type of video information needs to be processed ? Lets see what mother nature (neurobiology) tells us

17. Associative memory for video (Dr. Dmitry Gorodnichy) Visual Processing mechanisms Images are of very low resolution except in the fixation point. The eyes look at points which attract visual attention. disparity Saliency is: in a) motion, b) colour, c) disparity, d) intensity. These channels are processed independently in brain Intensity means: frequencies, orientation, gradient. Brain process the sequences of images rather than one image. - Bad quality of images is compensated by the abundance of images. Colour & motion are used for segmentation. Intensity is used for recognition. Bottom-up (image driven) visual attention is very fast and precedes top-down (goal-driven) attention: 25ms vs 1sec. Refs:Itti,….

18. Associative memory for video (Dr. Dmitry Gorodnichy) Visual recognition mechanism - What to learn: generality vs specifics, invariance vs selectivity - Affine transformations in 2D (rotation in image plane, scale) are easily dealt with. - No 3D model stored. Instead, several view-based 2D models stored - One neural network per view. In context of face recognition: - Faces are stored in canonical representation - 2D transformations are easy in image/video processing! - Video allows to wait (until a face is a position in which it was stored) Refs: Poggio,…

19. Associative memory for video (Dr. Dmitry Gorodnichy) Orientation selectivity, Top-down vs bottom up detection From [Riesenhuber-Poggio, Nature Neuroscience,2000]

20. Associative memory for video (Dr. Dmitry Gorodnichy) On computer vision side

21. Associative memory for video (Dr. Dmitry Gorodnichy) Perceptual Vision System Goal: To detect, track and recognize face and facial movements of the user. x y, z  PUI monitor binary event ON OFF recognition / memorization Unknown User! Setup: + face close to camera (within hand distance) + approximately front-faced oriented + limited number of users and motions - off-the shelf camera (low quality, low resolution) - Desktop computer (with limited processing power)

22. Associative memory for video (Dr. Dmitry Gorodnichy) What can be “perceived”: Face processing tasks “Something yellow moves” Face Segmentation Facial Event Recognition Face Memorization Face Detection Face Tracking (crude) Face Classification Face Localization (precise) Face Identification “It’s a face” “It’s at (x,y,z,  ” “Lets follow it!” “It’s face of a child”“S/he smiles, blinks” “Face unknown. Store it!” “It’s Mila!” “I look and see…” …

23. Associative memory for video (Dr. Dmitry Gorodnichy) Computer Vision results achieved Proof-of-concept PUI: colour-based tracking [Bradski] –Unlikely to be used for precise tracking… Several good skin colour models developed (HSV,UCS,YCrCb*) –Unlikely to get better than that… Subpixel-accuracy convex-shape nose tracking [Nouse] Motion-based segmentation & localization –2001. Non-linear change detection –2003. Second-order change detection [Double-blink] Viola-Jones face detection using Haar-like wavelets Stereotracking using nose and projective vision [Gorodnichy,Roth-IVC]

24. Associative memory for video (Dr. Dmitry Gorodnichy) Face Detection and Tracking

25. Associative memory for video (Dr. Dmitry Gorodnichy) Face Detection and Tracking (lights off)

26. Associative memory for video (Dr. Dmitry Gorodnichy) Face Detection and Tracking (lights on)

27. Associative memory for video (Dr. Dmitry Gorodnichy) Demand and applications Internet, tendencias & tecnología La nariz utilizada como mouse En el Instituto de Tecnología de la Información, en Canadá, se desarrolló un sistema llamado Nouse que permite manejar softwares con movimientos del rostro. El creador de este programa, Dmitry Gorodnichy, explicó vía a LA NACION LINE cómo funciona y cuáles son sus utilidades Si desea acceder a más información, contenidos relacionados, material audiovisual y opiniones de nuestros lectores ingrese en : Copyright S. A. LA NACION Todos los derechos reservados.

28. Associative memory for video (Dr. Dmitry Gorodnichy) On importance of nose Test: The user rotates his head only! (the shoulders do not move) Precision / convenience is such that it allows one to use nose as mouse (or a joystick handle) – to Nouse

29. Associative memory for video (Dr. Dmitry Gorodnichy) Nouse TM : range and speed of tracking

30. Associative memory for video (Dr. Dmitry Gorodnichy) Stereotracking with nose feature

31. Associative memory for video (Dr. Dmitry Gorodnichy) Second-order change detection Detecting change in a change [Gorodnichy’03] Non-linear change detection deals with changes due illumination changes [ Durucan’02]

32. Associative memory for video (Dr. Dmitry Gorodnichy) Eye Blink Detection Previously very difficult in moving heads With second-order change detection became possible Is currently used to enable people with brain injury face-to-face communication [AAATE’03]

33. Associative memory for video (Dr. Dmitry Gorodnichy) Something (special) about video Importance: - Video is becoming ubiquitous. Cameras are everywhere. - For security, computer–human interaction, video-conferencing, entertainment … Constraints: - Real-time processing is required. - Low resolution: 160x120 images or mpeg-decoded. - Low-quality: week exposure, blurriness, cheap lenses Essence: - It is inherently dynamic!  temporal info to make up for bad quality - It has parallels with biological vision!  it can be processed efficiently

34. Associative memory for video (Dr. Dmitry Gorodnichy) Applicability of 160x120 video According to face anthropometrics (studied on BioID database) Tested with Perceptual User interfaces Face size ½ image¼ image 1/8 image 1/16 image In pixels80x8040x4020x2010x10 Between eyes-IOD Eye size Nose size FS  b FD  b- FT  b- FL  b-- FER  b- FC  b- FM / FI  --  – good b – barely applicable - – not good

35. Associative memory for video (Dr. Dmitry Gorodnichy) Choosing the face model. On importance of eyes: Eyes are the most salient features on a face. Besides, there two of them, which makes the excellent reference frame out of them They also the best (and the only) stable landmarks on a face which can be used a reference.  Intra-ocular distance (IOD) makes a very convenient unit of measurement!  Eye –centered face model On resolution: Lowest resolution possible, not to inflict overfitting due to the present noise (and there’s a lot of noise in video!)

36. Associative memory for video (Dr. Dmitry Gorodnichy) Eye-centered face representations Suitable for Face Analysis from video d IOD Suitable for Face Recognition in travel documents [ICAO’02] Size 24 x 24 is sufficient for face memorization & recognition and is optimal for low-quality video and for fast processing.

37. Associative memory for video (Dr. Dmitry Gorodnichy) From image pixels to feature vectors When the eyes are detected, and a face is converted to a canonical representation, it is easy to memorize to recognize Using (orientational, frequency) features: Gabor filters ? As faces are already rectified (to the same scale and orientation), no need for complex transformations. Just deal with illumination changes. Converting 24x24 face to binary feature vector: A) V i =I xy - I ave,  N=24x24=576 B ) V i,j =sign(I i - I j ),  N= 24 4 C ) V i,j =Haar-like(i,j,k,l )  much more Some pixels may be ignored (corners, eye location)

38. Associative memory for video (Dr. Dmitry Gorodnichy) Closer to experiments Network size of N=576 stores –M=N/2 states with … –M=N/4 states with 25%N error correction Faces are extracted using OpenCV Viola-Jones function Another way: from blinking as in [avbpa03]:

39. Associative memory for video (Dr. Dmitry Gorodnichy) Visual memory for user perception What can be retrieved: –user identity, –face orientation, –facial expression

40. Associative memory for video (Dr. Dmitry Gorodnichy) Retrieving orientation

41. Associative memory for video (Dr. Dmitry Gorodnichy) A few more demos: taped and live… … as time allows… Watch how memory is being filled out, as you learn new prototypes

42. Associative memory for video (Dr. Dmitry Gorodnichy) Conclusions ? Computer vision Pattern recognition Neuro-biology

43. Associative memory for video (Dr. Dmitry Gorodnichy) Conclusions A lot has been done in PR, CV, NB. –How to know all of these ?… –How to use all of these ?… Or which way you’d prefer? Attractor-based network - great tool: –very easy to understand what it is doing –very suitable for live real-time video processing –Very much within the lines of biological vision –You are invited to try it yourself! – from our website Other contributions: –Canonical face representation for FPIV Is that possible to work, while on parental leave with two kids?

44. Associative memory for video (Dr. Dmitry Gorodnichy) Dealing with a stream of data Dynamic desaturation Dynamic desaturation : -> maintains the capacity of 0.2N (with complete retrieval) -> allows to store data in real-time -> allows to store data in real-time (no need for iterative learning methods!) -> provides means for forgetting obsolete data -> is the basis for the design of adaptive filters