Carnegie Mellon AISTATS 2009 Jonathan Huang Carlos Guestrin Carnegie Mellon University Xiaoye Jiang Leonidas Guibas Stanford University.

Slides:



Advertisements
Similar presentations
5.1 Real Vector Spaces.
Advertisements

Discovering Cyclic Causal Models by Independent Components Analysis Gustavo Lacerda Peter Spirtes Joseph Ramsey Patrik O. Hoyer.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: The Linear Prediction Model The Autocorrelation Method Levinson and Durbin.
CS4432: Database Systems II
Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.
Color Imaging Analysis of Spatio-chromatic Decorrelation for Colour Image Reconstruction Mark S. Drew and Steven Bergner
University of Ioannina - Department of Computer Science Wavelets and Multiresolution Processing (Background) Christophoros Nikou Digital.
Randomized Sensing in Adversarial Environments Andreas Krause Joint work with Daniel Golovin and Alex Roper International Joint Conference on Artificial.
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
Multi-Task Compressive Sensing with Dirichlet Process Priors Yuting Qi 1, Dehong Liu 1, David Dunson 2, and Lawrence Carin 1 1 Department of Electrical.
Best-First Search: Agendas
2008 SIAM Conference on Imaging Science July 7, 2008 Jason A. Palmer
Lecture 17 Introduction to Eigenvalue Problems
A 3-Query PCP over integers a.k.a Solving Sparse Linear Systems Prasad Raghavendra Venkatesan Guruswami.
School of Computing Science Simon Fraser University
Spectral bases for 3D shapes representation. Spectral bases – reminder Last week we spoke of Fourier analysis  1D sine/cosine bases for signals:
EECS 20 Chapter 10 Part 11 Fourier Transform In the last several chapters we Viewed periodic functions in terms of frequency components (Fourier series)
Carnegie Mellon Efficient Inference for Distributions on Permutations Jonathan Huang CMU Carlos Guestrin CMU Leonidas Guibas Stanford.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Particle filters (continued…). Recall Particle filters –Track state sequence x i given the measurements ( y 0, y 1, …., y i ) –Non-linear dynamics –Non-linear.
Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.
Wavelet Transform. What Are Wavelets? In general, a family of representations using: hierarchical (nested) basis functions finite (“compact”) support.
Recovering Articulated Object Models from 3D Range Data Dragomir Anguelov Daphne Koller Hoi-Cheung Pang Praveen Srinivasan Sebastian Thrun Computer Science.
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Preference Analysis Joachim Giesen and Eva Schuberth May 24, 2006.
Computer Science Department Andrés Corrada-Emmanuel and Howard Schultz Presented by Lawrence Carin from Duke University Autonomous precision error in low-
Carnegie Mellon Efficient Inference for Distributions on Permutations Jonathan Huang CMU Carlos Guestrin CMU Leonidas Guibas Stanford.
Orthogonality and Least Squares
Computer vision: models, learning and inference Chapter 10 Graphical Models.
T.Sharon-A.Frank 1 Multimedia Image Compression 2 T.Sharon-A.Frank Coding Techniques – Hybrid.
Efficient Quantum State Tomography using the MERA in 1D critical system Presenter : Jong Yeon Lee (Undergraduate, Caltech)
Digital Image Processing Final Project Compression Using DFT, DCT, Hadamard and SVD Transforms Zvi Devir and Assaf Eden.
1 of 12 COMMUTATORS, ROBUSTNESS, and STABILITY of SWITCHED LINEAR SYSTEMS SIAM Conference on Control & its Applications, Paris, July 2015 Daniel Liberzon.
ENG4BF3 Medical Image Processing
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Lecture 1 Signals in the Time and Frequency Domains
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.
Game Theory Meets Compressed Sensing
Cs: compressed sensing
Ranking Queries on Uncertain Data: A Probabilistic Threshold Approach Wenjie Zhang, Xuemin Lin The University of New South Wales & NICTA Ming Hua,
1 Integer transform Wen - Chih Hong Graduate Institute of Communication Engineering National Taiwan University, Taipei,
1 2. Independence and Bernoulli Trials Independence: Events A and B are independent if It is easy to show that A, B independent implies are all independent.
Signal Processing and Representation Theory Lecture 2.
1 Complex Images k’k’ k”k” k0k0 -k0-k0 branch cut   k 0 pole C1C1 C0C0 from the Sommerfeld identity, the complex exponentials must be a function.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Guaranteed Smooth Scheduling in Packet Switches Isaac Keslassy (Stanford University), Murali Kodialam, T.V. Lakshman, Dimitri Stiliadis (Bell-Labs)
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Matrices and linear transformations For grade 1, undergraduate students For grade 1, undergraduate students Made by Department of Math.,Anqing Teachers.
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
Adaptive Wavelet Packet Models for Texture Description and Segmentation. Karen Brady, Ian Jermyn, Josiane Zerubia Projet Ariana - INRIA/I3S/UNSA June 5,
Structure from motion Multi-view geometry Affine structure from motion Projective structure from motion Planches : –
The normal approximation for probability histograms.
Singular Value Decomposition and its applications
Biointelligence Laboratory, Seoul National University
Provable Learning of Noisy-OR Networks
Lecture 22: Linearity Testing Sparse Fourier Transform
NP-Completeness Yin Tat Lee
Differential Privacy in Practice
Discovering Functional Communities in Social Media
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Sudocodes Fast measurement and reconstruction of sparse signals
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
Maths for Signals and Systems Linear Algebra in Engineering Lectures 13 – 14, Tuesday 8th November 2016 DR TANIA STATHAKI READER (ASSOCIATE PROFFESOR)
NP-Completeness Yin Tat Lee
Aishwarya sreenivasan 15 December 2006.
Presentation transcript:

Carnegie Mellon AISTATS 2009 Jonathan Huang Carlos Guestrin Carnegie Mellon University Xiaoye Jiang Leonidas Guibas Stanford University

Identity management [Shin et al., ‘03] Identity 1,2 Track 1 Track 2 Where is Donald Duck?

Identity management 1,2 1,3 1,4 Track 1 Track 2 Track 3 Track 4 Where is ? Where is ?

Reasoning with permutations Represent uncertainty in identity management with distributions over permutations Identities Track Permutations Probability of each track permutation Probability of each track permutation A B C D P(σ) / / / [ ] means: “Alice is at Track 1, and Bob is at Track 3, and Cathy is at Track 2, and David is at Track 4 with probability 1/10” [ ] means: “Alice is at Track 1, and Bob is at Track 3, and Cathy is at Track 2, and David is at Track 4 with probability 1/10”

Storage complexity There are n! permutations in S n (set of permutations of n objects) Need to compress… x 1,800,000

Fourier theoretic approaches Approximate distributions over permutations with low frequency basis functions [Kondor2007, Huang2007] low frequency high frequency Fourier analysis on the real line sinusoidal basis Fourier analysis on S n (Permutations of n objects)

1 st order summary matrix Store marginals of the form P(identity i is at track j) Requires storing only n 2 numbers! 0 1/2 1/5 3/100 1/51/209/20 ABCD Identities Tracks “Bob is at Track 2 with zero probability” “Bob is at Track 2 with zero probability” “Cathy is at Track 3 with probability 1/20” “Cathy is at Track 3 with probability 1/20” 1 st order marginals reconstructible from O(n 2 ) lowest frequency Fourier coefficients!

Uncertainty principle on a line Uniform distribution “time domain” hard to represent “Fourier domain” easy to represent Peaked distribution “time domain” easy to represent “Fourier domain” hard to represent Uncertainty Principle: a signal f cannot be sparsely represented in both the time and Fourier domains Power spectrumSignal f

Uncertainty principle on permutations frequencyfraction of energy stored in frequency harder to represent Uniform on 1 group of 8 2 subgroups of 4 4 subgroups of 2 8 subgroups of 1 (peaked dist.) Power spectrums C C B B D D E E F F G G A A H H tracks identities confusion between tracks 1,2 confusion between tracks 3,4 No uncertainty Keep full joint distribution over S 8 Keep 8 independent distributions over S 1 Uncertainty within group, no mixing between groups Decomposed distributions can be captured more accurately, with fewer coefficients

Adaptive decompositions Our approach: adaptively factor problem into subgroups allowing for higher order representations for smaller subgroups “This is Bob” (and Bob was originally in the Blue group) “This is Bob” (and Bob was originally in the Blue group) Claim: Adaptive Identity Management can be highly scalable, more accurate for sharp distributions

Contributions Characterization of constraints on Fourier coefficients on permutations implied by probabilistic independence Two algorithms: for factoring a distribution (Split) and combining independent factors in the Fourier domain (Join) Adaptive algorithm for scalable identity management (handles up to n~100 tracks)

First-order independence condition Mutual Exclusivity “Alice and Bob not both at Track k” Independence = = 0 Alice Bob Track k Alice Bob Track k or Product of first-order marginals

First-order independence condition Tracks Identities Tracks Identities Not independentIndependent vs. Can verify condition using first-order marginals

First-order independence First-order condition is insufficient: “Alice is in red team” “Bob is in blue team” “Alice guards Bob” First-order independence ignores the fact that Alice and Bob are always next to each other! image from [sullivan06]

The problem with first-order First-order marginals look like: P(Alice is at Track 1) = 3/5 P(Bob is at Track 2) = 1/2 Now suppose Alice guards Bob, and… Tracks 1,2 very far apart Can write as second-order marginal: P({Alice,Bob} occupy Tracks  {1,2}) = 0

Second-order summaries Store summaries for ordered pairs: 2 nd order summary requires O(n 4 ) storage (A,B)(B,A)(A,C) (1,2) (2,1) (C,A) (1,3) (3,1) 1/61/121/8 1/12 1/81/24 1/8 1/61/12 1/81/12 Identities Tracks “Bob is at Track 1 and Alice is at Track 3 with probability 1/12” “Bob is at Track 1 and Alice is at Track 3 with probability 1/12” store marginal probability that identities (k,l) map to tracks (i,j)

Trade-off – capture higher frequencies by storing more numbers Remark: high-order marginals contain low-order information Higher orders and connections to Fourier Sum over entire distribution (always equal to 1) Sum over entire distribution (always equal to 1) Recovers original distribution, requires storing n! numbers Recovers original distribution, requires storing n! numbers

Fourier coefficient matrices Fourier coefficients on permutations are a collection of square matrices ordered by “frequency”: Bandlimiting - keep a truncated set of coefficients Fourier domain inference – prediction/conditioning in the Fourier domain [Kondor et al,AISTATS07] [Huang et al,NIPS07] 0 th order1 st order2 nd ordern th order

Back to independence Need to consider two operations Groups join when tracks from two groups mix Groups split when an observation allows us to reason over smaller groups independently “This is Bob” (and Bob was originally in the Blue group) “This is Bob” (and Bob was originally in the Blue group)

Problems If the joint distribution h factors as a product of distributions f and g: Distribution over tracks {1,…,p} Distribution over tracks {p+1,…,n} (Join problem) Find Fourier coefficients of the joint h given Fourier coefficients of factors f and g? (Split problem) Find Fourier coefficients of factors f and g given Fourier coefficients of the joint h?

First-order join Given first-order marginals of f and g, what does the matrix of first-order marginals of h look like? first-order marginals f h g zeroes

Higher-order joining Given Fourier coefficients of the factors f and g at each frequency level: Compute Fourier coefficients of the joint distribution h at each frequency level factors joint frequencies

Higher-order joining Joining for higher-order coefficients gives similar block- diagonal structure Also get Kronecker product structure for each block Blocks appear multiple times (multiplicities related to Littlewood-Richardson coefficients) same block

Problems If the joint distribution h factors as a product of distributions f and g: Distribution over tracks {1,…,p} Distribution over tracks {p+1,…,n} (Join problem) Find Fourier coefficients of the joint h given Fourier coefficients of factors f and g? (Split problem) Find Fourier coefficients of factors f and g given Fourier coefficients of the joint h?

Splitting Want to “invert” the Join process: Consider recovering 2 nd Fourier block Need to recover A, B from – only possible to do up to scaling factor Our approach: search for blocks of the form: or Theorem: these blocks always exist! (and are efficient to find)

Marginal preservation Problem: In practice, never have entire set of Fourier coefficients! Marginal preservation guarantee: Conversely, get a similar guarantee for splitting (Usually get some higher order information too) Theorem: Given m th -order marginals for independent factors, we exactly recover m th - order marginals for the joint. bandlimited representation

Detecting independence To adaptively split large distributions, need to detect independent subsets Recall first-order independence condition: Can use (bi)clustering on matrix of marginals to discover an appropriate ordering! matrix of marginals with appropriate ordering on identities and tracks matrix of marginals with appropriate ordering on identities and tracks In practice, get unordered identities, tracks… In practice, get unordered identities, tracks… Balance constraint: force nonzero blocks to be square Balance constraint: force nonzero blocks to be square p p

First-order independence First-order condition is insufficient: Can check for higher order independence after detecting at first-order What if we call Split when only the first-order condition is satisfied? “Alice is in red team” “Bob is in blue team” “Alice guards Bob” Even when higher-order independence does not hold: Theorem: Whenever first-order independence holds, Split returns exact marginals of each subset of tracks.

Experiments - accuracy ratio of observations label accuracy better nonadaptive adaptive dataset from [Khan et al. 2006] 20 ants

Experiments – running time ratio of observations elapsed time per run (seconds) Number of ants Running time (seconds) better nonadaptive adaptive nonadaptive

Conclusions Completely Fourier-theoretic characterization of probabilistic independence marginalization, conditioning, join, split Two new algorithms Scalable and adaptive identity management algorithm to track up to n=100 objects Thank you