Uncertain Observation Times Shaunak Chatterjee & Stuart Russell Computer Science Division University of California, Berkeley.

Slides:



Advertisements
Similar presentations
Active Appearance Models
Advertisements

State Estimation and Kalman Filtering CS B659 Spring 2013 Kris Hauser.
Online Filtering, Smoothing & Probabilistic Modeling of Streaming Data In short, Applying probabilistic models to Streams Bhargav Kanagal & Amol Deshpande.
Model Checker In-The-Loop Flavio Lerda, Edmund M. Clarke Computer Science Department Jim Kapinski, Bruce H. Krogh Electrical & Computer Engineering MURI.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
METHODS FOR HAPLOTYPE RECONSTRUCTION
MEG/EEG Inverse problem and solutions In a Bayesian Framework EEG/MEG SPM course, Bruxelles, 2011 Jérémie Mattout Lyon Neuroscience Research Centre ? ?
Dynamic Bayesian Networks (DBNs)
Lirong Xia Approximate inference: Particle filter Tue, April 1, 2014.
Hidden Markov Models Reading: Russell and Norvig, Chapter 15, Sections
Proportion Priors for Image Sequence Segmentation Claudia Nieuwenhuis, etc. ICCV 2013 Oral.
Introduction of Probabilistic Reasoning and Bayesian Networks
1 Reasoning Under Uncertainty Over Time CS 486/686: Introduction to Artificial Intelligence Fall 2013.
1 Vertically Integrated Seismic Analysis Stuart Russell Computer Science Division, UC Berkeley Nimar Arora, Erik Sudderth, Nick Hay.
EE-148 Expectation Maximization Markus Weber 5/11/99.
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
CS 188: Artificial Intelligence Fall 2009 Lecture 20: Particle Filtering 11/5/2009 Dan Klein – UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Lecture 2: Thu, Jan 16 Hypothesis Testing – Introduction (Ch 11)
Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.
Improving the Accuracy of Continuous Aggregates & Mining Queries Under Load Shedding Yan-Nei Law* and Carlo Zaniolo Computer Science Dept. UCLA * Bioinformatics.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5) March, 27, 2009.
Dynamic Bayesian Networks CSE 473. © Daniel S. Weld Topics Agency Problem Spaces Search Knowledge Representation Reinforcement Learning InferencePlanningLearning.
Computer Science Characterizing and Exploiting Reference Locality in Data Stream Applications Feifei Li, Ching Chang, George Kollios, Azer Bestavros Computer.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
. PGM 2002/3 – Tirgul6 Approximate Inference: Sampling.
Lecture II-2: Probability Review
Real-Time Odor Classification Through Sequential Bayesian Filtering Javier G. Monroy Javier Gonzalez-Jimenez
Computation and Minimax Risk The most challenging topic… Some recent progress: –tradeoffs between time and accuracy via convex relaxations (Chandrasekaran.
QUIZ!!  T/F: The forward algorithm is really variable elimination, over time. TRUE  T/F: Particle Filtering is really sampling, over time. TRUE  T/F:
Recap: Reasoning Over Time  Stationary Markov models  Hidden Markov models X2X2 X1X1 X3X3 X4X4 rainsun X5X5 X2X2 E1E1 X1X1 X3X3 X4X4 E2E2 E3E3.
Time Series Data Analysis - I Yaji Sripada. Dept. of Computing Science, University of Aberdeen2 In this lecture you learn What are Time Series? How to.
Pairwise Sequence Alignment. The most important class of bioinformatics tools – pairwise alignment of DNA and protein seqs. alignment 1alignment 2 Seq.
1 Robot Environment Interaction Environment perception provides information about the environment’s state, and it tends to increase the robot’s knowledge.
Data and modeling issues in population biology  Alan Hastings (UC Davis) and many colalborators  Acknowledge support from NSF.
2 Syntax of Bayesian networks Semantics of Bayesian networks Efficient representation of conditional distributions Exact inference by enumeration Exact.
CPSC 322, Lecture 32Slide 1 Probability and Time: Hidden Markov Models (HMMs) Computer Science cpsc322, Lecture 32 (Textbook Chpt 6.5.2) Nov, 25, 2013.
Computing & Information Sciences Kansas State University Wednesday, 22 Oct 2008CIS 530 / 730: Artificial Intelligence Lecture 22 of 42 Wednesday, 22 October.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
UIUC CS 498: Section EA Lecture #21 Reasoning in Artificial Intelligence Professor: Eyal Amir Fall Semester 2011 (Some slides from Kevin Murphy (UBC))
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Stable Multi-Target Tracking in Real-Time Surveillance Video
4. Particle Filtering For DBLOG PF, regular BLOG inference in each particle Open-Universe State Estimation with DBLOG Rodrigo de Salvo Braz*, Erik Sudderth,
CS Statistical Machine learning Lecture 24
Slides for “Data Mining” by I. H. Witten and E. Frank.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
"Classical" Inference. Two simple inference scenarios Question 1: Are we in world A or world B?
Analyzing wireless sensor network data under suppression and failure in transmission Alan E. Gelfand Institute of Statistics and Decision Sciences Duke.
QUIZ!!  In HMMs...  T/F:... the emissions are hidden. FALSE  T/F:... observations are independent given no evidence. FALSE  T/F:... each variable X.
Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
CPSC 422, Lecture 17Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 17 Oct, 19, 2015 Slide Sources D. Koller, Stanford CS - Probabilistic.
Discrete-time Random Signals
Inference of Non-Overlapping Camera Network Topology by Measuring Statistical Dependence Date :
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
TEMPLATE DESIGN © Approximate Inference Completing the analogy… Inferring Seismic Event Locations We start out with the.
1 Travel Times from Mobile Sensors Ram Rajagopal, Raffi Sevlian and Pravin Varaiya University of California, Berkeley Singapore Road Traffic Control TexPoint.
Can small quantum systems learn? NATHAN WIEBE & CHRISTOPHER GRANADE, DEC
Instructor: Eyal Amir Grad TAs: Wen Pu, Yonatan Bisk Undergrad TAs: Sam Johnson, Nikhil Johri CS 440 / ECE 448 Introduction to Artificial Intelligence.
CS 541: Artificial Intelligence Lecture VIII: Temporal Probability Models.
Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.
CS498-EA Reasoning in AI Lecture #23 Instructor: Eyal Amir Fall Semester 2011.
Hidden Markov Models BMI/CS 576
Course: Autonomous Machine Learning
Feifei Li, Ching Chang, George Kollios, Azer Bestavros
2018, Spring Pusan National University Ki-Joune Li
Nome Sobrenome. Time time time time time time..
Presentation transcript:

Uncertain Observation Times Shaunak Chatterjee & Stuart Russell Computer Science Division University of California, Berkeley

Overview Why uncertain observation times matter Scenarios considered: 1.Each event is observed: – Efficient DP algorithm 2.Missing and false events: – Practical approximation algorithm 3. Multiple asynchronous observation streams

Motivation Two types of data streams – Automatically time-stamped data traces – Human annotations for temporal events Many essential facts cannot be recorded automatically Human-generated timestamps often wrong Assuming the correctness of timestamps can lead to nonsense results

Example: at 16.30, nurse enters “gave phenylephrine at 16.00” Data entry time Event timestamp

Example: at 16.30, nurse enters “gave phenylephrine at 16.00” Data entry time Actual event time

Ubiquity of uncertain observation times Nurse monitoring a patient in the ICU – Hundreds of events recorded by the nurse Usually recorded after event, sometimes before Manual recording of events – Science experiments Biology, chemistry, physics – Industrial plants Multiple observation traces – Various historians’ accounts of a period – Only one underlying truth

Sample trace generated from model Correct chronological ordering of time stamps Actual time of event ( a i ) Recording time of event ( d i ) Time  Nurse gives medicine at 10:23 a.m. Nurse records event at 11:00 a.m. Nurse records time of event as 10:30 a.m. Previous event’s time stamp (10:15 a.m.) Recorded time of event ( m i )

Dynamic Bayesian networks DBNs are discrete-time multivariate stochastic process models (include HMMs and KFs) DBNs facilitate modeling of complex systems with sensor noise etc. Large-scale physiological models pursued since the 1960s, but little attention paid to nature of real data

Simple DBN representation Y1Y1 Y2Y2 Y3Y3 Y7Y7 X1X1 X2X2 X3X3 X7X7 a1a1 a2a2 a3a3 m1m1 m2m2 m3m3 d1d1 d2d2 d3d3 Y4Y4 Y5Y5 Y6Y6 X4X4 X5X5 X6X6 Y8Y8 X8X8 falsetrue false

Objective To design a graphical model that allows for uncertainty in observation times Derive efficient inference algorithms – Naïve algorithm has O(M T ) complexity – Reduce to O(MT) Ordering constraints Dynamic programming

Key constraint assumption Person recording events gets the order right Valid association Invalid association For all i, j: m i > m j => a i > a j Time  Recorded time of event (m i ) Actual time of event (a i ) Time  Recorded time of event (m i ) Actual time of event (a i )

Pre-computation step Likelihood of the data segment between the current event time stamp (a k ) and the next hypothesized event time stamp (a k+1 ) Pre-compute for all k, and all possible values of a k and a k+1

Modified Baum-Welch algorithm

Complexity Modified time complexity O(MS 2 T) – M: maximum size of the time window of uncertainty – S: # states in system – T: number of time steps Space complexity – O(KM 2 ) – storing – O(KM) – storing α, β and γ

Simulation results – Increased likelihood of evidence Window of uncertainty

Simulation results – General accuracy of inference

Simulation results – Computation time vs size of uncertainty window

Unreported events, false reports Not all events are reported – Unobserved – Negligence Not all reports are true – Double entry of a single data point – Misinterpretation of information – Intended actions reported but not carried out

Missing and false reports a1a1 a2a2 a4a4 m1m1 m2m2 m3m3 θ1θ1 θ2θ2 θ4θ4 Φ1Φ1 Φ2Φ2 Φ3Φ3 a3a3 θ3θ3 Actual time of event (a i ) Recorded time of event (m i ) Event i reported? (θ i ) Index of event corresponding to report j (φ i )

Missing and false reports a1a1 a2a2 a4a4 m1m1 m2m2 m3m3 θ1θ1 θ2θ2 θ4θ4 Φ1Φ1 Φ2Φ2 Φ3Φ3 a3a3 θ3θ Actual time of event (a i ) Recorded time of event (m i ) Event i reported? (θ i ) Index of event corresponding to report j (φ i )

Modified DP and complexity The previous algorithm was compact because of the one-to-one correspondence between events and reports – Now have to consider all possibilities Unless there are constraints (more on this later) Chronological mapping of events’ time stamps still holds – This again leads to an efficient dynamic program

Computational complexity In the general case, uncertainty windows are no longer limited, since event i can be associated with any report j O(IJT 2 ) – I is the number of hypothesized events – J is the number of reports – T is the length of the temporal sequence

Practical assumptions – I Data entries are made in blocks – All reports in a given block (e.g., the night shift) must be for events that occurred (really or otherwise) in that block – Computational complexity is linear in T if blocks are of constant size

Practical assumptions – I Data entries are made in blocks – Record entered in the afternoon shift cannot correspond to an event in the morning shift Actual time (a i ) Time  Recorded time (m i ) Morning ShiftAfternoon ShiftNight Shift Computational complexity is linear if time blocks are of constant size

Practical assumptions – II When unobserved events and false reports are both rare events – We can perform approximate inference by NOT considering all possible a i  m j associations – The posterior distribution is highly concentrated along the “skewed diagonal” corresponding to a small number of errors – Assuming a bounded number of errors gives time complexity proportional to T

Simulation results – Posterior is peaked around the skewed diagonal

Simulation results – Hypothesizing more events leads to better recall

Effect of varying c

Multiple observation sequences Formulation – Several “sources” reporting on the same events – Key assumption Individual report sequences are independent given the actual truth (the X chain)

aiai a i+1 aIaI θiθi θ i+1 θIθI Φ j (1) Φ J (1) m j+1 (1) m J (1) Φ j+1 (1) m j (1) m j (R) m j+1 (R) m J (R) Φ j (R) Φ j+1 (R) Φ J (R) Latent trajectory Evidence trajectory 1 Evidence trajectory R

Multiple observation sequences Formulation – Several “sources” reporting on the same events – Key assumption Individual report sequences are independent given the actual truth (the X chain) Inference – Similar DP algorithms apply, given the assumptions of ordering constraints, blocks, etc. – Complexity increases linearly with the number of report sequences

Conclusions Handling uncertainty in observation times is critical for correct modeling and inference Assumptions about qualitative accuracy (e.g., order of events) can be very helpful Given such assumptions, the computational complexity of inference remains unchanged (modulo some constant factors) while handling the following cases – Noisy observation times – Missing and false reports – Multiple report sequences

QUESTIONS? Thank You!