Determining the Number of Non- Spurious Arcs in a Learned DAG Model: Investigation of a Bayesian and a Frequentist Approach Listgarten & Heckerman.

Slides:



Advertisements
Similar presentations
Scientific Method Method of scientific investigation Four MAJOR steps:
Advertisements

A Tutorial on Learning with Bayesian Networks
Naïve Bayes. Bayesian Reasoning Bayesian reasoning provides a probabilistic approach to inference. It is based on the assumption that the quantities of.
Statistical Modeling and Data Analysis Given a data set, first question a statistician ask is, “What is the statistical model to this data?” We then characterize.
Properties of Real Numbers
De Novo Sequencing v.s. Database Search Bin Ma School of Computer Science University of Waterloo Ontario, Canada.
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
“Inverse Kinematics” The Loop Closure Problem in Biology Barak Raveh Dan Halperin Course in Structural Bioinformatics Spring 2006.
The z-Test What is the Purpose of a z-Test? What are the Assumptions for a z- Test? How Does a z-Test Work?
Review P(h i | d) – probability that the hypothesis is true, given the data (effect  cause) Used by MAP: select the hypothesis that is most likely given.
Bayesian Networks Alan Ritter.
1.2 – Open Sentences and Graphs
Properties of Real Numbers. Closure Property Commutative Property.
EXAMPLE 3 Identify properties of real numbers
Bayes Net Perspectives on Causation and Causal Inference
Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks Dirk Husmeier Adriano V. Werhli.
Properties A property is something that is true for all situations.
Probabilistic Models that uncover the hidden Information Flow in Signalling Networks.
Reverse engineering gene regulatory networks Dirk Husmeier Adriano Werhli Marco Grzegorczyk.
Characteristics of Life Organization of Life Scientific Method.
1. 3 x = x 3 2. K + 0 = K (3 + r) = (12 + 3) + r 4. 7 (3 + n) = n Name the Property Commutative of Multiplication Identity of Addition.
Science Process Skills. Observe- using our senses to find out about objects, events, or living things. Classify- arranging or sorting objects, events,
Antigen Receptors of Lymphocytes. Recognition: molecular patterns Recognition : molecular details (antigenic determinants) Innate immunity Aquired immunity.
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
Statistical Testing with Genes Saurabh Sinha CS 466.
Complex Numbers, Division of Polynomials & Roots.
Whole Number Operations and Their Properties
Reverse engineering of regulatory networks Dirk Husmeier & Adriano Werhli.
Properties Objective: To use the properties of numbers. Do Now 1.) = 3.) ( 2  1 )  4 = 2.) =4.) 2  ( 1  4 ) =
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Properties of Addition and Multiplication Mrs. Smith 7 th Grade Math.
1-4 Properties of Real Numbers. Properties 1.Additive Identity – the sum of any number and zero is equal to the number. a + 0 = a 2.Multiplicative Identity.
Distributive Commutative Addition Zero Property Additive Inverse 0 Multiplicative Identity Commutative Multiplication Multiplicative Inverse Additive Identity.
Chapter 2 Notes Ms. Sager. Science as Inquiry What is Science? – Word derived from Latin – means “to know” – A way of knowing – How to answer questions.
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
(2 x 1) x 4 = 2 x (1 x 4) Associative Property of Multiplication 1.
Example 4 Using Multiplication Properties SOLUTION Identity property of multiplication () 16 a. 6 = Find the product. () 16 a.b. 15– () 0 Multiplication.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Identities, Contradictions and Conditional Equations.
Chapter 14 Repeated Measures and Two Factor Analysis of Variance PowerPoint Lecture Slides Essentials of Statistics for the Behavioral Sciences Seventh.
L4: Effect of pH on Amylase Rate of Reaction
Blood Typing—Proteins
Section 8.2 Zero and Negative Exponents
CHAPTER 16: Graphical Models
Properties of Operations
Commutative Property of Addition
Meredith L. Wilcox FIU, Department of Epidemiology/Biostatistics
Antigens Ali Al Khader, MD Faculty of Medicine
Statistical Testing with Genes
והקשר למחלות אוטואימוניות
and CHAPTER 6 Major Histocompatibility Complex (MHC) Molecules
One-Way Analysis of Variance: Comparing Several Means
Figure 2 Peptide vaccination using TAA-derived long peptides
7) Properties LT# 1E.
true graph initial network keep edge and so on move to next edge 17.
Propagation Algorithm in Bayesian Networks
Pattern Recognition and Image Analysis
Proteins.
Statistics for the Social Sciences
Indicator 10 Solving Inequalities.
Chapter 13: Inferences about Comparing Two Populations Lecture 7a
What are their purposes? What kinds?
Estimate the square root
Sample details Number of non-synonymous mutations
Statistical Testing with Genes
3(9z + 4) > 35z – 4 Original problem. Solve each inequality. Then graph the solution set on the number line. 3(9z + 4) > 35z – 4 Original problem.
Presentation transcript:

Determining the Number of Non- Spurious Arcs in a Learned DAG Model: Investigation of a Bayesian and a Frequentist Approach Listgarten & Heckerman

2 Purpose  Design a vaccine for HIV By considering many patients and observing which HLA molekyles causes the T-killer cells of the imune system to react

3 Definitions  HLA = Human leukocyte antigen Each person usally has [3;6]  Epitopes = bits of protein Results of T-cell attacking HIV-peptide  Peptide = “small digestible” Link between amino acids

4 How?  Find out which HIV peptides interact with which HLA molekyles by using a graphical model.

5 Solution  A directed acyclic graph representing HLA and peptides HLA h 1 HLA h 2 HLA h 3 HLA h 4 peptide y 1 peptide y 2 peptide y 3 HLA h N peptide y M... Model for one patient. A design of a vaccine is to identify a set of peptide-HLA-pairs, which are epitopes for a large number of the population

6 Properties  Bi-partite model(2 levels)  HLA can have zero or several outgoing archs  Peptide can have zero or several ingoing archs  Each patient will have [3;6] HLA nodes that are “on”  Answers: which HLA molekyle(s) are(is) responsible for a given immune system reaction

7 Two approaches  Bayesian  Frequentist

8 Bayesian Approach cont. 1(2) true arch distribution bayesian expectation with given data D the number of archs both in G and G’ Ddata G’proposed model Gall possible graph structures

9 Bayesian Approach cont. 2(2)  Exponentional complexity…! Can be improved by limiting |Parent set| Limit=5, gives identical results

10 Frequentist Approach  FDR = False Discovery Rate  Given a set of hypotheses  Hypothesis i has a test score s: assumed to be independent in a given hypotheses

11 FDR cont. 1(4) Eexpected value Fnumber of false hypotheses Snumber of hypotheses with s i > t tthreshold

12 FDR cont. 2(4) Rewrite Where is a structure search algorithm

13 FDR cont. 3(4) – multiple data sets Q - – number of archs found by applying to real data, D

14 FDR cont. 4(4)  Standard FDR:  The average over multiple datasets  +1 – smooths the estimate

15 Results  PPV – positive predictive value Frequentist method: Bayesian method:

16 Results on non-HIV data

17 Results on non-HIV data

18 Results on synthetic HIV data

19 Results on real HIV data   8 results…. all matches