Lectures 2 – Oct 3, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.

Slides:



Advertisements
Similar presentations
CS188: Computational Models of Human Behavior
Advertisements

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 27 – Overview of probability concepts 1.
Lectures 9 – Oct 26, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Local structures; Causal Independence, Context-sepcific independance COMPSCI 276 Fall 2007.
BAYESIAN NETWORKS CHAPTER#4 Book: Modeling and Reasoning with Bayesian Networks Author : Adnan Darwiche Publisher: CambridgeUniversity Press 2009.
Bayesian Networks. Contents Semantics and factorization Reasoning Patterns Flow of Probabilistic Influence.
Introduction of Probabilistic Reasoning and Bayesian Networks
EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.
Lectures 8 – Oct 24, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
Parameter Estimation using likelihood functions Tutorial #1
From: Probabilistic Methods for Bioinformatics - With an Introduction to Bayesian Networks By: Rich Neapolitan.
Review: Bayesian learning and inference
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
The UNIVERSITY of Kansas EECS 800 Research Seminar Mining Biological Data Instructor: Luke Huan Fall, 2006.
Cs726 Modeling regulatory networks in cells using Bayesian networks Golan Yona Department of Computer Science Cornell University.
Bayesian Belief Networks
Bayesian Network Representation Continued
Probabilistic Graphical Models Tool for representing complex systems and performing sophisticated reasoning tasks Fundamental notion: Modularity Complex.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
6. Gene Regulatory Networks
Today Logistic Regression Decision Trees Redux Graphical Models
Bayesian Networks Alan Ritter.
Thanks to Nir Friedman, HU
. DAGs, I-Maps, Factorization, d-Separation, Minimal I-Maps, Bayesian Networks Slides by Nir Friedman.
1 Bayesian Networks Chapter ; 14.4 CS 63 Adapted from slides by Tim Finin and Marie desJardins. Some material borrowed from Lise Getoor.
Cristina Manfredotti D.I.S.Co. Università di Milano - Bicocca An Introduction to the Use of Bayesian Network to Analyze Gene Expression Data Cristina Manfredotti.
Bayesian networks More commonly called graphical models A way to depict conditional independence relationships between random variables A compact specification.
CSE 515 Statistical Methods in Computer Science Instructor: Pedro Domingos.
Approximate Inference 2: Monte Carlo Markov Chain
Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.
V8 The Bayesian Network Representation Our goal is to represent a joint distribution P over some set of random variables X = { X 1, X 2, … X n }. Even.
A Brief Introduction to Graphical Models
Comp. Genomics Recitation 12 Bayesian networks Taken from Artificial Intelligence course, MIT, 6.034
Aprendizagem Computacional Gladys Castillo, UA Bayesian Networks Classifiers Gladys Castillo University of Aveiro.
Bayesian Learning Chapter Some material adapted from lecture notes by Lise Getoor and Ron Parr.
Introduction to Bayesian Networks
V13: Causality Aims: (1) understand the causal relationships between the variables of a network (2) interpret a Bayesian network as a causal model whose.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Lecture 28 of 41 Friday, 22 October.
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Ch 8. Graphical Models Pattern Recognition and Machine Learning, C. M. Bishop, Revised by M.-O. Heo Summarized by J.W. Nam Biointelligence Laboratory,
Learning the Structure of Related Tasks Presented by Lihan He Machine Learning Reading Group Duke University 02/03/2006 A. Niculescu-Mizil, R. Caruana.
Bayesian Network By Zhang Liliang. Key Point Today Intro to Bayesian Network Usage of Bayesian Network Reasoning BN: D-separation.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Marginalization & Conditioning Marginalization (summing out): for any sets of variables Y and Z: Conditioning(variant of marginalization):
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2008 Readings: K&F: 3.1, 3.2, –  Carlos.
CPSC 322, Lecture 33Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 33 Nov, 30, 2015 Slide source: from David Page (MIT) (which were.
Exploiting Structure in Probability Distributions Irit Gat-Viks Based on presentation and lecture notes of Nir Friedman, Hebrew University.
1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.
Lectures 7 – Oct 19, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall.
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Review of statistical modeling and probability theory Alan Moses ML4bio.
Daphne Koller Bayesian Networks Semantics & Factorization Probabilistic Graphical Models Representation.
1 BN Semantics 2 – Representation Theorem The revenge of d-separation Graphical Models – Carlos Guestrin Carnegie Mellon University September 17.
A Cooperative Coevolutionary Genetic Algorithm for Learning Bayesian Network Structures Arthur Carvalho
Daphne Koller Bayesian Networks Semantics & Factorization Probabilistic Graphical Models Representation.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
1 BN Semantics 1 Graphical Models – Carlos Guestrin Carnegie Mellon University September 15 th, 2006 Readings: K&F: 3.1, 3.2, 3.3.
Chapter 12. Probability Reasoning Fall 2013 Comp3710 Artificial Intelligence Computing Science Thompson Rivers University.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Maximum Expected Utility
Qian Liu CSE spring University of Pennsylvania
Read R&N Ch Next lecture: Read R&N
General Gibbs Distribution
Maximum Likelihood Estimation & Expectation Maximization
Instructors: Fei Fang (This Lecture) and Dave Touretzky
Class #19 – Tuesday, November 3
Bayesian Learning Chapter
I-equivalence Bayesian Networks Representation Probabilistic Graphical
Class #16 – Tuesday, October 26
Flow of Probabilistic Influence
Presentation transcript:

Lectures 2 – Oct 3, 2011 CSE 527 Computational Biology, Fall 2011 Instructor: Su-In Lee TA: Christopher Miles Monday & Wednesday 12:00-1:20 Johnson Hall (JHN) 022 Introduction to Probabilistic Models for Computational Biology 1

Review: Gene Regulation AGATATGTGGATTGTTAGGATTTATGCGCGTCAGTGACTACGCATGTTACGCACCTACGACTAGGTAATGATTGATC DNA AUGUGGAUUGUU AUGCGCGUC AUGUUACGCACCUAC AUGAUUGAU RNA Protein MWIV MRV MLRTY MID Gene AGATATGTGGATTGTTAGGATTTATGCGCGTCAGTGACTACGCATGTTACGCACCTACGACTAGGTAATGATTGATC Genes regulate each others’ expression and activity. AUGCGCGUC MRV Genetic regulatory network gene RNA degradation MID AUGAUUAU AUGAUUGAU MID “Gene Expression” a switch! (“transcription factor binding site”) Gene regulation transcription translation

Review: Variations in the DNA AGATATGTGGATTGTTAGGATTTATGCGCGTCAGTGACTACGCATGTTACGCACCTACGACTAGGTAATGATTGATC Genetic regulatory network “Single nucleotide polymorphism (SNP)” AUGUGGAUUGUU AUGCGCGUC AUGUUACGCACCUAC AUGAUUGAU RNA Protein MWIV MRV MLRTY MID gene C X T XXX A G X T X C X L C X X T X U X X Sequence variations perturb the regulatory network.

4 Outline Probabilistic models in biology Model selection problems Mathematical foundations Bayesian networks Probabilistic Graphical Models: Principles and Techniques, Koller & Friedman, The MIT Press Learning from data Maximum likelihood estimation Expectation and maximization

5 Example 1 How a change in a nucleotide in DNA, blood pressure and heart disease are related? There can be several “models”… Blood pressure Heart disease OR DNA alteration Blood pressure Heart disease DNA alteration Blood pressure Heart disease DNA alteration

6 Example 2 How genes A, B and C regulate each other’s expression levels (mRNA levels) ? There can be several models… A BC A BC A BC OR ?

7 Gene A Gene B Gene C Exp 1Exp 2Exp N … A BC A BC A BC OR ? Statistical dependencies between expression levels of genes A, B, C? Probability that model x is true given the data Model selection: argmax x P(model x is true | Data) N instances Model IModel IIModel III Probabilistic graphical models A graphical representation of statistical dependencies.

8 Outline Probabilistic models in biology Model selection problem Mathematical foundations Bayesian networks Learning from data Maximum likelihood estimation Expectation and maximization

9 Probability Theory Review Assume random variables Val(A)={a 1,a 2,a 3 }, Val(B)={b 1,b 2 } Conditional probability Definition Chain rule Bayes’ rule Probabilistic independence

10 Probabilistic Representation Joint distribution P over {x 1,…, x n } x i is binary 2 n -1 entries If x’s are independent P(x) = p(x 1 ) … p(x n )

11 Conditional Parameterization The Diabetes example Genetic risk (G), Diabetes (D) Val (G) = {g 1,g 0 }, Val (D) = {d 1,d 0 } P(G,D) = P(G) P(D|G) P(G): Prior distribution P(D|G): Conditional probabilistic distribution (CPD) Genetic risk Diabetes

12 Naïve Bayes Model - Example Elaborating the diabetes example, Genetic Risk (G), Diabetes (D), Hypertension (H) Val (G) = {g 1,g 0 }, Val (D) = {d 1,d 0 }, Val (H) = {h 1,h 0 } 8 entries If S and G are independent given I, P(G,D,H) = P(G)P(D|G)P(H|G) 5 entries; more compact than joint Genetic risk DiabetesHypertension

13 Naïve Bayes Model A class C where Val (C) = {c 1,…,c k }. Finding variables x 1,…,x n Naïve Bayes assumption The findings are conditionally independent given the individual’s class. The model factorizes as: The Diabetes example class: Genetic risk, findings: Diabetes, Hypertension

14 Naïve Bayes Model - Example Medical diagnosis system Class C: disease Findings X: symptoms Computing the confidence: Drawbacks Strong assumptions

15 Bayesian Network Directed acyclic graph (DAG) Node: a random variable Edge: direct influence of one node on another The Diabetes example revisited Genetic risk (G), Diabetes (D), Hypertension (H) Val (G) = {g 1,g 0 }, Val (D) = {d 1,d 0 }, Val (H) = {h 1,h 0 } Genetic risk DiabetesHypertension

Bayesian Network Semantics A Bayesian network structure G is a directed acyclic graph whose nodes represent random variables X 1,…,X n. PaX i : parents of X i in G NonDescendantsX i : variables in G that are not descendants of X i. G encodes the following set of conditional independence assumptions, called the local Markov assumptions, and denoted by I L (G): For each variable X i : x1x1 x2x2 x3x3 x4x4 x5x5 x6x6 x3x3 x7x7 x 11 x 10 x8x8 x9x9 16

17 The Genetics Example Variables B: blood type (a phenotype) G: genotype of the gene that encodes a person’s blood type;,,,,,

18 Bayesian Network Joint Distribution Let G be a Bayesian network graph over the variables X 1,…,X n. We say that a distribution P factorizes according to G if P can be expressed as: A Bayesian network is a pair (G,P) where P factorizes over G, and where P is specified as a set of CPDs associated with G’s nodes.

19 The Student Example More complex scenario Course difficulty (D), quality of the recommendation letter (L), Intelligence (I), SAT (S), Grade (G) Val(D) = {easy, hard}, Val(L) = {strong, weak}, Val(I) = {i 1,i 0 }, Val (S) = {s 1,s 0 }, Val (G) = {g 1,g 2,g 3 } Joint distribution requires 47 entries

20 The Student Bayesian network Joint distribution P(I,D,G,S,L) = from Koller & Friedman

21 Parameter Estimation Assumptions Fixed network structure Fully observed instances of the network variables: D={d[1],…,d[M]} Maximum likelihood estimation (MLE)! “Parameters” of the Bayesian network For example, {i0,d1,g1,l0,s0} from Koller & Friedman

22 Outline Probabilistic models in biology Model selection problem Mathematical foundations Bayesian networks Learning from data Maximum likelihood estimation Expectation and maximization

23 Acknowledgement Profs Daphne Koller & Nir Friedman, “Probabilistic Graphical Models”