Human Genome 3 billion bases – 2% coding, 5-10% regulatory Organism’s complexity NOT correlated with number of genes! – Human (20-25k genes) vs. Rice (51k genes) 1 million Regulatory elements enable: – Precise control for turning genes on/off – Diverse cell types (lung, heart, skin)
Regulatory Elements ~ 20-25k genes – Expression Modulated by ~ 1 Million cis-reg elements – Enhancer, Promoters, Silencers
Controlling Gene Expression Transcription factors (TFs): – Proteins that recognize sequence motifs in enhancers, promoters – Combinatorial switches that turn genes on/off
Modulating Gene Expression Expression Quantitative Trait Locus (eQTL): – Regions where different genotypes correlate with changes in gene expression
eQTLs: Correlating Genotype with Expression GTEX RNA-seq, Microarray SNP Array, WGS
Measuring Open Chromatin http://hmg.oxfordjournals.org
Measuring open chromatin – DNase Seq Sequence open chromatin – map enhancers, promoters … wikipedia
Statistical Overview Given: Genotype + Expression Matrix Problem: Determine eQTLs Possible Solutions: – Regress homozygous/het genotypes with expression Key Problem: – Of many linked SNPs, what is the causal variant? Enhancer
Outline 1.Basic Gene Regulation 2.Gene Regulation and Human Disease 3.Measurement Technologies 4.Papers 5.Future Trends
PAPER 1: DISSECTING THE REGULATORY ARCHITECTURE OF GENE EXPRESSION QTLS
Overview HapMap cells + 1000G genotypes Bayesian Model – Uncertainty over functional SNP – Prior: Whether SNP hits a functional element (TFBS, promoter, etc) – Upweight effect of SNPs in functional regions Results: – eQTLs often in TFBS, open chromatin, not specifically overrepresented in TATA box
eQTNs are enriched in enhancers, promoters Inactive Active Promoter/En hancer
eQTNs are enriched in enhancers, promoters (2) What is the distribution of eQTNs in regulatory sites?
eQTNs enriched in TF binding sites What TF families show the highest eQTN enrichments?
PAPER 2: DNASE1 SENSITIVITY QTLS ARE A MAJOR DETERMINANT OF HUMAN EXPRESSION VARIATION
Overview If an allele is correlated with changes in open chromatin, how often does it actually modulate gene expression? dsQTL – DNase sensitive QTL dsQTL vs eQTL – Functional link between changes in chromatin accessibility, gene expression
DNase Hypersensitive Region http://hmg.oxfordjournals.org
dsQTL – genotype correlates with extent of open chromatin How does a dsQTL look?
Future Trends Denser genotyping + more expression measurements in variety of cell lines – Better power to detect eQTLs with more people eQTLs with small effect sizes that additively disrupt disease pathways – Common disease, common variant hypothesis Better annotating + understanding genome enhances selection of causal eQTNs