Hypergenes European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease.

Slides:



Advertisements
Similar presentations
PERSONALIZED MEDICINE: Planning for the Future You, Your Biomarkers and Your Rights.
Advertisements

Statistical methods for genetic association studies
Analysis of imputed rare variants
Integrating the gender aspects in research and promoting the participation of women in Life Sciences, Genomics and Biotechnology for Health.
What is an association study? Define linkage disequilibrium
Genetic research designs in the real world Vishwajit L Nimgaonkar MD, PhD University of Pittsburgh
SHI Meng. Abstract The genetic basis of gene expression variation has long been studied with the aim to understand the landscape of regulatory variants,
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Genetic Analysis in Human Disease
Wrapup. NHGRI strategic plan What does the NIH think genomics should be for the next 10 years? [Nature, Feb. 2011]
Multiple Comparisons Measures of LD Jess Paulus, ScD January 29, 2013.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Ingredients for a successful genome-wide association studies: A statistical view Scott Weiss and Christoph Lange Channing Laboratory Pulmonary and Critical.
Fokkerij in genomics tijdperk Johan van Arendonk Animal Breeding and Genomics Centre Wageningen University.
Genetics: Past, Present, and Future Robert M. Fineman, M.D., Ph.D. Web siteWeb site.
MSc GBE Course: Genes: from sequence to function Genome-wide Association Studies Sven Bergmann Department of Medical Genetics University of Lausanne Rue.
Integrating domain knowledge with statistical and data mining methods for high-density genomic SNP disease association analysis Dinu et al, J. Biomedical.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Introduction of Cancer Molecular Epidemiology Zuo-Feng Zhang, MD, PhD University of California Los Angeles.
Give me your DNA and I tell you where you come from - and maybe more! Lausanne, Genopode 21 April 2010 Sven Bergmann University of Lausanne & Swiss Institute.
Introduction to Molecular Epidemiology Jan Dorman, PhD University of Pittsburgh School of Nursing
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao.
Bioinformatics Jan Taylor. A bit about me Biochemistry and Molecular Biology Computer Science, Computational Biology Multivariate statistics Machine learning.
Genomics Alexandra Hayes. Genomics is the study of all the genes in a person, as well as the interactions of those genes with each other and a person’s.
Understanding Genetics of Schizophrenia
Genetic Analysis in Human Disease. Learning Objectives Describe the differences between a linkage analysis and an association analysis Identify potentially.
Strong Heart Family Study Phase VI Genetics Center Aims October 8, 2009.
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Introduction to BST775: Statistical Methods for Genetic Analysis I Course master: Degui Zhi, Ph.D. Assistant professor Section on Statistical Genetics.
ConceptS and Connections
1 Introduction to Grant Writing Beth Virnig, PhD Haitao Chu, MD, PhD University of Minnesota, School of Public Health December 11, 2013.
Multifactorial Traits
The Center for Medical Genomics facilitates cutting-edge research with state-of-the-art genomic technologies for studying gene expression and genetics,
Case(Control)-Free Multi-SNP Combinations in Case-Control Studies Dumitru Brinza and Alexander Zelikovsky Combinatorial Search (CS) for Disease-Association:
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
©Edited by Mingrui Zhang, CS Department, Winona State University, 2008 Identifying Lung Cancer Risks.
CS177 Lecture 10 SNPs and Human Genetic Variation
Genome-Wide Association Study (GWAS)
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Jianfeng Xu, M.D., Dr.PH Professor of Public Health and Cancer Biology Director, Program for Genetic and Molecular Epidemiology of Cancer Associate Director,
Sampling Design in Regional Fine Mapping of a Quantitative Trait Shelley B. Bull, Lunenfeld-Tanenbaum Research Institute, & Dalla Lana School of Public.
Future Directions Pak Sham, HKU Boulder Genetics of Complex Traits Quantitative GeneticsGene Mapping Functional Genomics.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
The International Consortium. The International HapMap Project.
Association of functional polymorphisms of Bax and Bcl2 genes with schizophrenia Kristina Pirumya, PhD, Laboratory of Human Genomics and Immunomics Institute.
Practical With Merlin Gonçalo Abecasis. MERLIN Website Reference FAQ Source.
Fast test for multiple locus mapping By Yi Wen Nisha Rajagopal.
BIOSTATISTICS Lecture 2. The role of Biostatisticians Biostatisticians play essential roles in designing studies, analyzing data and creating methods.
GENOMICS TO COMBAT RESISTANCE AGAINST ANTIBIOTICS IN COMMUNITY-ACQUIRED LRTI IN EUROPE (GRACE) H. Goossens (Coordinator), K. Loens (Manager), M. Ieven.
Linkage. Announcements Problem set 1 is available for download. Due April 14. class videos are available from a link on the schedule web page, and at.
Analysis of Next Generation Sequence Data BIOST /06/2015.
An atlas of genetic influences on human blood metabolites Nature Genetics 2014 Jun;46(6)
Genome-Wides Association Studies (GWAS) Veryan Codd.
1 Finding disease genes: A challenge for Medicine, Mathematics and Computer Science Andrew Collins, Professor of Genetic Epidemiology and Bioinformatics.
Power and Meta-Analysis Dr Geraldine M. Clarke Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015 Africa Centre for.
Genomic Analysis: GWAS
Epidemiology and Genomics Research Program
Genome Wide Association Studies using SNP
Gene Hunting: Design and statistics
Epidemiology 101 Epidemiology is the study of the distribution and determinants of health-related states in populations Study design is a key component.
Beyond GWAS Erik Fransen.
Chapter 7 Multifactorial Traits
Exercise: Effect of the IL6R gene on IL-6R concentration
Perspectives from Human Studies and Low Density Chip
GWAS-eQTL signal colocalisation methods
M-H Pinard-van der Laan
Presentation transcript:

Hypergenes European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model UNIMI partner # 1

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model Relevance to the objectives of the Health Priority: “Translating research for human health”, in particular, the specific topic HEALTH : "Molecular epidemiological studies in existing well characterised European (and/or other) population cohorts". HYPERGENES is centered on the objective of “Integrating biological data and processes: large-scale data gathering” with major focus "on genomics, population genetics, comparative, structural and functional genomics".

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model Relevance to the objectives of the Health Priority: Why Hypertension ? clinical-epidemiological relevance environmental component genetic component

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Clinical domain:definition of the phenotype Prevalence of Hypertension25-35% >50% (older 60) Powerful risk factor for CVD Causally involved in 70% of strokes Causally involved in heart failure Any breakthrough will have far-reaching implications for millions of people.

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Interaction with the environment:the salt saga “The pulse hardens when eating too much salt” Huang Ti Nei Ching, 3000 BC 100 mmol/day Na predict 3-6 lower SBP and 3 mmHg lower DBP INTERSALT study 1988

0 12 weeks HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Interaction with the environment:the salt saga Miller JZ, et al: J Chronic Dis 1987

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Interaction with the environment:the salt saga What is “normal or “low” salt diet?

0 12 weeks HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Interaction with the environment:the salt saga Miller JZ, et al: J Chronic Dis 1987

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Genes & Hypertension Cusi “Eye-fit” quantitative genetics: 50% genes - 50% environment LL Cavalli Sforza est. up to 75% gene (considering dominance) “Eye-fit” quantitative genetics: 50% genes - 50% environment LL Cavalli Sforza est. up to 75% gene (considering dominance)

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi Will this affect the final phenotype (BP or TOD) ? It depends! - Redundancy of the system How many can we find ? Who knows !

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model PROGRESS BEYOND THE STATE OF ART: Cusi with additive or interactive effect

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Cusi Find genes responsible for EH and TOD, using a whole genome association/entropy-based approach Develop an integrated disease model, taking the environment into account, using an advanced bioinformatics approach Test the predictive ability of the model to identify individuals at risk.

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Cusi Find genes responsible for EH and TOD, using a whole genome association/entropy-based approach USING ALREADY EXISTING COHORTS, as specifically required by the call, in order to be able, not only to identify a panel of genes involved in the pathogenesis of hypertension, but also to be able to detect the complexity of gene-environment interaction. 11 Clinical Units

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Cusi Being HYPERTENSION the prototype of a common-complex disorder, the outcome of HYPERGENES will also provide a paradigm for disentangling of complex diseases. Designing a comprehensive genetic epidemiological model of complex traits will also help us to translate genetic findings into improved diagnostic accuracy and new strategies for early detection, prevention and eventually personalised treatment of a complex trait. Needless to say, the ultimate goal will be to promote the quality of life of EU populations.

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Cusi research activities areas: population genetics, molecular epidemiology, clinical sciences, bioinformatics, health information technology predicted outcomes in the fields of: prevention, early clinical diagnosis and treatment, increased knowledge about the aetiology of EH and TOD.

Hypergenes ST Methodology and strategy UNIMI partner # 1

HYPERGENES Overall strategy of the work plan […] HYPERGENES aims at disentangling the complex genetic mechanisms that lead to EH and EH related TOD, integrating a large amount of qualitative and quantitative data collected in large cohorts that have been recruited in collaborative studies over the years. Genetic and environmental data will eventually allow building different disease models of EH ad TOD [….] […] the overall objectives of HYPERGENES are: to assemble a European EH genetic epidemiology task force to validate an original methodology for analysis of genetics of complex diseases to deliver a validated bioinformatics infrastructure for the study of complex diseases to develop machine-learning methods supporting knowledge investigation to understand the pathogenesis of complex traits.

HYPERGENES Scientific and Technological Objectives and expected outcomes: layout of the project

HYPERGENES A simple look into the project ………… On already existing EU cohorts 1.Discover: GWAS on 4,000 CA & CO 2.Confirm SNPs (-> genes..) in 8 / 12,000 CA & CO 3.Make sense of findings: “meaning of genes” -> system biology (re)build mol/bioch pathways … 4.Integrate the information: Modeling GxE Correlation of dimensions 5.Build an etiological model of disease 6.Develop a molecular diagnostic tool

HYPERGENES 1)To identify the common genetic variants relevant for EH and TOD. 2)To design and implement appropriate computational tools. 3)To develop a comprehensive Biomedical Information Infrastructure (BII). 4)To create a “Web-Based Portal” to allow access to the BII in order to allow dissemination of knowledge. 5)To develop new methods, protocols and standards for genomic association analysis, gene annotation and molecular pathways. 6)To develop a set of Decision Support Systems tools combining genetic, clinical and environmental information. Scientific and Technological Objectives and expected outcomes:

HYPERGENES 7)To develop a simple, inexpensive genetic diagnostic chip, that can be validated in our existing well-characterized cohorts. 8)To strengthen the existing clinician-basic scientist collaborative network on the genetic mechanisms of EH. 9)To generate educational tools to support professional training on all aspects of the project, favoring mobility of PhD students and post-docs. 10)To disseminate HYPERGENES achievements through scientific meetings, teaching in tutorial sessions, publication in high-impact scientific journals etc. 11)To exploit the results in a traslational scenario Scientific and Technological Objectives and expected outcomes:

HYPERGENES Pert diagram

HYPERGENES Most Diseases are Complex Multifactorial

HYPERGENES Several decades of linkage and candidate gene studies Mutsuddi, …, Daly, Scolnick, Sklar AJHG 2006 Linkage studies inconclusive Many reported associations but little consistency, e.g. DTNBP1 Meta-analysis of 32 linkage studies in SCZ (3200 pedigrees, 7500 affecteds). Lewis, Ng, and Levinson, October 2007

HYPERGENES Candidate Gene Approach Where to put the light?

HYPERGENES Eight to ten moderate-sized pedigrees are sufficient to perform definitive mapping … …affected sib-pair method will require 25 pairs… Gene finding prediction circa 1987

HYPERGENES genes ????? genes

HYPERGENES AA AC CT GG TT CT.. AG AC CC GG AT TT.. AA AC CC GG AA CT.. GG AA CT AG AT TT.. AG AC CC GG AA CT.. AA CC CC GG AT CC ~ 1M + SNPs → ← ~1000s individuals Single Nucleotide Polymorphisms (WGAS, GWAS) Whole-genome association studies

HYPERGENES Affected Individuals (CASES)Not affected Individuals (CONTROLS)  Comparing the SNP script of the 2 populations Strategy : Comparing affected and not affected individuals

HYPERGENES Bird’s eye view of a WGAS Chromosomes Association, -log10(p)

HYPERGENES Some success stories… 10Mar2005

HYPERGENES QTs increase power and reduce needed sample sizes OR 1.3; MAF.2; QT 10%

HYPERGENES Challenges Genome-Wide Scan How to analyze datasets with 1 million SNPs? How to analyze datasets with 1 million GLMs? Dimensionality: The problem of false positives and false negatives Permutation Testing to correct for false positives

HYPERGENES Gene-based scenarios & Pathways Unit of analysis is now a gene, not an allelic variant –aggregate multiple, semi-independent single variant effects Various methods, probably similar –Sum statistics, combing chi-squares –Truncated rank product, combing p-values –Rank-based approaches, cf. gene-set enrichment analysis –Data-mining approaches to multivariate data Important to correct for LD and gene size Increased susceptibility to stratification & confounding a concern

HYPERGENES GenotypeRisk of disease AA0.001 AG0.001 GG0.95 Disease prevalence ~1 in 1000 Individuals with GG are ~1000 times more likely to get disease Frequency of G in controls~ 5% Frequency of G in cases~ 96% Disease prevalence ~1 in 1000 Individuals with GG are ~1000 times more likely to get disease Frequency of G in controls~ 5% Frequency of G in cases~ 96% Rare disease, major gene effect

HYPERGENES Common disease, polygenic effects GenotypeRisk of disease AA0.01 AG0.012 GG Disease prevalence ~1 in 100 Each extra G allele increases risk by ~1.2 times Frequency of G in controls~ 5% Frequency of G in cases~ 6% Disease prevalence ~1 in 100 Each extra G allele increases risk by ~1.2 times Frequency of G in controls~ 5% Frequency of G in cases~ 6%

HYPERGENES Gene selectionCriteria Association Tissue localization Literature Others Interaction identificationCriteria Genetic interaction Physical interaction Literature Others Cluster identificationCriteria Superposition of interactions Clustering of selected genes Overall score BUILDING THE NETWORK Identifying the pathways

HYPERGENES European Network for Genetic-Epidemiological Studies Building a method to dissect complex genetic traits using essential hypertension as a disease model CONCEPT & OBJECTIVES: Find genes responsible for EH and TOD, using a whole genome association/entropy-based approach USING ALREADY EXISTING COHORTS, as specifically required by the call, in order to be able, not only to identify a panel of genes involved in the pathogenesis of hypertension, but also to be able to detect the complexity of gene-environment interaction. 11 Clinical Units P8 (Imperial)

HYPERGENES Size does matter… Proportional increase in sample size for same power (80%) Number of tests (log scale) Proportional increase in sample size required to maintain 80% power after conservative Bonferroni correction Benefit of increasing sample size beats the cost of increasing testing burden

HYPERGENES Combined analysis and replication DesignThresholdPower (%) “Discovery” 1000/1000 1e-56.1 “Replication” 1000/1000 1e Power to detect and replicate 1e-5×1e-2 =1e Combined analysis1e If data exist, usually more powerful to analyse jointly, versus splitting samples to achieve “replication” –Power calculations for a “typical” common variant –1.2-fold risk variant –40% MAF

HYPERGENES Power to detect at least 1 polygene Note, for a single gene, this level of power (10%) is poor But given >1 gene, power to detect at least 1 rises –i.e. heritable, polygenic disease A study that finds only 1 or 2 new genes of many, still probably a big success…

HYPERGENES Expected (-logP) Observed (-logP) Captures 78.7% of common variation ( MAF>=5%, including multimarker tests ) Q-Q plot for primary BP sample The “million-dollar plot”

HYPERGENES Evaluation of significance Pre-determined genome-wide threshold –1×10 -6 (Risch and Merikangas) –5×10 -8 (HapMap, equivalent to 1 million independent tests) Pre-determined number of SNPs for follow-up –e.g. fixed by replication study genotyping platform Various statistical corrections for actual number of (independent) tests, or alternate metrics –Bonferroni correction not actually that bad, as any one test is uncorrelated with vast majority of other tests –Permutation –False discovery rate –False positive report probability –Bayes factor Incorporation of prior information –From other studies (WGAS, linkage, candidate genes, expression, functional) –From sequence annotation (conserved/genic regions, enhancers, regulators, etc)

HYPERGENES Limitations of Genome-Wide Scan Problem of false positives: –Correction for number tests e.g. Bonferroni Number of SNPs e.g. 100K or 500K Numbers of genes ~26,000 Brain expressed genes ~15,000 –Permutation test on SNP –Confirmation in an independent sample –Biological Plausibility & Implications cannot be avoided regardless of N