Genome Wide Association Study (GWAS) and Personalized Medicine

1 Genome Wide Association Study (GWAS) and Personalized Medicine

2 Outline Gene discovery and personalized medicine
Family linkage-based approach Candidate gene-based approach Whole genome scan (Genome-wide association study) Genome wide association study (GWAS) Objectives and approaches Benefits and challenges Resources and requirements Technologies A case study – Genome-Wide Study of Exanta Hepatic Adverse Events

3 Human Genome Project – Hunting for disease genes
February 15 & 16, 2001 Science and Nature Implications: Scientific advancement Enhanced public health Potential social issues Genome

4 Relationship between genes and diseases - Single Gene-Driven Diseases
Rare and familial diseases caused by mutations in a single gene (e.g., cystic fibrosis and sickle-cell anemia) AGCT AGGGCCTT Genome

5 Family Linkage-Based Approach
Identify Genetic Profile Through Gene Discovery - Approaches and Technologies Family Linkage-Based Approach Use the linkage principle to study families in which the disease occur frequently Identify disease-susceptibility genes in rare familial diseases More successful for diseases caused by a single gene (e.g., Huntington’s disease) More successful for genes strongly increasing risk Need a well documented family tree and disease history Successful far less likely for some heritable diseases caused by interaction of many weak genes

6 Relationship between genes and diseases - Multiple Gene-Driven Diseases
Many genes interact each to cause disease No single gene has strong effect Must search for multiple genes functionally involved in putative disease-associated biomedical pathways Genome

7 Candidate Gene-Based Approach
Identify Genetic Profile Through Gene Discovery - Approaches and Technologies (cont.) Candidate Gene-Based Approach Process Select genes from known disease-related pathways Search for causative mutations in the genes e.g., ACH/Charlotte Hobbs Knowledge-based approach Drawbacks: Constrained by existing knowledge Constrained by genes examined

8 A More Complicated Picture
Genetics loads the gun, but environment pulls the trigger Interaction between disease genes and patients’ life style and/or environment Genome

9 A Realistic Picture + + = Diverse responses to treatment
Same (similar) symptom + One-fits-all

10 Diverse response to a one-fits-all treatment
Optimal responders Suboptimal responders Non- responders Adverse Events

11 From One-Fits-All to Personalized Medicine
Based on patients’ genetic profile, selecting patients  treatment Optimal responders Suboptimal responders Adverse Events Non- responders

12 A New Way to Determine Genetic Profile - Whole Genome Scanning
Search all possible SNPs, not mutations, in all genes; Yah, right ! Genome

13 Genetic Profile – From Mutation to SNPs
Mutations and SNPs are both genetic variation <1% of genetic variations are disease related, & called mutations; Mutations considered harmful and disease related The majority of genetic variation is not disease related (>1%),& called SNPs SNPs comprise “harmless” genetic variation (personalized) SNPs can be used as markers for disease genes GWAS is searching for SNPs marking disease causing mutations

14 The Era of the Genome Wide Association Study (GWAS)
A brute force approach of examining the entire genome to identify SNPs that might be disease causing mutations Far exceeds the scope of family linkage and candidate gene approaches Must obtain a comprehensive picture of all possible genes involved in a disease and how they interact Objective: Identify multiple interacting disease genes and their respective pathways, thus providing a comprehensive understanding of the etiology of disease

15 GWAS Approach Case Matched/unmatched Control Association:
Individual SNPs Alleles Haplotype (combination of SNPs) Disease related: Genes Pathways Loci

16 Benefits and Challenges
Challenges: the uncertainty between SNPs and the disease-causing mutation requires large sample size 2000 – 4000 sample sizes Minimum 1000 Unfortunately, most experiments have < 500 samples Why the enthusiasm about GWAS: Comprehensive scan of the genome in an unbiased fashion has potential to identify totally novel disease genes or susceptibility factors Potential to identify multiple interacting disease genes and their respective/shared pathways

17 Requirements Success factors Experimental: large sample size
Platform: accurate genotyping technology Analysis Comprehensive SNP maps Rapid algorithm IT Sophisticated IT infrastructure Powerful computers Expertise (NCTR) Medical doctors (NA) HTP genotyping platforms (NA) Population genetics (NA) Biostatistics (Yes) Bioinformatics (Yes) Statistics (Yes)

18 SNP Map Current technology not advanced enough to encompass all SNPs; not even close Selecting SNPs based on haplotype block Issues related to haplotype A SNP pattern consistent across a population Population-dependent Analysis method-dependent One of the objectives of HapMap LD Hyplotype Block Selecting SNPs

19 Selection of SNPs for GWAS

20 High-Throughput Genotyping Technology
Several diverse technologies, but moving to array-based approaches Array-based technologies: Illumina, Affymetrix, Perlegen and NimbleGene Very similar to the technology used for gene expression microarray

21 7 positions 2 alleles 2 strands 2 probes (PM/MM) Total 56 features


23 Downstream Analysis (QC)

24 Current Practice: A Combination of Candidate Gene Approach and GWAS
Data-driven Generates new knowledge Relies on a SNP map Hypothesis-driven Constrained by knowledge Allows systematic scanning Candidate gene approach

25 Case Study: Genome-Wide Study of Exanta Hepatic Adverse Events
Ximelagatran, marketed as ExantaTM, developed by AZ Developed/tested Prevention of stroke in atrial fibrillation Treatment of acute venous thromboembolism Withdrawn from clinical development in 2006 because of ALT elevation: Idiosyncratic nature: occurred in 6-7% of patients with ALT> 3 x upper limit normal (ULN) Geographic dependent: high incidence in Northern Europe compared with Asia Hypothesis: Genetic factors could be involved Approaches: GWAS and candidate gene approaches

26 Samples (Subjects or Patients)
The original set (Training set) 248 subjects from 80 regions in Europe (Denmark, Finland, Germany, Noway, Poland, Sweden and the UK) 74 Cases = ALT elevation > 3 x ULN 132 Control = ALT elevation < 1 x ULN 39 Intermediate Control = ALT elevation >1 x ULN and <3 x ULN An independent data set available late time 10 Cases and 16 Treated Controls

27 Experiment Design and Process
Candidate gene Approach GWAS Genotyping 690 genes 26,613 SNPs SNP/gene=40 Phase I 266,722 SNPs Association analysis of SNPs with elevated ALT: Matched and unmatched case-control analysis Fisher’s Exact test, ANOVA, logistic regression analysis; Multiple testing correction (FDR) Haplotype and linkage disequilibrium (LD) analysis 145 genes 76 genes 42,742 SNPs SNP/gene=200 Phase II 28 SNPs Representing 20 top-ranked genes

28 Drill-Down and Knowledge-Driven Analysis
Candidate gene Approach HLA-DRB1 region HLA-DQA1 region p-value SNP A lowest 690 genes 26,613 SNPs SNP/gene=40 Phase I DRB1*07 Haplotype 145 genes 76 genes DQB1*02 42,742 SNPs SNP/gene=200 Phase II 28 SNPs

29 Validated by the Test Set
Test set (replication study) 10 Cases and 16 Controls Both DRB1*07 and DQB1*02 are significant Only 2 of 28 SNPs are significant, might be due to: False positive in Phase I Lack of power A note: Phases I and II genotyping using the Perlegen technology Replication study using the TaqMan assay

30 Summary Emphasis more on the candidate gene approach; candidate genes were selected from Involved in MOA of Exanta Associated with elevated liver enzyme (e.g., ALT) Derived from preclinical studies for Exanta Found to be genetically associated with adverse effects Supported by the findings in Phase I Some evidence obtained from the candidate gene approach (select 145 genes from among 690) No evidence from GWAS (76 genes were selected) Reflected in the drill-down approach Focused on the gene/region with the lowest p-value SNP from the candidate gene approach; both SNPs identified this way are significant 2 out of 28 SNPs are significant from GWAS

31 My general impression This study presents the evidence from a comparative analysis between two approaches Knowledge-guided vs high-throughput screening Hypothesis driven vs data driven Less emphasis on GWAS and more reliance on the results from the candidate gene approach Due to lack of power Multiple testing correction issue Is GWAS ready for the prime time? Results from this study are not encouraging Further investigation/survey is urgently needed

