Cancer Genomics Lecture Outline

Slides:



Advertisements
Similar presentations
The Past, Present, and Future of DNA Sequencing
Advertisements

 Sequencing technology › Roche/454 GS-FLX (‘454’) › Illumina  Prokaryotic profiling › De novo genome sequencing › Metagenomics › SNP profiling › Species.
Next–generation DNA sequencing technologies – theory & practice
Next-generation sequencing
The past, present, and future of DNA sequencing Dan Russell.
The 454 and Ion PGM at the Genomics Core Facility Dr. Deborah Grove, Director for Genetic Analysis Genomics Core Facility Huck Institutes of the Life Sciences.
Canadian Bioinformatics Workshops
What Is Genomics? Genomics is the study of how the entire genome of a species functions as a unit and evolves over time. It is the study of life’s blueprint,
A Lot More Advanced Biotechnology Tools DNA Sequencing.
Biotechnology and Recombinant DNA
Workshop in Bioinformatics 2010 Class # Class 8 March 2010.
CS273a Lecture 1, Autumn 10, Batzoglou DNA Sequencing.
Goals of the Human Genome Project determine the entire sequence of human DNA identify all the genes in human DNA store this information in databases improve.
Genome Analysis Determine locus & sequence of all the organism’s genes More than 100 genomes have been analysed including humans in the Human Genome Project.
CS 6293 Advanced Topics: Current Bioinformatics
Update on Next-Generation Sequencing
DNA Sequencing Today, laboratories routinely sequence the order of nucleotides in DNA. DNA sequencing is done to: Confirm the identity of genes isolated.
Next Generation Sequencing – Benefits for Patients Jo Whittaker/ Su Stenhouse.
Dr Katie Snape Specialist Registrar in Genetics St Georges Hospital
DNA Technology and Genomics
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
High Throughput Sequencing Methods and Concepts
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
Human Genome Project by: Amanda Mosello. What is the Human Genome Project? created in 1990, by the National Institutes of Health and the US Department.
AP Biology A Lot More Advanced Biotechnology Tools Sequencing.
Applications of DNA technology
Section 2 Genetics and Biotechnology DNA Technology
A Lot More Advanced Biotechnology Tools (Part 1) Sequencing.
High Throughput Sequencing Methods and Concepts Cedric Notredame adapted from S.M Brown.
The History of Cancer and Its Treatments Russell Doolittle, PhD Osher Lecture 5 May 8, 2013 Recent History (after a reminder of how DNA encodes proteins)
Copyright © 2010 Pearson Education, Inc. Lectures prepared by Christine L. Case Chapter 9 Biotechnology and Recombinant DNA.
Stratton Nature 45: 719, 2009 Evolution of DNA sequencing technologies to present day DNA SEQUENCING & ASSEMBLY.
Copy Number Variation Eleanor Feingold University of Pittsburgh March 2012.
PHYSICAL MAPPING AND POSITIONAL CLONING. Linkage mapping – Flanking markers identified – 1cM, for example Probably ~ 1 MB or more in humans Need very.
Genetic Screening. Objectives Be able to describe the processes involved in genetic screening through DNA hybridisation. Be able to describe the role.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
GENE SEQUENCING. INTRODUCTION CELL The cells contain the nucleus. The chromosomes are present within the nucleus.
Locating and sequencing genes
Johnson - The Living World: 3rd Ed. - All Rights Reserved - McGraw Hill Companies Genomics Chapter 10 Copyright © McGraw-Hill Companies Permission required.
Genomics Part 1. Human Genome Project  G oal is to identify the DNA sequence of every gene in humans Genome  all the DNA in one cell of an organism.
Cancer Genome Landscapes
INTERPRETING GENETIC MUTATIONAL DATA FOR CLINICAL ONCOLOGY Ben Ho Park, M.D., Ph.D. Associate Professor of Oncology Johns Hopkins University May 2014.
Unit 1 – Living Cells.  The study of the human genome  - involves sequencing DNA nucleotides  - and relating this to gene functions  In 2003, the.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
Title: Studying whole genomes Homework: learning package 14 for Thursday 21 June 2016.
Topic Cloning and analyzing oxalate degrading enzymes to see if they dissolve kidney stones with Dr. VanWert.
Cse587A/Bio 5747: L2 1/19/06 1 DNA sequencing: Basic idea Background: test tube DNA synthesis DNA polymerase (a natural enzyme) extends 2-stranded DNA.
Next-generation sequencing technology
Next generation sequencing
Cancer Genomics Lecture Outline
DNA Sequencing -sayed Mohammad Amin Nourion -A’Kia Buford
Next-generation sequencing technology
Genomes and Their Evolution
copying & sequencing DNA
Higher Biology Genomic Sequencing Mr G R Davidson.
Human Cells Human genomics
Relationship between Genotype and Phenotype
Polymerase Chain Reaction (PCR)
DNA Sequencing The DNA from the genome is chopped into bits- whole chromosomes are too large to deal with, so the DNA is broken into manageably-sized overlapping.
CHAPTER 12 DNA Technology and the Human Genome
Genomes and Their Evolution
DNA Sequencing A population of DNA fragments is generated.
DNA Sequencing A population of DNA fragments is generated.
DNA and the Genome Key Area 8a Genomic Sequencing.
BIOL 433 Plant Genetics Term 2,
Plant Biotechnology Lecture 2
Introduction to Sequencing
Relationship between Genotype and Phenotype
A Lot More Advanced Biotechnology Tools
Presentation transcript:

Cancer Genomics Lecture Outline How do you do whole genome sequencing? with massively parallel sequencing (slides courtesy of Jason Lieb) What has this revealed about cancer genomes? How many mutations of what type? What are the ethical clinical implications about whole genome sequencing of patients? what do you tell a patient about their genome?

Lessons from Cancer Genomics Sequence lots of cancer genomes. Why do this? It gives a more precise definition of cancer in general and of YOUR cancer in particular The potential for personalized or “precision” medicine Tumor heterogeneity: Results from tumor evolution; genetic changes over time Causes resistance to chemotherapy via natural selection this is why you relapse

Tumor heterogeneity makes cancer treatment more difficult The Downside of Diversity Science 29 March 2013: vol. 339 no. 6127 1543-1545

Genetic heterogeneity among the cells of an individual tumor always exists and can impact the response to therapeutics. Four types of genetic heterogeneity in tumors, illustrated by a primary tumor in the pancreas and its metastatic lesions in the liver. Mutations introduced during primary tumor cell growth result in clonal heterogeneity. At the top left, a typical tumor is represented by cells with a large fraction of the total mutations (founder cells) from which subclones are derived. The differently colored regions in the subclones represent stages of evolution within a subclone. (A) Intratumoral: heterogeneity among the cells of the primary tumor. (B) Intermetastatic: heterogeneity among different metastatic lesions in the same patient. In the case illustrated here, each metastasis was derived from a different subclone. (C) Intrametastatic: heterogeneity among the cells of each metastasis develops as the metastases grow. (D) Interpatient: heterogeneity among the tumors of different patients. The mutations in the founder cells of the tumors of these two patients are almost completely distinct (see text). Published by AAAS Fig. 6 Four types of genetic heterogeneity in tumors, illustrated by a primary tumor in the pancreas and its metastatic lesions in the liver. B Vogelstein et al. Science 2013;339:1546-1558

Cancer Genomics Lecture Outline How do you do whole genome sequencing? with massively parallel sequencing What has this revealed about cancer genomes? How many mutations of what type? What are the ethical clinical implications about whole genome sequencing of patients? what do you tell a patient about their genome?

Why Sequence Whole Genomes? To speed characterization of genes mapped by linkage To obtain a "parts list" for what makes up an organism. All of the instructions are there, and the book is in front of us. Now, we just need to figure out what it all means. To discover what sets of genes make organisms (and each of us) similar to and different from one another. To understand our evolutionary heritage. Our genomes are a reflection of our recent and ancient origins.

But sequencing was too slow and too expensive to go for whole genomes How Are Genomes Sequenced? In the 1980's, all of the tools were in place for genome sequencing to begin Vectors for making genomic libraries PCR for amplifying genes DNA sequencing machines But sequencing was too slow and too expensive to go for whole genomes

Di-deoxy nucleotides are used for "Sanger" sequencing 5’ 4’ 1’ 3’ 2’ 2’, 3’ dideoxy nucleotide triphosphate

How it works: Dideoxy nucleotides terminate the DNA chain at positions according to DNA sequence What happens if we put ddATP into our reaction? 20 nt 19 nt P32 label on primer ddA 10 nt ddA ddA

Sanger sequencing Single primer PCR + chain terminating nucleotides Fredrick Sanger (1975) 1980 Nobel prize 1 molecule at a time First genome sequence (1977) ΦX174 phage 5,386 bp ~1000 bp per run Currently $5 / sequence (2 bp / cent)

Modern Sanger sequencing uses fluorescent dyes instead of radioactivity and capillary tubes instead of gels

Sanger Sequencing Output

A strategy called Shotgun sequencing, coupled with advances in sequencing technology, robotics, and computers, made "Genomics" possible 1. Blow genome to bits 2. Sequence the little bits 3. Put all the little bits back together in the right order based on their sequence 4. Assemble longer and longer pieces until the genome is complete.

There are always some gaps after shotgun sequencing

Paired-end reads can identify clones that span sequence gaps Note that some gaps may never be closed. For example, it is difficult to assemble repetitive DNA sequence

Using these strategies, between 1995 and 2003, the genome sequence of over 240 organisms had been determined 1977: First viral sequence ΦX174 (5.3 kb) 1995: First complete prokaryotic genome (H. influenzae, 1.8 Mb) 1996: First eukaryote, S. cerevisiae, 12 Mb 1998: First animal, C. elegans, 100 Mb 2001: Drosophila, 180 Mb (first genome by shotgun) 2001:“Draft” human sequence, 3 Gb. Book (2008): “Large Sequencing Centers can now assemble a mammalian genome in 1-2 years”. 2011: 30X genome in 10 days for ~$10,000-20,000 2012: 30X genome in 3 days for ~$3,000

The Sequence Explosion Nature, Feb. 2010

Next-generation Sequencing (Deep Sequencing) Massively parallel sequencing of short DNA fragments It will not replace Sanger sequencing for small sequencing projects. It has changed the approach for large scale sequencing projects and genome research. Time frames and cost for genome sequencing (Sanger): 1997 Yeast genome 13Mb – ~10 years; cost $30M 2003 Human genome ~3Gb – 15 years (1988-2003); $2.7 billion 1991 dollars Genome sequencing (20 x coverage) in September 2008 (2 Illumina GAII): Yeast genome 13Mb – 8 genomes per 6 day run; cost $1200/genome Human genome ~3Gb – 10-15 weeks; cost ~$150k Genome sequencing (20 x coverage) in March 2009 (3 Illumina systems GAII): Human genome ~3Gb: 7 weeks; cost ~$75k over 10,000 times cheaper Genome sequencing (20 x coverage) in March 2011 (1 Illumina Hi-seq): Human genome ~3Gb: 10 days; cost ~$15k Over 50,000 times cheaper, 500 times faster Jan 2012 update: $3-5k, 2x faster Genome sequencing (20 x coverage) in 2015 (1 Illumina Hi-seq 2500): Human genome for only $1000 (60 billion bases)!!

“Some Next Gen” sequencing platforms Roche Genome Sequencer FLX System (454) Illumina Genome Analyzer (GAII and HiSeq) aka “Solexa” ABI SOLiD System HeliScope Single Molecule Sequencer Shendure J. & Hanlee J., Nature biotechnology 2008

Illumina Sequencing Barcodes DNA (0.1-1.0 ug) Sample preparation 5’ 3’ G T C A DNA (0.1-1.0 ug) T C A G T Sample preparation Cluster growth Single molecule array Sequencing 1 2 3 7 8 9 4 5 6 Image acquisition Base calling T G C T A C G A T … Barcodes

Current status 2 flow cells per machine 8 lanes per flow cell 200-300 million molecules sequenced per lane 50 – 250 bp single or paired-end reads $800-$2000 per lane ~250,000 bp / cent The High-Throughput Sequencing Facility's sequencing bank is one of the largest in the country with 10 Illumina HiSeqs, a PacBio RS2, and several Ion Torrents and Protons.

Cancer Genomics Lecture Outline How do you do whole genome sequencing? with massively parallel sequencing What has this revealed about cancer genomes? How many mutations of what type? What are the ethical clinical implications about whole genome sequencing of patients? what do you tell a patient about their genome?

Some thoughts from sequencing cancer genomes Gatekeeper or driver mutations Passenger mutations No metastatic genes

Fig. 1 Number of somatic mutations in representative human cancers, detected by genome-wide sequencing studies. Most human cancers are caused by two to eight sequential alterations that develop over the course of 20 to 30 years. Each of these alterations directly or indirectly increases the ratio of cell birth to cell death; that is, each alteration causes a selective growth advantage to the cell in which it resides. Number of somatic mutations in representative human cancers, detected by genome-wide sequencing studies. (A) The genomes of a diverse group of adult (right) and pediatric (left) cancers have been analyzed. Numbers in parentheses indicate the median number of nonsynonymous mutations per tumor. (B) The median number of nonsynonymous mutations per tumor in a variety of tumor types. Horizontal bars indicate the 25 and 75% quartiles. MSI, microsatellite instability; SCLC, small cell lung cancers; NSCLC, non–small cell lung cancers; ESCC, esophageal squamous cell carcinomas; MSS, microsatellite stable; EAC, esophageal adenocarcinomas. The published data on which this figure is based are provided in table S1C. Published by AAAS B Vogelstein et al. Science 2013;339:1546-1558

There are two types of cancer “driver” genes (e.g. oncogenes and tumor suppressors) Mut-driver genes: The evidence to date suggests that there are ~140 genes whose intragenic mutations contribute to cancer. Epi-driver genes: Genes that are altered by epigenetic mechanisms and cause a selective growth advantage. The definitive identification of epi-driver genes has been challenging. Published by AAAS B Vogelstein et al. Science 2013;339:1546-1558

Fig. 5 Number and distribution of driver gene mutations in five tumor types. Number and distribution of driver gene mutations in five tumor types. The total number of driver gene mutations [in oncogenes and tumor suppressor genes (TSGs)] is shown, as well as the number of oncogene mutations alone. The driver genes are listed in tables S2A and S2B. Translocations are not included in this figure, because few studies report translocations along with the other types of genetic alterations on a per-case basis. In the tumor types shown here, translocations affecting driver genes occur in less than 10% of samples. The published data on which this figure is based are provided in table S1E. B Vogelstein et al. Science 2013;339:1546-1558 Published by AAAS

Fig. 3 Total alterations affecting protein-coding genes in selected tumors. Total alterations affecting protein-coding genes in selected tumors. Average number and types of genomic alterations per tumor, including single-base substitutions (SBS), small insertions and deletions (indels), amplifications, and homozygous deletions, as determined by genome-wide sequencing studies. For colorectal, breast, and pancreatic ductal cancer, and medulloblastomas, translocations are also included. The published data on which this figure is based are provided in table S1D. B Vogelstein et al. Science 2013;339:1546-1558 Published by AAAS

Fig. 4 Distribution of mutations in two oncogenes (PIK3CA and IDH1) and two tumor suppressor genes (RB1 and VHL). Distribution of mutations in two oncogenes (PIK3CA and IDH1) and two tumor suppressor genes (RB1 and VHL). The distribution of missense mutations (red arrowheads) and truncating mutations (blue arrowheads) in representative oncogenes and tumor suppressor genes are shown. The data were collected from genome-wide studies annotated in the COSMIC database (release version 61). For PIK3CA and IDH1, mutations obtained from the COSMIC database were randomized by the Excel RAND function, and the first 50 are shown. For RB1 and VHL, all mutations recorded in COSMIC are plotted. aa, amino acids. B Vogelstein et al. Science 2013;339:1546-1558 Published by AAAS

The Hallmarks of Cancer Hanahan and Weinberg, Cell 144:646 (2011)

Fig. 7 Cancer cell signaling pathways and the cellular processes they regulate. The known driver genes function through a dozen signaling pathways that regulate three core cellular processes: cell fate determination, cell survival, and genome maintenance. Cancer cell signaling pathways and the cellular processes they regulate. All of the driver genes listed in table S2 can be classified into one or more of 12 pathways (middle ring) that confer a selective growth advantage (inner circle; see main text). These pathways can themselves be further organized into three core cellular processes (outer ring). The publications on which this figure is based are provided in table S5. Published by AAAS B Vogelstein et al. Science 2013;339:1546-1558

Fig. 8 Signal transduction pathways affected by mutations in human cancer. Every individual tumor, even of the same histopathologic subtype as another tumor, is distinct with respect to its genetic alterations, but the pathways affected in different tumors are similar. Signal transduction pathways affected by mutations in human cancer. Two representative pathways from Fig. 7 (RAS and PI3K) are illustrated. The signal transducers are color coded: red indicates protein components encoded by the driver genes listed in table S2; yellow balls denote sites of phosphorylation. Examples of therapeutic agents that target some of the signal transducers are shown. RTK, receptor tyrosine kinase; GDP, guanosine diphosphate; MEK, MAPK kinase; ERK, extracellular signal–regulated kinase; NFkB, nuclear factor κB; mTOR, mammalian target of rapamycin. Published by AAAS B Vogelstein et al. Science 2013;339:1546-1558

RESEARCH IN “MODEL” ORGANISMS! “The vast majority of our knowledge of the function of driver genes has been derived from the study of the pathways through which their homologs work in nonhuman organisms.” We believe that greater knowledge of these pathways and the ways in which they function is the most pressing need in basic cancer research. B Vogelstein et al. Science 2013;339:1546-1558

Cancer Genomics Lecture Outline How do you do whole genome sequencing? with massively parallel sequencing What has this revealed about cancer genomes? How many mutations of what type? What are the ethical clinical implications about whole genome sequencing of patients? what do you tell a patient about their genome?

Some Definitions Regarding WGS Actionable Item when you can actually do something for the patient Clinical Validity the data supporting a genotype/phenotype relationship is strong Clinical Utility you can address (e.g. treat or advise) a clinically valid genotype The “incidentalome” the vast majority of a patients genetic variants will be unrelated to the presenting symptoms

PGx: pharmacogenetics Berg et al. Genetics IN Medicine • Volume 13, Number 6, June 2011

Computationally binning human genetic variants We categorized 2,016 genes linked with Mendelian diseases into “bins” based on clinical utility and validity, and used a computational algorithm to analyze 80 whole-genome sequences in order to explore the use of such an approach in a simulated real world setting. Berg et al. Volume 15 | Number 1 | January 2013 | Genetics in medicine