Detection of structural variants and copy number alterations in cancer: from computational strategies to the discovery of chromothripsis in neuroblastoma.

Slides:



Advertisements
Similar presentations
Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
Advertisements

BiNoM, a Cytoscape plugin for accessing and analyzing pathways using standard systems biology formats Eric Bonnet Computational Systems Biology of Cancer.
Bioinformatics at Molecular Epidemiology - new tools for identifying indels in sequencing data Kai Ye
Bioinformatics lectures at Rice University Li Zhang Lecture 10: Networks and integrative genomic analysis-2 Genome instability and DNA copy number data.
DETECTING CNV BY EXOME SEQUENCING Fah Sathirapongsasuti Biostatistics, HSPH.
Tumour karyotype Spectral karyotyping showing chromosomal aberrations in cancer cell lines.
Yanxin Shi 1, Fan Guo 1, Wei Wu 2, Eric P. Xing 1 GIMscan: A New Statistical Method for Analyzing Whole-Genome Array CGH Data RECOMB 2007 Presentation.
Genomic Arrays: Tools for cancer gene discovery Ian Roberts MRC Cancer Cell Unit Hutchison MRC Research Centre
Copyright OpenHelix. No use or reproduction without express written consent1.
Lab 3.41 Demo: Exploiting the UCSC Genome Browser Stefanie Butland UBC Bioinformatics Centre
Large-Scale Copy Number Polymorphism in the Human Genome J. Sebat et al. Science, 305:525 Luana Ávila MedG 505 Feb. 24 th /24.
Whole Exome Sequencing for Variant Discovery and Prioritisation
Detecting copy number variations using paired-end sequence data Nick Furlotte CS224 May 29, 2009.
Li and Dewey BMC Bioinformatics 2011, 12:323
MES Genome Informatics I - Lecture VIII. Interpreting variants Sangwoo Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute,
Genetics-multistep tumorigenesis genomic integrity & cancer Sections from Weinberg’s ‘the biology of Cancer’ Cancer genetics and genomics Selected.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Supplemental data 2. Breast cancer primary tumor, metastasis and xenograft Total copy number gain (green), loss (red) and unchanged (black) for primary.
The iPlant Collaborative
Genomics Method Seminar - BreakDancer January 21, 2015 Sora Kim Researcher Yonsei Biomedical Science Institute Yonsei University College.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
BRUDNO LAB: A WHIRLWIND TOUR Marc Fiume Department of Computer Science University of Toronto.
We obtained breast cancer tissues from the Breast Cancer Biospecimen Repository of Fred Hutchinson Cancer Research Center. We performed two rounds of next-gen.
Developed at the Broad Institute of MIT and Harvard Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, and Mesirov JP. GenePattern 2.0. Nature Genetics 38.
Identification of Copy Number Variants using Genome Graphs
W e present BiNoM [1,2], a Cytoscape plugin, developed to facilitate the manipulation of biological networks represented in standard systems biology formats.
Maxwell Lee National Cancer Institute Center for Cancer Research High-dimension Data Analysis Group March 19, 2014 Integrated Studies Of Breast, Esophageal,
Computational Laboratory: aCGH Data Analysis Feb. 4, 2011 Per Chia-Chin Wu.
Lecture 11. Topics in Omic Studies (Cancer Genomics, Transcriptomics and Epignomics) The Chinese University of Hong Kong CSCI5050 Bioinformatics and Computational.
Cluster validation Integration ICES Bioinformatics.
Tutorial 6 High Throughput Sequencing. HTS tools and analysis Review of resequencing pipeline Visualization - IGV Analysis platform – Galaxy Tuning up.
Ishida et al. Supplementary Figures 1-3 Page 1 Supplementary Fig. 1. Stepwise determination of genomic aberrations on chr-13 in medulloblastomas from Ptch1.
__________________________________________________________________________________________________ Fall 2015GCBA 815 __________________________________________________________________________________________________.
Supplemental Figure 1. Bias-corrected NGS bioinformatics strategies. Paired-end DNA sequencing reveals the sequence of the genomic clone, the sample ID.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Ke Lin 23 rd Feb, 2012 Structural Variation Detection Using NGS technology.
From: Duggan et.al. Nature Genetics 21:10-14, 1999 Microarray-Based Assays (The Basics) Each feature or “spot” represents a specific expressed gene (mRNA).
VizTree Huyen Dao and Chris Ackermann. Introducing example
Calling Somatic Mutations using VarScan
Whole Slide Image Stitching for Osteosarcoma detection Ovidiu Daescu Colaborators: Bogdan Armaselu and Harish Babu Arunachalam University of Texas at Dallas.
Canadian Bioinformatics Workshops
Visualizing data from Galaxy
Practice:submit the ChIP_Streamline.pbs 1.Replace with your 2.Make sure the.fastq files are in your GMS6014 directory.
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
A comparison of somatic mutation callers in breast cancer samples and matched blood samples THOMAS BRETONNET BIOINFORMATICS AND COMPUTATIONAL BIOLOGY UNIT.
Canadian Bioinformatics Workshops
High-throughput genomic profiling of tumor-infiltrating leukocytes
Cancer Metastases Classification in Histological Whole Slide Images
Nebula : A public web-server for advanced ChIP-seq data analysis
Canadian Bioinformatics Workshops
Gene expression.
A B TAPS Patchwork Allelic imbalance Allelic imbalance ratio Log R
Department of Computer Science
High-Resolution Genomic Profiling of Disseminated Tumor Cells in Prostate Cancer  Yu Wu, Jamie R. Schoenborn, Colm Morrissey, Jing Xia, Sandy Larson, Lisha.
Assessing Copy Number Alterations in Targeted, Amplicon-Based Next-Generation Sequencing Data  Catherine Grasso, Timothy Butler, Katherine Rhodes, Michael.
Sahar Al Seesi University of Connecticut CANGS 2017
Ranking Tumor Phylogeny Trees by Likelihood
Ewing sarcoma tumors acquire somatic aberrancies with treatment.
European Urology Oncology
Genomic alterations in breast cancer cell line MDA-MB-231.
Diverse abnormalities manifest in RNA
Defining Ploidy-Specific Thresholds in Array Comparative Genomic Hybridization to Improve the Sensitivity of Detection of Single Copy Alterations in Cell.
BF528 - Genomic Variation and SNP Analysis
High-Definition Reconstruction of Clonal Composition in Cancer
Standards and formats in systems biology
Concordance between the genomic landscape identified by whole-exome sequencing of plasma cfDNA and tumor; DNA and recurrence of KDR/VEGFR2 oncogenic mutations.
Driver pathways and key genes in OSCC
Integrated analysis of gene expression and copy number alterations.
Presentation transcript:

Detection of structural variants and copy number alterations in cancer: from computational strategies to the discovery of chromothripsis in neuroblastoma Introduction CNA & LOH detection (FREEC) Discovery of chromothripsis in neuroblastoma Detection of CNA regions Detection of LOH regions Possibility to work without control sample Possibility to set tumor ploidy Automatic window selection Use of mappability information Evaluation of and adjustment of contamination of tumor samples by normal cells Possibility to work with exome data Possibility to cross the output with the output of SVDetect Detection of CNA regions Detection of LOH regions Possibility to work without control sample Possibility to set tumor ploidy Automatic window selection Use of mappability information Evaluation of and adjustment of contamination of tumor samples by normal cells Possibility to work with exome data Possibility to cross the output with the output of SVDetect 1 Inserm U900, Paris, France 2 Mines ParisTech, Fontainebleau, F France 3 Institut Curie, 26, rue d’Ulm, Paris, France 4 Inserm U830, Paris, France To find a best fit by polynomial, shown in black (A-D), we first make an initialization of the polynomial's parameters (median value of RC for GC-content). Then, we optimize polynomial’s parameters by iteratively selecting data points related to P-copy regions and making a least-squares fit on them. In many studies that apply deep sequencing to cancer genomes, one has to calculate copy number profiles (CNPs) and predict regions of gain and loss. There exist two frequent obstacles in the analysis of cancer genomes: absence of an appropriate control sample for normal tissue and possible polyploidy. We therefore developed Control-FREEC 1,2, able to automatically detect Copy Number Alterations (CNAs) with or without use of a control dataset and Loss of Heterozygosity (LOH) regions. For mate-paired/paired-ends mapping (PEM) data, one can complement the information about CNAs (i.e., output of Control-FREEC) with the predictions of Structural Variants (SVs) made by another tool that we developed, SVDetect 3. Here we used a combination of Control-FREEC and SVDetect ( on neuroblastoma samples to (1) refine coordinates of CNAs using PEM data and (2) improve confidence in calling true positive rearrangements (particularly, in ambiguous satellite/repetitive regions ). For mate-paired/paired-ends mapping (PEM) data, one can complement the information about copy number changes (i.e., output of FREEC) with the predictions of structural variants (SVs) made by SVDetect 3. Automatic intersection of Control-FREEC and SVDetect outputs allows one to: 1 Control-free calling of copy number alterations in deep-sequencing data using GC-content normalization. Boeva, V., et al. Bioinformatics, 2011; 27(2): Control-FREEC: a tool for assessing copy number and allelic content using next generation sequencing data. V. Boeva, et al. Bioinformatics, 2012, 28(3): SVDetect - a bioinformatic tool to identify genomic structural variations from paired-end next-generation sequencing data. B. Zeitouni et al., Bioinformatics, : Window size selectionCalculation of dependency function “RC vs GC- content” or “RC sample vs RC control” W = L/T/(CV) 2, where L = genome length, T = total number of reads, CV = user-defined Coefficient of Variation. Refine coordinates of CNAs using PEMs Filter out false predictions of SVDetect (often in ambiguous satellite/repetitive regions) Valentina Boeva 1,2,3, Bruno Zeitouni 1,2,3, Tatiana Popova 1,2,3, Kevin Bleakley 1,2,3, Andrei Zinovyev 1,2,3, Jean-Philippe Vert 1,2,3, Isabelle Janoueix- Lerosey 3,4, Olivier Delattre 3,4 and Emmanuel Barillot 1,2,3 Segmentation Segmentation is done by a LASSO-based algorithm suggested by (Harchaoui and Lévy- Leduc, 2008). Adjustment for a possible contamination by normal cells Control-FREEC uses the following formula to evaluate the fraction of contaminating normal cells p, and then correct copy number profiles: NRC i ≈ E i + (1 - E i )p, where NRC i is the normalized read count in window i, E i is the expected ratio in window i. 1.List of gains and losses with assigned copy numbers 2.Visualization in R 3.Creation of different file format outputs for graphical visualization: Circos, UCSC Genome Browser (BedGraph) Results and graphical visualization SVDetect 3 is a tool that allows the user to: identify candidate SVs using the clustering of discordant PEMs, predict the type of a SV using the PEM signature, Filter out PEMs inconsistent with the main signature of the predicted SV, Compare SVs predicted for different samples Create different file format outputs for graphical visualization of predicted SVs Illustrations of read signatures for SV type prediction (implemented in SVDetect 3 ) Intra-chromosomal SVsInter-chromosomal SVs Circos representation of SVs predicted by SVDetect confirmed by the CNAs identified by Control-FREEC. (A- C) NB1141, (D-E) NB1142. (A,D) whole genome view, (B, E) zoom on chromothripsis, (C, F) copy number profile for chr1 of NB1141 and chr6 of NB1142. FG Calculation of BAF profiles Normalized Copy Number B allele frequency Annotation of B allele frequency profiles using Gaussian mixture model fit Primary neuroblastoma tumors with chromothripsis Neuroblastoma cell lines CLB-GA CLB-RE Detection of SVs (SVDetect) We investigated somatic rearrangements in two neuroblastoma cell lines and two primary tumors using paired-end sequencing of mate-pair libraries