Genome-wide DNA methylation analysis

Slides:



Advertisements
Similar presentations
Functional Genomics with Next-Generation Sequencing
Advertisements

Epigenetics Epigenetics - Heritable changes in gene expression that operate outside of changes in DNA itself - stable changes in gene expression caused.
Epigenetics Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520.
MEDIP, MAP AND MIRA Biological Affinity-Based Methods of DNA Methylation Detecton: Genome Wide.
Control of Eukaryotic Gene Expression. 2 Eukaryotic Gene Regulation Prokaryotic regulation is different from eukaryotic regulation. 1.Eukaryotic cells.
Sodium Bisulfite Methods for Genome Wide Methylation Methods MALDI-TOF BISULFITE SEQUENCING GOLDEN GATE PYROSEQUENCING.
Genome-Wide DNA Methylation Assays Nadia Khan, Rick Smith, and Anna Kuperman Epigenetics 2012.
9 Genomics and Beyond Brief Chapter Outline
Microarray Type Analyses using Second Generation Sequencing
Hybridization Diagnostic tools Nucleic acid Basics PCR Electrophoresis
Introduction to epigenetics: chromatin modifications, DNA methylation and the CpG Island landscape (part 2) Héctor Corrada Bravo CMSC858P Spring 2012 (many.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS)
CS 6293 Advanced Topics: Current Bioinformatics
Methods of DNA Methylation Analysis CNRU. Review: Epigenetics Study of mitotically heritable alterations in gene expression potential that are not mediated.
Pharmacogenomics and personalized medicines Jean-Marie Boeynaems
Next generation sequencing platforms Applications
Committee Meeting April 24 th 2014 Characterizing epigenetic variation in the Pacific oyster (Crassostrea gigas) Claire Olson School of Aquatic and Fishery.
Group 6 Xiaopeng Ma, Weiru Liu, Zhirui Hu, Weilong Guo.
DNA Methylation Assays High Throughput Data Analysis BIOS , VCU Winter 2010 Mark Reimers, PhD.
Epigenetics Lab December 1, 2008 Goals of today’s lab: 1.Understand the basic molecular techniques used in the lab to study epigenetic silencing in cancer.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
The virochip (UCSF) is a spotted microarray. Hybridization of a clinical RNA (cDNA) sample can identify specific viral expression.
CO 10.
The Genome is Organized in Chromatin. Nucleosome Breathing, Opening, and Gaping.
DNA Methylation mapping
Restriction Nucleases Cut at specific recognition sequence Fragments with same cohesive ends can be joined.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Next Generation DNA Sequencing
Manipulation of DNA. Restriction enzymes are used to cut DNA into smaller fragments. Different restriction enzymes recognize and cut different DNA sequences.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
Vidyadhar Karmarkar Genomics and Bioinformatics 414 Life Sciences Building, Huck Institute of Life Sciences.
EDACC Primary Analysis Pipelines Cristian Coarfa Bioinformatics Research Laboratory Molecular and Human Genetics.
Epigenetics Heritable characteristics of the genome other than the DNA sequence Heritable during cell-division (mitosis) To a lesser extent also over generations.
I519 Introduction to Bioinformatics, Fall, 2012
DNA MODIFICATIONS AND LONG-TERM PATTERNS OF GENE EXPRESSION EPIGENETICS PART 1 Feb 19, 2015.
How will new sequencing technologies enable the HMP? Elaine Mardis, Ph.D. Associate Professor of Genetics Co-Director, Genome Sequencing Center Washington.
Cellular Profiles Exploring gene expression profile patterns Pathways, Profiles and Predictions Brad Windle Associate Professor of Medicinal Chemistry.
EDACC Quality Characterization for Various Epigenetic Assays
Next Generation Sequencing
Genomics.
Taqman Technology and Its Application to Epidemiology Yuko You, M.S., Ph.D. EPI 243, May 15 th, 2008.
Other genomic arrays: Methylation, chIP on chip… UBio Training Courses.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
Epigenetic Modifications in Crassostrea gigas Claire H. Ellis and Steven B. Roberts School of Aquatic and Fishery Sciences, University of Washington, Seattle,
Analysis of protein-DNA interactions with tiling microarrays
Introduction to RNAseq
Epigenetic control of Gene Regulation Epigenetic vs genetic inheritance  Genetic inheritance due to differences in DNA sequence  Epigenetic inheritance.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
Trends Biomedical Science
Lecture-5 ChIP-chip and ChIP-seq
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
Supplemental Figure 1. False trans association due to probe cross-hybridization and genetic polymorphism at single base extension site. (A) The Infinium.
STAT115 STAT225 BIST512 BIO298 - Intro to Computational Biology.
Accessing and visualizing genomics data
March 6, 2016 EpiQ Chromatin Analysis Kit A New Tool for Epigenetic Research Gábor Kohut PhD Field Application Specialist Central and Eastern Europe.
Molecules and mechanisms of epigenetics. Adult stem cells know their fate! For example: myoblasts can form muscle cells only. Hematopoetic cells only.
Agenda  Epigenetics and microRNAs – Update –What’s epigenetics? –Preliminary results.
Additional high-throughput sequencing techniques (finding all functional elements of genome) June 15, 2017.
GENETIC MARKERS (RFLP, AFLP, RAPD, MICROSATELLITES, MINISATELLITES)
EPIGENETICS Textbook Fall 2013.
Breanna Perreault and Xiao Liu D145 Presentation
Volume 1, Issue 1, Pages (February 2002)
Molecular Mechanisms of Gene Regulation
by Hong Yin, and K. L. Blanchard
The RdDM Pathway Is Required for Basal Heat Tolerance in Arabidopsis
Volume 133, Issue 3, Pages (May 2008)
Chromosome Architecture
Epigenetics modification
Presentation transcript:

Genome-wide DNA methylation analysis Bi-Qing Li Key Laboratory of Systems biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences

outline Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Background DNA methylation is the main covalent chemical modification of DNA involved in a variety of biological processes, including embryogenesis and development, silencing of transposable elements, regulation of gene transcription and tumorigenesis and progression. The methylation pattern of DNA is highly variable among cells types and developmental stages and influenced by disease processes and genetic factors, which brings considerable theoretical and technological challenges for its comprehensive analysis. Recently various high-throughput approaches have been developed and applied for the genome wide analysis of DNA methylation providing single base pair resolution, quantitative DNA methylation data with genome wide coverage. Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Method to distinguish 5mC Biotechniques. 2010 Oct;49(4):iii-xi

Restriction endonuclease-based analysis isoschizomer Cut unmethylated DNA Regardless of methylation neoschizomer Cut unmethylated DNA Partially affacted by CpG methylation Cut methylated DNA Pu: A or G, mC: 5-methylcytosine or 5-hydroxymethylcytosine or N4-methylcytosine , These half-sites can be separated by up to 3 kb, but the optimal separation is 55-103 base pairs Biotechniques. 2010 Oct;49(4):iii-xi

Restriction endonuclease-based analysis Methylation-sensitive restriction digestion followed by PCR across the restriction site is a very sensitive technique that is still used in some applications today. This method is still applicable for some locus-specific studies that require linkage of DNA methylation information across multiple kilobases, either between CpGs or between a CpG and a genetic polymorphism. Limited by providing methylation data only at the restriction enzyme recognition sites or adjacent regions It is extremely prone to false-positive results caused by incomplete digestion for reasons other than DNA methylation. Nat Rev Genet. 2010 Feb 2;11(3):191-203

Bisulfite conversion of DNA PCR Proc Natl Acad Sci U S A. 1992 Mar 1;89(5):1827-31.

Bisulfite conversion of DNA Single base pair resolution, no bias DNA degradation by high temperature and low PH Incomplete conversion of unmethylated cytosine High GC density regions Protected by histones Stable secondary structure elements Reduced complexity of genome, greater sequence redundancy, decreased hybridization specificity Difficult to mapping (repetitive regions) Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

Immunoprecipitation-based methods methylated DNA immunoprecipitation (MeDIP-seq) Antibody recognizes 5mc to pull down the methylated fraction of genome More sensitive to highly methylated, intermediate-CpG density regions methyl-binding domain protein (MBD-seq) Using the methyl-binding protein MeCP2 or MBD2’s affinity for CpGs More sensitive to highly methylated, high-CpG density regions Methods. 2010 Nov;52(3):203-12

Immunoprecipitation-based methods Straitforward and data relatively easier to analyze Bias associated with CpG density and need adjustment High(MBD) or intermediate(MeDIP) CpG dense regions will be interpreted as “more methylated” than equally methylated low-CpG density regions Low resolution, do not yield information on individual CpG dinucleotides Methods. 2010 Nov;52(3):203-12

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Array-based genome wide DNA methylation analysis & restriction endonuclease Digestion of one pool of genomic DNA with a methylation-sensitive restriction enzyme and mock digestion of another pool or using two different enzymes Two DNA pools are amplified and labelled with different fluorescent dyes for two-color Array hybridization Nat Rev Genet. 2010 Feb 2;11(3):191-203

Array-based genome wide DNA methylation analysis & restriction endonuclease Comprehensive high-throughput arrays for relative methylation (CHARM) McrBC fractionate unmethylated DNA Label methyl-depleted DNA with Cy5 and total DNA with Cy3 Hybridized on high density arrays Cut methylated DNA Genome Res. 2008 May;18(5):780-90

Array-based genome wide DNA methylation analysis & restriction endonuclease HpaII tiny fragment enrichment by ligation mediated PCR (HELP) Cut unmethylated DNA Regardless of methylation Digestion genomic DNA with HpaII and MspI Ligation-mediated PCR for the amplification of HpaII or MspI genomic restriction fragments Label HpaII amplified with Cy5 and MspI with Cy3 Array hybridization Genome Res. 2006 Aug;16(8):1046-55

Array-based genome wide DNA methylation analysis & methylation immunoprecipitation Enrichment of methylated fragments using 5mC antibody or the affinity of methyl-binding proteins Input DNA and enriched DNA are labeled with different fluorescent dyes Array hybridization Nat Rev Genet. 2010 Feb 2;11(3):191-203

Array-based genome wide DNA methylation analysis & methylation immunoprecipitation Methylated DNA immunoprecipitation From Wikipedia, the free encyclopedia

Array-based genome wide DNA methylation analysis & bisulfite conversion ILLUMINA® EPIGENETIC ANALYSIS

Array-based genome wide DNA methylation analysis & bisulfite conversion 14,495 protein-coding gene promoters 27,578 CpG sites Nat Rev Genet. 2010 Feb 2;11(3):191-203 110 microRNA gene promoters

Array-based genome wide DNA methylation analysis & bisulfite conversion Genome Res. 2006 Mar;16(3):383-93

Array-based genome wide DNA methylation analysis & bisulfite conversion GoldenGate BeadArray 1536 specific CpG site in 371 gene GoldenGate Methylation Cancer Panel I 1505 CpG sites selected from 807 genes Illumina® Epigenetics Analysis Nat Rev Genet. 2010 Feb 2;11(3):191-203

Array-based genome wide DNA methylation analysis Easy to perform such experiments Easy to interpret data with many well-characterized software programs Low resolution Not easy to distinguish one repetitive element from another in a hybridization-based method Not truly genome-wide

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

NGS based genome-wide DNA methylation analysis Biotechniques. 2010 Oct;49(4):iii-xi

NGS based genome-wide DNA methylation analysis-ROCHE 454 Roche/454 pyrosequencing-based massively parallel bisulfite pyrosequencing Include more CpG sites facilitating complex methylation pattern research Easier and more accurately aligned to reference, especially in repetitive regions Bigger chance to cover more genotype information (SNP) adjacent to cytosine Relatively high sequencing cost Higher error rates in calling identical bases Genes 2010, 1(1), 85-101; doi:10.3390/genes1010085

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Methyl-seq ~100-350bp Regardless of methylation Illumina Genome Analyzer II Cut unmethylated DNA Genome Res. 2009 Jun;19(6):1044-56

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Methyl-sensitive cut counting(MSCC) The method is similar to Methyl-Seq; however, sequencing of MspI libraries was reported to have little effect on the measurement of methylation and was abolished to reduce costs. Genome Med. 2009 Nov 16;1(11):106 Nat Biotechnol. 2009 Apr;27(4):361-8

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA methyl-DNA immunoprecipitation (MeDIP) seq Methods. 2009 Mar;47(3):142-50

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Nature. 2008 Aug 7;454(7205):766-70 Nat Methods. 2010 Feb;7(2):133-6 Reduced representation bisulfite sequencing(RRBS) Illumina Genome Analyzer Nucleic Acids Research, 2005, Vol. 33, No. 18

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Bisulfite padlock probes(BSPPs) Nat Biotechnol. 2009 Apr;27(4):353-60

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Bisulfite sequencing(BS-seq) Nature. 2008 Mar 13;452(7184):215-9

NGS based genome-wide DNA methylation analysis-Illumina/SOLEXA Cytosine methylome sequencing (MethylC-seq) Cell. 2008 May 2;133(3):523-36 Nature. 2009 Nov 19;462(7271):315-22 Nature. 2011 Mar 3;471(7336):68-73

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Third generation sequencing based genome-wide DNA methylation analysis-PacBio single-molecule, real-time sequencing (SMRT) ZMW: zero mode waveguide Nat Biotechnol. 2010 May;28(5):426-8

Third generation sequencing based genome-wide DNA methylation analysis-PacBio single-molecule, real-time sequencing (SMRT) Nat Methods. 2010 Jun;7(6):461-5 Nat Methods. 2010 Jun;7(6):435-7

Third generation sequencing based genome-wide DNA methylation analysis-Oxford Nanopore Oxford Nanopore Technologies Nat Biotechnol. 2010 May;28(5):426-8

Background Method to distinguish 5mC Array based genome-wide DNA methylation analysis NGS based genome-wide DNA methylation analysis Third generation sequencing based genome-wide DNA methylation analysis Illumina BS-seq data manipulation

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation FASTQ file format FASTQ has emerged as a common file format for sharing sequencing read data combining both the sequence and an associated per base quality score Nucleic Acids Research, 2010, Vol. 38, No. 6 1767–1771

Illumina BS-seq data manipulation PHRED score Nature. 2009 Nov 19;462(7271):315-22 Nucleic Acids Research, 2010, Vol. 38, No. 6 1767–1771

Illumina BS-seq data manipulation PHRED score http://en.wikipedia.org/wiki/FASTQ_format#cite_note-Illumina_User_Guide_1.5-2

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation adaptor trimming with FASTX Nature. 2009 Nov 19;462(7271):315-22

Illumina BS-seq data manipulation adaptor trimming with FASTX http://hannonlab.cshl.edu/fastx_toolkit/index.html

Illumina BS-seq data manipulation adaptor trimming with FASTX http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastx_clipper_usage

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation Quality control with FastQC http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

Illumina BS-seq data manipulation Quality control with FastQC

Illumina BS-seq data manipulation Quality control with FastQC

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation Reads filter and trimming with FASTX e.g.1 fastq_quality_filter -Q 33 -q 20 -p 100 -v -i input -o output e.g.2 fastq_quality_filter -q 10 -p 100 -i /usr/local/data/GBS/OWB-RAD1.fastq -Q 33 | fastq_quality_filter -Q 33-q 20 -p 80 -o OWB1-filt.fastq http://hannonlab.cshl.edu/fastx_toolkit/commandline.html#fastq_quality_filter_usage

Illumina BS-seq data manipulation Reads filter and trimming with FASTX FASTQ quality trimmer e.g.1 fastq_quality_trimmer -t 20 -l 35 -v -i input -o output

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation Reads mapping with Bismark

Illumina BS-seq data manipulation Reads mapping with Bismark Bioinformatics. 2011 Jun 1;27(11):1571-2.

Two computationally converted reference Illumina BS-seq data manipulation Reads mapping with Bismark Two computationally converted reference Bioinformatics. 2011 Jun 1;27(11):1571-2.

Illumina BS-seq data manipulation Reads mapping with Bismark

Illumina BS-seq data manipulation Reads mapping with Bismark H=A, C or T

Illumina BS-seq data manipulation Reads mapping with Bismark H=A, C or T

Illumina BS-seq data manipulation Reads mapping with Bismark H=A, C or T

Illumina BS-seq data manipulation Reads mapping with Bismark

Illumina BS-seq data manipulation Reads mapping with Bismark

Illumina BS-seq data manipulation Reads mapping with Bismark chromosome position strand context mC All C 1 468 + CG 4 469 - 5 6 470 471 7 7384 CHG 9 225896 CHH 16 771455 22 702235 2 12 H=A, C or T

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation Basic analysis-Reads coverage

Illumina BS-seq data manipulation Basic analysis-Reads depth

Illumina BS-seq data manipulation Basic analysis-Reads depth percentage

Illumina BS-seq data manipulation Basic analysis- Methylation level chromosome position strand context mC All C Methylation level 1 468 + CG 4 100% 469 - 5 6 83.3% 470 471 7 7384 CHG 9 66.7% 225896 CHH 16 25% 771455 22 22.7% 702235 2 12 16.7% H=A, C or T

Illumina BS-seq data manipulation Basic analysis-Methylaion density H=A, C or T

Illumina BS-seq data manipulation FASTQ file format and PHRED score Adaptor trimming with FASTX Quality control with FastQC Reads filter and trimming with FASTX Reads mapping with Bismark Basic analysis Advanced analysis and application

Illumina BS-seq data manipulation Advanced analysis and application DNA methylation and gene expression DNA methylation is linked to gene silencing and is considered to be an important mechanism in the regulation of gene expression Gene expression Gene expression microarray RNA-seq

Illumina BS-seq data manipulation Advanced analysis and application DNA methylation and gene expression proximal TSS (-150 bp to +150 bp across TSS) Promoter (1.5 kb upstream of the TSS) Nature. 2009 Nov 19;462(7271):315-22

Illumina BS-seq data manipulation Advanced analysis and application DNA methylation and gene expression Genome Res. 2010 Mar;20(3):320-31.

Illumina BS-seq data manipulation Advanced analysis and application Differentially methylated region(DMRs) and gene expression DNA methylation at DNA–protein interaction sites DNA methylation, miRNA, and histone modification …… Genome Res. 2010 Mar;20(3):320-31. Nature. 2009 Nov 19;462(7271):315-22

Thank you!