Presentation is loading. Please wait.

Presentation is loading. Please wait.

BS222 – Genome Science Lecture 8

Similar presentations


Presentation on theme: "BS222 – Genome Science Lecture 8"— Presentation transcript:

1 BS222 – Genome Science Lecture 8
NGS applications. Part 1 Vladimir Teif

2 Module structure Genomes, sequencing projects and genomic databases (VT) (Oct 9, 2018) Sequencing technologies (VT) (Oct 11, 2018) Genome architecture I: protein coding genes (VT) (Oct 16, 2018) Genome architecture II: transcription regulation (VT) (Oct 18, 2018) Genome architecture III: 3D chromatin organisation (VT) (Oct 23, 2018) Epigenetics overview (PVW) (Oct 25, 2018) DNA methylation and other DNA modifications (VT) (Oct 30, 2018) NGS applications I: Experiments and basic analysis (VT) (Nov 1, 2018) NGS applications II: Data integration (VT) (Nov 8, 2018). Comparative genomics (JP, guest lecture) (Nov 13, 2018) SNPs, CNVs, population genomics (LS, guest lecture) (Nov 15, 2018) Histone modifications (PVW) (Nov 20, 2018) Non-coding RNAs (PVW) (Nov 22, 2018) Genome Stability (PVW) ) (Nov 27, 2018) Transcriptomics (PVW) (Nov 29, 2018) Year's best paper (PVW) (Dec 6, 2018) Revision lecture (all lecturers; spring term)

3 NGS techniques vs NGS applications
NGS techniques: how to sequence DNA (or RNA) (covered in lecture 2; funny recap in this video NGS applications: how to design experiments in order to answer a specific biological question

4 Examples of NGS applications
Chromatin domains Hi-C Figure adapted from

5 Types of NGS applications
RNA-seq, GRO-seq, CAGE, SAGE, CLIP-seq, Drop-seq gene expression; non-coding RNA ChIP-seq, MNase-seq, DNase-seq, ATAC-se, etc protein binding; histone modifications chromatin accessibility; nucleosome positioning Bisulfite sequencing (DNA methylation) Hi-C, 3C, 4C, ChIA-PET, etc (Chromatin loops) Amplicon sequencing targeted regions; philogenomics; metagenomics Whole Genome Sequencing (WGS) de-novo assembly (new species or new analyses) Curated bibliography of *seq methods (~100 methods) can be found at

6

7 RNA-seq (RNA sequencing)

8 ChIP-seq (Chromatin Immunoprecipitation followed by sequencing)
1. Crosslink Protein-DNA complexes in situ 2. Isolate nuclei and fragment DNA (sonication or digestion) 3. Immunoprecipitate with antibody against target nuclear protein and reverse crosslinks 4. Release DNA and submit for sequencing Adapted from

9 MNase-seq (Micrococcal Nuclease digestion followed by sequencing)
MM MNase-seq (Micrococcal Nuclease digestion followed by sequencing) MNase = Micrococcal Nuclease (enzyme that cuts DNA between nucleosomes) Teif et al. (2012), Methods, 62, 26-38

10 FAIRE-seq (Formaldehyde-Assisted Isolation of Regulatory Elements)
sequencing Giresi et al (2007), Genome Res. 17, 877–885

11 DNAse-seq (DNase I digestion followed by sequencing
Wang et al. (2012), PLoS ONE 7, e42414

12 ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing)
How transposase works: Buenrostro et al. (2013) Nat Methods. 10,

13 Methods for 1D genome mapping
MM Methods for 1D genome mapping Meyer & Liu, Nature Reviews Genetics 15, 709–721 (2014)

14 Methods for 1D genome mapping
Tsompana and Buck, Epigenetics & Chromatin20147:33

15 NGS methods for DNA methylation
Bisulfite sequencing Affinity purification (e.g. MeDIP)

16 Chromatin Conformation Capture methods to map locations of DNA-DNA loops
Rao et al., Nature 159, 1665–1680 (2014)

17 River and Ren (2013), Cell, 155, 39-55 Since 2017 DNA loops can be measured with 100-bp resolution (Bonev et al., Cell, 2017)

18 Timeline of NGS methods
Bulk methods that require many cells River and Ren (2013), Cell, 155, 39-55 Single-cell methods Hu et al, Front. Cell Dev. Biol., 2018

19 Where to get NGS data? Do your own experiment
Gene Expression Omnibus (GEO) Sequence read archive (SRA) European Nucleotide Archive The Cancer Genome Atlas (TCGA) Exome Aggregation Consortium (ExAC) You also have to upload your data!

20 Next generation sequencing analysis

21 How to analyze NGS data? Ask a bioinformatician
you need to explain what do you want, and for that you need to understand what/how can be done Do it yourself Command line –> become a bioinformatician Online wrappers –> simpler, but file size limits Example of a convenient online tool: Galaxy

22 ChIP-seq (Chromatin ImmunoPrecipitation followed by sequencing)
1. Crosslink Protein-DNA complexes in situ 2. Isolate nuclei and fragment DNA (sonication or digestion) 3. Immunoprecipitate with antibody against target nuclear protein and reverse crosslinks 4. Release DNA and submit for sequencing Adapted from

23 Experiment Data analysis

24 ChIP-seq data analysis

25 Unmapped sequenced reads (this is “raw”, primary data):

26 Mapped reads are characterised by their locations in the genome
Bowtie, BWA, ELAND, Novoalign, BLAST, ClustalW TopHat (for RNA-seq)

27 Reads can align to overlapping locations
We need to count all reads at each base pair

28 ChIP-seq landscapes depend on the protein
Park P. J., Nature Genetics, 2009

29 We can compare different experimental datasets for the same genomic region
5mC Gifford et.al., Cell 2013

30 We can compare different experimental conditions in a genome browser
Jung et al., NAR 2014 UCSC Genome Browser (online) IGV (install on a local computer)

31 Systematic analysis requires to identify all peaks in all datasets and compare differences
Badet et al. (2012) Nature Protocols, 7, 45-61

32 Peak calling is a method to identify areas in a genome enriched with aligned reads
Wilbanks EG (2010) PLoS ONE 5, e11471.

33 Peak calling: finding the peaks
Input: sample that was prepared in the same way as in the ChIP-seq, but no antibody was added, so it has no specific enrichment of our protein of interest Pepke et al. (2009). Nature Methods, 6, S22–S32. 

34 Peak calling: defining statistical significance

35 Peak calling: defining statistical significance
MACS (good for TFs) CISER (histones, etc) HOMER (universal) PeakSeq edgeR CisGenome Is this peak statistically significant? Is this peak statistically significant? Park P. J., Nature Genetics, 2009

36

37 Important: peaks are just genomic regions

38 Genes are also some genomic regions
DESeq, edgeR, Cuffdiff

39 DNA methylation: also genomic regions
Individual CpGs Differentially methylated regions DMRcaller BISMARK

40 Any genomic regions can be intersected
BedTools (command line) Galaxy (online)

41 We can calculate distribution of TF binding sites among different genomic features
Toropainen et al. (2016) Scientific Reports, 6, 33510

42 We can also calculate enrichments of binding sites of our TF in different genomic regions
Mattout et al., Genome Biology, 2015

43 …Or study the DNA sequence inside the peaks to find some common motifs
HOMER, MEME Massie et al., EMBO J. (2011) 30, 2719–2733

44 What else can we do with peaks?
Compare two experimental conditions to see which peaks appear/disappear (e.g. protein binding gained/lost); Compute associations of our protein with different genes (e.g. define which genes are regulated by this protein) Study the DNA sequence inside the peaks (e.g. to find which other TFs co-bind with our protein of interest) Look how our peaks are arranged with respect to other peaks (e.g. to check for interactions with other proteins) etc

45 Take home message NGS data structure
NGS data are very large text files NGS analysis needs “large” computers MUST KNOW: NGS data structure ~100s types of NGS experiments; we focus on ChIP-seq here Where NGS data is stored? (GEO, etc) RAW DATA; MAPPED READS; REGIONS; SITES GENOME BROWSERS. PEAKS. PEAK CALLING Optional video:


Download ppt "BS222 – Genome Science Lecture 8"

Similar presentations


Ads by Google