Presentation is loading. Please wait.

Presentation is loading. Please wait.

Functional Genomics with Next-Generation Sequencing Jen Taylor Bioinformatics Team CSIRO Plant Industry.

Similar presentations


Presentation on theme: "Functional Genomics with Next-Generation Sequencing Jen Taylor Bioinformatics Team CSIRO Plant Industry."— Presentation transcript:

1 Functional Genomics with Next-Generation Sequencing Jen Taylor Bioinformatics Team CSIRO Plant Industry

2 CSIRO. INI Meeting July Tutorial - Applications Capacity and Resolution Next generation sequencing Increasing capacity leads to increased resolution Eric Lander, Broad Institute

3 CSIRO. INI Meeting July Tutorial - Applications How a Genome Works? Parts Description Function? Interconnectedness? Comparisons Population - level Between genomes

4 CSIRO. INI Meeting July Tutorial - Applications Application domains Reference genome No Reference Genome Partially sequenced UNsequenced “PUN Genomes”

5 CSIRO. INI Meeting July Tutorial - Applications Impact of a Reference Genome Sequence Data Alignment Read Density Characterisation Genome Assembly Contigs

6 CSIRO. INI Meeting July Tutorial - Applications Applications of Next Generation Sequencing Profiling of Variation Genetic variation Transcript variation Epigenetic variation Metagenomic variation Discovery Novel genomes Novel genes Novel transcripts Small / long non-coding RNA RNA Sequencing (RNASeq) Coding and non-coding transcript profiling Dynamic and Context dependent Epigenomics Genome-wide protein-DNA interactions, DNA modifications Heritable and reversible regulation of gene expression Today

7 CSIRO. INI Meeting July Tutorial - Applications RNASeq Qualitative – transcript diversity Quantitative – transcript abundance Impact of NGS Observation of transcript complexity Transcript discovery Small / long non-coding RNA Analytical challenges Transcript complexity Compositional properties

8 CSIRO. INI Meeting July Tutorial - Applications RNASeq Library Construction Sample Total RNA PolyA RNA Small RNA Sequencing Base calling & QC Mapping to Genome Assembly to Contigs Digital “Counts” Reads per kilobase per million (RPKM) Transcript structure Secondary structure Targets or Products Reference PUN Analysis

9 CSIRO. INI Meeting July Tutorial - Applications RNASeq – Transcript Complexity Mapping : Reads with multiple locations Conserved domains ? Sequencing error ? Reads Spanning Exons Gapped alignments ? Sequencing error ? Erange Pipeline : Mortazavi et al., Nature Methods VOL.5 NO.7 JULY 2008

10 CSIRO. INI Meeting July Tutorial - Applications RNASeq – Compositional properties Depth of Sequence Sequence count ≈ Transcript Abundance Majority of the data can be dominated by a small number of highly abundant transcripts Ability to observe transcripts of smaller abundance is dependent upon sequence depth

11 CSIRO. INI Meeting July Tutorial - Applications RNASeq – Compositional properties Composition Sequence counts are a composition of a fixed number of total sequence reads Therefore they are sum-constrained and not independent Large variations in component numbers and sizes can produce artefacts True Reads RPKM

12 CSIRO. INI Meeting July Tutorial - Applications RNASeq - Correspondence Good correspondence with : Expression Arrays Tiling Arrays qRT-PCR Range of up to 5 orders of magnitude Better detection of low abundance transcripts Greater power to detect Transcript sequence polymorphism Novel trans-splicing Paralogous genes Individual cell type expression

13 CSIRO. INI Meeting July Tutorial - Applications Reference Genome - RNASeq

14 CSIRO. INI Meeting July Tutorial - Applications Reference Genome - RNASeq Human Exome Number of exons targeted: ~180,000 (CCDS database) plus700+ miRNA(Sanger v13) 300+ ncRNA

15 CSIRO. INI Meeting July Tutorial - Applications Epigenome Protein-DNA interactions [ChIPSeq] Nucleosome positioning Histone modification Transcription factor interactions Methylation [MethylSeq] Impact of NextGen Whole genome profiling Resolution Analytical challenges Systematic bias Unambiguous mapping Robust event calling Image : ClearScience

16 CSIRO. INI Meeting July Tutorial - Applications ChIPSeq MNase Linker Digest Sequence & Align Remove Nucleosomes

17 CSIRO. INI Meeting July Tutorial - Applications ChIPSeq MNase Digest Sequence & Align Remove Nucleosomes

18 CSIRO. INI Meeting July Tutorial - Applications ChipSeq methods Pepke et al., 2009 CisGenome ERANGE FindPeaks F-Seq GLITR MACS PeakSeq QuEST

19 CSIRO. INI Meeting July Tutorial - Applications MethylSeq using Bisulfite conversion Cytosine Uracil Bisulfite conversion Thymine PCR 5-methylcytosine Cytosine Bisulfite conversion PCR

20 CSIRO. INI Meeting July Tutorial - Applications Limited publications from BS-Seq Mammals Methylation predominant occurs at CpG site Several publications in human One publications in mouse Plants Methylation occurs at CG, CHH, CHG sites Two publications in arabidopsis H = A, G, T

21 CSIRO. INI Meeting July Tutorial - Applications Problems of mapping BS-seq reads Reduced sequence complexity C m methylated C Un-methylated Watson >>A C m G T T C T C C A G T C>> Bisulfite conversion >>A C m G T T T T T T A G T T>> >>A C G T T T T T T A G T T>>

22 CSIRO. INI Meeting July Tutorial - Applications Problems of mapping BS-seq reads Increased search space Watson >> A C m G T T C T C C A G T C >> Crick << T G C m A A G A G G T C A G << BSW >> AC m GTTTTTTAGTT >> BSC << TGC m AAGAGGTTAG << Bisulfite conversion BSW >> AC m GTTTTTTAGTT >> BSWR > BSCR >> ACG TTCTCCAAGA >> BSC << TGC m AAGAGGTTAG << PCR

23 CSIRO. INI Meeting July Tutorial - Applications ELAND Mapping reads to genome sequences Mapping reads to two converted genome sequences Cross match for reads mapping to multiple positions in converted genomes Mapping results were combined to generate methylation information Eland only allows 2 mismatches. Lister et al. Cell (2008)

24 CSIRO. INI Meeting July Tutorial - Applications BSMAP Based on HASH table seeding algorithm Xi and Li BMC Bioinformatics (2009)

25 CSIRO. INI Meeting July Tutorial - Applications Re-mapping of Lister’s data using BSMAP Raw ReadsMethods Uniquely Mapped Reads Unique and Nonclonal Reads Unique and nonclonal reads% 144,704,372 Eland55,805,93139,113, % BSMAP67,975,42548,498, % Lister et al. Cell (2008)

26 CSIRO. INI Meeting July Tutorial - Applications Methylation pattern throughout chromosomes CHG Crick Watson Position Arabidopsis Chromosome 3 CG Watson Crick CHH Watson Crick Methylation Level / 50Kb

27 CSIRO. INI Meeting July Tutorial - Applications Partially / Unsequenced Genomes Options for dealing with partial or unsequenced genomes Wait for or generate the genome sequence ‘Borrow’ a reference genome from a phylogenetic neighbour Take a deep breath and ‘do denovo’ Denovo Genome Denovo Transcriptome DNA or RNA Sequence Data Partial Sequence Database Partial Assembly Gene Annotation Genetic Variation Non-coding RNA Transcript Variation

28 CSIRO. INI Meeting July Tutorial - Applications Plant Genomes – Haploid Size Human Arabidopsis Rice Potato Sugarcane Cotton Barley Wheat Diameter proportional to genome haploid genome size

29 CSIRO. INI Meeting July Tutorial - Applications Plant Genomes – Total Size Human Cotton Barley Sugarcane Wheat

30 CSIRO. INI Meeting July Tutorial - Applications Denovo RNA Seq Why transcriptome ? Large genome sizes with high repeat content are difficult to assemble Transcriptomes more constant size Enriched for functional content Aims : Transcript discovery Small /long non-coding RNA profiling Analytical challenges Assembly – ABySS, Velvet, Euler-SR Comparisons between non-discrete, overlapping transcripts Annotation Ploidy

31 CSIRO. INI Meeting July Tutorial - Applications Summary – Impacts and Challenges RNASeq Increased resolution Increased power for transcript complexity and variation Analytical challenges – transcript complexity, compositional bias Large gains in small and long non-coding RNA profiling Epigenomics ChipSeq and MethylSeq Genome-wide with resolution Robust event calling is challenging Denovo transcriptomics Attractive option for large, repeat rich genomes

32 CSIRO. INI Meeting July Tutorial - Applications Acknowledgements CSIRO PI Bioinformatics Team Andrew Spriggs Stuart Stephen Emily Ying Jose Robles Michael James CSIRO Biostatistics David Lovell


Download ppt "Functional Genomics with Next-Generation Sequencing Jen Taylor Bioinformatics Team CSIRO Plant Industry."

Similar presentations


Ads by Google