Presentation is loading. Please wait.

Presentation is loading. Please wait.

User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen.

Similar presentations


Presentation on theme: "User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen."— Presentation transcript:

1 User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen

2 Structure of the hands-on session Introduction (Petri) Demo: Introduction to galaxy (Oskari) Demo: fastQC, read trimming (Petri) Hands-on: fastQC, read trimming –Online tutorial Demo: Alignment (Oskari) Hands-on: Alignment –Online tutorial

3 Introduction Idea of this tutorial is to: –Get familiar with galaxy –Understand the main steps of ChIP-seq analysis –Do each step manually in Galaxy –Motivate you to do NGS analysis yourself …but not to Go through all theory behind sequencing technology Explain all NGS analysis terms Go in depth in data-analysis

4 NGS, reads, deep sequencing..? Next generation sequencing(NGS) –Sequencing DNA-RNA molecules Read: –Fraction of DNA-RNA sequenced, typically 30-60 nucleotides Depth: –Refers to the number of times a nucleotide is read during the sequencing process

5 Gene regulation and genome-wide data Reference genome Gro-seq RNA-seq TF ChIP-seq TF motif SNP from exome sequencing Histone marker ChIP-seq Dnase-seq Input ChIP-seq TF=Transcription factor

6 NGS analysis pipeline: ChIP-seq Peak calling (Sami Heikkinen)Peak calling Motif detection and functional enrichment (Minna Kaikkonen)Motif detection and functional enrichment RNA-seq (Eija Korpelainen) Isoforms and differential expression Isoforms and differential expression Functional enrichment Exome-seq (Patrick May) Variant Calling and visual explorationVariant Calling and visual exploration Variant annotation Variant prioritization Common steps: FASTQ data and quality control Read trimming and filtering Alignment and Visualization (visualization, integrative analysis, machine learning, clustering etc)

7 How did we get the data? GEO accession: GSE31477, Homo Sapiens We used SRA database to download ENCODE data: Command line tool: prefetch SRR353507 (input) prefetch SRR340079 (TCF7L2 ChIP-seq) … or manually from SRA database SRR353507(http://www.ncbi.nlm.nih.gov/sra/?term= SRR353507 ) SRR340079 (http://www.ncbi.nlm.nih.gov/sra/?term= SRR340079) Converted SRA to fastq format ( SRA Toolkit, fastq-dump tool): fastq-dump path-to-file/file1-2 Extracted only chr3 results to make analysis steps faster  You will start with TCF7L2 ChIP-seq and INPUT ChIP-seq chr3 data in fastq format

8 What is Galaxy? http://galaxyproject.org/

9 Today’s hands-on session – Galaxy Logon to a computer with your UEFAD login credentials (or GUEST credentials you have received) Start a browser Go to http://galaxy.uef.fihttp://galaxy.uef.fi Logon with your UEFAD credentials

10 Today’s hands-on session – Galaxy Tutorial: http://scriptogr.am/ohofmann/chip-seqhttp://scriptogr.am/ohofmann/chip-seq Tools: Quality check 1. NGS: QC and manipulation –FastQC:Read QC –Tips and how ”bad” data looks like http://bioinfo-core.org/index.php/9th_Discussion-28_October_2010http://bioinfo-core.org/index.php/9th_Discussion-28_October_2010 Clean-up steps if required (Read trimming and filtering) Mapping to genome 2. NGS: Mapping –Map with Bowtie for Illumina

11 Today’s hands-on session – Galaxy Tools: extra visualiazation of genome mapping Peak calling 3. NGS: Peaks Calling –MACS Workflow 4. Workflow


Download ppt "User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen."

Similar presentations


Ads by Google