User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen.

Slides:



Advertisements
Similar presentations
Before we start Login to the laptop: user: crgcomu Password: crgcomu Login to the network: Wifi: carretwifi Password : Login to galaxy (ldap):
Advertisements

IMGS 2012 Bioinformatics Workshop: RNA Seq using Galaxy
Differentially expressed genes Sample class prediction etc.
ChIP-seq analysis Ecole de bioinformatique AVIESAN – Roscoff, Jan 2013.
DNAseq analysis Bioinformatics Analysis Team
BioPivot: Applying Microsoft Live Labs’ Pivot to Problems in Bioinformatics Stephen Taylor, CBRG GMOD Europe 2010.
RNA-seq Analysis in Galaxy
High Throughput Sequencing
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
Before we start: Align sequence reads to the reference genome
Sequencing Data Quality Saulo Aflitos. Read (≈100bp) Contig (≈2Kbp) Scaffold (≈ 2Mbp) Pseudo Molecule (Super Scaffold) Paired-End Mate-Pair LowComplexityRegion.
NGS Analysis Using Galaxy
An Introduction to RNA-Seq Transcriptome Profiling with iPlant
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Detecting enriched regions (Chip- seq, RIP-seq) Statistical evaluation of enriched regions Data displayed in Genome Browser Detection of enriched motifs.
Bioinformatics and OMICs Group Meeting REFERENCE GUIDED RNA SEQUENCING.
Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
NGS data analysis CCM Seminar series Michael Liang:
Next Generation DNA Sequencing
Practically Genomic A hands-on bioinformatics IAP Course Materials: Instructors: Paola Favaretto, Sebastian Hoersch,
RNA-Seq in Galaxy Igor Makunin QAAFI, Internal Workshop, April 17, 2015.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
EDACC Primary Analysis Pipelines Cristian Coarfa Bioinformatics Research Laboratory Molecular and Human Genetics.
NIH Extracellular RNA Communication Consortium 2 nd Investigators’ Meeting May 19 th, 2014 Sai Lakshmi Subramanian – (Primary
ChIP-seq hands-on Iros Barozzi, Campus IFOM-IEO (Milan) Saverio Minucci, Gioacchino Natoli Labs.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Transcriptomics Sequencing. over view The transcriptome is the set of all RNA molecules, including mRNA, rRNA, tRNA, and other non coding RNA produced.
RNA-Seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis is doing the.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
SeqWare for NGS analysis MGI meeting, 12/17/2012 Jianying Li.
Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015
Trinity College Dublin, The University of Dublin Data download: bioinf.gen.tcd.ie/GE3M25/project Get.fastq.gz file associated with your student ID
Bioinformatics support at School of Biological Sciences
RNA-Seq in Galaxy Igor Makunin DI/TRI, March 9, 2015.
__________________________________________________________________________________________________ Fall 2015GCBA 815 __________________________________________________________________________________________________.
Accessing and visualizing genomics data
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
HOMER – a one stop shop for ChIP-Seq analysis
Introduction to Exome Analysis in Galaxy Carol Bult, Ph.D. Professor Deputy Director, JAX Cancer Center Short Course Bioinformatics Workshops 2014 Disclaimer…I.
Using Galaxy to build and run data processing pipelines Jelle Scholtalbers / Charles Girardot GBCS Genome Biology Computational Support.
Canadian Bioinformatics Workshops
High Throughput Sequence (HTS) data analysis 1.Storage and retrieving of HTS data. 2.Representation of HTS data. 3.Visualization of HTS data. 4.Discovering.
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Canadian Bioinformatics Workshops
From Reads to Results Exome-seq analysis at CCBR
Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.
Introductory RNA-seq Transcriptome Profiling
Using command line tools to process sequencing data
Cancer Genomics Core Lab
NGS Analysis Using Galaxy
Regulatory Genomics Lab
Short Read Sequencing Analysis Workshop
Chip – Seq Peak Calling in Galaxy
Canadian Bioinformatics Workshops
Introductory RNA-Seq Transcriptome Profiling
EMC Galaxy Course November 24-25, 2014
Day 5 Session 29: Questions and follow-up…. James C. Fleet, PhD
Epigenetics System Biology Workshop: Introduction
ChIP-seq Robert J. Trumbly
Regulatory Genomics Lab
Additional file 2: RNA-Seq data analysis pipeline
Regulatory Genomics Lab
Chip – Seq Peak Calling in Galaxy

Presentation transcript:

User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen

Structure of the hands-on session Introduction (Petri) Demo: Introduction to galaxy (Oskari) Demo: fastQC, read trimming (Petri) Hands-on: fastQC, read trimming –Online tutorial Demo: Alignment (Oskari) Hands-on: Alignment –Online tutorial

Introduction Idea of this tutorial is to: –Get familiar with galaxy –Understand the main steps of ChIP-seq analysis –Do each step manually in Galaxy –Motivate you to do NGS analysis yourself …but not to Go through all theory behind sequencing technology Explain all NGS analysis terms Go in depth in data-analysis

NGS, reads, deep sequencing..? Next generation sequencing(NGS) –Sequencing DNA-RNA molecules Read: –Fraction of DNA-RNA sequenced, typically nucleotides Depth: –Refers to the number of times a nucleotide is read during the sequencing process

Gene regulation and genome-wide data Reference genome Gro-seq RNA-seq TF ChIP-seq TF motif SNP from exome sequencing Histone marker ChIP-seq Dnase-seq Input ChIP-seq TF=Transcription factor

NGS analysis pipeline: ChIP-seq Peak calling (Sami Heikkinen)Peak calling Motif detection and functional enrichment (Minna Kaikkonen)Motif detection and functional enrichment RNA-seq (Eija Korpelainen) Isoforms and differential expression Isoforms and differential expression Functional enrichment Exome-seq (Patrick May) Variant Calling and visual explorationVariant Calling and visual exploration Variant annotation Variant prioritization Common steps: FASTQ data and quality control Read trimming and filtering Alignment and Visualization (visualization, integrative analysis, machine learning, clustering etc)

How did we get the data? GEO accession: GSE31477, Homo Sapiens We used SRA database to download ENCODE data: Command line tool: prefetch SRR (input) prefetch SRR (TCF7L2 ChIP-seq) … or manually from SRA database SRR353507( SRR ) SRR ( SRR340079) Converted SRA to fastq format ( SRA Toolkit, fastq-dump tool): fastq-dump path-to-file/file1-2 Extracted only chr3 results to make analysis steps faster  You will start with TCF7L2 ChIP-seq and INPUT ChIP-seq chr3 data in fastq format

What is Galaxy?

Today’s hands-on session – Galaxy Logon to a computer with your UEFAD login credentials (or GUEST credentials you have received) Start a browser Go to Logon with your UEFAD credentials

Today’s hands-on session – Galaxy Tutorial: Tools: Quality check 1. NGS: QC and manipulation –FastQC:Read QC –Tips and how ”bad” data looks like Clean-up steps if required (Read trimming and filtering) Mapping to genome 2. NGS: Mapping –Map with Bowtie for Illumina

Today’s hands-on session – Galaxy Tools: extra visualiazation of genome mapping Peak calling 3. NGS: Peaks Calling –MACS Workflow 4. Workflow