Regulatory Genomics Lab

Slides:



Advertisements
Similar presentations
Cheryl Jelks Trainer/Applications Support Analyst Richland School District One.
Advertisements

Downloading a multiple alignment for your region of interest from the UCSC Genome Browser ( that can be uploaded in ConTra for.
Submitting a Genome to RAST. Uploading Your Job 1.Login to your RAST account. You will need to register if this is your first time using SEED technologies.
The Maize Inflorescence Project Website Tutorial Nov 7, 2014.
Sequence Comparison and Genome Alignment in the Human Genome Jian Ma Jian Ma | Sequence Comparison and Genome Alignment1 Powerpoint: Casey Hanson.
Variant Calling Workshop Chris Fields Variant Calling Workshop v2 | Chris Fields1 Powerpoint by Casey Hanson.
Homework Assignments due next session 1.Find a entry of interest in OMIM ( )
BlackBoard Online Submission Annual Assessment Updates
Copyright OpenHelix. No use or reproduction without express written consent1.
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
NGS Analysis Using Galaxy
Introduction to RNA-Seq and Transcriptome Analysis
Protein Sequence, Structure, and Function Lab Gustavo Caetano - Anolles 1 PowerPoint by Casey Hanson Protein Sequence, Structure, and Function | Gustavo.
Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki.
Chapter 10: Working with Large Data Spreadsheet-Based Decision Support Systems Prof. Name Position (123) University Name.
Copyright OpenHelix. No use or reproduction without express written consent1.
Variant Calling Workshop Chris Fields Variant Calling Workshop | Chris Fields | PowerPoint by Casey Hanson.
Bacterial Genome Assembly C. Victor Jongeneel Bacterial Genome Assembly | C. Victor Jongeneel | PowerPoint by Casey Hanson.
An Introduction to Designing, Executing and Sharing Workflows with Taverna Nowgen, Next Gen Workshop 17/01/2012.
Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.
Computer Lab (I) Introduction of galaxy and UCSC genome browser.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Introduction to RNA-Seq & Transcriptome Analysis
is accessible at: The following pages are a schematic representation of how to navigate through ALE-HSA21.
GeneWise and Artemis Exercises Spliced Alignment using GeneWise Click on the GeneWise hyperlink on the course links page,
Galaxy for Bioinformatics Analysis An Introduction TCD Bioinformatics Support Team Fiona Roche, PhD Date: 31/08/15.
Polymorphism & Variant Analysis Lab Saurabh Sinha Polymorphism and Variant Analysis Lab v1 | Saurabh Sinha 1 Powerpoint by Casey Hanson.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
NGS data analysis CCM Seminar series Michael Liang:
Copyright OpenHelix. No use or reproduction without express written consent1.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Copyright OpenHelix. No use or reproduction without express written consent1.
DroPPC Tutorial DroPPC- A Drosophila Pipeline for Prediction of CRMs 29 th Dec, 2010.
IPlant Collaborative Discovery Environment RNA-seq Basic Analysis Log in with your iPlant ID; three orange icons.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
The UCSC Table Browser & Custom Tracks Advanced searching and discovery using the UCSC Table Browser and Custom Tracks Osvaldo Graña CNIO Bioinformatics.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Copyright OpenHelix. No use or reproduction without express written consent1.
McGraw-Hill/Irwin The Interactive Computing Series © 2002 The McGraw-Hill Companies, Inc. All rights reserved. Microsoft Excel 2002 Working with Data Lists.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Genome Genome Browser Training Materials developed by: Warren C. Lathe, Ph.D. and Mary Mangan, Ph.D. Part 2.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
1.Switch on the computer and wait for loading. 2.Select the Windows 7 OS at the end of the list. 3.Click on the link ‘Administrator’ 4.Enter the administrator.
Protein Sequence, Structure, and Function Lab Gustavo Caetano - Anolles Protein Sequence, Structure, and Function Lab v1 | Gustavo Caetano - Anolles 1.
Enlisted Association of the National Guard of the United States Data Extract Instructional Guide.
1 CS 106 Computing Fundamentals II Chapter 85 “Excel Tables” Herbert G. Mayer, PSU CS status 6/14/2013 Initial content copied verbatim from CS 106 material.
Computer Fundamentals
NGS Analysis Using Galaxy
Regulatory Genomics Lab
Bacterial Genome Assembly
Variant Calling Workshop
Chip – Seq Peak Calling in Galaxy
EPConDB: Endocrine Pancreas Consortium Database
TSS Annotation Workflow
Bacterial Genome Assembly
Linux + Galaxy Server Tutorial
NWSI Neuroimaging Web Services Interface
ID Mapping tools: Converting Accessions between Databases
Genome Biology & Applied Bioinformatics Mehmet Tevfik DORAK, MD PhD
Basic Local Alignment Search Tool
IBM SCPM Basic Navigation
Yating Liu July 2018 G-OnRamp workshop
Regulatory Genomics Lab
Welcome to the GrameneMart Tutorial
Linux + Genome Assembly Tutorial
Introduction to RNA-Seq & Transcriptome Analysis
Chip – Seq Peak Calling in Galaxy
PubMed/How to Search, Display, Download & (module 4.1)
Presentation transcript:

Regulatory Genomics Lab Bacterial Genome Assembly | Victor Jongeneel Regulatory Genomics Lab Saurabh Sinha PowerPoint by Saba Ghaffari Regulatory Genomics | Saurabh Sinha | 2019

Bacterial Genome Assembly | Victor Jongeneel Exercise In this exercise, we will do the following:. Use Galaxy to manipulate a ChIP track for BIN in D. Mel. Subject peak sets to MEME suite. Compare MEME motifs with Fly Factor Survey motifs for BIN. Subject peak set to a gene set enrichment test. Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 0A: Local Files For viewing and manipulating the files needed for this laboratory exercise, insert your flash drive. Denote the path to the flash drive as the following: [course_directory] We will use the files found in: [course_directory]/06_Regulatory_Genomics/data/ Regulatory Genomics | Saurabh Sinha | 2019

Step 0B: Logging into Galaxy Go to: http://compgen.knoweng.org/galaxy/ Click Enter Click Login Input your login credentials. Click Login. Regulatory Genomics | Saurabh Sinha | 2019

Computational Prediction of Motifs In this exercise, we will upload a ChIP track of the transcription factor BIN in Drosophila Melanogaster to Galaxy. After performing various file manipulations, we will use the MEME suite to identify a motif from the top 100 ChIP regions. Subsequently, we will compare our predicted motif with the experimentally validated motif for BIN at Fly Factor Survey. Regulatory Genomics | Saurabh Sinha | 2019

Step 1: Accessing Input Files At the top of the page, click Shared Data. Then click Histories. Regulatory Genomics | Saurabh Sinha | 2019

Step 2: Accessing Input Files Bacterial Genome Assembly | Victor Jongeneel Step 2: Accessing Input Files Click sb_regulatorygenomics. You should see this page. Click Import History. Regulatory Genomics | Saurabh Sinha | 2019

Step 5: Sort ChIP Track By Score Click on “Filter and Sort” and Sort. Under Sort Dataset, select our ChIP track. Under on column, select column: 6. Under with flavor, select Numerical sort. Under everything in, select Descending order. Click Execute. Regulatory Genomics | Saurabh Sinha | 2019

Step 6: Obtain Top 100 ChIP Regions Bacterial Genome Assembly | Victor Jongeneel Step 6: Obtain Top 100 ChIP Regions Click on "Text Manipulation" and Select First. Under Select first, enter 100 lines. Under from, select our sorted ChIP data. Click Execute. Regulatory Genomics | Saurabh Sinha | 2019

Step 7: Extract DNA of Top 100 ChIP Regions Bacterial Genome Assembly | Victor Jongeneel Step 7: Extract DNA of Top 100 ChIP Regions Click on Fetch Alignment/Sequences. Click on Extract Genomic DNA. Under Fetch sequences for intervals in select our top 100 ChIP regions. Set Interpret features when possible to No. Set Source for Genomic Data to History and use dm3.fasta file as reference. Set Output data type to FASTA. Click Execute. Regulatory Genomics | Saurabh Sinha | 2019

Step 8: Download The Data When finished, click on to download the file to our desktop. This has already been done for you. The resulting sequence is in the following file: [course_directory]/06_Regulatory_Genomics/data/BIN_top_100.fasta Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 9: Submit to MEME DO NOT RUN THIS NOW. MEME TAKES A VERY LONG TIME. In this step, we will submit the sequences to MEME Go to the following address: http://meme-suite.org/tools/meme Upload your sequences file here Enter your email address here. Leave other parameters as default. Click “Start Search”. Regulatory Genomics | Saurabh Sinha | 2019

Step 9A: Analyzing MEME Results Go to the following web address: (You will receive notification email from MEME. The webpage contains a summary of MEME’s findings. It is also available on the results directory: [course_directory]/06_Regulatory_Genomics/results/MEME.html Let’s investigate the top hit. Regulatory Genomics | Saurabh Sinha | 2019

Step 9B: Analyzing MEME Results To the right is a LOGO of our predicted motif, showing the per position relative abundance of each nucleotide At the bottom are the aligned regions in each of our sequences that helped produce this motif. As the p- value increases (becomes less significant) matches show greater divergence from our LOGO. Regulatory Genomics | Saurabh Sinha | 2019

Step 9C: Analyzing MEME Results Other predicted motifs do not seem as plausible. Regulatory Genomics | Saurabh Sinha | 2019

Step 10A: Comparison with Experimentally Validated Motif for BIN FlyFactorSurvey is a database of TF motifs in Drosophila Melanogaster. Go to the following link to view the motif for BIN: http://pgfe.umassmed.edu/ffs/TFdetails.php?FlybaseID=FBgn0045759 Regulatory Genomics | Saurabh Sinha | 2019

Step 10B: Comparison with Experimentally Validated Motif for BIN Best MEME Motif Reverse Complemented Actual BIN Motif Best MEME Motif There is strong agreement between the actual motif and the reverse complement of MEME’s best motif. This indicates MEME was actually able to find the motif from the top 100 ChIP regions for this TF. Regulatory Genomics | Saurabh Sinha | 2019

Gene Set Enrichment Analysis In this exercise, we will extract the nearby genes for each one of the ChIP peaks for BIN. We will then subject the nearby genes to enrichment analysis tests on various Gene Ontology gene sets utilizing DAVID. Regulatory Genomics | Saurabh Sinha | 2019

Step 11A: Acquire Nearby Genes In this step, we will acquire all genes in Drosophila Melanogaster using UCSC Main Table Browser: https://genome.ucsc.edu/ Regulatory Genomics | Saurabh Sinha | 2019

Step 11B: Acquire Nearby Genes Ensure the following settings are configured. Click get output and then get BED. Regulatory Genomics | Saurabh Sinha | 2019

Step 11C: Acquire Nearby Genes Go back to Galaxy Server Click Get Data and then Upload File Click Choose local file and then upload our gene file: [course_directory]/06_Regulatory_Genomics/results/ flygenes.bed Set the Type to bed. Set Genome to dm3. Click Start Regulatory Genomics | Saurabh Sinha | 2019

Step 11D: Acquire Nearby Genes Select Operate on Genomic Intervals Then Select Fetch Closest non-overlapping interval feature. Regulatory Genomics | Saurabh Sinha | 2019

Step 11E: Acquire Nearby Genes For For every interval feature in select our original ChIP track. For Fetch closest features from select the UCSC genes track we just downloaded. Click Execute Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 12A: Cut Out Genes The resulting file has the list of nearby genes in CG format in the 12th column. We are only interested in the genes, so we need to cut them out using the CUT tool. Under Text Manipulation click Cut Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 12B: Cut Out Genes For Cut Columns type c12 to denote column 12. Under Delimited By select Tab Under From select the track we just generated: the intersection of the ChIP-peaks and Fly Base genes. Click Execute. Regulatory Genomics | Saurabh Sinha | 2019

Step 12C: Download The Data When finished, click on to download the file to our desktop. This has already been done for you. The resulting sequence is in the following file: [course_directory]/06_Regulatory_Genomics/results/cg_transcript.txt Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 13A: Convert IDs The enrichment tool we will use doesn’t accept genes in this format. We will use the FlyBase ID converter to convert these transcript ids into FlyBase transcript ids. Regulatory Genomics | Saurabh Sinha | 2019

Regulatory Genomics | Saurabh Sinha | 2019 Step 13B: Convert IDs Go to http://flybase.org/static_pages/downloads/IDConv.html Upload our cg_transcript.txt file and hit Go. On the next page, click file, uniq IDs only to download the file of converted IDs. Regulatory Genomics | Saurabh Sinha | 2019

Step 14A: Gene Set Enrichment - DAVID Move the resulting file from the previous analysis to the course directory and rename it: [course_directory]/06_Regulatory_Genomics/results/fb_transcripts.txt (In case if the file already exist in the folder then replace it with the new file.) With our correct ids of transcripts of genes near ChIP peaks, we now wish to perform a gene set enrichment analysis on various gene sets. A tool that allows us to do this from a web interface is DAVID located at the following address: https://david-d.ncifcrf.gov/summary.jsp Regulatory Genomics | Saurabh Sinha | 2019

Step 14B: Gene Set Enrichment - DAVID We will perform a Gene Set Enrichment Analysis on our transcript list (gene list) and see what GO categories we are significantly enriched in. Analyze the gene list with Functional Annotation Tool Click Choose File on select our fb_transcripts.txt file. Under Select Identifier select FLYBASE_TRANSCRIPT_ID. Under Step 3: List Type check Gene List. Click Submit List. Regulatory Genomics | Saurabh Sinha | 2019

Step 14C: Gene Set Enrichment - DAVID Bacterial Genome Assembly | Victor Jongeneel Step 14C: Gene Set Enrichment - DAVID On the next page, select Functional Annotation Chart. Our gene set seems to be enriched in the BP_FAT GO category! This is consistent with the activity of the BIN transcription factor in the literature. Regulatory Genomics | Saurabh Sinha | 2019