Condor: BLAST Monday, 3:30pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.

Slides:



Advertisements
Similar presentations
Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Advertisements

Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Intermediate Condor: DAGMan Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 2: “Homology” Searches and Sequence Alignments.
Jeff Shen, Morgan Kearse, Jeff Shi, Yang Ding, & Owen Astrachan Genome Revolution Focus 2007, Duke University, Durham, North Carolina Introduction.
Sequence Similarity Searching Class 4 March 2010.
Biological Databases Notes adapted from lecture notes of Dr. Larry Hunter at the University of Colorado.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Similar Sequence Similar Function Charles Yan Spring 2006.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Recap Don’t forget to – pick a paper and – me See the schedule to see what’s taken –
Welcome to Introduction to Bioinformatics Computing aka BIC1.
Intermediate HTCondor: Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Automatic methods for functional annotation of sequences Petri Törönen.
C OMPUTATIONAL BIOLOGY. O UTLINE Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity of the Algorithms.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Run restriction digestion: TA's will take the pictures for you.
An Introduction to High-Throughput Computing Monday morning, 9:15am Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Welcome to Introduction to Bioinformatics Computing aka BIC1.
BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.
Lab 3 – BLAST – Directed It’s a BLAST! (too easy?)
PROTEIN STRUCTURE CLASSIFICATION SUMI SINGH (sxs5729)
Bacterial Genetics - Assignment and Genomics Exercise: Aims –To provide an overview of the development and.
BLAST Basic Local Alignment Search Tool (Altschul et al. 1990)
Introduction to Bioinformatics Biostatistics & Medical Informatics 576 Computer Sciences 576 Fall 2008 Colin Dewey Dept. of Biostatistics & Medical Informatics.
Construction of Substitution Matrices
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Turning science problems into HTC jobs Wednesday, July 29, 2011 Zach Miller Condor Team University of Wisconsin-Madison.
Condor: BLAST Monday, July 19 th, 3:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
BioInformatics Database of Primer Results In order to help predict the way proteins will act in an organism, biologists cross-examine sequences of amino.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Intermediate Condor: Workflows Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Sequence Alignment.
Construction of Substitution matrices
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
How to benefit from the International Summer School on Grid Computing 2009 Alain Roy.
Bioinformatics zInterdisciplinary science that involves developing and applying information technology for analyzing biological data Overview of Bioinformatics.
What is BLAST? Basic BLAST search What is BLAST?
While hiking, a student decided to collect and eat berries from the plants he came across on the AT trail. Unfortunately, he became very ill and had to.
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
What is BLAST? Basic BLAST search What is BLAST?
Using BLAST to Identify Species from Proteins
Basics of BLAST Basic BLAST Search - What is BLAST?
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
Using BLAST to Identify Species from Proteins
Sequence comparison: Significance of similarity scores
Genome Center of Wisconsin, UW-Madison
Bioinformatics and BLAST
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Comparative Genomics.
What do you with a whole genome sequence?
Basic Local Alignment Search Tool
Explore Evolution: Instrument for Analysis
Sequence Similarity Andrew Torda, wintersemester 2006 / 2007, Angewandte … What is the easiest information to find about a protein ? sequence history.
Applying principles of computer science in a biological context
Basic Local Alignment Search Tool
Using BLAST to Identify Species from Proteins
Lab 3 – BLAST – Directed It’s a BLAST! (too easy?)
Sequence alignment, E-value & Extreme value distribution
Condor: BLAST Tuesday, Dec 7th, 10:45am
Presentation transcript:

Condor: BLAST Monday, 3:30pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison

OSG Summer School 2012 Before we begin… Any questions on the lectures or exercises up to this point? 2

OSG Summer School 2012 I hope you’re not getting too tired 3

OSG Summer School 2012 BLAST Up to now, you’ve done toy examples  Simple, easy to use  Illustrate basics of what you need to know  The Mandlebrot set is cool… but a toy Let’s try out a real application: BLAST  More complex, not so easy to use 4

OSG Summer School 2012 First, some honesty I am a computer scientist I am not a biologist My knowledge of BLAST is shallow But it’s way cooler application than what we’ve done so far! 5

OSG Summer School 2012 BLAST Description From the BLAST web page: 6 The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

OSG Summer School 2012 Blast Description (My understanding) Biologists have sequences:  Nucleotides in DNA: ACGTTGCA…  Amino acids in proteins: GECVASR… They also have databases of lots of sequences  From lots of organisms, from tiny bacteria to humans BLAST helps them answer questions:  Which bacterial species have a protein that is related in lineage to another protein?  What other genes encode proteins that exhibit structures or motifs such as ones that have just been determined?  … BLAST is widely used and considered important 7

OSG Summer School 2012 Is this just string comparison? It’s harder than just comparing two strings: Is “GCTA == GCTA”? BLAST can find “similar” sequences, based on metrics that biologists determine.  “Similar” means this is more computationally expensive than just string comparison BLAST is a very popular program to ask these questions 8

OSG Summer School 2012 BLAST exercise The final set of exercises have you run queries with BLAST They are a bit arbitrary, because I know less about the underlying biology But it’s a real application with real data! Your challenge: run a bunch of BLAST queries and summarize the results. Do it all within a DAG 9

OSG Summer School 2012 Time to try it out! 10

OSG Summer School 2012 Questions? Questions? Comments? Feel free to ask me questions later: Alain Roy Upcoming sessions  Now – 5:00pm  Hands-on exercises  Finish up earlier exercises  Try out BLAST  5:00 – 7:00: Dinner, on your own  7:00 – 9:00: Optional evening work session  I’ll be there, with my laptop  Come and finish up any exercises, try the challenges, ask me hard questions  Or skip it and get a drink: it’s your choice 11