Condor: BLAST Tuesday, Dec 7th, 10:45am

Slides:



Advertisements
Similar presentations
Fa07CSE 182 CSE182-L4: Database filtering. Fa07CSE 182 Summary (through lecture 3) A2 is online We considered the basics of sequence alignment –Opt score.
Advertisements

Blast outputoutput. How to measure the similarity between two sequences Q: which one is a better match to the query ? Query: M A T W L Seq_A: M A T P.
Phylogenetic Trees Understand the history and diversity of life. Systematics. –Study of biological diversity in evolutionary context. –Phylogeny is evolutionary.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 2: “Homology” Searches and Sequence Alignments.
Sequence Similarity Searching Class 4 March 2010.
Expect value Expect value (E-value) Expected number of hits, of equivalent or better score, found by random chance in a database of the size.
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
Introduction to Bioinformatics - Tutorial no. 5 MEME – Discovering motifs in sequences MAST – Searching for motifs in databanks TRANSFAC – The Transcription.
Similar Sequence Similar Function Charles Yan Spring 2006.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Arabidopsis Gene Project GK-12 April Workshop Karolyn Giang and Dr. Mulligan.
Making Sense of DNA and protein sequence analysis tools (course #2) Dave Baumler Genome Center of Wisconsin,
Automatic methods for functional annotation of sequences Petri Törönen.
C OMPUTATIONAL BIOLOGY. O UTLINE Proteins DNA RNA Genetics and evolution The Sequence Matching Problem RNA Sequence Matching Complexity of the Algorithms.
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
Run restriction digestion: TA's will take the pictures for you.
Welcome to Introduction to Bioinformatics Computing aka BIC1.
BLAST: A Case Study Lecture 25. BLAST: Introduction The Basic Local Alignment Search Tool, BLAST, is a fast approach to finding similar strings of characters.
Lab 3 – BLAST – Directed It’s a BLAST! (too easy?)
PROTEIN STRUCTURE CLASSIFICATION SUMI SINGH (sxs5729)
Bacterial Genetics - Assignment and Genomics Exercise: Aims –To provide an overview of the development and.
BLAST Anders Gorm Pedersen & Rasmus Wernersson. Database searching Using pairwise alignments to search databases for similar sequences Database Query.
Construction of Substitution Matrices
Condor: BLAST Monday, July 19 th, 3:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
A Tutorial of Sequence Matching in Oracle Haifeng Ji* and Gang Qian** * Oklahoma City Community College ** University of Central Oklahoma.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
Basic Local Alignment Search Tool BLAST Why Use BLAST?
BLAST, which stands for basic local alignment search tool, is a heuristic algorithm that is used to find similar sequences of amino acids or nucleotides.
Biocomputation: Comparative Genomics Tanya Talkar Lolly Kruse Colleen O’Rourke.
Condor: BLAST Monday, 3:30pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Database Similarity Search. 2 Sequences that are similar probably have the same function Why do we care to align sequences?
Sequence Alignment.
Construction of Substitution matrices
Integration of Bioinformatics into Inquiry Based Learning by Kathleen Gabric.
What is BLAST? Basic BLAST search What is BLAST?
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
Summer Bioinformatics Workshop 2008 BLAST Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State University – Rochester Center
Pairwise Sequence Alignment. Three modifications for local alignment The scoring system uses negative scores for mismatches The minimum score for.
Bioinformatics What is a genome? How are databases used? What is a phylogentic tree?
BLAST: Basic Local Alignment Search Tool Robert (R.J.) Sperazza BLAST is a software used to analyze genetic information It can identify existing genes.
What is BLAST? Basic BLAST search What is BLAST?
Using BLAST to Identify Species from Proteins
Scoring Sequence Alignments Calculating E
Gil McVean Department of Statistics
Variation among organisms
Basics of BLAST Basic BLAST Search - What is BLAST?
Investigating Diversity Part 2
Bioinformatics Madina Bazarova. What is Bioinformatics? Bioinformatics is marriage between biology and computer. It is the use of computers for the acquisition,
How to use a bioinformatics website!
What is Bioinformatics?
Using BLAST to Identify Species from Proteins
Sequence comparison: Significance of similarity scores
Genome Center of Wisconsin, UW-Madison
Sequence comparison: Multiple testing correction
Bioinformatics and BLAST
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Comparative Genomics.
Bioinformatics Vicki & Joe.
What do you with a whole genome sequence?
Basic Local Alignment Search Tool
Explore Evolution: Instrument for Analysis
Sequence Similarity Andrew Torda, wintersemester 2006 / 2007, Angewandte … What is the easiest information to find about a protein ? sequence history.
Applying principles of computer science in a biological context
Basic Local Alignment Search Tool
Basic Local Alignment Search Tool (BLAST)
Using BLAST to Identify Species from Proteins
Lab 3 – BLAST – Directed It’s a BLAST! (too easy?)
Sequence alignment, E-value & Extreme value distribution
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

Condor: BLAST Tuesday, Dec 7th, 10:45am Zach Miller <zmiller@cs.wisc.edu> University of Wisconsin-Madison

Any questions on the lectures or exercises up to this point? Before we begin… Any questions on the lectures or exercises up to this point? 2

I hope you’re not getting too tired 3

Up to now, you’ve done toy examples BLAST Up to now, you’ve done toy examples Simple, easy to use Illustrate basics of what you need to know But not a “real” application Let’s try out a real application: BLAST More complex, not so easy to use A real application 4

I am a computer scientist I am not a biologist First, some honesty I am a computer scientist I am not a biologist My knowledge of BLAST is shallow But it’s way cooler application than what we’ve done so far! 5

From the BLAST web page: BLAST Description From the BLAST web page: The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families. 6

Blast Description (My understanding) Biologists have sequences: Nucleotides in DNA: ACGTTGCA… Amino acids in proteins: GECVASR… They also have databases of lots of sequences From lots of organisms, from tiny bacteria to humans BLAST helps them answer questions: Which bacterial species have a protein that is related in lineage to another protein? What other genes encode proteins that exhibit structures or motifs such as ones that have just been determined? … BLAST is widely used and considered important. 7

Is this just string comparison? It’s harder than just comparing two strings: Is “GCTA == GCTA”? BLAST can find “similar” sequences, based on metrics that biologists determine. “Similar” means this is more computationally expensive than just string comparison BLAST is a very popular program to ask these questions 8

The final set of exercises have you run queries with BLAST. BLAST exercise The final set of exercises have you run queries with BLAST. They are a bit arbitrary, because we know less about the underlying biology But it’s a real application with real data! Your challenge: run a bunch of BLAST queries and summarize the results. Do it all within a DAG. 9

Time to try it out! 10

Feel free to ask me questions later! Questions? Comments? Feel free to ask me questions later! 11