Application to find Eukaryotic Open reading frames. Lab.

Slides:



Advertisements
Similar presentations
12-3 RNA and Protein Synthesis
Advertisements

Chapter 10 How proteins are made.
The Central Dogma DNA  RNA  Protein  Function Replication
Prokaryotic Gene Regulation:
EAnnot: A genome annotation tool using experimental evidence Aniko Sabo & Li Ding Genome Sequencing Center Washington University, St. Louis.
© Wiley Publishing All Rights Reserved. Using Nucleotide Sequence Databases.
Ch 17 Gene Expression I: Transcription
An Introduction to Bioinformatics Finding genes in prokaryotes.
CSCE555 Bioinformatics Lecture 3 Gene Finding Meeting: MW 4:00PM-5:15PM SWGN2A21 Instructor: Dr. Jianjun Hu Course page:
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
On line (DNA and amino acid) Sequence Information Lecture 7.
Gene Finding BCH364C/391L Systems Biology / Bioinformatics – Spring 2015 Edward Marcotte, Univ of Texas at Austin Edward Marcotte/Univ. of Texas/BCH364C-391L/Spring.
Predicting Genes in Mycobacteriophages December 8, In Silico Workshop Training D. Jacobs-Sera.
Finding Eukaryotic Open reading frames.
Gene prediction and HMM Computational Genomics 2005/6 Lecture 9b Slides taken from (and rapidly mixed) Larry Hunter, Tom Madej, William Stafford Noble,
Tutorial 7 Genome browser. Free, open source, on-line broswer for genomes Contains ~100 genomes, from nematodes to human. Many tools that can be used.
Gene Expression Overview
BME 130 – Genomes Lecture 7 Genome Annotation I – Gene finding & function predictions.
What was the most interesting thing that you did over Winter Break? Create a double bubble map comparing/contrasting DNA and RNA.
Finding prokaryotic genes and non intronic eukaryotic genes
Shine-Dalgarno Motif Ribosome binding site located about 13 bases upstream of AUG start codon SD sequence is: 5’-AGGAGGU-3’ Middle GGAG is more highly.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
On line (DNA and amino acid) Sequence Information
Basic Introduction of BLAST Jundi Wang School of Computing CSC691 09/08/2013.
BME 110L / BIOL 181L Computational Biology Tools October 29: Quickly that demo: how to align a protein family (10/27)
Transcription BIT 220 Chapter 12 Basic process of Transcription Figures 12.3 Figure 12.5.
Genome Organization and Evolution. Assignment For 2/24/04 Read: Lesk, Chapter 2 Exercises 2.1, 2.5, 2.7, p 110 Problem 2.2, p 112 Weblems 2.4, 2.7, pp.
BME 110L / BIOL 181L Computational Biology Tools February 19: In-class exercise: a phylogenetic tree for that.
You should be able to label these pictures Label the following: –RNA polymerase –DNA –mRNA –tRNA –5’ end –3’ end –Amino acid –Ribosome –Polypeptide chain.
12-3 RNA and Protein Synthesis
Review of Protein Synthesis. Fig TRANSCRIPTION TRANSLATION DNA mRNA Ribosome Polypeptide (a) Bacterial cell Nuclear envelope TRANSCRIPTION RNA PROCESSING.
Sackler Medical School
Overview of Bioinformatics 1 Module Denis Manley..
Assignment sample solution: Lecture 5. overview Generic types of regulation control Regulation of the “sugar” lactose gene(s) for the bactria e. coli.
Gene Expression. Remember, every cell in your body contains the exact same DNA… …so why does a muscle cell have different structure and function than.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
PROTEIN SYNTHESIS HOW GENES ARE EXPRESSED. BEADLE AND TATUM-1930’S One Gene-One Enzyme Hypothesis.
Bioinformatics and Computational Biology
Eukaryotic Gene Structure. 2 Terminology Genome – entire genetic material of an individual Transcriptome – set of transcribed sequences Proteome – set.
DNA in the Cell Stored in Number of Chromosomes (24 in Human Genome) Tightly coiled threads of DNA and Associated Proteins: Chromatin 3 billion bp in Human.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Finding genes in the genome
Annotation of eukaryotic genomes
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Performing BlastP Amino acids Based on the nature of the side chains:  Aliphatic amino acids- G, A, V, L, I, P  Aromatic amino acids- F, Y, W  Polar.
Visualization of genomic data Genome browsers. UCSC browser Ensembl browser Others ? Survey.
HOW DO CELLS KNOW WHEN TO EXPRESS A GENE? DO NOW:.
CS177 week 3 scavenger hunt team mini-project start in class finish as part of homework this will include a mixture of things we have and have not covered.
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
Lecture 9: Basic concepts of Perl Modules. Functions (Subs) In perl functions take the following format: – sub subname – { my $var1 = $_[0]; statements.
bacteria and eukaryotes
EGASP 2005 Evaluation Protocol
Eukaryotic Gene Structure
Molecular Genetics Transcription & Translation
EGASP 2005 Evaluation Protocol
BTY100-Lec#4.2 DNA to Protein (Central Dogma).
DNA Test Review.
Central Dogma.
Recitation 7 2/4/09 PSSMs+Gene finding
Predicting Genes in Actinobacteriophages
Overview of Protein Synthesis And RNA Processing
Section: ___ Time of lab:______8th or 9th floor (circle)
Gene Expression Activation of a gene to transcribe DNA into RNA.
Unit 7 Part 2 Notes: From Gene to Protein
Reading Frames and ORF’s
Introduction to Alternative Splicing and my research report
Gene Discovery.
The Toy Exon Finder.
Gene Discovery.
Presentation transcript:

Application to find Eukaryotic Open reading frames. Lab

Introduction Finding ORF in Prokaryotic DNA Finding cds in non-intronic Eukaryotes “genes” Use perl scripts (from assignment or elsewhere). Finding ORF in Intronic eukaryotic genes Compare and contrast results of your program with actual results (gene bank or EBI records) Test an online gene finding

Using your assignment code Open the file: ORF pal gene.fasta Find all open reading frames. (This time you must modify your code to translate each codon you can utilise the convertor_hashtable.txt Compare to file: pal protein sequence.fasta. – Visual inspect the files for start and stops; ORFs.; in which reading frame is the TRUE ORF. Compare your results to the results in the gene bank record. Annotation and Analysis site: Pal genePal gene What conclusion can you draw. Note that the file – ORF pal gene.fasta (has extra bases on either side of the ORF take this into account in your analysis)

Predictive translation effect Exons/intron length Consider the following: We have the mRNA CDS of 60 bp in length (start…stop) Let us assume that the intron1 is: – at the end of codon three (position 9) – the length of the intron is 30bp. Intron 2 occurs at: – the end of codon 10 (position 30) – and is 45 bp in length What is the effect of the translations: on Exon A and Exon B? Exon 9 bp Intron 30 bp Exon 21 bp Intron 45 bp Exon 30 Bp BP… ATG TAA Exon AExon B DNA Strand

Predictive translation effect Exons/intron length Consider the following: We have the mRNA CDS of 60 bp in length (start…stop) Let us assume that the intron1 is: – at the end of codon three (position 9) – the length of the intron is 30bp. Intron 2 occurs at: – at position 29 (at the 3 rd bp of codon 10) – and is 45 bp in length What is the effect of the translations: on Exon A and Exon B? Exon 9 bp Intron 30 bp Exon 20 bp Intron 45 bp Exon 31 Bp BP… ATG TAA Exon AExon B DNA Strand

Predictive translation effect Exons/intron length Consider the following: We have the mRNA CDS of 60 bp in length (start…stop) Let us assume that the intron1 is: – at the end of codon three (position 9) – the length of the intron is 30bp. Intron 2 occurs at: – at position 30(the end of codon 10) – and is 43 bp in length What is the effect of the translations: on Exon A and Exon B? Exon 9 bp Intron 30 bp Exon 21 bp Intron 43 bp Exon 30 Bp BP… ATG TAA Exon AExon B DNA Strand

Effect of Translation Example 1 no effect all multiples of 3 Example 2 the last residue of exon 2 is incorrect. The residue for Exon 3 is correct. (but starts at bp 2 of first codon) Example 3 last expn is in different reading frame. Refer to Incorrect_translation_examples.rar

Finding start and stop codons and potential ORF Download the file. (TUBAC3 gene complete sequence) (see last lecture)TUBAC3 gene complete sequence Using “your” perl module and perl script determine the location of the start and stop codons. Write a script that will “attempt” to find at least one ORF. Clearly state how, using your background knowledge of gene expression, you are going to try and resolve this problem. Using your program: determine the position of an (the) ORF(s) for this gene. Discuss how your findings compare with the results obtained in the genebank record file. You can use the graphics image

Intronic DNA gene Go to the understanding Bioinformatics student resources and down load the PRISSEL (CDK10 fasta file). It can also be downloaded from the sample files in the lecture on finding Eukaryotic genes.student resources Run your application to find ORF. Comapre this to the actual results. Discuss your results (in relation to the known experimental results; you can find the actual results in understanding ioinformatics or refer to slide in the lecture: finding Eukaryotic genes.

Online Gene finding program In the student resources you should find links to various online gene predicting applications. Use one to analyse the ALDH10 exon 1 and exon 2 of the ALDH10 gene.