Gene, Proteins, and Genetic Code. Protein Synthesis in a Cell.

Slides:



Advertisements
Similar presentations
Chapter 13- RNA and Protein Synthesis
Advertisements

Central Dogma Big Idea 3: Living systems store, retrieve, transmit, and respond to info essential to life processes.
RNA and Protein Synthesis
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
The Molecular Genetics of Gene Expression
© 2006 W.W. Norton & Company, Inc. DISCOVER BIOLOGY 3/e
ECE 501 Introduction to BME
Prepared with lots of help from friends... Metsada Pasmanik-Chor, Zohar Yakhini and NUMEROUS WEB RESOURCES. BioInformatics / Computational Biology Introduction.
DNA and RNA. I. DNA Structure Double Helix In the early 1950s, American James Watson and Britain Francis Crick determined that DNA is in the shape of.
From Gene to Protein. Genes code for... Proteins RNAs.
Lecture 12 Splicing and gene prediction in eukaryotes
Protein Synthesis.
Biological Motivation Gene Finding in Eukaryotic Genomes
FROM GENE TO PROTEIN: TRANSCRIPTION & RNA PROCESSING Chapter 17.
The Genetic Code and Transcription
Express yourself That darn ribosome Mighty Mighty Proteins Mutants RNA to the Rescue
Transcription transcription Gene sequence (DNA) recopied or transcribed to RNA sequence Gene sequence (DNA) recopied or transcribed to RNA sequence.
Chapter 6 Gene Prediction: Finding Genes in the Human Genome.
Transcription Transcription is the synthesis of mRNA from a section of DNA. Transcription of a gene starts from a region of DNA known as the promoter.
Chapter 10 genome, gene expression; genes as units of inheritance transmission of heritable characteristics; gene regulation, eukaryote chromosomes, alleles.
NAi_transcription_vo1-lg.mov.
Transcription and Translation. Central Dogma of Molecular Biology Proposed by Crick DNA  RNA  Protein.
© 2012 Pearson Education, Inc. Lecture by Edward J. Zalisko PowerPoint Lectures for Campbell Biology: Concepts & Connections, Seventh Edition Reece, Taylor,
RNA and Protein Synthesis
You should be able to label these pictures Label the following: –RNA polymerase –DNA –mRNA –tRNA –5’ end –3’ end –Amino acid –Ribosome –Polypeptide chain.
Chapter 10 Transcription RNA processing Translation Jones and Bartlett Publishers © 2005.
Gene finding and gene structure prediction M. Fatih BÜYÜKAKÇALI Computational Bioinformatics 2012.
Chapter 13. The Central Dogma of Biology: RNA Structure: 1. It is a nucleic acid. 2. It is made of monomers called nucleotides 3. There are two differences.
Molecular Biology in a Nutshell (via UCSC Genome Browser) Personalized Medicine: Understanding Your Own Genome Fall 2014.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Copyright © 2009 Pearson Education, Inc. Chapter 14 The Genetic Code and Transcription Copyright © 2009 Pearson Education, Inc.
From Genomes to Genes Rui Alves.
GENE EXPRESSION What is a gene? Mendel –Unit of inheritance conferring a phenotype Modern definition –Unit of DNA directing the synthesis of a polypeptide.
Topics in Bioinformatics CS832b Bin Ma. Lecture 1: Basic.
Eukaryotic Gene Structure. 2 Terminology Genome – entire genetic material of an individual Transcriptome – set of transcribed sequences Proteome – set.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
DNA in the Cell Stored in Number of Chromosomes (24 in Human Genome) Tightly coiled threads of DNA and Associated Proteins: Chromatin 3 billion bp in Human.
12/16/14 StarterConnection/Exit: What is the true meaning of the word mutation? Are mutations bad / harmful? 12/16/14 Protein Synthesis Writing
The Genetic Code and Transcription Chapter 12 Honors Genetics Ms. Susan Chabot.
11 Gene function: genes in action. Sea in the blood Various kinds of haemoglobin are found in red blood cells. Each kind of haemoglobin consists of four.
The Central Dogma The Central Dogma traces the flow of genetic information DNA Replication, Transcription, and Translation take place in human cells as.
Lesson Four Structure of a Gene. Gene Structure What is a gene? Gene: a unit of DNA on a chromosome that codes for a protein(s) –Exons –Introns –Promoter.
Finding genes in the genome
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
The Central Dogma of Molecular Biology DNA  RNA  Protein  Trait.
Transcription and The Genetic Code From DNA to RNA.
Unit-II Synthetic Biology: Protein Synthesis Synthetic Biology is - A) the design and construction of new biological parts, devices, and systems, and B)
Introduction to molecular biology Data Mining Techniques.
TRANSCRIPTION AND TRANSLATION Vocabulary. GENE EXPRESSION the appearance in a phenotype characteristic or effect attributed to a particular gene.
Colinearity of Gene and Protein
Features of the genetic code: Triplet codons (total 64 codons) Nonoverlapping Three stop or nonsense codons UAA (ocher), UAG (amber) and UGA (opal)
Genetic Code and Interrupted Gene Chapter 4. Genetic Code and Interrupted Gene Aala A. Abulfaraj.
Biological Motivation Gene Finding in Eukaryotic Genomes Rhys Price Jones Anne R. Haake.
1 Gene Finding. 2 “The Central Dogma” TranscriptionTranslation RNA Protein.
bacteria and eukaryotes
Gene Expression - Transcription
From Gene to Protein: Transcription & RNA Processing
Lesson Four Structure of a Gene.
Lesson Four Structure of a Gene.
DNA Test Review.
From Gene to Protein: Transcription & RNA Processing
Introduction to Bioinformatics II
Mutations changes in the DNA sequence that can be inherited
Analogy Video Central Dogma Analogy Video (Resources Page)
Central Dogma Central Dogma categorized by: DNA Replication Transcription Translation From that, we find the flow of.
General Animal Biology
Chromosome structures
Gene Structure.
So how do we get from DNA to Protein?
Gene Structure.
Presentation transcript:

Gene, Proteins, and Genetic Code

Protein Synthesis in a Cell

A protein sequence >gi| |dbj|BAA | EST AU055734(S20025) corresponds to a region … MCSYIRYDTPKLFTHVTKTPPKNQVSNSINDVGSRRATDRSVASCSSEKSVGTMSVKNASSISFEDIEKSISNWKIPKVN IKEIYHVDTDIHKVLTLNLQTSGYELELGSENISVTYRVYYKAMTTLAPCAKHYTPKGLTTLLQTNPNNRCTTPKTLKWD EITLPEKWVLSQAVEPKSMDQSEVESLIETPDGDVEITFASKQKAFLQSRPSVSLDSRPRTKPQNVVYATYEDNSDEPSI SDFDINVIELDVGFVIAIEEDEFEIDKDLLKKELRLQKNRPKMKRYFERVDEPFRLKIRELWHKEMREQRKNIFFFDWYE SSQVRHFEEFFKGKNMMKKEQKSEAEDLTVIKKVSTEWETTSGNKSSSSQSVSPMFVPTIDPNIKLGKQKAFGPAISEEL VSELALKLNNLKVNKNINEISDNEKYDMVNKIFKPSTLTSTTRNYYPRPTYADLQFEEMPQIQNMTYYNGKEIVEWNLDG FTEYQIFTLCHQMIMYANACIANGNKEREAANMIVIGFSGQLKGWWNNYLNETQRQEILCAVKRDDQGRPLPDRDGNGNP TELKEGFHMEEKDEPIQEDDQVVGTIQKYTKQKWYAEVMYRFIDGSYFQHITLIDSGADVNCIREDEILDQLVQTKREQV VNSIYLHDNSFPKSMDLPDQKITEKRAKLQDIPHHEERLLDYREKKSRDGQDKLPMEVEQSMATNKNTKILLRAWLLST A protein sequence may have a few hundreds to several thousands amino acids.

Protein synthesis

Genetic code..ATTCACAGTGGA....ATTCACAGTGGA.. I H S G

Notes on translation Three Reading frames Third base not important 5’ -> 3’ Start and end codon Open Reading Frame (ORF) Each gene is an ORF, but not all ORF are genes.

The Central Dogma of Molecular Biology DNARNAProtein transcripttranslation replication genotype phenotype

Exception – retroviruses DNARNAProtein transcripttranslation replication genotype phenotype

Protein Phenotype DNA (Genotype) Biology

Genes One gene encodes one protein (or sometimes RNA). Like a program, it starts with start codon (e.g. ATG), then each three code one amino acid. Then a stop codon (e.g. TGA) signifies end of the gene. Genes are dense in prokaryotes and sparse in eukaryotes. In the middle of a eukaryotic gene, there are introns that are spliced out (as junk) after transcription. Good parts are called exons. This is the task of gene finding.

Gene related diseases Hemophilia: on X chromosome. Sickle-Cell Anemia: single nucleotide mutation in the first exon of beta-globin gene (removes a cutting site). 1 in 12 African Americans are carriers. (sick for homozygotes) BRCA1 gene (chr. 17q) – responsible for ½ inherited breast cancer (10% of breast cancer) Fragile X syndrome (mentally retard) – 1 in 1250 males, 2500 females (dominate, but females have partially expressed good gene). FMR-1 gene: tri-nucleotide repeats >200 causes disease. P53 gene: chr. 17p, tumor suppressor protein.

Gene Prediction and Annotation Prokaryotes 1.Start/stop codon (ORF) 2.Promoters 3.Content 4.Sequence similarity

Start Codon May miss short genes. Do not know which start codon to use. Overlapping ORF at different reading frames.

Promoters 5'-XXXXPPPPPPXXXXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3‘ Gene to be transcribed -10: T A T A A T 77% 76% 60% 61% 56% 82% -35: T T G A C A 69% 79% 61% 56% 54% 54% Pribnow box In prokaryotes, the promoter consists of two short sequences at -10 and -35 position upstream of the gene, that is, prior to the gene in the direction of transcription. The sequence at -10 is called the Pribnow box and usually consists of the six nucleotides TATAAT. The Pribnow box is absolutely essential to start transcription in prokaryotes. The other sequence at -35 usually consists of the six nucleotides TTGACA. Its presence allows a very high transcription rate.prokaryotesPribnow box These rules are only approximately correct.

Scoring a 6-mer as Pribnow box We need a “score function” to measure the likelihood that a 6-mer is a pribnow box

An exemplary function for pribnow box fitness evaluation log()

Content I – codon bias A codon XYZ occurs with different freqencies in coding regions and non-coding regions different amino acids have different freq. Diff. codons for the same amino acid have diff. freq. In non-coding regions approx. p(X)*p(Y)*p(Z)

Codon bias First use many known genes of the organism or similar organisms to train codon frequency table. Each codon c i has f(c i ). Second compute the background frequency of each base bf(X) for X=A,C,G,T The “significance” of a codon c=XYZ is then –log( f(c) / (bf(X)*bf(Y)*bf(Z))). High average significance in a region is an indication of gene.

Content II - Hidden Markov Model (HMM)

Eukaryotes Basic idea similar to Prokaryotes Difference:

DNA-specific transcription factors These are the basic of gene-regulatory network Another hot area in Bioinformatics

Splicing Consensus sequences have been identified as necessary but not sufficient for splicing. In vertebrates, these sequences are (the slash identifies the exon-intron or intron-exon junction): Consensussplicing C(orA)AG/GTA(orG)AGT "donor" splice site T(orC)nNC(orT)AG/G "acceptor" splice site. A third sequence, which in yeast is TACTAAC, is necessary within the intron sequence. These rules are only approximately correct.