Presentation is loading. Please wait.

Presentation is loading. Please wait.

BioInformatics (1). What is Life All About : Self-compiling & self-assembling Complementary surfaces Watson-Crick base pair (Nature April 25, 1953)

Similar presentations


Presentation on theme: "BioInformatics (1). What is Life All About : Self-compiling & self-assembling Complementary surfaces Watson-Crick base pair (Nature April 25, 1953)"— Presentation transcript:

1 BioInformatics (1)

2 What is Life All About : Self-compiling & self-assembling Complementary surfaces Watson-Crick base pair (Nature April 25, 1953)

3 Life Science vs Computing Where do parasites come from? (computer & biological viral codes) Over $12 billion/year on computer viruses LoveBug Set dirtemp =3D fso.GetSpecialFolder(2) Set c =3D fso.GetFile(WScript.ScriptFullName) c.Copy(dirsystem&"\MSKernel32.vbs") c.Copy(dirwin&"\Win32DLL.vbs") c.Copy(dirsystem&"\LOVE-LETTER-FOR-YOU.TXT.vbs") regruns() html() spreadtoemail() listadriv() 20 M dead (worse than black plague & 1918 Flu) AIDS - HIV-1 Polymerase drug resistance mutations M41L, D67N, T69D, L210W, T215Y, H208Y PISPIETVPV KLKPGMDGPK VKQWPLTEEK IKALIEICAE LEKDGKISKI GPVNPYDTPV FAIKKKNSDK WRKLVDFREL NKRTQDFCEV

4 Concept Computers Organisms Instructions Program Genome Bits 0,1 a,c,g,t Stable memory ROM,Disk,tape DNA Active memory RAM RNA Processing CPU/Compiler enzyme/Ribosome Editing Editor tRNA Environment Sockets,people Water,salts,heat I/O AD/DA proteins Monomer Minerals Nucleotide Polymer chip DNA,RNA,protein Replication Cut/Paste DNA replication Sensor/In scanner Chem/photo receptor Exciting Life ??

5 of RNA-based life: C,H,N,O,P Useful for many species: Na, K, Fe, Cl, Ca, Mg, Mo, Mn, S, Se, Cu, Ni, Co, Si Elements

6

7 The Four Nucleosides of DNA dA dG dC dT A nucleoside is a sugar, here deoxyribose, plus a base dA = deoxyadenosine, etc. PYRIMIDINESPURINES

8 Adenine Guanine ThymineCytosineUracil BASES

9 Base Pairing

10 A nucleotide is a phospate, a sugar, and a purine or a pyramidine base. The monomeric units of nucleic acids are called nucleotides.

11

12 Chromosomes

13 Genome and gene

14 Nucleic acid and proteins

15 Nucleotide codes

16 Amino acid codes

17 Standard Genetic Code

18 Schematic illustration of a plant cell (Home for DNA)

19

20 History of structure determination for nucleic acids and proteins

21 Human chromosomes: idiograms

22 X-linked recessive disorder. The inheritance pattern is shown for a recessive gene on the chromosome X, designated in bold. Male XY (normal) FemaleX (normal) Female XX (normal) Female XX (normal) Male XY (normal) Male XY (affected)

23

24 Reductionistic and synthetic approaches in biology Biological System (Organism) Building Blocks (Genes/Molecules) Synthetic Approach (Bioinformatics) Reductionistic Approach (Experiments)

25 Basic principles in physics, chemistry and biology. Principles Known? Physics Matter Chemistry Compound Biology Organism Elementary Particles Yes Elements Yes Genes No

26

27

28

29 The addresses for the major databases

30 New generation of molecular biology databases

31 Example of sequence database entry for Genbank LOCUSDRODPPC4001 bpINV15-MAR-1990 DEFINITIOND.melanogaster decapentaplegic gene complex (DPP-C), complete cds. ACCESSIONM30116 KEYWORDS. SOURCED.melanogaster, cDNA to mRNA. ORGANISMDrosophila melanogaster Eurkaryote; mitochondrial eukaryotes; Metazoa; Arthropoda; Tracheata; Insecta; Pterygota; Diptera; Brachycera; Muscomorpha; Ephydroidea; Drosophilidae; Drosophilia. REFERENCE1 (bases 1 to 4001) AUTHORSPadgett, R.W., St Johnston, R.D. and Gelbart, W.M. TITLEA transcript from a Drosophila pattern gene predicts a protein homologous to the transforming growth factor-beta family JOURNALNature 325, 81-84 (1987) MEDLINE87090408 COMMENTThe initiation codon could be at either 1188-1190 or 1587-1589 FEATURESLocation/Qualifiers source1..4001 /organism=“Drosophila melanogaster” /db_xref=“taxon:7227” mRNA<1..3918 /gene=“dpp” /note=“decapentaplegic protein mRNA” /db_xref=“FlyBase:FBgn0000490” gene1..4001 /note=“decapentaplegic” /gene=“dpp” /allele=“” /db_xref=“FlyBase:FBgn0000490” CDS1188..2954 /gene=“dpp” /note=“decapentaplegic protein (1188 could be 1587)” /codon_start=1 /db_xref=“FlyBase:FBgn0000490” /db_xref=“PID:g157292” /translation=“MRAWLLLLAVLATFQTIVRVASTEDISQRFIAAIAPVAAHIPLA SASGSGSGRSGSRSVGASTSTALAKAFNPFSEPASFSDSDKSHRSKTNKKPSKSDANR …………………… LGYDAYYCHGKCPFPLADHFNSTNAVVQTLVNNMNPGKVPKACCVPTQLDSVAMLYL NDQSTBVVLKNYQEMTBBGCGCR” BASE COUNT1170 a1078 c956 g797 t ORIGIN 1 gtcgttcaac agcgctgatc gagtttaaat ctataccgaa atgagcggcg gaaagtgagc 61 cacttggcgt gaacccaaag ctttcgagga aaattctcgg acccccatat acaaatatcg 121 gaaaaagtat cgaacagttt cgcgacgcga agcgttaaga tcgcccaaag atctccgtgc 181 ggaaacaaag aaattgaggc actattaaga gattgttgtt gtgcgcgagt gtgtgtcttc 241 agctgggtgt gtggaatgtc aactgacggg ttgtaaaggg aaaccctgaa atccgaacgg 301 ccagccaaag caaataaagc tgtgaatacg aattaagtac aacaaacagt tactgaaaca 361 gatacagatt cggattcgaa tagagaaaca gatactggag atgcccccag aaacaattca 421 attgcaaata tagtgcgttg cgcgagtgcc agtggaaaaa tatgtggatt acctgcgaac 481 cgtccgccca aggagccgcc gggtgacagg tgtatccccc aggataccaa cccgagccca 541 gaccgagatc cacatccaga tcccgaccgc agggtgccag tgtgtcatgt gccgcggcat 601 accgaccgca gccacatcta ccgaccaggt gcgcctcgaa tgcggcaaca caattttcaa …………………………. 3841 aactgtataa acaaaacgta tgccctataa atatatgaat aactatctac atcgttatgc 3901 gttctaagct aagctcgaat aaatccgtac acgttaatta atctagaatc gtaagaccta 3961 acgcgtaagc tcagcatgtt ggataaatta atagaaacga g //

32 Example of sequence database entry for SWISS-PROT IDDECA_DROMESTANDARD;PRT;588AA. ACP07713; DT01-APR-1988 (REL. 07, CREATED) DT01-APR-1988 (REL. 07, LAST SEQUENCE UPDATE) DT01-FEB-1995 (REL. 31, LAST ANNOTATION UPDATE) DEDECAPENTAPLEGIC PROTEIN PRECURSOR (DPP-C PROTEIN). GNDPP. OSDROSOPHILA MELANOGASTER (FRUIT FLY). OCEUKARYOTA; METAZOA; ARTHROPODA; INSECTA; DIPTERA. RN[1] RPSEQUENCE FROM N.A. RM87090408 RAPADGETT R.W., ST JOHNSTON R.D., GELBART W.M.; RLNATURE 325:81-84 (1987) RN[2] RPCHARACTERIZATION, AND SEQUENCE OF 457-476. RM90258853 RAPANGANIBAN G.E.F., RASHKA K.E., NEITZEL M.D., HOFFMANN F.M.; RLMOL. CELL. BIOL. 10:2669-2677(1990). CC-!- FUNCTION: DPP IS REQUIRED FOR THE PROPER DEVELOPMENT OF THE CC EMBRYONIC DOORSAL HYPODERM, FOR VIABILITY OF LARVAE AND FOR CELL CC VIABILITY OF THE EPITHELIAL CELLS IN THE IMAGINAL DISKS. CC-!- SUBUNIT: HOMODIMER, DISULFIDE-LINKED. CC-!- SIMILARITY: TO OTHER GROWTH FACTORS OF THE TGF-BETA FAMILY. DREMBL; M30116; DMDPPC. DRPIR; A26158; A26158. DRHSSP; P08112; 1TFG. DRFLYBASE; FBGN0000490; DPP. DRPROSITE; PS00250; TGF_BETA. KWGROWTH FACTOR; DIFFERENTIATION; SIGNAL. FTSIGNAL1?POTENTIAL. FTPROPEP?456 FTCHAIN457588DECAPENTAPLEGIC PROTEIN. FTDISULFID487553BY SIMILARITY. FTDISULFID516585BY SIMILARITY. FTDISULFID520587BY SIMILARITY. FTDISULFID552552INTERCHAIN (BY SIMILARITY). FTCARBOHYD120120POTENTIAL. FTCARBOHYD342342POTENTIAL. FTCARBOHYD377377POTENTIAL. FTCARBOHYD529529POTENTIAL. SQSEQUENCE 588 AA; 65850MW; 1768420 CN; MRAWLLLLAV LATFQTIVRV ASTEDISQRF IAAIAPVAAH IPLASASGSG SGRSGSRSVG ASTSTAGAKA FNRFSEPASF SDSDKSHRSK TNKKPSKSDA NRQFNEVHKP RTDQLENSKN KSKQLVNKPN HNKMAVKEQR SHHKKSHHHR SHQPKQASAS TESHQSSSIE SIFVEEPTLV LDREVASINV PANAKAIIAE QGPSTYSKEA LIKDKLKPDP STYLVEIKSL LSLFNMKRPP KIDRSKIIIP EPMKKLYAEI MGHELDSVNI PKPGLLTKSA NTVRSFTHKD SKIDDRFPHH HRFRLHFDVK SIPADEKLKA AELQLTRDAL SQQVVASRSS ANRTRYQBLV YDITRVGVRG QREPSYLLLD TKTBRLNSTD TVSLDVQPAV DRWLASPQRN YGLLVEVRTV RSLKPAPHHH VRLRRSADEA HERWQHKQPL LFTYTDDGRH DARSIRDVSG GEGGGKGGRN KRHARRPTRR KNHDDTCRRH SLYVDFSDVG WDDWIVAPLG YDAYYCHGKC PFPLADHRNS TNHAVVQTLV NNMNPGKBPK ACCBPTQLDS VAMLYLNDQS TVVLKNYQEM TVVGCGCR

33 Functional classification of E. coli genes according to Monica Riley

34 The Protein Folding Problem

35 Protein Folding Problem (Sequence 3D Structure) 1 Protein folding is thermodynamically determined (Anfinsen’s thermodynamic principle ) Protein + Environment 2. Protein folding is a reaction imvolving other interacting molecules (Principle of molecular interactions) Protein + Chaperonins +….

36 Central Paradigm

37 Bioinformatics : A Long Journey (How far are we away from knowing the God ??) Sequence to exon 80% [Laub 98] Exons to gene (without cDNA or homolog) ~30% [Laub 98] Gene to regulation ~10% [Hughes 00] Regulated gene to protein sequence 98% [Gesteland ] Sequence to secondary-structure ( , ,c) 77% [CASP] Secondary-structure to 3D structure 25% [CASP] 3D structure to ligand specificity ~10% [Johnson 99] Expected accuracy overall ~ = 0.8*.3*.1*.98*.77*.25*.1 =.0005 ?

38 Our Focus in Bioinformatics Perturbation Environment Medication Genetic Engineering Dynamic Response Gene Expression Protein Expression BioChip DataBase Genotype/Phenotype Symbolic Algorithms/ Computing Analysis Biology Molecular Biology Bio Chemistry Genetics Virtual Cell Genome Sequencing


Download ppt "BioInformatics (1). What is Life All About : Self-compiling & self-assembling Complementary surfaces Watson-Crick base pair (Nature April 25, 1953)"

Similar presentations


Ads by Google