Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning & Bioinformatics 1 Tien-Hao Chang (Darby Chang)

Similar presentations

Presentation on theme: "Machine Learning & Bioinformatics 1 Tien-Hao Chang (Darby Chang)"— Presentation transcript:

1 Machine Learning & Bioinformatics 1 Tien-Hao Chang (Darby Chang)

2 Machine Learning & Bioinformatics 2 Molecular biology Nucleic acid –DNA –RNA Central dogma –Transcription –Translation Protein –Amino acid –Primary structure –Secondary structure –Tertiary structure

3 Nucleic acid A nucleic acid is a macromolecule composed of chains of monomeric nucleotide In biochemistry these molecules carry genetic information or form structures within cells The most common nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) Machine Learning & Bioinformatics 3


5 Nucleic acid components Sugar Machine Learning & Bioinformatics 5

6 Nucleic acid components Base Purine –Adenine (A) and guanine (G) Pyrimidine –Thymine (T), cytosine (C) –Uracil (U, only in RNA) Machine Learning & Bioinformatics 6



9 DNA Chemically, DNA is a long polymer of simple units called nucleotides, with a backbone made of sugars and phosphate groups joined by ester bonds Attached to each sugar is one of four types of molecules called bases It is the sequence of these four bases along the backbone that encodes information Machine Learning & Bioinformatics 9

10 DNA Base pairing Each type of base on one strand forms a bond with just one type of base on the other strand Here, purines form hydrogen bonds to pyrimidines, with A bonding only to T, and C bonding only to G DNA sequence –5’CpGpCpApApTpT 3’TpTpApApCpGpC –CGCGAATT Machine Learning & Bioinformatics 10



13 Hydrogen bond A hydrogen bond exists between an electronegative atom and a hydrogen atom bonded to another electronegative atom This type of force always involves a hydrogen atom and the energy of this attraction is close to that of weak covalent bonds (155 kJ/mol), thus the name – Hydrogen Bonding Biological functions –DNA/RNA base paring –protein secondary/tertiary structure formation –some properties of water molecule –antibody-antigen (and other protein-protein) binding Machine Learning & Bioinformatics 13



16 DNA structure Machine Learning & Bioinformatics 16 k5iS1f0&NR=1

17 Any Questions? Machine Learning & Bioinformatics 17 About DNA


19 Central dogma The process by witch information is extracted from the nucleotide sequence of a gene and then used to make a protein is essentially the same for all living things on Earth and is described by the grandly named central dogma of molecular biology Information in cells passes from DNA to RNA to proteins Machine Learning & Bioinformatics 19

20 RNA Information stored from DNA is used to make a more transient, single-stranded polynucleotide called RNA (Ribonucleic Acid) RNA is very similar to DNA, but differs in a few important structural details –in the cell RNA is usually single stranded, while DNA is usually double stranded –RNA nucleotides contain ribose while DNA contains deoxyribose (a type of ribose that lacks one oxygen atom) –in RNA the nucleotide uracil substitutes for thymine, which is present in DNA Machine Learning & Bioinformatics 20


22 Central dogma Transcription Transcription is the synthesis of RNA under the direction of DNA Both nucleic acid sequences use the same language, and the information is simply transcribed, or copied DNA sequence is copied by RNA polymerase to produce a complementary nucleotide RNA strand, called messenger RNA (mRNA) Machine Learning & Bioinformatics 22

23 DNA transcription Machine Learning & Bioinformatics 23 Z3DsntU

24 Transcription detail Machine Learning & Bioinformatics 24 http://www- imation/m_animations/gene2.swf

25 RNA Various types mRNA –messenger RNA (mRNA) is the RNA that carries information from DNA to the ribosome –the coding sequence of the mRNA determines the amino acid sequence in the protein that is produced Non-coding RNA Machine Learning & Bioinformatics 25

26 Various RNA types Non-coding RNA Many RNAs do not code for protein These ncRNAs encode in specific genes (RNA genes) or mRNA introns The most common ncRNAs are transfer RNA (tRNA) and ribosomal RNA (rRNA) Other ncRNAs such as microRNA (miRNA) involve in post-transcriptional gene regulation Machine Learning & Bioinformatics 26


28 Central dogma Translation Translation is the second stage of protein biosynthesis Translation occurs in the cytoplasm where the ribosomes are located In translation, mRNA is decoded to produce a specific polypeptide according to the rules specified by the genetic code Machine Learning & Bioinformatics 28

29 From RNA to protein synthesis Machine Learning & Bioinformatics 29 gkPEAo

30 Protein translation Machine Learning & Bioinformatics 30 lonmA0


32 Any Questions? Machine Learning & Bioinformatics 32 About central dogma

33 Protein Machine Learning & Bioinformatics 33

34 Protein Proteins are large organic compounds made of amino acids arranged in a linear chain and joined together by peptide bonds between the carboxyl and amino groups of adjacent amino acid residues Proteins can also work together to achieve a particular function, and they often associate to form stable complexes Machine Learning & Bioinformatics 34

35 Protein Amino acid In chemistry, an amino acid is a molecule that contains both amine and carboxyl functional groups In biochemistry, this term refers to alpha- amino acids with the general formula H2NCHRCOOH, where R is an organic substituent Machine Learning & Bioinformatics 35


37 Amino acid Various side chains The various alpha amino acids differ in which side chain (R group) is attached to their alpha carbon They can vary in size from just a hydrogen atom in glycine through a methyl group in alanine to a large heterocyclic group in tryptophan Machine Learning & Bioinformatics 37




41 Machine Learning & Bioinformatics 41

42 Amino acid The building blocks of proteins Amino acids combine in a condensation reaction and the new “amino acid residue” are held together by peptide bonds Proteins are defined by their unique sequence of residues (primary structure) As the letters form various words, amino acids form a vast variety of sequences/proteins Machine Learning & Bioinformatics 42




46 Protein After knowing amino acids Amino acids form short polymer chains called peptides or longer chains called either polypeptides or proteins The process of such formation from an mRNA template (obeying genetic code) is known as translation, which is part of protein biosynthesis Machine Learning & Bioinformatics 46

47 Protein structure hierarchy Machine Learning & Bioinformatics 47



50 50


52 Protein structure hierarchy Secondary structure In biochemistry and structural biology, secondary structure is the general three- dimensional form of local segments of biopolymers such as proteins and nucleic acids It does not, however, describe specific atomic positions in three-dimensional space, which are considered to be tertiary structure Machine Learning & Bioinformatics 52


54 Protein structure hierarchy Tertiary structure The three-dimensional structure of a protein or any other macromolecule, as defined by the atomic coordinates Describe the spatial relations among it secondary structures Tertiary structure is considered to be largely determined by the protein’s primary sequence Machine Learning & Bioinformatics 54

55 Protein tertiary structure Experiment techniques The majority of protein structures have been solved with X-ray crystallography The second common way is NMR (Nuclear Magnetic Resonance) –lower resolution –limited to small proteins –provide time-dependent information in solution Machine Learning & Bioinformatics 55


57 Protein structure hierarchy Quaternary structure Many proteins are actually assemblies of more than one polypeptide chain, which in the context of the larger assemblage are known as protein subunits In addition to the tertiary structure of the subunits, multiple-subunit proteins possess a quaternary structure, which is the arrangement into which the subunits assemble Machine Learning & Bioinformatics 57

58 Protein sub-structure Machine Learning & Bioinformatics 58

59 Protein sub-structure Domain A part of protein sequence and structure that can evolve, function, and exist independently About 25–500 aa Often form functional units Machine Learning & Bioinformatics 59


61 Protein sub-structure Motif A sequence motif indicate a nucleotide or amino-acid sequence pattern that is widespread and often has a biological significance For proteins, a sequence motif is distinguished from a structural motif, a motif formed by the three dimensional arrangement of amino acids, which may not be adjacent Machine Learning & Bioinformatics 61

62 Protein sub-structure Structure motif A 3D structural element or fold, which appears also in a variety of other molecules In the context of proteins, the term is sometimes used interchangeably with “structure domain,” although a domain need not be a motif nor, if it contains a motif, need not be made up of only one Machine Learning & Bioinformatics 62




66 Molecular biology Reference 台大莊榮輝教授網站 – 交大分子生物學網站 – Machine Learning & Bioinformatics 66

67 Any Questions? Machine Learning & Bioinformatics 67 About molecular biology

Download ppt "Machine Learning & Bioinformatics 1 Tien-Hao Chang (Darby Chang)"

Similar presentations

Ads by Google