Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151.

Slides:



Advertisements
Similar presentations
The DNA Connection.
Advertisements

Nucleic Acids Not considered a nutrient macromolecule
RNA and Protein Synthesis
Introduction to Bioinformatics Spring 2008 Yana Kortsarts, Computer Science Department Bob Morris, Biology Department.
12-3: RNA AND PROTEIN SYNTHESIS Biology 2. DNA double helix structure explains how DNA can be copied, but not how genes work GENES: sequence of DNA that.
How does DNA work? Building the Proteins that your body needs.
Protein Synthesis Pages Part 3. Warm-Up: DNA DNA is a double stranded sequence of ___________ (smallest unit of DNA). 2.Short segments of.
Q2 WK8 D3 & 4. How does DNA’s message travel OUT of the nucleus and INTO THE CELL, where the message gets expressed as a protein??? This is known as…
Transcription and Translation
DNA, RNA, & Protein Synthesis (12.3) State Standards 2A. Distinguish between DNA and RNA. 2B. Explain the role of DNA in storing and transmitting cellular.
Watson and Crick Watson and Crick studied the work of others to determine the structure of DNA Figured that it is a “Double Helix”: –Twisted ladder.
Section 11-2 From DNA to Proteins.  Enzymes control all the chemical reactions of an organism  Thus, by encoding the instructions form making proteins,
Chapter 13: RNA and Protein Synthesis
DNA and now RNA DNA is deoxyriboneucleic acid. RNA is ribonucleic acid.
DNA to Eye Color? Just How does it Happen? Problem? How do we go from DNA to individual traits?
Mrs. Degl Molecular Genetics DNA, or deoxyribonucleic acid, is the hereditary material in humans and almost all other organisms. Nearly every cell in a.
RNA & Protein Synthesis.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
National 5 Biology Course Notes Part 4 : DNA and production of
DNA. What is DNA? DNA is found in all living cells – It controls all functions inside a cell – It stores all the genetic information for an entire living.
Protein Synthesis: DNA CONTAINS THE GENETIC INFORMATION TO PRODUCE PROTEINS BUT MUST FIRST BE CONVERTED TO RND TO DO SO.
Chapter From DNA to Protein.
Lecture #3 Transcription Unit 4: Molecular Genetics.
Protein Synthesis Transcription. DNA vs. RNA Single stranded Ribose sugar Uracil Anywhere Double stranded Deoxyribose sugar Thymine Nucleus.
RNA and Protein Synthesis
Protein Synthesis The majority of genes are expressed as the proteins they encode. The process occurs in 2 steps: 1. Transcription (DNA---> RNA) 2. Translation.
3 types:  mRNA – used in transcription  tRNA – used in translation  rRNA – makes up ribosomes Composed of nucleotides  5 carbon sugar = ribose  phosphate.
12-3 RNA and Protein Synthesis
RNA Structure and Protein Synthesis Chapter 10, pg
3.5 transcription and translation by arielle lafuente.
RNA AND PROTEIN SYNTHESIS
DNA The Code of Life.
DNA, RNA & Genetics Notes
Leaving Cert Biology Genetics – section 2.5 Genetics ( RNA), 2.5.5,
RNA & Protein Synthesis
 RNA: Ribonucleic Acid  3 types  Helps cells make protein  Single strand of nucleotides: › Ribose sugar › Phosphate › Nitrogen bases  Adenine, uracil,
Protein Synthesis Transcription. DNA vs. RNA Single stranded Ribose sugar Uracil Anywhere Double stranded Deoxyribose sugar Thymine Nucleus.
DNA, RNA. Genes A segment of a chromosome that codes for a protein. –Genes are composed of DNA.
RNA  Structure Differences:  1. Instead of being double stranded, RNA is a single stranded molecule. (ss)  2. The sugar in RNA is ribose. It has one.
Chapter 10.2 – DNA & RNA. Standards SPI Identify the structure and function of DNA. Class Objective: –Describe the base pairing for nucleotides.
DNA Deoxyribose Nucleic Acid – is the information code to make an organism and controls the activities of the cell. –Mitosis copies this code so that all.
Do you know what this is?. DNA Stands for Deoxyribose Nucleic Acid It is a long molecule called a polymer Shape: double helix.
From DNA to Proteins Section 2.3 BC Science Probe 9 Pages
RNA and Protein Synthesis Chapter How are proteins made? In molecular terms, genes are coded DNA instructions that control the production of.
Placed on the same page as your notes Warm-up pg. 48 Complete the complementary strand of DNA A T G A C G A C T Diagram 1 A T G A C G A C T T A A C T G.
Molecules to Eye Color DNA, RNA and Protein Synthesis.
You are what you eat!.  Deoxyribonucleic Acid  Long, double-stranded chain of nucleotides  Contains genetic code  Instructions for making the proteins.
Chapter 13: RNA and Protein Synthesis Mr. Freidhoff.
RNA and Transcription. Genes Genes are coded DNA instructions that control the production of proteins within the cell To decode the genetic message, you.
The DNA connection Coulter. The genetic code  The main function of genes is to control the production of proteins in an organism’s cells. Proteins help.
DNA and RNA.
Data-intensive Computing: Case Study Area 1: Bioinformatics
Protein Synthesis From genes to proteins.
(3) Gene Expression Gene Expression (A) What is Gene Expression?
Protein Synthesis.
Protein Synthesis.
Transcription and Translation Chapter 12
Nucleotide.
GENETICS (Geneology) the study of “genes” Inheritable traits that
Protein synthesis: Overview
The nucleus is the 'command center' of the cell
Protein Synthesis.
RNA: Structures and Functions
DNA, RNA, and Protein Synthesis
It’s Wednesday!! Don’t be content with being average. Average is as close to the bottom as it is to the top!
Our Genetic Code.
Translation and Transcription
Replication, Transcription, Translation
DNA Structure and Function Notes
Presentation transcript:

Data-intensive Computing: Case Study Area 1: Bioinformatics B. Ramamurthy 6/17/20151

Human Genetics Genomics Human Genome project Proteomics Diseasome Tree of life project Phylogenetics 6/17/20152

Human cell Base pair of DNA: CG, AT – C – cytosine, G – guanine, A – adenine, T - thymine Each human cell contains approximately 3 billion base pairs. The DNA of a single cell contains so much information that if it were represented in printed words, simply listing the first letter of each base would require over 1.5 million pages of text! If laid end-to-end, the DNA strand measures about 2 – 3 meters. DNA is a single large molecule at the nucleus of cell It is coiled a double helix Each strand of the DNA molecule is made of A, C, G and T: example: AAAGTTCTTAATTA that will be matched on the other strand by the matching base: TTTCAAGAATTAAT These string of alphabets contain Ref: Ref text: Bioinformatics: Databases, tools and algorithms, by. O. Bosu and S.K. Thukral 6/17/20153

More details Sequence of base pairs are grouped to make sense: genes When a gene inside needs to be activated, the DNA molecule at the cell nucleus uncoils and unfurls to the right extent to expose that gene From the exposed ends of the DNA a RNA is formed. mRNA or messenger RNA is formed that carries with it the “print” of the open DNA section (Map process?) RNA and DNA differ in one respect: RNA does not contain T or thymine but it has uracil (U). RNA is short-lived (like intermediate data in MapReduce) Once mRNA is formed open sections of the DNA close off. 6/17/20154

Protein formation mRNA travels to the cytoplasm where it meets the ribosome (rRNA) Ribosome reads the code in the mRNA (codon) and form the amino acids. Twenty amino acids are prevalent in human cells. Ex: codon GCU GCC GCA correspond to alanine In effect ribosome is a process control computer that takes in as input codons and produces amino acids as output. Amino acids polymerize and form polypeptide chains called proteins Proteins fold and form the basic structures such as skin and hair. Even though brain controls major human functions at the cell level it the DNA that has the command and control. DNA is fixed code for a given human. (WORM characteristics) 6/17/20155

Life’s processes DNA is “program” that controls functions, operations and structure of a cell and in turn that of our life processes. Life processes are in fact dependent of the program in a DNA and the hundreds of millions of ribosomes. Life in this context appears as an immense distributed system. 6/17/20156

Bioinformatics Can we study, understand and analyze the complexity of the immensely complex system? It structure and programs? University of Arizona’s tree of life project (ToL): Human Genome project (NIH and DOE): collecting approximately 30,000 genes in human DNA and determining the sequences three billion bases that make up the human DNA. Out of the genes we do not know the functions of more than 50% of them. 99.9% of the nucleotide sequence is same for all of us 0.1% is attributed to individual differences such as race, color of skin, disposition to diseases High throughput sequencing is generating ultra scale biological data: how to analyze this data? That is a data-intensive problem. 6/17/20157

Existing solutions? Traditional databases: store, retrieve, analyze and/or predict huge biological data Software tools for implementing algorithms, and developing applications for in-silico experiments Visualization tools, user interfaces, web accessibility for search through data Machine learning and data mining methodologies. 6/17/20158

Databases Taxonomy DB Genomics Sequence db Structure db Proteomic database (PDB) Micro-array db Expression db Enzyme db Disease db Molecular biology db 6/17/20159

Tools Data analysis tools – MySQL – Perl Prediction tools – Clustering Modeling tools – Surface prediction, predicting area of interest, protein-protein interaction Alignment tools 6/17/201510

How can we help? How can we leverage our knowledge of large scale data management to address bioinformatics problems? DC methods. Large number of tools and data: how we standardize the efforts so that they are complementary or repetitive? Cloud computing. 6/17/201511

6/17/ Text Mining vs Genetic Sequence Mining (Dot plot) CORRELATIONS R E L A T I O N S H I P ACTCTAGGAGTC G A T A A T T C G A T C