Watermarks.  Four sequences, 1000 bp each  Inserted into noncoding regions of genome  Translated into English using secret triplet nucleotide to character.

Slides:



Advertisements
Similar presentations
Diagnosis with PCR This is a preparation of DNA. We zoomed in a portion of a gene. We know that two primers, Forward and Reverse, will hybridize at specific.
Advertisements

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Structure of DNA. Polymerase Chain Reaction - PCR PCR amplifies DNA –Makes lots and lots of copies of a few copies of DNA –Can copy different lengths.
Digital information preservation in DNA
 -GLOBIN MUTATIONS AND SICKLE CELL DISORDER (SCD) - RESTRICTION FRAGMENT LENGTH POLYMORPHISMS (RFLP)
A Look into the Process of Marker Development Matt Robinson.
Methods of identification and localization of the DNA coding sequences Jacek Leluk Interdisciplinary Centre for Mathematical and Computational Modelling,
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
1 Lecture-2 CS-120 Fall 2000 Revision of Lecture-1 Introducing Computer Architecture The FOUR Main Elements Fetch-Execute Cycle A Look Under the Hood.
Single DNA Sequence Analysis Tools BME 110: CompBio Tools Todd Lowe May 6, 2008.
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 12 Data.
2.7 DNA Replication, transcription and translation
ASCII & Gray Codes.
2.1.4 BINARY ASCII CHARACTER SETS A451: COMPUTER SYSTEMS AND PROGRAMMING.
Mutation  Is a change in the genetic material.  Structural change in genomic DNA which can be transmitted from cell to it is daughter cell.  Structural.
Dale & Lewis Chapter 3 Data Representation
Genetic engineering to produce an organism which will make a ‘foreign’ protein:  Obtain ‘foreign’ gene  Amplify using PCR  Insert gene into a vector.
Aloha Aloha What you see: What the computer sees: binary number columns binary number columns
Genomic walking (1) To start, you need: -the DNA sequence of a small region of the chromosome -An adaptor: a small piece of DNA, nucleotides long.
1. A mutation occurs at the midpoint of a gene, altering all amino acids encoded after the point of mutation. Which mutation could have produced this.
Screening a Library Plate out library on nutrient agar in petri dishes. Up to 50,000 plaques or colonies per plate.
More on translation. How DNA codes proteins The primary structure of each protein (the sequence of amino acids in the polypeptide chains that make up.
Linkage Mapping of the Angiotensin I Converting Enzyme Gene in Pig V.Q. Nguyen 1, K.L. Glenn 2, B.E. Mote 2, and M.F. Rothschild 2 1 Department of Biological.
DNA alphabet DNA is the principal constituent of the genome. It may be regarded as a complex set of instructions for creating an organism. Four different.
10/29/20151 Gene Finding Project (Cont.) Charles Yan.
Tutorial -1: BB 101 (30/7/13) Q.1: The language of life is coded into two sets of alphabets. The genetic information which is coded in the DNA is read.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Pattern Matching Rhys Price Jones Anne R. Haake. What is pattern matching? Pattern matching is the procedure of scanning a nucleic acid or protein sequence.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Chapter 1 Background 1. In this lecture, you will find answers to these questions Computers store and transmit information using digital data. What exactly.
The Genetic Code. The DNA that makes up the human genome can be subdivided into information bytes called genes. Each gene encodes a unique protein that.
Molecular Basis for Relationship between Genotype and Phenotype DNA RNA protein genotype function organism phenotype DNA sequence amino acid sequence transcription.
1Computer Sciences Department. 2 Advanced Design and Analysis Techniques TUTORIAL 7.
Characters CS240.
Deciphering the instructions
Chapter 17 How to read a table of codons. These are two forms in which you might see a table of codons.
Starter What do you know about DNA and gene expression?
COURSE OF BIOINFORMATICS Exam_30/01/2014 A.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
DNA 101, Sequencing. Double Helix Structure of DNA.
318 bp insertion between Exon 2 and 3
Unit 2.6 Data Representation Lesson 2 ‒ Characters
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Steganography Example
From: Three Novel Pax6 Alleles in the Mouse Leading to the Same Small-Eye Phenotype Caused by Different Consequences at Target Promoters Invest. Ophthalmol.
Sequence Alignments—part 2
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
More on translation.
Quiz#6 LC710 10/13/10 name___________
January 4th, 2017 Get out a clean sheet of paper, title it Unit VIII Warm-ups and answer the following questions: A section of DNA has the following sequence:
Advanced Algorithms Analysis and Design
Student: Ying Hong Course: Database Security Instructor: Dr. Yang
(A) Schematic representation of kalata B1 showing the cyclic cystine knot, the amino acid sequence in single letter code, and the regions used for oligonucleotide.
DNA and the Genome Key Area 8a Genomic Sequencing.
Lecture 9 Genome Mapping By Ms. Shumaila Azam
Essential Question: How cells make proteins
Quiz#6 LC710 10/13/10 name___________
Chapter 8 – Compression Aims: Outline the objectives of compression.
Reading mRNA and synthesizing protein
Huffman Coding Greedy Algorithm
Generation of heterozygous mutations with the CRISPR-Cas9 system.
Relationship between Genotype and Phenotype
Relationship between Genotype and Phenotype
Patterns of amino acid usage and its GC-content of synonymous codons in 65 nuclear genomes in this study. Patterns of amino acid usage and its GC-content.
Relationship between Genotype and Phenotype
Digital Representation of Data
Figure Genetic characterization of the novel GYG1 gene mutation (A) GYG1_cDNA sequence and position of primers used. Genetic characterization of the novel.
Presentation transcript:

Watermarks

 Four sequences, 1000 bp each  Inserted into noncoding regions of genome  Translated into English using secret triplet nucleotide to character code Names of scientists “To live, to err, to fall, to triumph, to recreate life out of life." "See things not as they are, but as they might be." "What I cannot build, I cannot understand." address to send decoded sequences

 Each gene >500 bp was given a PCR Tag Use GeneDesign program to recode a portion of gene to maximize difference (Avoid first 100 bases of each gene) At least 33% of nucleotides recoded (target tags to regions where amino acids can vary at >1 nucleotide) First and last nucleotides correspond to variable position Melting temperature between 58-60C Amplifies bp fragment Primers will not amplify other genome sequence <1000 nucleotides  5-10% error rate

 Create codon usage table and convert to binary  Convert watermark from English to binary  Change the codons of your gene so that binary watermark is encoded in DNA (this will change the rankings of your codons)  This method takes into account the frequency of the different codons, which will vary for each species

NONCODING REGIONSPROTEIN-CODING REGIONS  Assign 2 bit sequence to each base  Does not want to introduce cryptic start codons (ATG, CTG, TTG) or their complements (CAT, CAG, CAA)  Examines the dinucleotides AT, CT, TT, CA and restricts the subsequent dinucleotide  Like previous paper, changes the codons, but retains the amino acid sequence  Not only does it take into account the frequency of codons, it preserves the codon count for each (if a codon is used X number of times in the gene, once the recoded gene uses it X times, that codon can no longer be used)

N Goldman et al. Nature 000, 1-4 (2013) doi: /nature11875 The five files comprised all 154 of Shakespeare’s sonnets (ASCII text), a classic scientific paper 18 (PDF format), a medium- resolution colour photograph of the European Bioinformatics Institute (JPEG 2000 format), a 26-s excerpt from Martin Luther King’s 1963 ‘I have a dream’ speech (MP3 format) and a Huffman code 10 used in this study to convert bytes to base- 3 digits (ASCII text), giving a total of 757,051 bytes or a Shannon information 10 of 5.2 × 10 6 bits 18 10