Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vladimir V. Ufimtsev Adviser: Dr. V. Rykov A Mathematical Theory of Communication C.E. Shannon Main result: Entropy function - average value of information.

Similar presentations


Presentation on theme: "Vladimir V. Ufimtsev Adviser: Dr. V. Rykov A Mathematical Theory of Communication C.E. Shannon Main result: Entropy function - average value of information."— Presentation transcript:

1

2 Vladimir V. Ufimtsev Adviser: Dr. V. Rykov

3 A Mathematical Theory of Communication C.E. Shannon Main result: Entropy function - average value of information obtained from a channel. Error Detecting and Error Correcting Codes R.W. Hamming Main result: Matrices that can be used to encode messages and provide more reliable transmission across a channel. A structure for Deoxyribose Nucleic Acid J. D. WATSON, F. H. C. CRICK, M. H. F. Wilkins, R. E. Franklin, Main result: Structure found for the building block of life. There’s Plenty of Room at the Bottom R.P. Feynman Main result: Anticipated Science at the nanoscale ( meters).

4 Letdenote a set consisting of all vectors (codewords) of length n built over i.e. Letsuch that: 1) 2) 3) Let be such that: is referred to as a Code of length n, size M, and minimum distance d.

5 Volume of the sphere around x, of radius d: A sphere in centered at x having radius d: A space is HOMOGENEOUS when the volume of a sphere does not depend on where it is centered i.e. A space is NON - HOMOGENEOUS when the volume of a sphere does depend on where it is centered.

6 For any code there are 3 conflicting parameters; Length: n Size: M Minimum distance: d The aim of coding theory is: Given any 2 parameters, find the optimal value for the 3 rd. We need small n for fast transmission, large M for as much information as possible to be encoded and large d so that we can detect and correct many errors.

7 Exact formulas for sphere volumes and code sizes are extremely difficult to obtain sometimes. In most cases only upper and lower bounds can be obtained for these parameters. We will be working in a NON-HOMOGENEOUS space making the obtainment of exact formulas for sphere volumes and code sizes VERY HARD. Hamming Upper Bound on Code Size in with any metric: Varshamov-Gilbert Lower Bound on Code Size in with any metric:

8 Let G be a simple graph on vertices and e edges. G contains an M-clique if: CLIQUES:

9 If: Then there exists a code of size M.

10 Let Then: Hence there exists a code of size M and so:

11 The rules of base pairing (nucleotide paring): A - T: adenine (A) always pairs with thymine (T) C - G: cytosine (C) always pairs with guanine (G)

12 Each base has a bonding surface Bonding surface of A is complementary to that of T (2 bonds) Bonding surface of G is complementary to that of C (3 bonds) Hybridization is a process that joins two complementary opposite polarity single strands into a double strand through hydrogen bonds.

13 Orientation of single DNA strands is important for hybridization.

14 Direct Shifted Folded Loop

15 Interest into DNA computing was sparked in 1994 by Len Adleman. Adleman showed how we can use DNA molecules to solve a mathematical problem. (Hamiltonian path problem). DNA computing relies on the fact that DNA strands can be represented as sequences of bases (4-ary sequences) and the property of hybridization. In Hybridization, errors can occur. Thus, error-correcting codes are required for efficient synthesis of DNA strands to be used in computing.

16 Sequence is a subsequence of if and only if there exists a strictly increasing sequence of indices: Such that: is defined to be the set of longest common subsequences of and is defined to be the length of the longest common subsequence of and

17 X = ( A T C T G A T ) Z = ( T C G T ) - subsequence of X X = ( A T C T G A T ) Y = ( T G C A T A ) ( T C A T )– L (X,Y) LCS(X,Y) = 4

18 Original Insertion-Deletion metric (Levenshtein 1966): This metric results from the number of deletions and insertions that need to be made to obtain ‘ y ’ from ‘ x ’. For vectors that have the same length: the number of deletions that will be made is: likewise, the number of insertions that will be made is:

19 A common subsequence is called a common stacked pair subsequence of length between x and y if two elements, are consecutive in x and consecutive in y or if they are non -consecutive in x and or non- consecutive in y, then and are consecutive in x and y. Let, denote the length of the longest sequence occurring as a common stacked pair subsequence subsequence z between sequences x and y. The number, is called a similarity of blocks between x and y. The metric is defined to be

20 The upper bound for the average sphere volume in this metric will be: The Varshamov-Gilbert bound becomes:

21 ACGT A1.001.441.280.88 C1.451.842.171.28 G1.302.241.841.44 T0.581.301.451.00 Thermodynamic weight of virtual stacked pairs. Can use statistical estimation of sphere volume.

22 There are many possibilities for metrics on the space of DNA sequences. All discussed metrics are non-homogeneous i.e. the sizes of the spheres in the metric spaces depend on the location of their centers. A universal method that will allow us to calculate lower bounds for optimal code sizes was given.

23 Length (n)Min. size 158 1615 1728 1853 19107 20223 21479 221055 232386 245524 2513068 2631545 2777600 281943016 29494758 301279652 Minimum distance (d) = 6

24 Length (n)Min. size 152 163 175 188 1913 2024 2146 2290 23183 24381 25815 261783 273988 289102 2921174 3050155 Minimum distance (d) = 7

25 Length (n)Min. size 204 217 2212 2321 2439 2575 26149 27304 28635 291354 302946 Minimum distance (d) = 8

26 Length (n)Min. size 201 212 222 234 246 2510 2618 2733 2862 29121 30243 Minimum distance (d) = 9

27 Length (n)Min. size 252 263 275 288 2915 3027 Minimum distance (d) = 10

28 LengthLCSMin dist.SizeV-G bound 1082 4365 14122 580715 128448225 141042683151 16124 1042 18144 7989 20164 66413 22184 588872 24204 5504930 1486661 161062043 1812676713 20146284365 22166 364 24186 2279 1688281 18108501 201281221 221483452 2416810847 221210451 241410861


Download ppt "Vladimir V. Ufimtsev Adviser: Dr. V. Rykov A Mathematical Theory of Communication C.E. Shannon Main result: Entropy function - average value of information."

Similar presentations


Ads by Google