# MAT 4830 Mathematical Modeling 4.1 Background on DNA

## Presentation on theme: "MAT 4830 Mathematical Modeling 4.1 Background on DNA"— Presentation transcript:

MAT 4830 Mathematical Modeling 4.1 Background on DNA http://myhome.spu.edu/lauw

HW Pick up your last HW in my office.

PNW MAA Meeting The 7th annual Northwest Undergraduate Mathematics Symposium (NUMS) will be held in conjunction with the 2015 Spring meeting of the PNW MAA Section at the University of Washington Tacoma on April 10-11. http://www.tacoma.uw.edu/maa-nums

Remarks No handouts Need to read the textbook for more info All individual HW for this chapter (4.1- 4.6) Techniques learned can be apply to other applications

Disclaimer This is not a biology class! I do not know too much biology. We will ignore all possible theological questions and implications.

Our Learning Philosophy Acquire minimum background to start the analysis/ modeling. Ignore the complexity of the biochemical process.

Our Learning Philosophy Concentrate on certain mathematical problems. Very interesting problems once we get through the terminologies.

DNA Genetic info is encoded by DNA molecules, which are passed from parent to offspring.

Bases 4 types of smaller molecules: Adenine (A), Guanine (G) Cytosine (C), Thymine (T)

Bases 4 types of smaller molecules: Adenine (A), Guanine (G)  Purine Cytosine (C), Thymine (T)  Pyrimidine

Bases A always pairs with T G always pairs with C

Bases A always pairs with T G always pairs with C Sequence: AGCGCT Complementary TCGCGA Sequence:

Bases In order to describe a DNA, it suffices to list the bases in one strand.

Mutations Mutations of DNA occur (randomly) from parent to offspring.

Mutations Mutations of DNA occur (randomly) from parent to offspring.

Base Substitution A common form of mutation. A base is replaced by another base.

Base Substitution Transition: Pur by Pur, Pyr By Pyr Transversion: Pur By Pyr, Pyr By Pur

Basic Question How to deduce the amount of mutations during the descent of the DNA sequences?

Example S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1

Example S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Observed mutations: 2

Example S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Actual mutations: 5

Example S0 : Ancestral sequence S1 : Descendant of S0 S2 : Descendant of S1 Actual mutations: 5, (some are hidden mutations)

What Do We Want? Compare the initial and final DNA sequences Develop mathematical models to reconstruct the number of mutations likely to have occurred.

Reality… Seldom do we actually have an ancestral DNA sequence, much less several from different times along a line of descent. Instead, we have sequences from several currently living descendants, but no direct information about any of their ancestors.

Reality… When we compare two sequences, and imagine the mutation process that produced them, the sequence of their most recent common ancestor, from which they both evolved, is unknown.

Orthologous Sequences Given a DNA sequence from some organism, there are good search algorithms to find similar sequences for other organisms in DNA databases. If a gene has been identified for one organism, we can quickly locate likely candidate sequences for similar genes in related organisms.

Orthologous Sequences If the genes has similar function, we can reasonably assume the sequences are descended from a common ancestral sequence (orthologous)

Assumption All sequences in our discussions are aligned orthologous DNA sequences

4.2 An Introduction to Probability Read Section 4.2 to “review”.

4.3 Conditional Probability Read Section 4.3 to “review”

Definition

Example Suppose a 40-base ancestral and descendent DNA sequences are

Example Count the frequency of base substitutions.

Example We can estimate

Example We can estimate

Example Q1: What is the sum of the 16 numbers in the table? Why?

Example Q2: What is the meaning of a row sum in the table?

Example We can form a table of conditional probabilities

Example Q3: What is the sum of the entries in any column of this new table? Why?

Example Q4: If instead of dividing by column sums, you divided by row sums, would you get the same results? What conditional probabilities would you be calculating?