Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity.

Similar presentations


Presentation on theme: "Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity."— Presentation transcript:

1 Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity and homology. Appreciate that proteins can be modular. Workshop-Perform sliding window to compute %G+C as a function of position in sequence. Compute hydrophobicity as a function of position in sequence. Become familiar with the Dotter program.

2 Sliding window A sliding window-gathers information about properties of nucleotides or amino acids. GCATATGCGCATATCCCGTCAATACCA 4 5 6 A simple example is to calculate the %G+C content within a window. Then move the window one nucleotide and repeat the calculation.

3 Sliding window If the window is too small it is difficult to detect the trend of the measurement. If too large you could miss meaningful data. Large window size Small window size %G+C Sequence number

4 Sliding window Adapted from Zhao et al, BMC Genomics. 2007 Nov 7;8:403.

5 Amino acid characteristics

6

7 Four levels of protein structure 1) Primary 2) Secondary 3) Tertiary 4) Quaternary Linear sequence- AGHIPLLQ Initial folding patterns- AGHIPLLQ  TTT  Complex folding patterns- Interactions between polypeptides

8

9 Kyte-Doolittle Hydropathy Plot – Another sliding window routine [J. Mol. Biol. 157:105-132 (1982)]. They determine a "hydropathy scale" for each amino acid based on chemical properties. 12 3 4 5 6 7

10 Dot Plot with window = 1 A T G C C T A G ATGCCTAGATGCCTAG * * * * * * * * * * * * * * * * Window = 1 Note that 25% of the table will be filled due to random chance. 1 in 4 chance at each position

11 Dot Plot with window = 3 A T G C C T A G ATGCCTAGATGCCTAG * * * * * * Window = 3 The larger the window the more noise can be filtered What is the percent chance that you will receive a match randomly? One in (four) 3 chance. (¼) 3 * 100 = 1.56% {

12 Evolutionary Basis of Sequence Alignment 1. Identity: Quantity that describes how much two sequences are alike in the strictest terms. 2. Similarity: Quantity that relates how much two amino acid sequences are alike. 3. Homology: a conclusion drawn from data suggesting that two genes share a common evolutionary history.

13 Purpose of finding differences and similarities of amino acids in two proteins. Infer structural information Infer functional information Infer evolutionary relationships

14 One is mouse trypsin and the other is crayfish trypsin. They are homologous proteins. The sequences share 41% identity.

15

16 Modular nature of proteins Proteins possess local regions of similarity. Proteins can be thought of as assemblies of modular domains.

17 Identity Matrix Simplest type of scoring matrix LICA 1000L 100I 10C 1A

18 Similarity It is easy to score if an amino acid is identical to another (the score is 1 if identical and 0 if not). However, it is not easy to give a score for amino acids that are somewhat similar. + NH 3 CO 2 - + NH 3 CO 2 - Leucine Isoleucine Should they get a 0 (non-identical) or a 1 (identical) or Something in between?

19

20 Two proteins that are similar in certain regions (domains) Tissue plasminogen activator (PLAT) Coagulation factor 12 (F12).

21 The Dotter Program Program consists of three components: Sliding window A table that gives a score for each amino acid match A graph that converts the score to a dot of certain density (the higher the dot density the higher the score)

22

23 Region of similarity Single region on F12 is similar to two regions on PLAT


Download ppt "Sequence comparisons June 23, 2009 Learning objectives-Understand the concept of sliding window programs. Understand difference between identity, similarity."

Similar presentations


Ads by Google