Presentation is loading. Please wait.

Presentation is loading. Please wait.

 CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA.

Similar presentations


Presentation on theme: " CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA."— Presentation transcript:

1  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure Martin Ebeling Bioinformatics, PRBI-B F. Hoffmann-La Roche, Basel

2  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure

3  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG E I E I E I E I E I E I E I E Exploring eukaryotic gene structure

4  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG I E begend Exploring eukaryotic gene structure

5  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG i e begend Exploring eukaryotic gene structure

6  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure i e begend

7  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend Intron model Exploring eukaryotic gene structure

8  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend GTAG i Exploring eukaryotic gene structure

9  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend GT AG i GC Exploring eukaryotic gene structure

10  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC i i What is an intron?

11  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Just an intron...

12  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 0 intron...

13  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 1 intron...

14  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCATGGGGCCCC AGATGGGAGATGGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 2 intron...

15  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC i i AG GT GC i i AG GT GC i i 1bp 2bp 1bp 2bp Intron model 1 phase 0 phase 1 phase 2

16  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC central Py tract spacer Intron model 2

17  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e beg end AG GT GC central Py tract spacer promoter model poly-A site Gene model 1

18  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e beg end AG GT GC central Py tract spacer promoter model poly-A site 5’ UTR 3’ UTR Gene model 2

19  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e What is an exon?

20  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG AAAAAA TTTTTT 4 = Hexamer fingerprints

21  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Hexamer fingerprints GC rich GC poor

22  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Hexamer fingerprints predict coding regions

23  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG intergenic region AG GT GC promoter model poly-A site 5’ UTR 3’ UTR Gene model 3GenScan

24  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e What is an exon?

25  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG AAGGCTGAGAACGGGAAGCTTGTCATCAATGGAAATCCCATCACCATCTTC |||||||||...|||||| ||||||...||| |||||||||||| LysAlaGluAspGlyLys...ValIleAspGlyLysAlaIleThrIlePhe.... CAGGAGCGAGATCCCTCCAAAATCAAGTGGGGCGATGCTGGCGCTGAGTAC ||||||||||||||| ||||||||||||||||||||| ||| GlnGluArgAspProGluAsnIleLysTrpGlyAspAlaGlyThrAlaTyr.... GTCGT.GAGTCCACTGGCGTCTCCACCACC...GAGAAGGCTGGGGCTCAT ||| ||||||||||||||| |||||| |||||||||||| ||| ValValGluSerThrGlyValPheThrThrMetGluLysAlaGly...His.... CATTTGCAGGGGGGAGCCAAAAGGGTCATCATCTCTGCCCCCTCTGCTGAT ||||||...|||||||||||||||::::::||||||||||||||||||||| HisLeuLysGlyGlyAlaLysArgIleValIleSerAlaProSerAlaAsp Coding sequences match proteins - more or less...

26  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exon model 2 match deleteinsert

27  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG begend AG GT GC central Py tract spacer match deleteinsert Gene model 4GeneWise

28  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG GENSCAN 1.0Date run: 6-Oct-103Time: 08:06:38 Sequence genomic : 9529 bp : 58.75% C+G : Isochore 4 ( C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S.Begin...End.Len Fr Ph I/Ac Do/T CodRg P.... Tscr Intr Term PlyA Prom Init Intr Intr Intr Intr Intr Intr Term PlyA PlyA Term Init Using… GenScan

29  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Using...GenScan

30  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG

31  % 0.2% Using...GenScan:exon P values

32  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSIDPGLKEDTL IYSIL H:H[cat] HFRSIDPGLKEDTL CHR12-region ataatCAGTAAGTG Intron 2 TAGTctcaagcgcaggac tagtt atggtacgtaaact ccctg ctatattccaattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFLIK VSRHSQELPAILDD QFLIK V:V[gtg] VSRHSQELPAILDD CHR12-region ctcaaGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attta tcgacaatccttaa acgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region aatagtgaagccagcaaattgcacccgttgaagacaacggcgcgga cctggcagacatactatagacttgttactactcgacattatatgga atgtaatacctatccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:perfect output match deleteinsert

33  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Gene Exon Exon Exon Exon Exon Exon Exon Exon // >CHR SEQSEGMENT.seq.[1280:4611].sp ATGAAGGTGAAGGTCGGAGTCAACGGATTTGGTCGTATTGGGCGCCTGGTCACCAGGGCT GCTTTTAACTCTGGTAAAGTGGATATTGTTGCCATCAATGACCCCTTCATTGACCTCAAC TACATGGTTTACATGTTCCAATATGATTCCACCCATGGCAAATTCCATGGCACCGTCAAG GCTGAGAACGGGAAGCTTGTCATCAATGGAAATCCCATCACCATCTTCCAGGAGCGAGAT CCCTCCAAAATCAAGTGGGGCGATGCTGGCGCTGAGTACGTCGTGGAGTCCACTGGCGTC TTCACCACCATGGAGAAGGCTGGGGCTCATTTGCAGGGGGGAGCCAAAAGGGTCATCATC TCTGCCCCCTCTGCTGATGCCCCCATGTTCGTCATGGGTGTGAACCATGAGAAGTATGAC AACAGCCTCAAGATCATCAGCAATGCCTCCTGCACCACCAACTGCTTAGCACCCCTGGCC AAGGTCATCCATGACAACTTTGGTATCGTGGAAGGACTCATGACCACAGTCCATGCCATC ACTGCCACCCAGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGG GCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGGTGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTG GTGAAGCAGGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTG GTCTCCTCTGACTTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATT GCCCTCAACGACCACTTTGTCAAGCTCATTTCCTGGTATGACAACGAATTTGGCTACAGC AACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCAAGGAG // Using...GeneWise:perfect output match deleteinsert

34  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSI PGLKEDTL IYSIL H:H[cat] HFRSIGPGLKEDTL CHR12-region ataatCAGTAAGTG Intron 2 TAGTctcaagcgcaggac tagtt atggtgcgtaaact ccctg ccatattccaattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFL + VSRHSQELPAILDD QFLTE V:V[gtg] VSRHSQELPAILDD CHR12-region ctcagGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attca tcgacaatccttaa atgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region aatagtgaagccagcaaattgcacccgttgaagacaacggcgcgga cctggcagacatactatagacttgttactactcgacattatatgga atgtaatacctatccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:mismatches match deleteinsert

35  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSI PG KEDTL IYSIL H:H[cat] HFRSIGPG!KEDTL CHR12-region ataatCAGTAAGTG Intron 2 TAGTctcaagcg4aggac tagtt atggtgcg aaact ccctg ccatattg aattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFL + VSRHSQELPAILDD QFLTE V:V[gtg] VSRHSQELPAILDD CHR12-region ctcagGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attca tcgacaatccttaa atgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK T LSGSDRNAHL ALKMNCYALIRLLESFETMASQTNLVDLDLGGK T!LSGSDRNAHL!ALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region a1tagtgaagcc5gcaaattgcacccgttgaagacaacggcgcgga c tggcagacat ctatagacttgttactactcgacattatatgga a gtaataccta ccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:frame shifts match deleteinsert

36  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG // Gene 1 Gene Exon Exon Exon Gene 2 Gene Exon Exon Gene 3 Gene Exon Gene 4 Gene Exon // Using...GeneWise:frame shifts match deleteinsert

37  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Using...GeneWise:“trimming”

38  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Score = 272 bits (137), Expect = 1e-69 Identities = 137/137 (100%) Strand = Plus / Minus Query: 172 cttttcctagcgcagataaagtgagcccggaaagggaaggagggggcggggacaccattg 231 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: cttttcctagcgcagataaagtgagcccggaaagggaaggagggggcggggacaccattg Query: 232 ccctgaaagaataaataagtaaataaacaaactggctcctcgccgcagctggacgcggtc 291 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: ccctgaaagaataaataagtaaataaacaaactggctcctcgccgcagctggacgcggtc Query: 292 ggttgagtccaggttgg 308 ||||||||||||||||| Sbjct: ggttgagtccaggttgg Score = 168 bits (85), Expect = 1e-38 Identities = 85/85 (100%) Strand = Plus / Minus Query: 883 ggaaatgatctgctggaggattccccatatgaaccagttaacagcagattgtcagatata 942 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: ggaaatgatctgctggaggattccccatatgaaccagttaacagcagattgtcagatata Query: 943 ttccgggtggtcccattcatatcag 967 ||||||||||||||||||||||||| Sbjct: ttccgggtggtcccattcatatcag Score = 129 bits (65), Expect = 1e-26 Identities = 65/65 (100%) Strand = Plus / Minus Query: 525 cggggaccagccccgcgccggcaccatgttcctggcgaccctgtacttcgcgctgccgct 584 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: cggggaccagccccgcgccggcaccatgttcctggcgaccctgtacttcgcgctgccgct Query: 585 cttgg 589 ||||| Sbjct: cttgg Mapping cDNAs: sim4

39  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG 900. :. :. :. :. : 6326 ACTGTCCATCAAACATCTTCCACCACAGCTTAGAGGTA...TAGCTTTTC |||||||||||||||||||||||||||||||||||>>>...>>>|||||| 892 ACTGTCCATCAAACATCTTCCACCACAGCTTAGAG CTTTTC 950. :. :. :. :. : AGGCTGCCTTTCGAGCTCAGGGGCCCCTGGCTATGCTGCAGCACTTTGAT |||||||||||||||||||||||||||||||||||||||||||||||||| 933 AGGCTGCCTTTCGAGCTCAGGGGCCCCTGGCTATGCTGCAGCACTTTGAT :. :. :. :. : ACTATCTACAGCATTTTGCAGTA...TAGTCACTTTCGAAGTATAGATCC ||||||||||||||||||||>>>...>>>||||||||||||||||||||| 983 ACTATCTACAGCATTTTGCA TCACTTTCGAAGTATAGATCC :. :. :. :. : TGGCCTCAAAGAAGATACTCTGCAATTCCTGATAAAAGGTG...CAGTGG |||||||||||||||||||||| |||||||||||||||>>>...>>>||| 1024 TGGCCTCAAAGAAGATACTCTGGAATTCCTGATAAAAG TGG :. :. :. :. : TATCCCGCCACTCCCAGGAGCTTCCAGCTATCCTGGATGATACAACTTTG |||||||||||||||||||||||||||||||||||||||||||||||||| 1065 TATCCCGCCACTCCCAGGAGCTTCCAGCTATCCTGGATGATACAACTTTG... Using...sim4:perfect output

40  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG (1- 776) 99% -> ( ) 100% -> ( ) 100% -> ( ) 98% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 99% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 99% -> ( ) 99% -> ( ) 100% -> ( ) 98% -> ( ) 100% -> ( ) 100% -> ( ) 100% -> ( ) 100% Using...sim4: (nearly) perfect output

41  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure

42  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATHANKAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGYOUCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure


Download ppt " CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA."

Similar presentations


Ads by Google