Presentation is loading. Please wait.

Presentation is loading. Please wait.

 CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA.

Similar presentations


Presentation on theme: " CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA."— Presentation transcript:

1  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure Martin Ebeling Bioinformatics, PRBI-B F. Hoffmann-La Roche, Basel

2  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure

3  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG E I E I E I E I E I E I E I E Exploring eukaryotic gene structure

4  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG I E begend Exploring eukaryotic gene structure

5  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG i e begend Exploring eukaryotic gene structure

6  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure i e begend

7  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend Intron model Exploring eukaryotic gene structure

8  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend GTAG i Exploring eukaryotic gene structure

9  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend GT AG i GC Exploring eukaryotic gene structure

10  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC i i What is an intron?

11  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Just an intron...

12  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 0 intron...

13  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCCTGGGGCCCC AGATGGGATGGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 1 intron...

14  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATGCTGTCCTCCGCATGGGGCCCC AGATGGGAGATGGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG A phase 2 intron...

15  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC i i AG GT GC i i AG GT GC i i 1bp 2bp 1bp 2bp Intron model 1 phase 0 phase 1 phase 2

16  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e begend AG GT GC central Py tract spacer Intron model 2

17  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e beg end AG GT GC central Py tract spacer promoter model poly-A site Gene model 1

18  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e beg end AG GT GC central Py tract spacer promoter model poly-A site 5’ UTR 3’ UTR Gene model 2

19  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e What is an exon?

20  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG AAAAAA TTTTTT 4 = 4096 6 Hexamer fingerprints

21  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Hexamer fingerprints GC rich GC poor

22  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Hexamer fingerprints predict coding regions

23  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG intergenic region AG GT GC promoter model poly-A site 5’ UTR 3’ UTR Gene model 3GenScan

24  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG e What is an exon?

25  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG AAGGCTGAGAACGGGAAGCTTGTCATCAATGGAAATCCCATCACCATCTTC |||||||||...|||||| ||||||...||| |||||||||||| LysAlaGluAspGlyLys...ValIleAspGlyLysAlaIleThrIlePhe.... CAGGAGCGAGATCCCTCCAAAATCAAGTGGGGCGATGCTGGCGCTGAGTAC ||||||||||||||| ||||||||||||||||||||| ||| GlnGluArgAspProGluAsnIleLysTrpGlyAspAlaGlyThrAlaTyr.... GTCGT.GAGTCCACTGGCGTCTCCACCACC...GAGAAGGCTGGGGCTCAT ||| ||||||||||||||| |||||| |||||||||||| ||| ValValGluSerThrGlyValPheThrThrMetGluLysAlaGly...His.... CATTTGCAGGGGGGAGCCAAAAGGGTCATCATCTCTGCCCCCTCTGCTGAT ||||||...|||||||||||||||::::::||||||||||||||||||||| HisLeuLysGlyGlyAlaLysArgIleValIleSerAlaProSerAlaAsp Coding sequences match proteins - more or less...

26  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exon model 2 match deleteinsert

27  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG begend AG GT GC central Py tract spacer match deleteinsert Gene model 4GeneWise

28  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG GENSCAN 1.0Date run: 6-Oct-103Time: 08:06:38 Sequence genomic : 9529 bp : 58.75% C+G : Isochore 4 (57 - 100 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S.Begin...End.Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.01 Intr + 124 238 115 1 1 76 43 65 0.719 2.42 1.02 Term + 292 378 87 0 0 72 48 95 0.768 2.48 1.03 PlyA + 1978 1983 6 -0.45 2.00 Prom + 2122 2161 40 -2.26 2.01 Init + 2471 2499 29 1 2 107 100 32 0.705 5.09 2.02 Intr + 4132 4231 100 1 1 121 101 165 0.999 22.38 2.03 Intr + 4322 4428 107 1 2 92 64 159 0.999 14.08 2.04 Intr + 4558 4648 91 1 1 96 84 158 0.999 17.36 2.05 Intr + 4739 4854 116 1 2 99 94 138 0.762 15.92 2.06 Intr + 4947 5028 82 0 1 119 60 81 0.998 9.19 2.07 Intr + 5222 5634 413 1 2 70 84 680 0.985 60.11 2.08 Term + 5739 5808 70 0 1 108 47 141 0.966 10.32 2.09 PlyA + 5986 5991 6 1.05 3.03 PlyA - 6649 6644 6 -3.24 3.02 Term - 8226 8121 106 2 1 82 44 45 0.501 -1.69 3.01 Init - 9400 9150 251 1 2 66 82 375 0.600 29.82 Using… GenScan

29  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Using...GenScan

30  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG

31 ... 99.8% 0.2% Using...GenScan:exon P values

32  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSIDPGLKEDTL IYSIL H:H[cat] HFRSIDPGLKEDTL CHR12-region 20911 ataatCAGTAAGTG Intron 2 TAGTctcaagcgcaggac tagtt atggtacgtaaact ccctg ctatattccaattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFLIK VSRHSQELPAILDD QFLIK V:V[gtg] VSRHSQELPAILDD CHR12-region 21253 ctcaaGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attta tcgacaatccttaa acgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region 21808 aatagtgaagccagcaaattgcacccgttgaagacaacggcgcgga cctggcagacatactatagacttgttactactcgacattatatgga atgtaatacctatccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:perfect output match deleteinsert

33  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Gene 1280 4611 Exon 1280 1305 Exon 2938 3037 Exon 3128 3234 Exon 3364 3454 Exon 3545 3660 Exon 3753 3834 Exon 4028 4440 Exon 4545 4611 // >CHR12-6512984-6518250-SEQSEGMENT.seq.[1280:4611].sp ATGAAGGTGAAGGTCGGAGTCAACGGATTTGGTCGTATTGGGCGCCTGGTCACCAGGGCT GCTTTTAACTCTGGTAAAGTGGATATTGTTGCCATCAATGACCCCTTCATTGACCTCAAC TACATGGTTTACATGTTCCAATATGATTCCACCCATGGCAAATTCCATGGCACCGTCAAG GCTGAGAACGGGAAGCTTGTCATCAATGGAAATCCCATCACCATCTTCCAGGAGCGAGAT CCCTCCAAAATCAAGTGGGGCGATGCTGGCGCTGAGTACGTCGTGGAGTCCACTGGCGTC TTCACCACCATGGAGAAGGCTGGGGCTCATTTGCAGGGGGGAGCCAAAAGGGTCATCATC TCTGCCCCCTCTGCTGATGCCCCCATGTTCGTCATGGGTGTGAACCATGAGAAGTATGAC AACAGCCTCAAGATCATCAGCAATGCCTCCTGCACCACCAACTGCTTAGCACCCCTGGCC AAGGTCATCCATGACAACTTTGGTATCGTGGAAGGACTCATGACCACAGTCCATGCCATC ACTGCCACCCAGAAGACTGTGGATGGCCCCTCCGGGAAACTGTGGCGTGATGGCCGCGGG GCTCTCCAGAACATCATCCCTGCCTCTACTGGCGCTGCCAAGGCTGTGGGCAAGGTCATC CCTGAGCTGAACGGGAAGCTCACTGGCATGGCCTTCCGTGTCCCCACTGCCAACGTGTCA GTGGTGGACCTGACCTGCCGTCTAGAAAAACCTGCCAAATATGATGACATCAAGAAGGTG GTGAAGCAGGCGTCGGAGGGCCCCCTCAAGGGCATCCTGGGCTACACTGAGCACCAGGTG GTCTCCTCTGACTTCAACAGCGACACCCACTCCTCCACCTTTGACGCTGGGGCTGGCATT GCCCTCAACGACCACTTTGTCAAGCTCATTTCCTGGTATGACAACGAATTTGGCTACAGC AACAGGGTGGTGGACCTCATGGCCCACATGGCCTCCAAGGAG // Using...GeneWise:perfect output match deleteinsert

34  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSI PGLKEDTL IYSIL H:H[cat] HFRSIGPGLKEDTL CHR12-region 20911 ataatCAGTAAGTG Intron 2 TAGTctcaagcgcaggac tagtt atggtgcgtaaact ccctg ccatattccaattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFL + VSRHSQELPAILDD QFLTE V:V[gtg] VSRHSQELPAILDD CHR12-region 21253 ctcagGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attca tcgacaatccttaa atgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region 21808 aatagtgaagccagcaaattgcacccgttgaagacaacggcgcgga cctggcagacatactatagacttgttactactcgacattatatgga atgtaatacctatccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:mismatches match deleteinsert

35  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG chr12-cnd1 63 IYSIL HFRSIDPGLKEDTL IYSIL HFRSI PG KEDTL IYSIL H:H[cat] HFRSIGPG!KEDTL CHR12-region 20911 ataatCAGTAAGTG Intron 2 TAGTctcaagcg4aggac tagtt atggtgcg aaact ccctg ccatattg aattg chr12-cnd1 83 QFLIK VSRHSQELPAILDD QFL + VSRHSQELPAILDD QFLTE V:V[gtg] VSRHSQELPAILDD CHR12-region 21254 ctcagGGTGTTTA Intron 3 CAGTGgtcctcgccgacgg attca tcgacaatccttaa atgaa accccggtatcgtt chr12-cnd1 103 TTLSGSDRNAHLNALKMNCYALIRLLESFETMASQTNLVDLDLGGK T LSGSDRNAHL ALKMNCYALIRLLESFETMASQTNLVDLDLGGK T!LSGSDRNAHL!ALKMNCYALIRLLESFETMASQTNLVDLDLGGK CHR12-region 21809 a1tagtgaagcc5gcaaattgcacccgttgaagacaacggcgcgga c tggcagacat ctatagacttgttactactcgacattatatgga a gtaataccta ccagctttgatcgactgcgccgactgcgcttgg... Using...GeneWise:frame shifts match deleteinsert

36  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG // Gene 1 Gene 6234 21234 Exon 6234 6360 Exon 20852 20927 Exon 21210 21234 Gene 2 Gene 21239 21811 Exon 21239 21269 Exon 21765 21811 Gene 3 Gene 21813 21842 Exon 21813 21842 Gene 4 Gene 21847 21945 Exon 21847 21945 // Using...GeneWise:frame shifts match deleteinsert

37  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Using...GeneWise:“trimming”

38  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Score = 272 bits (137), Expect = 1e-69 Identities = 137/137 (100%) Strand = Plus / Minus Query: 172 cttttcctagcgcagataaagtgagcccggaaagggaaggagggggcggggacaccattg 231 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 97393 cttttcctagcgcagataaagtgagcccggaaagggaaggagggggcggggacaccattg 97334 Query: 232 ccctgaaagaataaataagtaaataaacaaactggctcctcgccgcagctggacgcggtc 291 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 97333 ccctgaaagaataaataagtaaataaacaaactggctcctcgccgcagctggacgcggtc 97274 Query: 292 ggttgagtccaggttgg 308 ||||||||||||||||| Sbjct: 97273 ggttgagtccaggttgg 97257 Score = 168 bits (85), Expect = 1e-38 Identities = 85/85 (100%) Strand = Plus / Minus Query: 883 ggaaatgatctgctggaggattccccatatgaaccagttaacagcagattgtcagatata 942 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 93686 ggaaatgatctgctggaggattccccatatgaaccagttaacagcagattgtcagatata 93627 Query: 943 ttccgggtggtcccattcatatcag 967 ||||||||||||||||||||||||| Sbjct: 93626 ttccgggtggtcccattcatatcag 93602 Score = 129 bits (65), Expect = 1e-26 Identities = 65/65 (100%) Strand = Plus / Minus Query: 525 cggggaccagccccgcgccggcaccatgttcctggcgaccctgtacttcgcgctgccgct 584 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| Sbjct: 96153 cggggaccagccccgcgccggcaccatgttcctggcgaccctgtacttcgcgctgccgct 96094 Query: 585 cttgg 589 ||||| Sbjct: 96093 cttgg 96089 Mapping cDNAs: sim4

39  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG 900. :. :. :. :. : 6326 ACTGTCCATCAAACATCTTCCACCACAGCTTAGAGGTA...TAGCTTTTC |||||||||||||||||||||||||||||||||||>>>...>>>|||||| 892 ACTGTCCATCAAACATCTTCCACCACAGCTTAGAG CTTTTC 950. :. :. :. :. : 20858 AGGCTGCCTTTCGAGCTCAGGGGCCCCTGGCTATGCTGCAGCACTTTGAT |||||||||||||||||||||||||||||||||||||||||||||||||| 933 AGGCTGCCTTTCGAGCTCAGGGGCCCCTGGCTATGCTGCAGCACTTTGAT 1000. :. :. :. :. : 20908 ACTATCTACAGCATTTTGCAGTA...TAGTCACTTTCGAAGTATAGATCC ||||||||||||||||||||>>>...>>>||||||||||||||||||||| 983 ACTATCTACAGCATTTTGCA TCACTTTCGAAGTATAGATCC 1050. :. :. :. :. : 21231 TGGCCTCAAAGAAGATACTCTGCAATTCCTGATAAAAGGTG...CAGTGG |||||||||||||||||||||| |||||||||||||||>>>...>>>||| 1024 TGGCCTCAAAGAAGATACTCTGGAATTCCTGATAAAAG TGG 1100. :. :. :. :. : 21767 TATCCCGCCACTCCCAGGAGCTTCCAGCTATCCTGGATGATACAACTTTG |||||||||||||||||||||||||||||||||||||||||||||||||| 1065 TATCCCGCCACTCCCAGGAGCTTCCAGCTATCCTGGATGATACAACTTTG... Using...sim4:perfect output

40  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG 4514- 5289 (1- 776) 99% -> 6211- 6360 (777- 926) 100% -> 20852-20927 (927-1002) 100% -> 21210-21268 (1003-1061) 98% -> 21764-21945 (1062-1243) 100% -> 22245-22387 (1244-1386) 100% -> 25400-25527 (1387-1514) 100% -> 25599-25722 (1515-1638) 100% -> 25908-26055 (1639-1786) 100% -> 27963-28160 (1787-1984) 100% -> 28500-28634 (1985-2119) 100% -> 28731-28818 (2120-2207) 100% -> 28914-29094 (2208-2388) 100% -> 32121-32245 (2389-2513) 99% -> 32933-33172 (2514-2753) 100% -> 33963-34137 (2754-2928) 100% -> 34397-34481 (2929-3013) 100% -> 36718-36851 (3014-3147) 100% -> 37203-37335 (3148-3280) 100% -> 37422-37506 (3281-3365) 100% -> 37583-37750 (3366-3533) 100% -> 38026-38198 (3534-3706) 100% -> 38912-39023 (3707-3818) 100% -> 39103-39226 (3819-3942) 100% -> 39308-39463 (3943-4098) 99% -> 39814-39991 (4099-4276) 99% -> 40080-40174 (4277-4371) 100% -> 40648-40728 (4372-4452) 98% -> 40910-41093 (4453-4636) 100% -> 41826-41952 (4637-4763) 100% -> 42056-42211 (4764-4919) 100% -> 42462-43089 (4920-5547) 100% Using...sim4: (nearly) perfect output

41  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATGGGGAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGGCCCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure

42  CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA CTGTCTCTCTCCCTAGGCAGCACAGCCCACAGGTTTCCAGGAGTGCCTTTG TGGGAGGCCTCTGGGCCCCCACCAGCCATCCTGTCCTCCGCCTGGGGCCCC AGCCCGGAGAGAGCCGCTGGTGCACACAGGGCCGGGATTGTCTGCCCTAAT TATCAGGTCCAGGCTACAGGGCTGCAGGACATCGTGACCTTCCGTGCAGAA ACCTCCCCCTCCCCCTCAAGCCGCCTCCCGAGCCTCCTTCCTCTCCAGGCC CCCAGTGCCCAGTGCCCAGTGCCCAGCCCAGGCCTCGGTCCCAGAGATGCC AGGAGCCAGGAGATHANKAGGGGGAAGTGGGGGCTGGGAAGGAACCACGGG CCCCCGCCCGAGGCCCATGGYOUCCTCCTAGGCCTTTGCCTGAGCAGTCCG GTGTCACTACCGCAGAGCCTCGAGGAGAAGTTCCCCAACTTTCCCGCCTCT CAGCCTTTGAAAGAAAGAAAGGGGAGGGGGCAGGCCGCGTGCAGCCGCGAG CGGTGCTGGGCTCCGGCTCCAATTCCCCATCTCAGTCGTTCCCAAAGTCCT CCTGTTTCATCCAAGCGTGTAAGGGTCCCCGTCCTTGACTCCCTAGTGTCC TGCTGCCCACAGTCCAGTCCTGGGAACCAGCACCGATCACCTCCCATCGGG Exploring eukaryotic gene structure


Download ppt " CTCTTCTTGACTCACCCTGCCCTCAATATCCCCCGGCGCAGCCAGTGAAAG GGAGTCCCTGGCTCCTGGCTCGCCTGCACGTCCCAGGGCGGGGAGGGACTT CCGCCCTCACGTCCCGCTCTTCGCCCCAGGCTGGATGGAATGAAAGGCACA."

Similar presentations


Ads by Google