Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATKĀRTOTĀS SEKVENCES.

Similar presentations


Presentation on theme: "ATKĀRTOTĀS SEKVENCES."— Presentation transcript:

1 ATKĀRTOTĀS SEKVENCES

2 ATKĀRTOTĀS SEKVENCES AIZŅEM LIELU DAĻU CILVĒKA GENOMA
retrovīrusiem līdzīgie elementi DNS transpozonu ‘fosīlijas’ TRANSPOZONI ATKĀRTOTĀS SEKVENCES UNIKĀLAS SEKVENCES GĒNI introni proteīnus kodējošie rajoni citas DNS sekvences vienkāršie sekvences atkārtojumi segmentu duplikācijas procenti Figure Representation of the nucleotide sequence content of the human genome. LINES, SINES, retroviral-like elements, and DNA-only transposons are all mobile genetic elements that have multiplied in our genome by replicating themselves and inserting the new copies in different positions. Mobile genetic elements are discussed in Chapter 5. Simple sequence repeats are short nucleotide sequences (less than 14 nucleotide pairs) that are repeated again and again for long stretches. Segmental duplications are large blocks of the genome ( ,000 nucleotide pairs) that are present at two or more locations in the genome. Over half of the unique sequence consists of genes and the remainder is probably regulatory DNA. Most of the DNA present in heterochromatin, a specialized type of chromatin (discussed later in this chapter) that contains relatively few genes, has not yet been sequenced. (Adapted from Unveiling the Human Genome, Supplement to the Wellcome Trust Newsletter. London: Wellcome Trust, February 2001.)

3 ATKĀRTOTO DNS SEKVENČU VARĒTU BŪT DAUDZ VAIRĀK
Figure 2. P-clouds and RepeatMasker annotation of the repeat structure of the human genome. Results are displayed as a percentage of the ungapped genome assembly length. A) Consensus results prior to this study indicate that,50% of the genome is repetitive (RepeatMasker). B) Analysis using P-clouds suggests more than two-thirds of the genome is repetitive or repeat-derived. doi: /journal.pgen g002 Pielietojot jaunu DNS sekvenču identifikācijas metodi (P-clouds) palielinās līdz šim pieņemto atkārtoto sekvenču daudzums cilvēka genomā (~50%) līdz ~70% līmenim PLOS Genet. 7,Nr.12, e (2011)

4 ATKĀRTOTĀS SEKVENCES Vairāk kā 50% cilvēka genoma pastāv atkārtotu sekvenču veidā Tandēmiskie atkārtojumi (tandemic repeats): vienkāršie tandēmiskie atkārtojumi - (A)n; (CA)n vai (CGG)n tandēmiski atkārtojumi - centromēras, telomēras, ribosomu RNS gēnu klasteri Izkliedētie atkārtojumi (interspersed repeats): transpozoni, t.sk. retrotranspozoni segmentu duplikācijas procesētie pseidogēni (retropseidogēni) - neaktīvi radušies retrotranspozīcijas ceļā no gēniem

5 PROCESĒTIE PSEIDOGĒNI
FIGURE 7-5 Processed pseudogenes arise from integration of reverse transcribed mRNAs. When reverse transcriptase is present in a cell, mRNA molecules can be copied into dsDNA. In rare instances, these DNA molecules can integrate into the genome creating pseudogenes. Because introns are rapidly removed from newly transcribed RNAs, these pseudogenes have the common characteristic of lacking introns. This distinguishes the pseudogene from the copy of the gene from which it was derived. In addition, pseudogenes lack the appropriate promoter sequences to direct the transcription as these are not part of the mRNA from which they are derived. Molecular Biology of the Cell, 2008 Līdz šim cilvēka genomā identificēti vairāk kā 8000 procesēto pseidogēnu, kas cēlušies no ~2500 atsevišķiem funkcionāliem gēniem, un 10% cilvēka gēniem ir vismaz viena kopija tam atbilstošā pseidogēna. Kopā tie aizņem ap 0.5% genoma. 5

6 TANDĒMISKO ATKĀRTOJUMU NOVIETOJUMS
CILVĒKA HROMOSOMĀ Telomere (tandem repeats of TTAGGG minisatellite). Length - several kb Alphoid satellite β satellite Satellites 2 and 3 Centromere (various satellite DNA components). Length - several Mb See (B) for examples plus satellite 1 and other repeats Microsatellites (widely dispersed over chromosomes) β satellite rDNA β satellite β satellite Satellites 2 and 3 Alphoid satellite Hypervariable minisatellite DNA (preferentially in regions close to telomeres) Jonathan Wolfe, Dell Corp.

7 TRANSPOZONI Transpozīciju aktivitāte vispār var būt liela: 10-3 – 10-5 (uz 1 transpozonu 1 paaudzē), kamēr bāžu apmaiņa tikai: 10-8 – 10-9 (uz 1 bāzi 1 paaudzē). Drozofilās 50–80% visu mutāciju pamatā ir transpozonu aktivitāte, lai gan transpozonu saturs ir tikai 15–22% genoma. Turpretī cilvēka genomā, kurā transpozoni sastāda jau 45% genoma, tikai par 0.1–1% visu mutāciju atbild transpozoni. Iemesls šeit ir milzīgās atšķirības transpozonu aktivitātē stap abiem organismiem. Nature 443,521(2006)

8 TRANSPOZONU VEIDI Retrotranspozoni

9 RETROTRANSPOZONU STRUKTŪRA
RNA binding protein Reverse transcriptase + endonuclease 13 (2003)

10 TRANSPOZONU STRUKTŪRA UN KLASIFIKĀCIJA Evolucionārie aspekti
(*) Box 2. Structure and nomenclature of transposable elements There is no officially agreed system for classifying transposable elements, but here is a simple system based on their evolution (phylogeny) and the genetic modules they contain (modified from ref. 30). The term transposon is often used as a generic term instead of transposable element, but it was originally coined to name the first characterized TE. This first TE moves about the genome through a DNA intermediate, using the transposase (Trp) enzyme to splice itself in and out of the DNA. The term DNA transposon (class II elements) is often used to distinguish this kind of TE from LTR (long terminal repeat) and non-LTR retrotransposons (class I elements) that move by means of an RNA intermediate and use the reverse transcriptase (RT) enzyme. Inverted terminal repeats (ITRs) are needed for the movement of DNA transposons. The gag gene specifies the components of the molecular complex that is associated with the RNA transposition intermediate of retrotransposons. Retrotransposons also encode the RT enzyme, which synthesizes a complementary single-stranded RNA from the inserted DNA of the TE, and converts it to a double-stranded DNA that will be integrated in the genome elsewhere. They also encode the enzyme ribonuclease H (RH), which degrades the DNA–RNA hybrids obtained during transposition. Retrotransposons have genes encoding the integrase enzyme (INT), which splices the double-stranded DNA into a new spot in the host genome, and an enzyme (a protease; PR) that cuts up precursor proteins and is involved in particle assembly. Some retrotransposons have gained the envelope gene (env), which encodes surface proteins that interact with the host cell membrane, conferring infectious characteristics on the elements. The human genome has about half a million copies of long interspersed nuclear elements (LINEs), of which 50–100 are still active. And there are more than a million copies of a short interspersed nuclear element (SINE) called Alu in the human genome. Alu elements are also a major source of genomic diversity in dogs. These SINEs have a particular structure and depend on LINEs for their transposition. Miniature inverted-repeat transposable elements (MITEs) are an ancient TE family that is characterized by short-sequence, terminal or subterminal inverted repeats flanked by short direct repeats, and they have no coding potential. They are distributed ubiquitously and seem to originate from DNA transposons. Nature 443,521(2006) (*) Miniature inverted-repeat transposable elements (MITEs)

11 POLI(A) TIPA RETROTRANSPOZONI
RNA binding protein Cilvēka genomā šīs klases retrotranspozonus pārstāv: LINE un SINE elementi

12 IZKLIEDĒTO ATKĀRTOJUMU VEIDI CILVĒKA GENOMĀ
P (poli-A elementi) P P (endogēnie retrovīrusi) LTR LTR (LTR elementi) P LTR LTR Figure 17 Almost all transposable elements in mammals fall into one of four classes. Nature 409, 860 (2001) LINE – Long Interspersed Nuclear Element SINE – Short Interspersed Nuclear Element LTR – Long Terminal Repeat P – promoters ORF1 (open-reading frame 1) – RNS saistības proteīna gēns ORF2 – reversās transkriptāzes un endonukleāzes gēns gag – strukturālo proteīnu gēns pol – reversās transkriptāzes, RNāzes H un integrāzes gēns (env) – apvalka proteīnu gēns (endogēnajiem retrovīrusiem nav, eksogēnajiem ir)

13 IZKLIEDĒTIE ATKĀRTOJUMI (IA) CILVĒKA GENOMĀ
FIGURE 1 | The transposable element content of the human genome a | Approximately 45% of the human genome can currently be recognized as being derived from transposable elements, the majority of which are non-long terminal repeat (LTR) retrotransposons, such as LINE-1 (L1), Alu and SVA elements. b | The canonical L1 element consists of two open reading frames (ORF1 and ORF2) flanked by 5' and 3' UTRs. The 5' UTR possesses an internal RNA polymerase II promoter (blue box). The element ends with an oligo(dA)-rich tail (AAA) preceded by a polyadenylation signal (pA). The canonical Alu element consists of two related monomers separated by an A-rich linker region (with consensus sequence A5TACA6). The left monomer contains A and B boxes (blue boxes), which are transcriptional promoters for RNA polymerase III. The element ends with an oligo(dA)-rich tail (AAA) that can be up to 100 bp long. The canonical SVA element has a composite structure consisting of (from the 5' end to 3' end): a (CCCTCT)n hexamer repeat region; an Alu-like region consisting of two antisense Alu fragments and an additional sequence of unknown origin; a variable number of tandem repeats (VNTR) region made of units 35–50 bp in length; and a region derived from the envelope polyprotein (env) gene and the 3' LTR of human endogenous retrovirus (HERV)-K10. The element ends with an oligo(dA)-rich tail preceded by a polyadenylation signal. L1, Alu and SVA elements are typically flanked by target site duplications (black arrows) that are generated upon integration. Elements are not drawn to scale. SVA: SINE+VNTR+Alu Nature Rev.Genetics 10,691(2009)

14 IZKLIEDĒTIE ATKĀRTOJUMI (IA) CILVĒKA GENOMĀ
Box 1 | Repetitive DNA in the human genome Approximately 50% of the human genome is comprised of repeats. The table in panel a shows various named classes of repeat in the human genome, along with their pattern of occurrence (shown as 'repeat type' in the table; this is taken from the RepeatMasker annotation). The number of repeats for each class found in the human genome, along with the percentage of the genome that is covered by the repeat class (Cvg) and the approximate upper and lower bounds on the repeat length (bp). The graph in panel b shows the percentage of each chromosome, based on release hg19 of the genome, covered by repetitive DNA as reported by RepeatMasker. The colours of the graph in panel b correspond to the colours of the repeat class in the table in panel a. Microsatellites constitute a class of repetitive DNA comprising tandem repeats that are 2–10 bp in length, whereas minisatellites are 10–60 bp in length, and satellites are up to 100 bp in length and are often associated with centromeric or pericentromeric regions of the genome. DNA transposons are full-length autonomous elements that encode a protein, transposase, by which an element can be removed from one position and inserted at another. Transposons typically have short inverted repeats at each end. Long terminal repeat (LTR) elements (which are often referred to as retrovirus-like elements) are characterized by the LTRs (200–5000 bp) that are harboured at each end of the retrotransposon. LINE, long interspersed nuclear element; rDNA, ribosomal DNA; SINE, short interspersed nuclear element. Nature Rev.Genetics 13,36(2012)

15 IZKLIEDĒTO ATKĀRTOJUMU CILVĒKA GENOMĀ VARĒTU BŪT VĒL VAIRĀK
(izmanto jaunu sekvenču analīzes metodi P-clouds) Figure 3. Percentage of previously-identified transposable elements annotated by P-clouds. B) The number of nucleotides annotated or missed. doi: /journal.pgen g003 PLOS Genet. 7, Nr.12, e (2011)

16 LINE UN SINE ELEMENTI Pilna garuma LINE cilvēka genomā ir 6 – 8 kb gari. Satur pol II promoteru un divus ORF, viens no tiem kodē apgriezto transkriptāzi (RT) un endonukleāzi, otrs – RNS saistības proteīnu. Gandrīz visi LINE elementi ir saglabājušies kā īsi delēciju varianti – zem 1 kb un nav aktīvi. Visbiežākais no trim elementiem (LINE1, 2 un 3) ir LINE1, tas ir vienīgais joprojām aktīvais LINE cilvēka genomā. No LINE1 kopskaita (ap ) aktivitāti saglabājuši tikai kādi 50 – 100 pārstāvji. Peles genomā aktīvi varētu būt ap 3000 LINE1. SINE ir īsi (100 – 400 bp) atkārtojumi ar pol III promoteru, bet bez ORF. To mobilitāti nodrošina LINE sistēma. Galvenais SINE elements cilvēka genomā ir Alu elements (pilns garums ~280 bp), kas vienīgais uzrāda aktivitāti. Pārējie SINE elementi – MIR un MIR3 vairs nav aktīvi. Arī SINE ir delēciju varianti. Alu ir galvenais aktivitāti saglabājušais IA cilvēka genomā. Pēc Genome Res. 19,1516(2009) datiem retrotranspozīcijas biežums ir augsts – pa vienai Alu transpozīcijai uz 21 jaundzimušo, vai attiecīgi uz 212 un 916 jaundzimušajiem Line1 un SVA gadījumā.

17 JAUNIE Alu ELEMENTI IR AKTĪVĀKIE IA CILVĒKA GENOMĀ
Alu aktivitāte Figure 1. A genome-wide view of human Alu activity. A total of 850,044 full-length (>268 bp) genomic Alus were identified in hg18 of the reference human genome sequence and assigned to known Alu subfamilies. Alu elements frequently have sequence changes relative to consensus sequences. The number of changes for each full-length copy is indicated on the x-axis; the copy number for a given level of sequence variation is indicated on the left y-axis. Pink data points mark the mobilization activities of the 89 Alu copies that were examined in this study (labeled on the right y-axis). In sum, 8 AluJ, 27 AluS, and 54 AluY copies were tested, spanning a range of subfamilies and variation levels. Note that elements with fewer changes relative to consensus sequences (zero changes) generally had the highest levels of activity; no elements below 10% variation (28 changes) were active. Please see Supplemental Table 1 for additional details and error measurements. Vecākie Alu Jaunākie Alu Tikai jauni Alu cilvēka genomā uzrāda spēju transponēties. Šo aktivitāti varētu būt saglabājuši vairāki tūkstoši Alu elementu. Aktivitāti pārbauda ar spēju pārvietot marķiergēnu neo no klonētas plazmīdas HeLa šūnu genomā. Genome Res. 18,1875(2008)

18 LINE ELEMENTU INSERCIJAS SHĒMA
LINE elementi transponējas tikai kodolā pēc target-priming reverse transcription shēmas. To nodrošina ar LINE RNS saistītā RT. The retrotransposition cycle The increase in copy numbers of non-long terminal repeat (LTR) retrotransposons occurs through an RNA-based duplication process termed retrotransposition. The first step in LINE-1 (L1) retrotransposition involves RNA polymerase II-mediated transcription of a genomic L1 locus from an internal promoter that directs transcription initiation at the 5' boundary of the L1 element19, 129; an internal promoter allows a retrotransposon to generate autonomous duplicate copies at multiple locations in the genome. The L1 RNA is exported to the cytoplasm, in which ORF1 (which encodes an RNA-binding protein) and ORF2 (which encodes a protein with endonuclease and reverse-transcriptase activities) are translated. Both proteins show a strong cis-preference28; consequently, they preferentially associate with the L1 RNA transcript that encoded them to produce a ribonucleoprotein (RNP) particle. The RNP is then transported back into the nucleus by a mechanism that is poorly understood. The integration of the L1 element into the genome is likely to occur through a process termed target-primed reverse transcription (TPRT)13, 130, 131, which was originally described for the R2 non-LTR retrotransposon of the silkworm Bombyx mori132. During TPRT, it is thought that the L1 endonuclease cleaves the first strand of target DNA, generally between T and A at 5'-TTTTAA-3' consensus sites133 (see the figure, part a). The free 3' hydroxyl (OH) generated by the nick is then used to prime reverse transcription of L1 RNA (red) by the L1 reverse transcriptase (b). The second strand of the target DNA is cleaved (c) and used to prime second-strand synthesis (d) through poorly understood mechanisms. Hallmarks of the integration process include frequent 5' truncations, the presence of an oligo(dA)-rich tail at the 3' end and target site duplications (TSDs) of between 2 and 20 base pairs in length3, 21 (e). Alu and SVA retrotransposition is also likely to occur through TPRT using the L1 retrotransposition machinery12, 29, 30. The mechanism of Alu and SVA trans-mobilization by L1 proteins remains elusive. RNA polymerase III-mediated Alu transcripts are exported to the cytoplasm and bound to signal recognition particle 9 kDa protein (SRP9) or SRP14 to form stable RNPs134, 135. It has been suggested that Alu RNPs interact with ribosomes, thereby positioning Alu transcripts in close proximity to nascent L1 ORF2 proteins12, 42 (the ORF1 protein enhances, but is not strictly required for, Alu retrotransposition12, 136). However, it remains unclear whether Alu RNPs gain access to the L1 retrotransposition machinery in the cytoplasm or in the nucleus, as Alu RNPs might recruit L1 ORF2 proteins in the nucleus and immediately proceed with TPRT137. Nature Rev.Genetics 10,691(2009)

19 LINE ELEMENTU INSERCIJAS SHĒMA
Integration. L1-encoded RNA and proteins assemble into ribonucleoprotein particles. Reverse transcription of L1 RNA is coupled to insertion into DNA at random sites throughout the genome. Science 346,1187(2014)

20 Retrotranspozonu aktivitātes blodāde ar laiku samazinās
Aging guards invite a jailbreak. With aging, increased stress, DNA damage, and telomere shortening weaken the multiple systems that keep retrotransposons in check. Aged cells lose repressive heterochromatin, SIRT6 relocalizes away from L1promoters, and autophagy becomes less efficient. Other defense pathways (see box) may also lose their effectiveness. The consequent unleashing of L1 elements could lead to profound somatic damage, driving age-associated cell and tissue dysfunction. Science 346,1187(2014)

21 IZKLIEDĒTO ATKĀRTOJUMU MOBILITĀTE
LINE elementi vieglāk saglabājas genomā, jo tiem ir izteikta cis-priekšrocība LINE proteīniem saistoties ar LINE RNS, resp. proteīni pārsvarā saistās ar to RNS, no kuras tikuši translēti. Tādēļ transponējas gandrīz tikai funkcionāli pilnvērtīgi LINE elementi. Taču dažreiz LINE aktivitātei nav nepieciešams pilna garuma elements. Ar citiem LINE elementiem kā helperiem var aktivēt arī daļu LINE delēciju variantus. Alu transkriptam ir funkcionāla līdzība ar LINE RNS, uz kuras citoplazmā sintezējas LINE proteīni. Rezultātā Alu RNS asociējas ar ribosomu, kas sintezē LINE proteīnus un iegūst sintezēto LINE RT. Saistība ar RT nodrošina Alu RNS apgriezto transkripciju un tam sekojošu integrāciju genomā.

22 Alu MOBILIZĀCIJA AR LINE ELEMENTU
ORF 2 protein CBP – Cap binding protein ORF 1 – RNA binding protein ORF 2 – Reverse transcriptase / endonuclease Nature Genetics 35, 15 (2003)

23 IZKLIEDĒTO ATKĀRTOJUMU VECUMS
Par IA vecumu spriež pēc uzkrāto mutāciju daudzuma tajos (neselektīvos apstākļos) - cik lielā mērā katrs individuāls IA atšķiras no consensus sekvences konkrētai IA klasei. Katra IA kopija izveidojusies no kādreiz aktīvas IA transpozīcijas. Samazinoties transpozīciju aktivitātei, IA kopijas strauji diverģē no sākotnējās IA struktūras, neatkarīgu mutāciju rezultātā.

24 CILVĒKA Alu UN PELES B1 SINE ELEMENTU IZCELŠANĀS
Human Molecular Genetics, 2nd Edition Alu (un B1) satur iekšēju RNS Pol III promoteru (a un b elementi), pateicoties kuram tiek transkribēti Molecular Biology of the Cell, 2008 Figure The human Alu repeat and the mouse B1 repeat evolved from processed copies of the 7SL RNA gene. Extensive homology of the Alu repeat sequences to the ends of the 7SL RNA sequence suggests that a polyadenylated copy of the 7SL RNA gene integrated elsewhere in the genome by a retrotransposition event (see Figure 8.7). In some cases, the integrated copies were able to produce RNA transcripts of their own, using the internal promoter of the 7SL RNA gene. At a very early stage, an internal segment (between c and d) was lost. Subsequently, a 32-bp central segment containing regions flanking the original deletion (c + d) was deleted to give a related repeat unit. Fusion of the two types of unit resulted in the classical Alu dimeric repeat, with the left (5 ) monomer lacking a 32-bp sequence and the right (3 ) monomer containing the 32-bp sequence. Note that in the human genome there are also multiple copies of a free left Alu monomer (FLAM) and a free right Alu monomer (FRAM). In the mouse, a similar process of copying from the 7SL RNA gene appears to have occurred with subsequent deletion of a large internal unit (between a and b), followed by tandem duplication of flanking regions (a + b). Figure The proposed pattern of expansion of the abundant Alu and B1 sequences found in the human and mouse genomes, respectively. Both of these transposable DNA sequences are thought to have evolved from the essential 7SL RNA gene which encodes the SRP RNA (see Figure 12-41). On the basis of the species distribution and sequence similarity of these highly repeated elements, the major expansion in copy numbers seems to have occurred independently in mice and humans (see Figure 5-78). (Adapted from P.L. Deininger and G.R. Daniels, Trends Genet. 2:7680, 1986 and International Human Genome Sequencing Consortium, Nature 409:860921, 2001.) FIGURE Pol III core promoter. Shown here is the promoter for a yeast tRNA gene. The order of events leading to transcription initiation is desxribed in the text. Salīdzinot ar peles genomu, cilvēka IA ir veci un funkcionāli aktīvo IA skaits ir niecīgs. Visu veidu IA aktivitāte ir strauji samazinājusies pēdējo mlj. gadu laikā, atskaitot LINE1. Izņemot spēcīgu Alu aktivitātes pīķa parādīšanos pirms apm. 40–50 mlj. gadiem, visu citu IA aktivitāte ir samazinājusies kopš zīdītāju līnijas nodalīšanās. 24

25 IZKLIEDĒTO ATKĀRTOJUMU VECUMS CILVĒKA GENOMĀ Nature 409, 860 (2001)
Figure 18 Age distribution of interspersed repeats in the human and mouse genomes. Bases covered by interspersed repeats were sorted by their divergence from their consensus sequence (which approximates the repeat's original sequence at the time of insertion). The average number of substitutions per 100 bp (substitution level, K) was calculated from the mismatch level p assuming equal frequency of all substitutions (the one-parameter Jukes–Cantor model, K = -3/4ln(1 - 4/3p)). This model tends to underestimate higher substitution levels. CpG dinucleotides in the consensus were excluded from the substitution level calculations because the CT transition rate in CpG pairs is about tenfold higher than other transitions and causes distortions in comparing transposable elements with high and low CpG content. a, The distribution, for the human genome, in bins corresponding to 1% increments in substitution levels. b, The data grouped into bins representing roughly equal time periods of 25 Myr. c,d, Equivalent data for available mouse genomic sequence. There is a different correspondence between substitution levels and time periods owing to different rates of nucleotide substitution in the two species. The correspondence between substitution levels and time periods was largely derived from three-way species comparisons (relative rate test139,157) with the age estimates based on fossil data. Human divergence from gibbon 20–30 Myr; old world monkey 25–35 Myr; prosimians 55–80 Myr; eutherian mammalian radiation 100 Myr. Nature 409, 860 (2001)

26 Nucleotide substitution
IZKLIEDĒTO ATKĀRTOJUMU SALĪDZINĀJUMS Cilvēka genomā eihromatīnā ir lielākais IA blīvums, salīdzinot ar citiem līdz šim sekvenētiem genomiem (D.melanogaster, C.elegans, A.thaliana) Cilvēka genoms ir pārpilns ar vecām kādreiz aktīvu IA kopijām. Tam nav efektīva genoma attīrīšanas mehānisma. Zīdītājiem vispār tā ir problēma. Cilvēka genomā LINE1 + Alu sastāda 60% visu IA. Citiem organismiem nav dominējošas IA klases, nav arī tik maz DNS transpozonu. Figure 20 Comparison of the age of interspersed repeats in eukaryotic genomes. The copies of repeats were pooled by their nucleotide substitution level from the consensus. Nucleotide substitution level from consensus Nature 409, 860 (2001)

27 IZKLIEDĒTO ATKĀRTOJUMU EVOLŪCIJA PRIMĀTOS
Īpaši izceļas Alu atkārtojumi, kuru skaits un tam sekojoši pārkārtojumi dramatiski pieaug šimpanzes un cilvēka genomā. Tieši Alu tiek piedēvēta svarīga loma transpozīcijā un citos genoma pārkārtojumos, lai gan arī LINE1 elementiem tāda pastāv. Alu quiescence in the orang-utan lineage We identified only ~250 lineage-specific Alu retroposition events in the orang-utan genome, a dramatically lower number than that of other sequenced primates, including humans. The total number of lineage-specific L1, SVA and Alu insertions is shown (pie chart) at the terminus of each branch of the phylogeny of sequenced great apes shown in grey at left, along with the rate of insertion events per element type (bar graph). Reduced Alu retroposition potentially limited the effect of a wide variety of repeat-driven mutational mechanisms in the orang-utan lineage that played a major role in restructuring other primate genomes. Nature 469, 529 (2011)

28 IZKLIEDĒTO ATKĀRTOJUMU NOVIETOJUMS
Atkārtojumi nav izvietoti vienmērīgi genomā. Cilvēka genomā ekstrēms ir 525 kb Xp11 rajons, kur IA blīvums ir 89%. Ir rajoni, kas par 60% sastāv no Alu, vai arī par 89% - no LINE1. Ir rajoni brīvi no IA. Tāds ir homeobox gēnu klasters hoxA,B,C,D, kurā 100 kb rajonā IA saturs ir zem 2%. Tas varbūt saistīts ar hox gēnu sarežģīto regulācijas sistēmu, kuru varētu traucēt IA klātbūtne. Figure 21 Two regions of about 1 Mb on chromosomes 2 and 22. Red bars, interspersed repeats; blue bars, exons of known genes. Note the deficit of repeats in the HoxD cluster, which contains a collection of genes with complex, interrelated regulation. blue – exons red – interspearsed repeats Nature 409, 860 (2001)

29 LINE UN Alu INDUCĒTĀS GENOMA PĀRVĒRTĪBAS
13, xxx (2003)

30 SEGMENTU DUPLIKĀCIJAS

31 SEGMENTU DUPLIKĀCIJAS
Segmentu duplikācijas (SD) atšķirībā no Kopiju skaita variantiem (CNV) jeb t.s. Strukturāliem variantiem (SV) ir nostabilizējušās populācijā un maz atšķiras atsevišķiem indivīdiem. SD tāpat kā CNV radušās atsevišķu genoma rajonu duplikācijas un transpozīcijas rezultātā. Segmentu duplikācijas – SD ir vairāku reižu atkārtoti un izkliedēti pa genomu noteiktas secības DNS rajoni (1 – 200 Kb un lielāki). Sekvences identitāte pārsniedz 90%, ja savstarpēji salīdzina viena tipa duplikonus. Cilvēka genomā SD ir samērā jaunas ar augstu sekvences homoloģiju: 95 – 97%, pat līdz 99%. Nature Rev.Genetics 7,552(2006)

32 SEGMENTU DUPLIKĀCIJAS
Izšķir divu veidu SD – inter- un intrahromosomālas. SD blīvums hromosomās stipri atšķiras un tie izvietoti visai nevienmērīgi. Cilvēka genomā 48% SD pārinieku atrodas dažādās hromosomās, kamēr peles genomā – tikai 13%. SD struktūra un izvietojums ir stipri atšķirīgs atsevišķos genoma rajonos – eihromatīnā, pericentromērās un subtelomērās. Daudz segmentu duplikāciju atrodas pericentromēras un subtelomēru rajonos. Duplikāciju saturs pericentromēras rajonā ir 6 – 7 reizes lielāks kā citur hromosomā. Jo tuvāk centromēras α-satelita DNS rajonam, jo vairāk tur redzamas duplikācijas, kas liecina par α-satelitu lomu duplikācijās. Duplicētos viena genoma rajonus sauc par paralogiem, atšķirībā no t.s. ortologiem (ortologus identificē ar struktūrā un funkcijās līdzīgiem rajoniem, salīdzinot savā starpā dažādus organismus). Nature Rev.Genetics 7,552(2006)

33 PERICENTROMĒRAS RAJONU STRUKTŪRA
Figure 1 Models of centromeric transition regions. a, Pre-genome sequence model of pericentromeric organization: tandem reiterations of higher-order alpha-satellite DNA constitute larger array structures whose precise composition is diagnostic for a particular chromosome35. Blocks of alpha-satellite DNA lacking higher-order structure as well as other pericentromeric satellite DNA sequences map to the periphery2,11,36,37. In some cases, such as 9q12, 16q12 and 1q12, these peripheral satellite DNAs became sufficiently large to warrant their own cytological designations known as a secondary constriction36–38. b, Models of pericentromeric organization based on three sequenced chromosomes11,26,39 showing various degrees of duplication content and interstitial satellite content40. Chromosome 7q represents a high level of segmental duplication whereas 19p represents an intermediate level and Xp a low level of segmental duplication. Nature 430, 857 (2004)

34 SEGMENTU DUPLIKĀCIJAS Chr22
Eihromatīna daļā pārsvarā ir intra-hromosomāli atkārtojumi, kamēr pericentromēras un subtelomēru rajonos dominē inter-hromosomālie atkārtojumi, pie tam no dažādām hromosomām nākuši. Euchromatine FIGURE 1 | The distribution of segmental duplications (SDs) in the human genome. a | The q-arm of chromosome 22 is used to show the distribution of the three classes of SD: pericentromeric, interstitial and subtelomeric. The overview (top panel) shows the position of SDs that are 10 kb in length and 90% identity. Interchromosomal and intrachromosomal pairwise SDs are shown in red and blue, respectively, with light blue lines joining homologous SDs. The expanded views of the main duplicated regions show their SD structures in terms of % identity (vertical scales) and the chromosomal location of the other pairwise copy (blue represents chromosome 22, other colours represent other individual chromosomes). The pericentromeric region consists of many juxtaposed duplicons that originate from diverse ancestral regions. Secondary duplications of larger segments consisting of multiple duplicons (duplication blocks) are distributed among non-homologous pericentromeric regions. The interstitial region shows examples of interspersed (*) and tandem (**) intrachromosomal duplications. The expanded view for this region delineates an interspersed cluster that spans 3.5 Mb and contains three large duplication blocks in which the average sequence identity is 98–99%. The subtelomeric region contains approximately 100 kb of interchromosomal SD sequence, which is shared with up to three other non-homologous subtelomeric regions. inter-chromosomal intra-chromosomal Nature Rev.Genetics 7,552(2006)

35 PERICENTROMĒRAS RAJONI UN GENOMA PĀRKĀRTOJUMI
Pericentromēras un subtelomēras – heterohromatīna rajoni ar satelita DNS struktūru ir īpaši aktīvi rekombināciju procesos. Gēnu kopijas, nonākot šajos pericentromēras, tāpat arī subtelomēru rajonos, ir pakļautas aktīvai pārveidei – sašķelšanai atsevišķos fragmentos, inversijām, transpozīcijām uc. Šeit liela loma ir nealēliskajām rekombinācijām starp atkārtotām sekvencēm. Duplikācijas nodrošina jaunu gēnu veidošanos, kas nav pakļauta selekcijas spiedienam, jo ārpus pericentromēras ir saglabāta vismaz viena nemainīta gēna kopija, kas pilda savas funkcijas, kamēr citas var brīvi transformēties. Gēnu duplikācija ar tam sekojošu kopiju pārveidošanu varētu būt vispārīgs jaunu gēnu veidošanās mehānisms. Eihromatīna rajonos notiekošās segmentu duplikācijas pārsvarā radušās nehomoloģisku rekombināciju rezultātā, kur galvenā loma translokācijām pieder IA, t.sk. Alu. 29% segmentu atkārtojumu galos konstatētas tieši Alu sekvences.

36 DUPLIKĀCIJAS 2p11 PERICENTROMĒRĀ
35–40 mlj. gadu laikā 2p11 pericentromēras rajons uzkrājis no eihromatīna pārnestos segmentus – t.s. priekšteča duplikācijas (ancestral duplications). Tālāk seko jau daudz ātrāka šo duplikāciju pārveidošana hromosomu pericentromērās, radot atsevišķu genoma segmentu jaunas kombinācijas, kas beigās var novest pie jauna gēna izveides. Jauna gēna radīšana no fragmentiem gan nav ātrs process – varbūt viens 1 mlj. gadu laikā. Figure 5 Ancestral duplicons within 2p11. The modular organization of a pericentromeric region, 2p11, is depicted based on the classification of the underlying pairwise alignments. Duplicated segments that originate outside of the pericentromeric region, termed ancestral duplicons, are shown in colour (ancestral cytogenetic band locations are delineated). Unshaded regions correspond to regions where no underlying duplication could be detected. The minimal tiling path of large-insert BAC clones is drawn to scale below each line. A 737-kb validated sequence contig that provides the first autosomal transition into higher-order alpha-satellite repeat DNA is shown here. Approximately 98% of this region is composed of duplicated material of which 57% can be traced back to non-pericentromeric regions of the genome. These correspond to 13 ancestral duplicons of which 9 were experimentally confirmed by non-human primate analyses. Nature 430, 857 (2004)

37 DUPLIKĀCIJU SAVĀKŠANA 2p11 PERICENTROMĒRĀ
UN TO PĀRDALĪŠANA PA CITĀM HROMOSOMĀM 10–20 Myr 4–8 Myr Figure 5. A model for the acquisition and dispersal of 2p11 duplicons. An expanded two-step model is shown to explain the current organization of 2p11. First, a burst of DNA duplicative transposition events occurs in the common ancestor of humans and apes (10–20 Myr), creating a large mosaic region consisting of at least 14 duplicons. During the radiation of humans and African great-apes (4–8 Mya), a series of secondary duplications disperse larger cassettes to other pericentromeric regions, leading to quantitative and qualitative differences of each larger block within different lineages. More recent transposition events suddenly cease or are no longer fixed during this second phase. Genome Research 15, 914 (2005)

38 SEGMENTĒTAIS GENOMS GENOMA IZMAIŅĀS AKTĪVIE RAJONI IR NEVIENMĒRĪGI IZVIETOTI Fig. 3. Genomic locations of segments belonging to the divergence states of the six-state AR (ancestral repeats) model. Bars represent human chromosomes, reported in scale and with positions indicated on the vertical axis. Gray regions correspond to windows excluded from the analysis due to assembly gaps or data preprocessing filters. PNAS 110,14699(2013)

39 CENTROMĒRU TRANSPOZĪCIJAS
Genoma evolūcijā centromēras var pārvietoties un izzust, veidojot jaunas hromosomu struktūras. Šis process saistīts ar heterohromatīna (α-satelita) struktūras veidošanos un zušanu attiecīgajos rajonos. Centromēra nav aktīva ģenētisko rekombināciju procesos, neskatoties uz tandēmisko atkārtojumu pārbagātību tajā. duplicated E centromere α-satelites repositioning 13, xxx (2003) AC – ancestral centromere; NC – neocentromere

40 aktivējas transkripcija
inaktivēta centromēra CENTROMĒRA UN SEGMENTU DUPLIKĀCIJA hromatīna relaksācija satelītu zaudēšana Centromērai zaudējot savu aktivitāti, noārdās tās α-satelīta ultrastruktūra un kļūst aktīvi rekombināciju procesos līdz tam klusējošie tandēmiskie atkārtojumi. Tā rezultātā nehomoloģiskās rekombinācijas un gēnu konversija duplicē centromēras un pericentromēru atkārtotās sekvences un izkliedē tās arī eihromatīna daļā. Centromēra un pericentromēru rajoni uzskatāmi par segmentu duplikācijas un to tālākās transformācijas ģeneratoriem, arī jaunu gēnu fabrikām. aktivējas transkripcija un rekombinācijas pieaug segmentu duplikācijas pericentromērā transkripcijas saiti 13 (2003)

41 GENOMA STRUKTŪRAS PĀRKĀRTOŠANAS SHĒMAS
Four types of common Mobile element Associated Structural Variants (MASVs) in the HuRef genome. (A) Classical retrotransposon insertion; (B) nonclassical insertions; (C) nonallelic homologous recombination-mediated insertion/deletion; (D) nonhomologous end-joining-mediated deletion. (TTAAAA) Standard L1 cleavage site for classical retrotransposition; (black lines) flanking regions, (gray lines) intervening regions, (dotted circles) homologous recombining regions, (red boxes) microhomology regions, (red arrow boxes) TSDs of each element. Genome Res. 19,1516(2009)

42 SEGMENTU DUPLIKĀCIJAS UN GĒNU EVOLŪCIJA
Divi piemēri kā sapludinot ar SD palīdzību izveidojušies jauni primātu gēni. FIGURE 6 | Gene innovation in segmental duplications (SDs). Two examples of 'novel' primate-specific genes that have been created by SDs are shown. a | The TRE2 oncogene (also known as ubiquitin-specific protease 6 (USP 6)) is a hominoid-specific gene that is located at 17p13.2 in humans. This gene was formed from the fusion of two SDs, each carrying sequence from genes that are located on the q-arm of human chromosome 17, at a distance from the duplication site (TBC1 domain family member 3 (TBC1D3) and USP32). TRE2 has derived exons 1–14 (red) from TCB1D3 and exons 15–29 (green) from USP32. b | The RanBP2-like GRIP-domain-containing protein (RGP) gene family formed from the fusion of SDs of the genes RANBP2 and GGC2, which are located on human chromosome 2q13. As shown in the top panel, a protypical RGP is composed of the first 20 exons of RANBP2 (red) and last three exons of GGC (yellow). This fusion sequence has been extensively duplicated as part of duplication hubs on chromosome 2, on both the q- and p-arms. Below this is a detailed view of the duplicon structure of the RGP-containing regions involved, illustrating the complex mosaic pattern that has arisen during the formation of this gene family. GGC2 and RANBP2 regions are shown in yellow and red, respectively; regions of other genes are shown in various colours. These duplication hubs show evidence for multiple functional copies under extensive positive selection21. Modified with permission from Ref. 21 © (2005) Cold Spring Harbor Laboratory Press. Nature Rev.Genetics 7,552(2006)

43 SEGMENTU DUPLIKĀCIJA UN ADAPTĪVĀ EVOLŪCIJA
Cilvēka chr16 vēl neatpazītas gēnu saimes (morfeus) viens no gēniem LCR16a (LCR - low copy number repeats) ir pārnests tās pašas hromosomas īsajā plecā 14 kopiju veidā, kuru starpā ir augsta homoloģija (katrai no kopijām ir ap 20 kb). 5 no LCR16a kopijām ir aktīvas transkripcijā un tām ir tipiska eikariotu gēnu struktūra. Starp duplicētām LCP16a gēna struktūrām nekodējošai daļai ir 98.1% homoloģija, taču eksonu daļā homoloģija ir tikai 81%. Tas liecina par pozitīvu selekciju evolūcijas gaitā attiecībā uz aminoskābju maiņām LCR16a proteinā. Figure 1. Sequence properties of the LCR16a duplication. a, Schematic display of the distribution pattern (red bars) of LCR16a duplications relative to a human chromosome 16 ideogram. The analysis is based on the published human genome project assembly4,5 and shows the clustering of duplications on the short arm of chromosome 16. The gene structure of one member of the gene family (AF132984) is shown (green bars) compared with the 20-kb LCR16a segment from its corresponding genomic locus (AC002045). The analysis indicates eight exons, two strong polyadenylation signals within the 3' untranslated region, and a putative promoter region overlapping the first exon. Nature 413, 514 (2001)

44 SEGMENTU DUPLIKĀCIJA UN ADAPTĪVĀ
EVOLŪCIJA Sekvencē dažādas LCR16a kopijas šimpanzei, gorillai, orangutanam un dažādiem Vecās pasaules pērtiķiem un salīdzina attiecību Ka : Ks (nesinonīmisko pret sinomīmisko apmaiņu skaitu) starp sugām: ekstrēms ir starp cilvēku un pērtiķiem - Ka : Ks = 13.0 starp šimpanzi un pārējiem pērtiķiem Ka : Ks = 11.8 atšķirība ir arī starp cilvēku un šimpanzi - Ka : Ks = 5.0 pretēji tam, starp gibonu un orangutanu - Ka : Ks = 1.0 LCR16a proteīna aminoskābju sekvences diverģence notikusi 20 reizes ātrāk par aprēķināto, ja izmanto neitrālo mutāciju biežuma vērtību x10-9 šajā aprēķinā. LCR16a duplikācija ar tam sekojošu pozitīvo selekciju notikusi pēc šimpanzes-cilvēka līnijas nodalīšanās no orangutāna līnijas pirms 12 mlj. gadiem un turpinājās arī pēc cilvēka un šimpanzes līniju nodalīšanās. Nature 413, 514 (2001)

45 TRANSPOZONU VIETA GENOMA PĀRVĒRTĪBĀS UN AKTIVITĀTĒ
Nature 443,521(2006)

46 TRANSPONĒJAMIE ELEMENTI KĀ MOBĪLIE GĒNU REGULĀTORI
DNS hipometilēšana iezīmē enhanseru eksistenci genomā Figure 1: Possible evolutionary scenarios leading to TE functionalization and DNA hypomethylation A TE whose insertion creates deleterious enhancer activity by driving expression of a target gene in an inappropriate tissue will be removed from the population by purifying selection (left). High levels of CpG methylation within TEs (5-methylcytosine, 5mC) may buffer against deleterious effects by obscuring the regulatory potential of the TE. This buffering would allow further adaptation to occur toward tissue-specific enhancer activation and DNA hypomethylation (right). TF, transcription factor Nature Genetics 45,717(2013)

47 HIPOMETILĒTI TRANSPONĒJAMO ELEMENTU RAJONI
KĀ GĒNU AKTIVITĀTES REGULĀCIJAS ELEMENTI DNS metilēšanas pakāpe Figure 3: Tissue-specific hypomethylated TEs correlate with gene expression (d) UCSC Genome Browser view of an LFSINE element upstream of the GFRA1 gene. Displayed tracks include DNA methylation (MeDIP-seq) for human ESC H1, breast, brain and blood samples; histone modification (H3K4me3 and H3K4me1) tracks for a fetal brain sample and a naive CD8+ T cell sample; gene annotation; and repetitive elements shown as black boxes annotated by RepeatMasker. The dashed box highlights the LFSINE element. (e) Bisulfite sequencing validation of the DNA methylation status of the LFSINE element (four CpG sites) in human ESC H1, breast, brain and blood samples. (f) Box plots of the expression levels of GFRA1 in four different tissues. Analizē dažādu audu DNS (arī H3K4) metilēšanas pakāpi GRAF1 gēna rajonā. DNS metilēšanas trūkums, kas sakrīt ar LF SINE insercijas vietu, korelē ar GRAF1 gēna ekspresiju Nature Genetics 45,836(2013)

48 GĒNU EVOLŪCIJA 48

49 GĒNU EVOLŪCIJAS DIVI CEĻI
Jaunu gēnu veidošanās notiek galvenokārt pa diviem ceļiem: atsevišķu genoma rajonu duplikācijas (segmentu duplikācijas vai transponējamo elementu izraisītās duplikācijas genomā), ar tām sekojošu radušos kopiju diverģenci; gēnu veidošanās no atsevišķiem strukturāliem domēniem, tai skaitā pievienojot jaunus eksonus jau esošajiem gēniem ar alternatīvā splaisinga palīdzību. Cilvēka un peles genomu (proteomu) salīdzinājums atklāj ortologu gēnu lielu pārākumu (80%) par paralogiem gēniem, kas liecina par otro ceļu kā galveno gēnu evolūcijā. Taču gēnu duplikāciju – paralogu veidošanās rezultātā rodas gēnu saimes, un to evolūcijā dominē pirmais mehānisms. 49

50 GĒNU EVOLŪCIJA NOTIEK ARĪ MŪSDIENĀS
Evolūcijā novērojama arī kontrolēta gēnu izslēgšanu (gene death): Cilvēka genomā ir tādi neprocesēti pseidogēni, kuri radušie visai nesen un tikai ar dažām mutācijām atšķiras no aktīviem gēniem. Tas liecina par joprojām notiekošiem gēnu izslēgšanas procesiem evolūcijas gaitā. Starp 32 šādiem nesenas izcelsmes pseidogēniem (katram no tiem vidēji ir – 0.8 stop-kodons un 1.6 rāmja nobīde) 10 atvasināti no ožas receptoru klases gēniem. Ožas receptoru gēnu izslēgšana un reizē arī jaunu variantu rašanās liecina par šajā lielajā gēnu saimē notiekošām izmaiņām. 50

51 Citohroma P450 gēnu supersaime cilvēkam, pelei un zivij:
DUPLIKĀCIJAS KĀ GĒNU EVOLŪCIJA MEHĀNISMS Citohroma P450 gēnu supersaime cilvēkam, pelei un zivij: Cilvēka genomā ir 63 šīs saimes gēnu, bet pelei to skaits ir daudz lielāks - 84 gēni. Lielas atšķirības starp cilvēka un peles gēniem ir pie Cyp2b, 2c, 2d un 4a. Nature 420, 520 (2002) 51

52 GĒNU DUPLIKĀCIJAS KĀ JAUNA GĒNA AVOTS
Gēna A kopijas pēc amplifikācijas atbrīvojas no selekcijas spiediena un var brīvi uzkrāt mutācijas un jebkurus pārveidojumus, lielākā daļa no tām ir vai nu nevēlamas vai neitrālas, kas var novest pie gēna kopiju iznīcināšanas pirms tajos varētu rasties kādas pozitīvas izmaiņas jauna gēna radīšanas procesā un nostiprināšanās populācijā. Tādēļ šāds process ticamāk var notikt, ja amplificējamajam gēnam A bez pamatfunkcijas ir vēl kāda minorā papildfunkcija – b, kura noteiktos fizioloģiskos apstākļos būtu palielināma. Tad gēna A+b amplifikācijai parādās jau pozitīvas selekcijas apstākļi un šādi pasargātā gēna kopija A+b turpmākos pārveidojumos (diverģējot) var iegūt jaunu funkciju B, kas nostiprinās genomā. Šādas ģenētiskas segregācijas apstākļos papildus gēnam A ir radies jauns gēns B. Real-Time Evolution of New Genes by Innovation, Amplification, and Divergence Fig. 1 (A) The IAD model. Innovation occurs when the ancestral gene (green) encodes a protein with the main function “A” and a minor activity “b.” Amplification occurs when an environmental change makes the b activity beneficial and selection favors variants with increased b activity. Divergence may occur in any one of the amplified gene copies that acquires a beneficial mutation that increases “B” activity (blue gene copy). After a B mutation, selection for the amplified array is relaxed, and segregation occurs to leave alleles with original A activity and the evolved B activity.   Science 338,384(2012)

53 JAUNU GĒNU VEIDOŠANĀS Visas minētās shēmas gēnu evolūcijā atvieglo jaunu gēnu veidošanos, jo samazina negatīvās selekcijas spiedienu uz jau funkcionējošu gēnu, radot jaunas ģenētiskas struktūras, neiznīcinot vecās. Transponējamie elementi var palīdzēt izveidoties jauniem eksoniem. Daudzu gēnu tuvumā atrod Alu elementus, kas satur vairākus potenciālus splaisinga saitus abos virzienos. Tiem integrējot intronā vai citā gēna netranslējamā daļā var izveidoties jauns eksons, kas var tikt izmantots jauna proteīna varianta sintēzē ar alternatīva splaisinga palīdzību. Piemērs šādai Alu inducēta eksona izveidošanai ir tumora nekrozes faktora receptora (p75TNFR) gēns, kas evolūcijas gaitā pievienojis 1. eksonu. Nature Rev. Genetics 7,499(2006) 53

54 ALTERNATĪVS EKSONS IEGŪTS NO Alu ELEMENTA
( ) Exon1a struktūras pārvērtības: A→G (inic. AUG kodons); 7bp del (transl. rāmis); C→T (splaissaits) FIGURE 3 | Creation of a new functional alternative exon of p75TNFR from an Alu element. a | Gene structure and alternative splicing of p75TNFR. The exon 1a (red) is an alternative first exon that originates from an Alu element of the Alu Jo subfamily. b | Pairwise alignment of p75TNFR exon 1a locus to the Alu Jo subfamily consensus sequence. Assuming the Alu Jo consensus sequence is the origin for the exon 1a locus, the alignment indicates an A-to-G substitution which creates the start codon on exon 1a; a C-to-T substitution which creates a splice donor site; and a 7-bp deletion in exon 1a which creates a full-length ORF. The red box delineates the boundaries of exon 1a. Modified with permission from Ref. 37 © J.Mol.Biol.341,883(2004) Nature Rev.Genetics 7,499(2006); J.Mol.Biol. 341,883(2004) 54

55 ALTERNATĪVS EKSONS IEGŪTS NO Alu ELEMENTA
(c) FIGURE 3 | Creation of a new functional alternative exon of p75TNFR from an Alu element. c | Phylogenetic analyses of the p75TNFR exon 1a locus in primates. The Alu insertion occurred 40–58 million years ago (mya), followed shortly by the A-to- G substitution (which creates the start codon on exon 1a). The C-to-T substitution (which creates a splice donor site) and the 7-bp deletion (which creates a full-length ORF) occurred 25–40 million years ago. Modified with permission from Ref. 37 © J.Mol.Biol.341,883(2004) Nature Rev.Genetics 7,499(2006); J.Mol.Biol. 341,883(2004) 55

56 GĒNU RADĪŠANA DE NOVO Genome Res.19,1752(2009)
CC C CLLU1 Sequence changes in the origin of CLLU1 from noncoding DNA. (A) Region of conserved synteny between human and chimp chromosomes 12. Genes are indicated by rectangular boxes and the region of chromosome is indicated by a horizontal line. Unambiguous 1:1 orthologs that were used to infer the synteny block are shown in red. One gene in this region, chronic lymphocytic leukemia up-regulated gene 1 (CLLU1), had no BLASTP hits in any other genome and is shown in green. (B) Multiple sequence alignment of the gene sequence of the human gene CLLU1 and similar nucleotide sequences from the syntenic location in chimp and macaque. The start codon is located immediately following the first alignment gap, which was inserted for clarity. Stop codons are indicated by red boxes. The sequenced peptide identified from this locus is indicated in orange. The critical mutation that allows the production of a protein is the deletion of an A nucleotide, which is present in both chimp and macaque (indicated by an arrow). This causes a frameshift in human that results in a much longer ORF capable of producing a 121-amino acids-long protein. Both the chimp and macaque sequences have a stop codon after only 42 potential codons. (C) Alignment of the region around the critical human enabler-mutation with similar nucleotide sequences from the syntenic regions in chimp, and macaque and sequence traces from gorilla, gibbon, and orangutan. For gorilla, gibbon, and orangutan the trace database accession number is shown on the right. The disabler is also shared by gorilla and gibbon indicating it is ancestral. Pierādītā proteīna daļa – Kritiskā delēcija Genome Res.19,1752(2009) Jauns cilvēka gēns – CLLU1 (chronic lymphocytic leukemia up-regulated gene 1) radies no nekodējošā šimpanzes genoma rajona 56

57 ALTERNATĪVAIS SPLAISINGS UN GĒNU EVOLŪCIJA
Alternatīvais splaisings ir būtisks mehānisms jaunu gēnu radīšanā. Ievadot jaunu alternatīvu eksonu un izmantojot to minorā spalsisinga procesā, sākotnējais gēna splaisings nav traucēts un negatīvās selekcijas spiediens tiek novērsts. Līdz ar to var panākt būtiskas genoma pārmaiņas, saglabājot gandrīz neitrālu evolūcijas gaitu. Gēnu duplikācijas un alternatīvais splaisings ir savstarpēji papildinoši procesi gēnu evolūcijā. Multigēnu saimēs alternatīvais splaisings ir maz izteikts, kamēr unikāliem gēniem novēro aktīvu alternatīvo splaisingu. FIGURE 5 | Alternative splicing opens neutral paths for an accelerated rate of new exon creation. Because of frameshift, in-frame premature stop codons or disruption of functional and/or structural elements, the evolutionary landscape of a new constitutive exon in an existing gene is largely negative. The insertion of a new constitutive exon is likely to cause a significant reduction in fitness and is therefore likely to be prevented by strong negative selection. Insertion of a new exon as a minor-form exon through alternative splicing neutralizes regions of negative selection in the fitness landscape of the new exon. Therefore, alternative splicing opens neutral paths for an accelerated rate of new exon evolution. The new minor-form alternative exon can evolve rapidly, possibly to a new, adaptive function. The red and blues lines represent fitness landscapes for each of the two scenarios discussed. Nature Rev.Genetics 7,499(2006) Ne katrai splaisformai var piedēvēt kādu lomu organismā, vairums to varētu būt nevēlamas. Bez konkrētas funkcijas tās var pastāvēt tikai kā mazākuma forma līdz optimālāka varianta radīšanai. Proc.Natl.Acad.Sci. 104,5495(2007) 57

58 IZMAIŅAS CILVĒKA GENOMĀ
IR SELEKTĪVAS 58

59 CITIEM DZĪVNIEKIEM KONSERVATĪVU SEKVENČU DELĒCIJAS (hCONDELs) CILVĒKA GENOMĀ
Figure 1: Hundreds of sequences highly conserved between chimpanzee and other species are deleted in humans Computational approach used to discover human-specific deletions of functional DNA: identification of ancestral chimpanzee genomic sequences deleted in human; discovery of chimpanzee genomic sequences highly conserved in other species; and detection of human-specific deletions that remove one or more chimpanzee conserved sequences. Top: Human genomic locations of the 583 hCONDELs. hCONDELs are displayed as blue ticks above the many locations where they are missing. Bottom: Size distribution of hCONDELs. Vairāk kā 500 tikai cilvēka genomam raksturīgas dažāda garuma delēcijas pārsvarā ir notikušas nekodējošos rajonos Nature 471,216(2011) 59

60 HUMAN ACELERATED REGIONS – HAR
Genoma evolūcijā lielākā daļa izmaiņu ir neitrālas mutācijas un brīvi tiek pārmantotas, daļa genoma paliek nemainīga vai mainās ļoti nedaudz, jo negatīvās selekcijas rezultātā mutācijas tiek izslēgtas. Taču ir rajoni, kas raksturojas ar paaugstinātu mainību – pozitīvā selekcija. Tieši šie genoma rajoni piesaista paaugstinātu pētnieku uzmanību, īpaši cilvēka smadzeņu evolūcijas aspektā. Cilvēka genomā atrasti rajoni, kuri būtiski atšķiras no primātu un citu zīdītāju genomiem: HAR1 – HAR49 (Human acelerated regions). Citos dzīvniekos HAR atbilstošie rajoni ir evolucionāri konservatīvi, bet strauji izmainījušies cilvēka genomā. Tie galvenokārt saistīti ar gēniem, kas regulē transkripciju un neironu attīstību. Atšķirības gēnu aktivitātē smadzenēs starp cilvēku un primātiem varētu liecināt par notikušām būtiskām genoma ekspresijas izmaiņām evolūcijā, cilvēka līnijai atšķeļoties no šimpanzes. Tādēļ tieši tur notiek intensīvi pētījumi. 60

61 HUMAN ACELERATED REGION – HAR1
Viens no tādiem HAR rajoniem cilvēka genomā ir HAR1, kas ir daļa no īpaša RNS kodējoša gēna – HAR1R un HAR1F transkriptiem, un ir iesaistīts smadzeņu korteksa agrīnajā attīstībā – Nature 443,167(2006). Šajā tikai 118 bp garajā HAR1 rajonā notikuši 18 aizvietojumi kopš cilvēka–šimpanzes līniju nodalīšanās (pretēji tikai 0,27 sagaidāmiem aizvietojumiem), jo šis rajons praktiski maz mainījies pēdējo 300–400 mlj. gadu laikā. HAR1 rajons FIGURE 1. HAR1-associated transcripts in genomic context Schematic drawing showing the genomic context on chromosome 20q13.33 of the HAR1-associated transcripts HAR1F and HAR1R (black, with a chevroned line indicating introns), and the predicted RNA structure (green) based on the May 2004 human assembly in the UCSC Genome Browser41. The level of conservation in the orthologous region in other vertebrate species (blue) is plotted for this region using the PhastCons program16. Both the common and testes-specific splice sites are conserved (data not shown). eksons introns eksons Nature 443,167(2006) 61

62 HUMAN ACELERATED REGION – HAR1
Cilvēka HAR1 rajona struktūras homoloģija ar citu dzīvnieku ortologām sekvencēm (a) un transkripta iespējamā otrējā struktūra (b) FIGURE 2. Predicted RNA secondary structure for HAR1F a, Section of the multiple alignment of HAR1F in various amniote species. The multiple alignment is annotated with the secondary structure (fold) shown in panel b. Matching round parentheses indicates pairing bases. Square parentheses are used to indicate bases that are predicted to pair outside the region shown. Unpaired regions are shown in grey and substitutions are shown with the following colour scheme: green denotes compensatory transitions, yellow–green denotes compensatory transversions, purple denotes substitutions in unpaired regions, and red denotes non-compensated changes. b, The evolutionarily conserved parts of the RNA secondary structure of the HAR1F region as predicted using the EvoFold program12. Substitutions are shown using the colour scheme in panel a; red bars indicate the region for which the alignment is shown in panel a. The structure is supported by substitutions on the human lineage, as well as a pair of changes in platypus (indicated by (P)). Only the helices with the compensatory substitutions can be considered to be supported by evolutionary data. Ar [......] ierobežotas homoloģiskā rajona robežas HAR1 otrējā struktūras modelī Nature 443,167(2006) Autori uzskata, ka tādi rajoni kā HAR veic selektīvu proteīnu kodējošo gēnu regulāciju, kas ir daudz nozīmīgāka cilvēka evolūcijā nekā šo gēnu struktūras–funkcijas izmaiņas. 62

63 HUMAN-ACCELERATED CONSERVED NONCODING SEQUENCE
(HACNS1) Cilvēka Chr2 šaurā nekodējošā rajonā (81 bp) konstatēti 13 aizvietojumi, kuru nav citiem dzīvniekiem. Tas liecina par straujām evolucionārām izmaiņām šajā genoma konservatīvajā rajonā pēc cilvēka un šimpanzes līnijas nodalīšanās Human-specific gain of function in HACNS1. Top: Location of HACNS1 in NCBI build 36.1 of the human genome assembly. Bottom: Sequence alignment of HACNS1 with orthologs from other vertebrate genomes; positions identical to human are shown in black. A quantitative plot of sequence conservation is shown in blue above the alignment (26–28). The location of each human-specific substitution is indicated by a vertical red line, and the depth of nonhuman evolutionary conservation at human-substituted positions is shown by a vertical yellow line that indicates whether each sequence is identical to chimpanzee and rhesus at that position. The cluster of 13 human-specific substitutions in 81 bp is also indicated. Science 321,1346(2008) 63

64 HUMAN-ACCELERATED CONSERVED NONCODING SEQUENCE (HACNS1)
Aktīvi evolucionējušam cilvēka HACNS1 rajonam (13 aizvietojumi) pierādīta enhansera aktivitāte. Pārnes enhansera variantus kopā ar lacz gēnu kā transgēnu peļu embrijos. Identification of human-specific substitutions contributing to the gain of function in HACNS1. (A) Alignment of HACNS1 with orthologous sequences from other vertebrate genomes, focused on an 81-bp region in the element that contains 13 human-specific substitutions. The position of each substitution is indicated by a red box above the alignment; each human-specific nucleotide is highlighted in red. Positions in the nonhuman genomes that are identical to the human sequence are displayed as dots. (B) Expression pattern of a synthetic enhancer in which the 13 human-specific substitutions (red box) are introduced into the orthologous 1.2-kb chimpanzee sequence background (black bar). (C) Expression pattern of a synthetic enhancer obtained by reversion of these substitutions (black box) in the human sequence (red bar) to the nucleotide states in chimpanzee and rhesus. (D) Number of embryos transgenic for each synthetic enhancer that show full, partial, or no expression in the limb at E11.5. Science 321,1346(2008) 64

65 ŠĶIETAMĀ POZITĪVĀ SELEKCIJA
(SMG6 GĒNA PIEMĒRS) No kopējā apmaiņu biežuma – tikai konstatētas cilvēka līnijā, kas liecina, ka kopš cilvēka un šimpanzes līniju nodalīšanās proteīnos notikušas apmēram 1.6% sekvences izmaiņas, pretēji novērotajiem 15% SMG6 gēnā. Skaitliski dots aminoskābju apmaiņas biežums (vidēji pret saitu) 1000 salīdzinātos proteīnos 25 no 29 ir AT→GC FIGURE 1. Identification of a gene segment showing unusually rapid evolution This example comes from the study by Galtier et al.5, who identified human-specific accelerated change in DNA sequence by comparing the proportional number of changes seen in a given gene segment (or gene) with that of a reference set from humans and other primates. a, The reference set drawn from 1,000 genes. Considering that changes have occurred in the human-specific lineage, compared with the sum of all branch lengths across the tree (0.185), we conclude that only 1.6% of all sequence change happened after the split between humans and chimps in the human lineage. b, Tree resulting from sequence data for part of the gene encoding the SMG6 protein. Here, 15% of all the sequence change has occurred in the human lineage, implying human-specific acceleration of rates. Of 29 changes, 25 are AT GC, implicating biased gene conversion as the cause. Numbers indicate the average number of amino-acid changes per site. (Figure adapted from ref. 5.) Atsevišķu genoma rajonu paaugstinātais mainīgums (t.s. HAR – human accelerated regions) var būt saistīts ne tikai ar adaptīvo selekciju, bet arī ar paaugstināta rekombināciju biežuma rajoniem genomā. Homoloģisko, bet ne identisko rajonu sapārošanas vietās notiek reparācija ar izteiktu AT→GC pārsvaru (BGC – Biased gene conversion). Tas liecina, ka nav spēkā pozitīvā selekcija proteīnu līmenī. Šādas izmaiņas var būt neizdevīgas organismam un jākompensē ar papildus mutācijām. Nature 457,543(2009)

66 ANDROGĒNĀ RECEPTORA GĒNA (AR) ENHANSERA DELĒCIJA
Cilvēka ChrX ir deletēts plašs rajons, kas satur citiem dzīvniekiem konservatīvu AR gēna enhansera sekvenci. Tested for enhancer activity with lacz Figure 2: Transgenic analysis of a chimpanzee and mouse AR enhancer region missing in humans Top panel: 1.1 Mb region of the chimpanzee X chromosome. The red bar shows the position of a 60.7-kb human deletion removing a well-conserved chimpanzee enhancer between the AR and OPHN1 genes. Bottom panel: multiple species comparison of the deleted region, showing sequences aligned between chimpanzee and other mammals. Blue and orange bars represent chimpanzee and mouse sequences tested for enhancer activity in transgenic mice. The chimpanzee sequence drives lacZ expression. Bottom panel: Model depicting multiple conserved tissue-specific enhancers (coloured shapes) surrounding AR coding sequences (black bars) of different species. Loss of an ancestral vibrissae/penile spine enhancer in humans is correlated with corresponding loss of sensory vibrissae and penile spines. Delēcijas rezultātā cilvēka AR gēna ekspresijas regulācija ir mainījusies. Nature 471,216(2011) 66

67 GĒNU AMPLIFIKĀCIJA CILVĒKA GENOMĀ
Genoma evolūcija ietekmē ne tikai gēnu ekspresiju, bet arī jaunu gēnu vai to domēnu rašanos. Atšķirībā no citiem primātiem, cilvēka genomā novērota specifiska (HLS – human lineage–specific) gēnu vai domēnu amplifikācija. Viens no tādiem gēniem ir MGC8902, kura funkcija nav skaidra, bet tas saistīts ar neironālo ekspresiju smadzenēs (korteksā) un kognitīvo funkciju. MGC8902 Ribosomal protein gene (not analysed) Fig. 1. Cross-species BLAT survey of HLS cDNAs and organization of the MGC8902 gene. (A) BLAT searches were performed using full cDNA insert sequences for 140 HLS genes (5) as queries. The IMAGE clones ( that yielded >5 BLAT hits in the human genome are shown. BLAT hits with span sizes exceeding the size of the cDNA query were scored as potentially containing introns. Potentially "intronless" BLAT hits are shown in white. The asterisk denotes BLAT hits associated with the ribosomal protein gene RLP23AP7, which had hit totals of 150, 144, and 133 for human, chimp, and macaque, respectively. All of these were intronless. Science 313,1304(2006) 67

68 SRGAP2 GĒNA AMPLIFIKĀCIJA CILVĒKA GENOMĀ
SRGAP2C gēna kopijai ir svarīga loma cilvēka smadzeņu attīstībā (valoda, komunikācija uc.) Model for SRGAP2 Evolution Schematic depicts location and orientation (blue triangles) of SRGAP2 paralogs on human chromosome 1 with putative protein products indicated above each based on cDNA sequencing. Asterisks indicate a 49 amino acid truncation of the F-BAR domain. Note that the orientation of SRGAP2D remains uncertain, as the contig containing this paralog has not yet been anchored. Arrows trace the evolutionary history of SRGAP2 duplication events. Copy number polymorphism and expression analyses suggest both paralogs at 1q21.1 (SRGAP2B and SRGAP2D) are pseudogenes, whereas the 1q32.1 (SRGAP2A) and 1p12 (SRGAP2C) paralogs are likely to encode functional proteins. Iespējamie pseidogēni Cell 149,912(2012)

69 MIKROCEFALĪNA GĒNA POZITĪVĀ SELEKCIJA
Mikrocefalīna gēns (MCPH1) regulē smadzeņu lielumu. Mutācijas šajā gēnā noved pie garīgas attīstības traucējumiem, taču neietekmē citus cilvēka fizioloģiskos procesus. MCPH1 gēna 14 eksoni aizņem 236 kB lielu rajonu chr8p23. Īpaša nozīme ir MCPH1 gēna 8. eksonam, jo sekvencējot 29 kb rajonu 86 indivīdiem un salīdzinot haplotipus, identificēja funkcionāli svarīgu alēli pozīcijā. Cilvēka populācijā dominē viens mikrocefalīna haplotips – Nr.49 (33%), kas atšķiras no daudziem citiem haplotipiem ar transversiju gēna pozīcijā (C alēle) ar G→C; Asp→His mikrocefalīna 314. aminoskābē. Pārējo 85 haplotipu biežums ir niecīgs (0.6 – 6.2%). Tātad, te spilgti izteikta pozitīvā selekcija. Science 309, 1717 (2005) 69

70 CILVĒKA EVOLŪCIJA TURPINĀS ŠODIEN
Ja summē visus C alēli saturošos MCPH1 haplotipus, iegūst D haplogrupu ar biežumu 70%, kurā dominē haplotips Nr.49. D haplogrupa izveidojusies, Nr.49 haplotipam kolonizējot plašu populāciju pozitīvās selekcijas apstākļos. To pavadīja atsevišķas mutācijas un radīja nelielu lokusa polimorfismu. Fig. 2. Frequencies of 86 inferred Microcephalin haplotypes in the 89- individual Coriell panel. Haplotypes in haplogroup D are indicated by blue-edged bars; non-D haplotypes are indicated by solid red bars. Science 309, 1717 (2005) 70

71 CILVĒKA EVOLŪCIJA TURPINĀS ŠODIEN
Uzskata, ka D haplogrupas vecums ir ap gadu, tad sākās strauja cilvēka attīstība (simboliskā domāšana un komunikācija, ko apliecina arheoloģiskie izrakumi). Iespējams, ka šim cilvēka attīstību veicinātājam D haplotipam un mutācijai G→C ir Eirāzijas izcelsme un adaptācija. Science 309, 1717 (2005) Līdzīga aina novērojama cita smadzeņu attīstību stimulējošā gēna – ASPM evolūcijā, kur pozitīvais D haplotips ar alēli A→G radies, iespējams tikai pirms ~5800 gadiem. Science 309, 1722 (2005) 71

72 CILVĒKA EVOLŪCIJA TURPINĀS ŠODIEN
Mikrocefalīna gēna variants – D haplogrupa, kas itkā nesen radusies un pozitīvās selekcijas rezultātā strauji izplatījusies cilvēku populācijā, faktiski nav jauna. No diviem scenārijiem ticamāka ir B shēma. A B D haplogrupa, iespējams, varētu būt cēlusies no kādas cilvēka priekšteču Homo līnijas, kura nodalījusies no modernā cilvēka līnijas pirms ~ 1.1 mlj. gadiem. Šajā laikā cilvēka līnijā nebija D alēles, taču cilvēka genoms to ieguva pirms ~ gadu, krustojoties ar kādiem D alēli nesošiem citiem Homo līnijas pārstāvjiem, varbūt – neandertāliešiem. Šāds rets interbrīdings, iegūstot reto D alēli, varēja dot būtiskas priekšrocības un šī D alēle strauji izplatījās cilvēka populācijā. Tajā pat laika periodā D alēles donora populācija izmira. D allele D allele Modern humans lineage Archaic Homo lineage Fig. 4. Schematic depiction of two demographic scenarios compatible with the observed genealogy of the microcephalin locus. In both scenarios, an ancestral population, depicted in green, was subdivided into two reproductively isolated populations. One population, depicted in red, fixes the non-D allele, whereas the other population, depicted in blue, fixes the D allele. (A) In the first scenario, the blue population went through a severe bottleneck that dramatically reduced genetic diversity. It then expanded and merged with the other population. (B) In the second scenario, a rare interbreeding event occurred between the two populations, bringing a copy of the D allele from the blue into the red population. This copy subsequently amplified to high frequency under positive selective pressure. The first scenario depends on demography only and does not require selection. This scenario should therefore affect all sites in the genome. The second scenario requires the action of positive selection on the introgressed allele and is therefore not expected to have a genome-wide effect. The observation that the genealogy of microcephalin is not representative of the genome is consistent with the second scenario. PNAS 103,18178(2006) 72

73 SINTĒTISKĀ BIOLOĢIJA – MĒRĶTIECĪGA DZĪVĪBAS FORMU MODIFIKĀCIJA

74 583 kb Micoplasma genitalium GENOMA SINTĒZES METODE
Fig. 3.  Assembly of cassettes by in vitro recombination. (A) Diagram of steps in the in vitro recombination reaction, using the assembly of cassettes 66 to 69 as an example. (B) BAC vector is prepared for the assembly reaction by PCR amplification using primers as illustrated. The linear amplification product, after gel purification, is included in the assembly reaction of (A), such that the desired assembly is circular DNA containing the four cassettes and the BAC DNA as depicted in (C). Science 319,1215(2008)

75 1078 kb Micoplasma mycoides GENOMA SINTĒZE
Fig. 1 The assembly of a synthetic M. mycoides genome in yeast. A synthetic M. mycoides genome was assembled from 1078 overlapping DNA cassettes in three steps. In the first step, 1080-bp cassettes (orange arrows), produced from overlapping synthetic oligonucleotides, were recombined in sets of 10 to produce 109 ~10-kb assemblies (blue arrows). These were then recombined in sets of 10 to produce 11 ~100-kb assemblies (green arrows). In the final stage of assembly, these 11 fragments were recombined into the complete genome (red circle). With the exception of two constructs that were enzymatically pieced together in vitro (27) (white arrows), assemblies were carried out by in vivo homologous recombination in yeast. Major variations from the natural genome are shown as yellow circles. These include four watermarked regions (WM1 to WM4), a 4-kb region that was intentionally deleted (94D), and elements for growth in yeast and genome transplantation. In addition, there are 20 locations with nucleotide polymorphisms (asterisks). Coordinates of the genome are relative to the first nucleotide of the natural M. mycoides sequence. The designed sequence is 1,077,947 bp. The locations of the Asc I and BssH II restriction sites are shown. Cassettes 1 and were unnecessary and removed from the assembly strategy (11). Cassette 2 overlaps cassette 1104, and cassette 799 overlaps cassette 811. Science 329,52(2010)

76 316,617 kb Saccharomyces cerevisiae (Chr3) SINTĒZES STRATĒĢIJA
Fig. 2. SynIII construction. (A) BB synthesis. JHU students in the Build-A-Genome course synthesized 750-bp BBs (purple) from oligonucleotides. nt, nucleotides. (B) Assembly of minichunks. Two- to 4-kb minichunks (yellow) were assembled by homologous recombination in S. cerevisiae (table S1). Adjacent minichunks were designed to encode overlap of one BB to facilitate downstream assembly steps. Minichunks were flanked by a rare cutting restriction enzyme (RE) site, XmaI or NotI. (C) Direct replacement of native yeast chromosome III with pools of synthetic minichunks. Eleven iterative one-step assemblies and replacements of native genomic segments of yeast chromosome III were carried out by using pools of overlapping synthetic DNA minichunks (table S2), encoding alternating genetic markers (LEU2 or URA3), which enabled complete replacement of native III with synIII in yeast. Science 344,55(2014)

77 VEIKTĀS IZMAIŅAS SINTĒTISKĀS RAUGA Chr3 STRUKTŪRĀ
Modificētā Chr3 – synIII sastāvā: Visi terminējošie UAG kodoni aizvietoti ar UAA kodoniem Deletēti subtelomērie rajoni, introni, tRNS gēni, transpozoni un vēl daži gēni Ievietotas dažas struktūras, kas veicina genoma evolūcijas un redukcijas iespējas Fig. 1. SynIII design. Representative synIII design segments for loxPsym site insertion (Aand B) and stop codon TAG to TAA editing (C) are shown. Green diamonds represent loxPsym sites embedded in the 3′ untranslated region (UTR) of nonessential genes and at several other landmarks. Fuchsia circles indicate synthetic stop codons (TAG recoded to TAA). Complete maps of designed synIII chromosome with common and systematic open reading frame (ORF) names, respectively, are shown in figs. S1 and S2. Science 344,55(2014)

78 Saccharomyces cerevisiae SINTĒTISKĀ GENOMA PROJEKTS
Pirmā sintezētā rauga hromosoma Chr3 ar veiktajām izmaiņām (Jef Boeke et.al) Science 343,1426(2014)


Download ppt "ATKĀRTOTĀS SEKVENCES."

Similar presentations


Ads by Google