Presentation is loading. Please wait.

Presentation is loading. Please wait.

SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find.

Similar presentations


Presentation on theme: "SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find."— Presentation transcript:

1 SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find CpG islands? - What are Gardiner-Frommer and Takai-Jones parameters? - Why do we need t-tests? – - What are the findings of (Hutter et al. 2006)?

2 SS 2008lecture 4 Biological Sequence Analysis 2 Arabidopsis thaliana Arabidopsis thaliana is a small flowering plant that is widely used as a model organism in plant biology. Arabidopsis is a member of the mustard (Brassicaceae) family, which includes cultivated species such as cabbage and radish. Arabidopsis is not of major agronomic significance, but it offers important advantages for basic research in genetics and molecular biology. TAIR

3 SS 2008lecture 4 Biological Sequence Analysis 3 Some useful statistics for Arabidopsis thaliana –Small genome (114.5 Mb/125 Mb total) has been sequenced in the year 2000. –Extensive genetic and physical maps of all 5 chromosomes. –A rapid life cycle (about 6 weeks from germination to mature seed). –Prolific seed production and easy cultivation in restricted space. –Efficient transformation methods utilizing Agrobacterium tumefaciens. –A large number of mutant lines and genomic resources many of which are available from Stock Centers. –Multinational research community of academic, government and industry laboratories. Such advantages have made Arabidopsis a model organism for studies of the cellular and molecular biology of flowering plants.TAIR collects and makes available the information arising from these efforts. TAIR

4 SS 2008lecture 4 Biological Sequence Analysis 4 Arabidopsis thaliana genome sequence Representation of the Arabidopsis chromosomes. Sequenced portions are red, telomeric and centromeric regions are light blue, heterochromatic knobs are shown black and the rDNA repeat regions are magenta. Left: DAPI-stained chromosomes. Gene density (`Genes') ranged from 38 per 100 kb to 1 gene per 100 kb; expressed sequence tag matches (`ESTs') ranged from more than 200 per 100 kb to 1 per 100 kb. Transposable element densities (`TEs') ranged from 33 per 100 kb to 1 per 100 kb. Mitochondrial and chloroplast insertions (`MT/CP') were assigned black and green tick marks, respectively. Transfer RNAs and small nucleolar RNAs (`RNAs') were assigned black and red ticks marks, respectively. Nature 408, 796 (2000)

5 SS 2008lecture 4 Biological Sequence Analysis 5 Arabidopsis thaliana genome sequence Nature 408, 796 (2000) The proportion of Arabidopsis proteins having related counterparts in eukaryotic genomes varies by a factor of 2 to 3 depending on the functional category. Only 8 ± 23% of Arabidopsis proteins involved in transcription have related genes in other eukaryotic genomes, reflecting the independent evolution of many plant transcription factors. In contrast, 48 ± 60% of genes involved in protein synthesis have counterparts in the other eukaryotic genomes, reflecting highly conserved gene functions. The relatively high proportion of matches between Arabidopsis and bacterial proteins in the categories `metabolism' and `energy' reflects both the acquisition of bacterial genes from the ancestor of the plastid and high conservation of sequences across all species. Finally, a comparison between unicellular and multicellular eukaryotes indicates that Arabidopsis genes involved in cellular communication and signal transduction have more counterparts in multicellular eukaryotes than in yeast, reflecting the need for sets of genes for communication in multicellular organisms.

6 SS 2008lecture 4 Biological Sequence Analysis 6 Many genes were duplicated Nature 408, 796 (2000)

7 SS 2008lecture 4 Biological Sequence Analysis 7 Segmental duplication Nature 408, 796 (2000) Segmentally duplicated regions in the Arabidopsis genome. Individual chromosomes are depicted as horizontal grey bars (with chromosome 1 at the top), centromeres are marked black. Coloured bands connect corresponding duplicated segments. Similarity between the rDNA repeats are excluded. Duplicated segments in reversed orientation are connected with twisted coloured bands.

8 SS 2008lecture 4 Biological Sequence Analysis 8 Membrane channels and transporters Nature 408, 796 (2000) Transporters in the plasma and intracellular membranes of Arabidopsis are responsible for the acquisition, redistribution and compartmentalization of organic nutrients and inorganic ions, as well as for the efflux of toxic compounds and metabolic end products, energy and signal transduction. Unlike animals, which use a sodium ion P-type ATPase pump to generate an electrochemical gradient across the plasma membrane, plants and fungi use a proton P- type ATPase pump to form a large membrane potential.  plant secondary transporters are typically coupled to protons rather than to sodium. -almost half of the Arabidopsis channel proteins are aquaporins which emphasizes the importance of hydraulics in a wide range of plant processes. - Compared with other sequenced organisms, Arabidopsis has 10-fold more predicted peptide transporters, primarily of the proton-dependent oligopeptide transport (POT) family, emphasizing the importance of peptide transport or indicating that there is broader substrate specificity than previously realized. - nearly 1,000 Arabidopsis genes encoding Ser/Thr protein kinases, suggesting that peptides may have an important role in plant signalling.

9 SS 2008lecture 4 Biological Sequence Analysis 9 What is TAIR*? NSF-funded project begun in 1999 Web resource for Arabidopsis data and stocks Literature-based manual annotation of gene function Genome annotation (gene structure, computational gene function) * URL The following slides were borrowed from a talk at the TAIR7 workshop by Eva Huala & Donghui Li

10 SS 2008lecture 4 Biological Sequence Analysis 10 Portals

11 SS 2008lecture 4 Biological Sequence Analysis 11 Tools

12 SS 2008lecture 4 Biological Sequence Analysis 12 Search

13 SS 2008lecture 4 Biological Sequence Analysis 13

14 SS 2008lecture 4 Biological Sequence Analysis 14 Names Description

15 SS 2008lecture 4 Biological Sequence Analysis 15 GO annotations Expression

16 SS 2008lecture 4 Biological Sequence Analysis 16 Sequences Maps

17 SS 2008lecture 4 Biological Sequence Analysis 17 Mutations Seed lines

18 SS 2008lecture 4 Biological Sequence Analysis 18 Seed lines Links to other sites

19 SS 2008lecture 4 Biological Sequence Analysis 19 Seed lines Links to other sites

20 SS 2008lecture 4 Biological Sequence Analysis 20 Seed lines Links to other sites

21 SS 2008lecture 4 Biological Sequence Analysis 21 Seed lines Links to other sites

22 SS 2008lecture 4 Biological Sequence Analysis 22 Comments References

23 SS 2008lecture 4 Biological Sequence Analysis 23

24 SS 2008lecture 4 Biological Sequence Analysis 24

25 SS 2008lecture 4 Biological Sequence Analysis 25

26 SS 2008lecture 4 Biological Sequence Analysis 26

27 SS 2008lecture 4 Biological Sequence Analysis 27 GBrowse - coming soon

28 SS 2008lecture 4 Biological Sequence Analysis 28 Overview of releases to date 26,819 protein coding genes 3,866 alternatively spliced


Download ppt "SS 2008lecture 4 Biological Sequence Analysis 1 V4 Genome of Arabidopsis thaliana Review of lecture V3... - What are Tandem repeats? - How does one find."

Similar presentations


Ads by Google