Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)

Similar presentations

Presentation on theme: "Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)"— Presentation transcript:

1 Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)

2 Protein coding genes RNA genes (rRNA, snRNA, snoRNA, miRNA, tRNA) Structural DNA (centromeres, telomeres) Regulation-related sequences (promoters, enhancers, silencers, insulators) Parasite sequences (transposons) Pseudogenes (non-functional gene-like sequences) Simple sequence repeats Eukaryotic Genomes: Not only collections of genes

3 Blue: Prokaryotes Black: Unicellular eukaryotes Other colors: Multicellular eukaryotes (red = vertebrates) Eukaryotic Genomes: High fraction non-coding DNA Bron: Mattick, NRG, 2004

4 3 billion basepairs (3Gb) 22 chromosome pairs + X en Y chromosomes Chromosome length varies from ~50Mb to ~250Mb About 22000 protein-coding genes –compare with ~14000 for fruitfly en ~19000 for Nematode C. elegans Human Genome

5 Human genome Bron: Molecular Biology of the Cell (4 th edition) (Alberts et al., 2002) Only 1.2% codes for proteins, 3.5-5% is under selection Long introns, short exons Large spaces between genes More than half exists of repetitive DNA

6 Variation Along Genome sequence Nucleotide usage varies along chromosomes –Protein coding regions tend to have high GC levels Genes are not equally distributed across the chromosomes –Housekeeping generally in gene-dense areas –Gene-poor areas tend to have many tissue specific genes Bron: Ensembl

7 Chromosome organisation Bron: Lodish (4 th edition) DNA packed in chromatin Active genes in less dense chromatin (beads-on-a-string) Non-active genes often in densely packed chromatine (30-nm fiber) Gene regulation by changing chromatin density, methylation/acetylation of the histones Limited availability of chromatin information in genome browsers (post transcriptional modifications are currently under investigation with ChIP-on- chip experiments

8 Genome browsers UCSC NCBI Ensembl

9 Genome Browsing With the UCSC Genome Browser

10 UCSC Genome browser

11 Choose a species, an assembly and a gene

12 Gene search results

13 Genome browser

14 Genomic Datatypes (Tracks)

15 Transcription data rather complicated

16 Browser → Gene record

17 Gene record

18 Gene record (2)

19 Gene record (3)

20 Gene record (4) “best hit”

21 Gene record (5)

22 Genomic elements Genome browsers can be used to examine other things –Genomic sequence conservation –Pseudogenes –Duplications en deletions of pieces chromosome (Copy Number Variations, CNVs)

23 Genomic Sequence Conservation Not only protein coding parts are conserved in evolution Conserved non-coding genomic sequences can be involved in gene regulation (enhancers, silencers, insulators) With the UCSC browser one can examine genomic conservation

24 Genomic Conservation (UCSC)

25 Pseudogenes Pseudogenes “look” like (are homologous to) protein- coding genes, but are non-functional Two types: –Unprocessed pseudogenes (loss of function) –Processed pseudogenes (mRNAs that are retrotranscribed onto the genome  they miss introns and sometimes have a polyA) The UCSC contains various databases of pseudogenes: –Yale pseudogenes (both types pseudogenes) –Vega pseudogenes (both types pseudogenes) –Retroposed genes (only processed pseudogenes)

26 Pseudogenes (UCSC)

27 Copy Number Variation People do not only vary at the nucleotide level (SNPs); short pieces genome can be present in varying number of copies (Copy Number Polymorphisms (CNPs) or Copy Number Variants (CNVs) When there are genes in the CNV areas, this can lead to variations in the number of gene copies between individuals With the UCSC browser CNVs can be examined

28 Copy Number Variation (UCSC)

29 Finding a sequence in the genome

30 BLAT – Search page

31 BLAT - Results

32 BLAT – “Details”

33 BLAT – “Browser”

34 Genome browsers UCSC Ensembl

35 Genome Browsing With the Ensembl Genome browser

36 Ensembl Genome browser

37 Het Human Genome

38 MapView – Overview chromosome

39 ContigView – Zooming in (compare UCSD)

40 ContigView (2)

41 GeneView – Gene record

42 TransView - mRNA Transcript

43 TransView - mRNA Transcript (2)

44 Alternative Transcripts Bron: Wikipedia (

45 GeneView - Show Alternative Transcripts

46 GeneSpliceView - Alternative Transcripts

47 Single Nucleotide Polymorphisms (SNPs) Sequence variations within a species Similar to mutations, but are simultaneously present in the population, and generaly have little effect Are being used as genetic markers (a genetic disease is e.g. associated with a SNP) ENSEMBL offers a nice SNP view

48 GeneView - Show SNPs

49 GeneSNPView - SNPs

50 GeneView - Show Protein

51 ProtView - Protein

52 ProtView - Protein Sequence

53 ProtView – Search proteins with the same domains

54 DomainView – Proteins with a certain domain (Interpro = SMART + PFAM + others)

55 ProtView - Find Proteins In the Same Protein Family

56 FamilyView – Alignments of homologous proteins

57 Finding Human Genes

58 Finding a human gene (2)

59 Blast

60 Blast (2)

61 UCSC vs Ensembl: Which is better ? They more or less contain the same information UCSC is a bit easier in use Ensembl gives more detailed information and more flexible data export Other small differences in data (e.g. UCSC has more extensive genomic conservation data) Whatever your are familiar with !!

Download ppt "Genome Browsers UCSC (Santa Cruz, California) and Ensembl (EBI, UK)"

Similar presentations

Ads by Google