Sackler Medical School UCSC Genome Browser Dror Hollander Gil Ast Lab Sackler Medical School
Understanding the Genome histone midifications genes GC content promoters repetitive elements conservation SNPs nucleosome occupancy miRNA gene expression non-coding RNA secondary structure alternative splicing exon-intron structure How can you examine a genomic segment while taking all of these factors into account? DNA protein RNA
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
UCSC Genome Browser (64 eukaryote genome genomes) track track Basic design: “the Genome Browser stacks annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information” (64 eukaryote genomes) genome track track
Genome Browsing…
Basic Genome Browser Interface chromosomal position genomic coordinates zoom start codons in green mark and drag here to zoom in stop codons in red UCSC Genes track Refseq Genes track Black - feature has a corresponding entry in the Protein Data Bank (PDB) Dark blue - transcript has been reviewed or validated by either the RefSeq, SwissProt or CCDS staff Medium blue - other RefSeq transcripts Light blue - non-RefSeq transcripts intron CDS UTR gene direction (> / <)
Basic Genome Browser Interface Configure track visualization:
Basic Genome Browser Interface “RefSeq track shows known human protein-coding and non-protein-coding genes taken from the NCBI RNA reference sequences collection (RefSeq). The data are updated daily” “The UCSC track shows gene predictions based on data from RefSeq, Genbank, CCDS and UniProt… includes both protein-coding and putative non-coding transcripts… Compared to RefSeq, this gene set has generally about 10% more protein-coding genes, approximately five times as many putative non-coding genes, and about twice as many splice variants” Let’s examine a few examples online… Press on track name for full visualization Examples: UCSC: show RefSeq description, GeneCards & GO annotation RefSeq: show summary and genomic alignment Show gene with alternative ATG: ?????
Basic Genome Browser Interface A few more tracks worth mentioning: miRNA (Genes and Gene Prediction Tracks -> sno/miRNA) conservation (Comparative Genomics -> Conservation) Expression tracks Regulation tracks (chromatin structure and modifications, DNA methylation, etc.; includes ENCODE data) RNA secondary structure (Genes and Gene Prediction Tracks -> EvoFold) SNPs (Variation and Repeats -> SNPs) Mention that additional tracks are available for the hg18 genome
Basic Genome Browser Interface Get genomic DNA for the viewed coordinates: Convert sequence to a different genome assembly or genome: Convert browser window to an image file:
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
Detecting Alternative Splicing Events Via Human mRNAs & Spliced ESTs tracks (mRNA and EST Tracks) “The mRNA track shows alignments between human mRNAs in GenBank and the genome” “…alignments between human expressed sequence tags (ESTs) in GenBank and the genome… ESTs are single-read sequences, typically about 500 bases in length” gene DNA Mention the different fields in each mRNA / EST mRNA
Detecting Alternative Splicing Events Via Alt Events track (Genes and Gene Prediction Tracks) – based on UCSC genes gene DNA cassetteExon >
Detecting Alternative Splicing Events Via Burge RNA-seq track (Expression) Click on the track name to choose tissues gene DNA Burge RNA-seq
Different Alternative Splicing Types Exon skipping Alternative splice site (3’) Intron retention Exon skipping (PTBP1; chr19:804,980-806,685) Alternative 3’ splice site; (PTBP1; chr19:804,980-806,685) Alternative 5’ splice site; (SLC11A2; chr12:51,401,760-51,402,755) Intron retention; (CD163; chr12:7,639,792-7,641,193)
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
Histone Modifications
Transcription Factor Binding
DNA Methylation
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
BLAT BLAT = Blast-Like Alignment Tool BLAT query BLAT query BLAT = Blast-Like Alignment Tool BLAT is designed to find similarity of >95% on DNA and >80% for protein CACGCCCAGACCTGCCTTCCGGGGACAGCCAGCCCTCGCTGGACCAGACCATGGCCGCGGCCTTCGGTGCACCTGGTATAATCTCAGCCTCTCCGTATGCAGGAGCTGGTTTCCCTC
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
amplicon in fasta format PCR coordinates strand primers amplicon in fasta format amplicon Primers: CREB-ATF2_F: cggccaatgtagctcctcta Tm= 60.37 CREB-ATF2_R: CTCAATGGGTGCGTCAACTA Tm= 59.72 temperatures
Lecture Overview UCSC Genome Browser Galaxy Interface & selected tracks Detecting alternative splicing events Chromatin organization & epigenetics BLAT PCR Galaxy
Galaxy “Galaxy allows you to do analyses you cannot do anywhere else without the need to install or download anything. You can analyze multiple alignments, compare genomic annotations, profile metagenomic samples and much much more...”
Galaxy – What Is It Good for? Getting the best out of UCSC Operating on UCSC data Supports operations both at the interval level, and at the sequence level Designed for biologists!
Galaxy – Typical Workflow Extract sets of coordinates either upload from computer or from UCSC table browser Operate on different sets of coordinates (intersect, subtract etc.) Fetch genomic sequences of coordinates