STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu.

Slides:



Advertisements
Similar presentations
Recombinant DNA technology
Advertisements

Microarray technology and analysis of gene expression data Hillevi Lindroos.
Gene Expression Chapter 9.
DNA microarray and array data analysis
Introduction to Genomics, Bioinformatics & Proteomics Brian Rybarczyk, PhD PMABS Department of Biology University of North Carolina Chapel Hill.
The Human Genome Project and ~ 100 other genome projects:
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
Exam #2 Mean = 73% Median = 74% Mode = 90% A range: | | | | | | | | | B range: | | | | | | | | | C range: | | | | | | | D range: | | | | | | | | | | Failing:
Bacterial Physiology (Micr430)
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Information Aspects of Nucleic Acids Measurement Technologies Description of nucleic acid measurement technologies Algorithmic, optimization, data analysis.
Modeling Functional Genomics Datasets CVM Lesson 1 13 June 2007Bindu Nanduri.
Introduce to Microarray
Sanguinetti, 2012Bio2 lecture 1 Bioinformatics 2 “My main problem these days is that I don’t understand how we go from an experiment in the lab to a number.
STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Human Genome Project Seminal achievement. Scientific milestone. Scientific implications. Social implications.
and analysis of gene transcription
Gene Expression Microarrays Microarray Normalization Stat
Microarray Preprocessing
Paola CASTAGNOLI Maria FOTI Microarrays. Applicazioni nella genomica funzionale e nel genotyping DIPARTIMENTO DI BIOTECNOLOGIE E BIOSCIENZE.
with an emphasis on DNA microarrays
‘Omics’ - Analysis of high dimensional Data
Microarrays (Gene Chips) Pioneered by Pat Brown in mid 1990’s To monitor thousands of mRNAs simultaneously Comparative Northern blot on thousands of genes.
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Chapter 14 Genomes and Genomics. Sequencing DNA dideoxy (Sanger) method ddGTP ddATP ddTTP ddCTP 5’TAATGTACG TAATGTAC TAATGTA TAATGT TAATG TAAT TAA TA.
DNA MICROARRAYS WHAT ARE THEY? BEFORE WE ANSWER THAT FIRST TAKE 1 MIN TO WRITE DOWN WHAT YOU KNOW ABOUT GENE EXPRESSION THEN SHARE YOUR THOUGHTS IN GROUPS.
STAT115 Introduction to Computational Biology and Bioinformatics Spring 2012 Jun Liu & Xiaole Shirley Liu.
Data Type 1: Microarrays
DNA Sequencing 8.2 Image from:
A New Oklahoma Bioinformatics Company. Microarray and Bioinformatics.
Finish up array applications Move on to proteomics Protein microarrays.
Monday Human and chimp DNA is ~98.7 similar, But, we differ in many and profound ways, Can this difference be attributed, at least in part, to differences.
Intro to Microarray Analysis Courtesy of Professor Dan Nettleton Iowa State University (with some edits)
Genomics I: The Transcriptome
Introduction to Bioinformatics Dr. Rybarczyk, PhD University of North Carolina-Chapel Hill
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Sequencing DNA 1. Maxam & Gilbert's method (chemical cleavage) 2. Fred Sanger's method (dideoxy method) 3. AUTOMATED sequencing (dideoxy, using fluorescent.
MICROARRAY TECHNOLOGY
BIOLOGICAL DATABASES. BIOLOGICAL DATA Bioinformatics is the science of Storing, Extracting, Organizing, Analyzing, and Interpreting information in biological.
Analysis of protein-DNA interactions with tiling microarrays
Idea: measure the amount of mRNA to see which genes are being expressed in (used by) the cell. Measuring protein might be more direct, but is currently.
Class 23, 2001 CBCl/AI MIT Bioinformatics Applications and Feature Selection for SVMs S. Mukherjee.
Introduction to Microarrays Kellie J. Archer, Ph.D. Assistant Professor Department of Biostatistics
Bioinformatics Lecture to accompany BLAST/ORF finder activity
AP Biology Biotech Tools Review AP Biology Biotech Tools Review  Recombinant DNA / Cloning gene  restriction enzyme, plasmids,
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
EE150a – Genomic Signal and Information Processing On DNA Microarrays Technology October 12, 2004.
The Future of Genetics Research Lesson 7. Human Genome Project 13 year project to sequence human genome and other species (fruit fly, mice yeast, nematodes,
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
DNA Microarray Overview and Application. Table of Contents Section One : Introduction Section Two : Microarray Technique Section Three : Types of DNA.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
An Introduction to NCBI & BLAST National Center for Biotechnology Information Richard Johnston Pasadena City College.
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Introduction to Oligonucleotide Microarray Technology
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Human Genomics Higher Human Biology. Learning Intentions Explain what is meant by human genomics State that bioinformatics can be used to identify DNA.
Microarray: An Introduction
Graduate Research with Bioinformatics Research Mentors Nancy Warter-Perez, ECE Robert Vellanoweth Chem and Biochem Fellow Sean Caonguyen 8/20/08.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Genomics A Systematic Study of the Locations, Functions and Interactions of Many Genes at Once.
Statistical Applications in Biology and Genetics
생물정보학 Bioinformatics.
Microarray Technology and Applications
14-3 Human Molecular Genetics
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
LESSON 1 INTNRODUCTION HYE-JOO KWON, Ph.D /
Human Genome Project Seminal achievement. Scientific milestone.
Data Type 1: Microarrays
Presentation transcript:

STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2016 Xiaole Shirley Liu

The Protein Sequence and Structure Wave 1955: Sanger sequenced bovine insulin 1970: Smith-Waterman algorithm 1973: PDB 1990: BLAST 1994: BLOCKS database 1994-: CASP 1997-: Proteomics STAT1152

3 The Microarray Wave Microarray contains hundreds to millions of tiny probes Simultaneously detect how much each gene is expressed

STAT1154 ALL vs AML Golub et al, Science 1999.

STAT1155 ALL vs AML

“Microarrays” Today Infer the expression value of all the genes from 1000 probes High throughput drug screen STAT1156

The DNA Sequencing Wave STAT : DNA structure 1972: Recombinant DNA 1977: Sanger sequencing 1985: PCR 1988: NCBI 1990: BLAST

Sequencing in the 1970s STAT1158

9 The Human Genome Race Human Genome Project: –Originally –Boosted by technology improvement and automation –Competition from Celera

STAT11510 Human Genome Sequencing Clone-by-clone and whole-genome shotgun

STAT11511 The Human Genome Race Human Genome Project: –Originally –Boosted by technology improvement and automation –Competition from Celera Informatics essential for both the public and private sequencing efforts –Sequence assembly and gene prediction –Working draft finished simultaneously spring 2000

Sequencing in 2001

Sequencing in 2007

Sequencing Today Personal genome sequencing HiSeq X –900GB data / flow cell in < 3 days, 10 * 30X human genomes, at ~$1500 / sample STAT11514

Personalized Disease Susceptibility Test and Treatment STAT11515 Break

Big Data Challenges STAT11516

All biology is becoming computational, much the same way it has became molecular … Otherwise “low input, high throughput and no output science” --- Sydney Brenner 2002 Nobel Prize

Bioinformatics and Computational Biology Interdisciplinary –Statistics, Biology, Computer Science Applied –From freshman to postdocs –Useful training for many –The more you practice, the better you get Moves with technology development STAT11518

Is This Class for me? Computer: –R and Python Biology: –Molecular biology, genomics Statistics: –Hypothesis testing, distributions, intuition STAT11519

Class Information Course website: – –Video recording, slides, reading online –Office hours, auditing –Background: CS, Stats, Biology Roughly 6 modules (HW each) –Transcriptomes (microarrays and RNA-seq) –Gene regulation (transcriptional & epigenetic regulation) –Human genetics and disease (GWAS / cancer) STAT11520

Class Information Teaching Fellows Zhirui HuZack McCaw Labs: –Wed 6 – 8pm, Science Center B09 –Thur 6 – 8pm, HCSPH HSPH Kresge LL6 –Next Wed: Odyssey account and LINUX tutorial! STAT11521

HW and Grading Discussion on Canvas by HW Submission on Canvas by HW HW: 6 * 15 (STAT115) or 6 * 20 (graduate) Quiz for each module: 6 Final exams 20 Class participation: 5 (extra) Algorithm videos: 5 (extra) Late days STAT11522 Break

Gene Expression Microarrays

24 Expression Microarrays Grow cells at certain condition, collect mRNA population, and label them Microarray has high density (thousands to millions) sequence specific probes with known location for each gene/RNA Sample hybridized to microarray probes by DNA (A-T, G-C) base pairing, wash non- specific binding Measure sample mRNA value by checking labeled signals at each probe location

25 Affymetrix GeneChip Arrays

26 Labeled Samples Hybridize to DNA Probes on GeneChip

27 Shining Laser Light Causes Tagged Fragments to Glow

28 Perfect Match (PM) vs MisMatch (MM) (control for cross hybridization)

NimbleGen Arrays 29

Agilent Arrays 30

Microarrays Array comparison: –# probes / array, # probes / gene, probe length –Flexibility vs data reuse Why do we bother learning about microarrays now? –RNA-seq is probably more cost effective now –The amount of useful public data –The data analysis techniques STAT11531