Presentation is loading. Please wait.

Presentation is loading. Please wait.

STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In.

Similar presentations


Presentation on theme: "STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In."— Presentation transcript:

1 STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015
Xiaole Shirley Liu Please Fill Out Student Sign In Sheet

2 Bioinformatics and Computational Biology
Interdisciplinary Statistics, Biology, Computer Science Applied From freshman to postdocs Useful training for many The more you practice, the better you get Moves with technology development STAT115

3 The Protein Sequence and Structure Wave
1955: Sanger sequenced bovine insulin 1970: Smith-Waterman algorithm 1973: PDB 1990: BLAST 1994: BLOCKS database 1994-: CASP 1997-: Proteomics STAT115

4 The Microarray Wave Microarray contains hundreds to millions of tiny probes Simultaneously detect how much each gene is expressed STAT115

5 ALL vs AML Golub et al, Science 1999. STAT115

6 ALL vs AML STAT115

7 “Microarrays” Today Infer the expression value of all the genes from 1000 probes High throughput drug screen STAT115

8 The DNA Sequencing Wave
1953: DNA structure 1972: Recombinant DNA 1977: Sanger sequencing 1985: PCR 1988: NCBI 1990: BLAST STAT115

9 Sequencing in the 1970s STAT115

10 The Human Genome Race Human Genome Project: 1990-2003
Originally Boosted by technology improvement and automation Competition from Celera STAT115

11 Human Genome Sequencing
Clone-by-clone and whole-genome shotgun STAT115

12 The Human Genome Race Human Genome Project: 1990-2003
Originally Boosted by technology improvement and automation Competition from Celera Informatics essential for both the public and private sequencing efforts Sequence assembly and gene prediction Working draft finished simultaneously spring 2000 STAT115

13 Sequencing in 2001 [Enter any extra notes here; leave the item ID line at the bottom] Avitage Item ID: {{CE8AAEAA-A22F-47FE-A1F8-66CBC3CDB6FC}}

14 Sequencing in 2007 [Enter any extra notes here; leave the item ID line at the bottom] Avitage Item ID: {{010D7619-E070-4F7B-BC AA639C8D}}

15 Sequencing Today Personal genome sequencing HiSeq X
900GB data / flow cell in < 3 days, 10 * 30X human genomes, at ~$1.5-2K / sample STAT115

16 Personalized Disease Susceptibility Test and Treatment
STAT115

17 Big Data Challenges STAT115

18 --- Sydney Brenner 2002 Nobel Prize
All biology is becoming computational, much the same way it has became molecular … Otherwise “low input, high throughput and no output science” --- Sydney Brenner 2002 Nobel Prize

19 STAT115

20 Class Information Course website: Roughly 3 modules (2 HW each)
Video recording / slides online Office hours, auditing Background: CS, Stats, Biology Roughly 3 modules (2 HW each) Transcriptome (microarrays and RNA-seq) Gene regulation (transcriptional & epigenetic regulation) Human genetics and disease (GWAS / cancer) STAT115

21 Class Information Teaching Fellows Yang Li Stephanie Chan
Labs: Wed 6 – 8pm, Science Center B09 Tue 6-8pm, HSPH Kresge 209, Boston First Lab: Fri 1/30 3-5pm (Odyssey)! STAT115

22 HW and Grading Discussion forum: stat115.slack.com
Submission HW 6 * 10 or 6 * 12 Final exams 20 Class participation: 20 Algorithm videos: 5 Lecture notes: extra 5 points Late days STAT115

23 STAT115

24 Gene Expression Microarrays

25 Expression Microarrays
Grow cells at certain condition, collect mRNA population, and label them Microarray has high density (thousands to millions) sequence specific probes with known location for each gene/RNA Sample hybridized to microarray probes by DNA (A-T, G-C) base pairing, wash non-specific binding Measure sample mRNA value by checking labeled signals at each probe location

26 Affymetrix GeneChip Arrays

27 Labeled Samples Hybridize to DNA Probes on GeneChip

28 Shining Laser Light Causes Tagged Fragments to Glow

29 Perfect Match (PM) vs MisMatch (MM) (control for cross hybridization)

30 NimbleGen Arrays

31 Agilent Arrays

32 Microarrays Array comparison:
# probes / array, # probes / gene, probe length Flexibility vs data reuse Why do we bother learning about microarrays now? RNA-seq is probably preferred in new expression experiments The amount of useful public data The data analysis techniques STAT115


Download ppt "STAT115 STAT215 BIO512 BIST298 Introduction to Computational Biology and Bioinformatics Spring 2015 Xiaole Shirley Liu Please Fill Out Student Sign In."

Similar presentations


Ads by Google