Presentation is loading. Please wait.

Presentation is loading. Please wait.

Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.

Similar presentations


Presentation on theme: "Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes."— Presentation transcript:

1

2

3 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes! For Bioinformatics, Start with:

4 The Human Genome E. coli Genome

5 SHEAR Shotgun DNA Sequencing of whole genome (WGS) DNA target sample LIGATE & CLONE Vector ReadsSEQUENCE Primer Reading:

6 Reading to Assembly:

7 The Human Genome E. coli Genome 50% of genome is repeat sequences! Assembly: The challenge of eukaryotic genomes 4 million bp 3 billion bp

8 Assembly of sequence of each chromosome from end to end END, Jan 14 begin

9 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence Whole genome shotgun OR Ordered clones find the genes ! Annotation: Robotically do dideoxy-dye data collection

10

11 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence find the genes ! Annotation: 1.ab initio 2.by evidence 10/1/5

12 ORFs are MOST of prokaryotic genome Annotation: For Bacterial genomes, ab initio is adequate ab initio: “from the beginning” יש מאין from first principles…

13 -85-88% of the nucleotides are associated with coding sequence in the bacterial genomes that have been completely sequenced. example: in Escherichia coli there are 4288 genes that have an average of 950 bp of coding sequence and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!! ab initio – finding ORFs Annotation:

14

15

16

17 -85-88% of the nucleotides are associated with coding sequence in the bacterial genomes that have been completely sequenced. example: in Escherichia coli there are 4288 genes that have an average of 950 bp of coding sequence and are separated by an average of just 118 bp. So first, to find genes in prokaryotic DNA, search for ORFs!! ab initio – finding ORFs Annotation:

18 -Prokaryotes have short, simple promoters that are easy to recognize -Transcriptional terminators often consist of short inverted repeats followed by a run of Ts. -Therefore, programs that find prokaryotic genes search for: ORFs 60 or more codons long –and codon usage promoters at the 5' end Terminators at the 3' end Homology to known genes from other prokaryotes Shine-Dalgarno sequences ` ab initio – beyond ORFs Annotation: beyond ORFs:

19 Prokaryotic gene finder examples Glimmer- Interpolated Markov Model method GrailII- Neural Network method (See BioInfo text – Fig 8.8) ab initio – automated Annotation:

20 results Annotation:

21

22 Multicellular eukaryotes Done too 10/1/5

23 Multicellular eukaryotes Annotation: Done too 10/1/5

24 Multicellular eukaryotes Annotation: Done too 10/1/5

25 2 ways to annotate eukaryotic genomes: -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. -Genes based on previous knowledge….EVIDENCE -cDNA sequence of the gene’s message -cDNA of a closely related gene’ message sequence -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein……. -ab initio gene finders: Work on basic biological principles: Open reading frames Codon usage Consensus splice sites Met start codons ….. Annotation: Genes based on previous knowledge-EVIDENCE -cDNA sequence of the gene’s message -cDNA of a related gene’s message seq. -Protein sequence of the known gene Same gene’s Same gene’s from another species Related gene’s protein…….

26 Homology based exon predictions Consensus gene structure (both strands) start and stop site predictions Splice site predictions computational exon predictions Tracking information Unique identifiers

27 Automatically generated annotation

28 A zebrafish hit shows a gene model protein encoded by a 6 exon gene. This gene structure (intron/exon) is seen in other species, as is the protein size. The proteins, if corresponding to MSP in S. gal., must be heavily glycosylated (likely). At least some have a signal peptide.

29 The zebrafish hit can be viewed at higher resolution, and…

30 The zebrafish hit can be viewed down to nucleotide resolution

31 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes, 700 bp each read, MAX

32 Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes find the genes!

33

34 cDNAs & ESTs: Expressed Sequence Tags RNA target sample End Reads (Mates) SEQUENCE Primer cDNA Library Each cDNA provides sequence from the two ends – two ESTs Annotation:

35

36

37 Who Gets Sequenced? Models Pathogens Agriculturals

38

39

40

41

42

43 Array analysis: see animation from Griffiths

44

45

46

47 Protein Structure Database See Swiss-pdb viewer

48

49

50

51 RNA for ALL C. elegans genes

52

53

54 RNAi for every C. elegans gene too! -results on the web Projects to systematically Knock-out (or pseudo-knockout) every gene, in order to establish phenotype of each gene -> function of each gene


Download ppt "Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes."

Similar presentations


Ads by Google