Welcome to Introduction to Bioinformatics Friday, 12 September Introduction to Scenario 2 Finding biologically important sites in DNA How to avoid being fooled by imposters? Scenario 1: Genomic comparisons Problem Sets 1M and 1P
Scenario 2 Finding biologically important sites in DNA
You: A typical grad student
Your object of study: Cyanobacteria
How do they do it? Critical position in food web CO 2 sugar N 2 ammonia H 2 O electrons Your object of study: Cyanobacteria
heterocysts Matveyev and Elhai (unpublished) CO 2 sucrose N2N2 N 2 fixation in cyanobacteria O2O2
heterocysts Matveyev and Elhai (unpublished) CO 2 sucrose N2N2 NH 3 N 2 fixation in cyanobacteria O2O2
-NH 3 Differentiation in cyanobacteria Heterocysts ? ? ? ? ?
DNA RNA protein How do bacteria respond to the environment? From gene to protein Response to environment
How do bacteria respond to the environment? From gene to protein DNA RNA protein RNA Pol P
High N Low N How do cyanobacteria respond to NH 3 ? From gene to proteinNH 3 α-ketoglutarate glutamine DNA binding protein, NtcA RNA Pol Binding site P DNA No RNA
High N Low N How do cyanobacteria respond to NH 3 ? From gene to proteinNH 3 α-ketoglutarate glutamine DNA binding protein, NtcA RNA Pol Binding site P DNA No RNA
How do cyanobacteria respond to NH 3 ? From gene to protein DNA Low N RNA Pol NtcA Binding site P RNA protein α-ketoglutarate
-NH 3 Differentiation in cyanobacteria Heterocysts ? ? ? ? ?
-NH 3 Differentiation in cyanobacteria Heterocysts ? ? ? Activates NtcA (Nitrogen Control)
Differentiation in cyanobacteria What does NtcA bind to? Herrero et al (2001) J Bacteriol 183: mRNA …(20-24)…TAnnnT GTA…(8)…TAC
HetR Genes needed for differentiation Master regulator Differentiation in cyanobacteria Integration of signals through HetR Level of PatS Level of HetN Position in cell cycle NtcA -N ??? HetQ Strategy PCR out hetQ Random mutagenesis Look for effects on HetR expression/activity
cctatctccgccctatggcgatttgggcaatatatttgatgattggttag...hypothetical ttgtcagttgtcagacgtagtagcgcgtctagtctaatgtgttgttatat protein tatttgctactagaaatgaggagagggttatttttctcactgcttcccaa ttctatgagaatataaaattttccttaagtttctcatggcaataatggaa aaaaccgaccattctgatgaataagtccggttttttccaaaaaatatttt tgctttttcgctttatttatctatatttccaagttttagtacatcggtga ggggtgacaactatcttgccaatattgtcgttattgttaggttgctatcg gaaaaaatctgtaacatgagatacacaatagcatttatatttgctttagt atctctctcttgggtgggattctgcctgcaatttaaaaaccagtgttaac aattttcggctttattttccgggagttaaatcaaccaagggaaaatgtaa ctaatgtttaaatatcttcggatacacacaaagtaaaaccaatttttaca gatgtcgatgttgctcacattttttagaaatattactaaattaaaaatgt tattaaatttatgttcatagagaaccttttccaaataaaaaaataatttt cctgatgttttaagaaaattactgttgttataaattaaaggtgattcaac aaaatatagatagttctttcaataactatctacttttaccattaagtgaa cttactcatgaataatcaacaggaattaaaaataaagttcatgaatactg gttaaagattcagtaaagtttgaggaaataccggaataaatttccaccca aatatgattttttaaaagatacattggcagtacattaaaatgccgatgtt agataaatttgccttcatagctgttatctatttgctcagaactaagccaa gagtttacacaccaaacagaaattaaactatgaatccctcttcgtcgtta hetQ... Differentiation in cyanobacteria Find primers to PCR out hetQ
cctatctccgccctatggcgatttgggcaatatatttgatgattggttag...hypothetical ttgtcagttgtcagacgtagtagcgcgtctagtctaatgtgttgttatat protein tatttgctactagaaatgaggagagggttatttttctcactgcttcccaa ttctatgagaatataaaattttccttaagtttctcatggcaataatggaa aaaaccgaccattctgatgaataagtccggttttttccaaaaaatatttt tgctttttcgctttatttatctatatttccaagttttagtacatcggtga ggggtgacaactatcttgccaatattgtcgttattgttaggttgctatcg gaaaaaatctgtaacatgagatacacaatagcatttatatttgctttagt atctctctcttgggtgggattctgcctgcaatttaaaaaccagtgttaac aattttcggctttattttccgggagttaaatcaaccaagggaaaatgtaa ctaatgtttaaatatcttcggatacacacaaagtaaaaccaatttttaca gatgtcgatgttgctcacattttttagaaatattactaaattaaaaatgt tattaaatttatgttcatagagaaccttttccaaataaaaaaataatttt cctgatgttttaagaaaattactgttgttataaattaaaggtgattcaac aaaatatagatagttctttcaataactatctacttttaccattaagtgaa cttactcatgaataatcaacaggaattaaaaataaagttcatgaatactg gttaaagattcagtaaagtttgaggaaataccggaataaatttccaccca aatatgattttttaaaagatacattggcagtacattaaaatgccgatgtt agataaatttgccttcatagctgttatctatttgctcagaactaagccaa gagtttacacaccaaacagaaattaaactatgaatccctcttcgtcgtta hetQ...
ttgtcagttgtcagacgtagtagcgcgtctagtctaatgtgttgttatat tatttgctactagaaatgaggagagggttatttttctcactgcttcccaa ttctatgagaatataaaattttccttaagtttctcatggcaataatggaa aaaaccgaccattctgatgaataagtccggttttttccaaaaaatatttt tgctttttcgctttatttatctatatttccaagttttagtacatcggtga ggggtgacaactatcttgccaatattgtcgttattgttaggttgctatcg gaaaaaatcTGTAacatgagaTACAcaatagcatttatatttgctttagt atctctctcttgggtgggattctgcctgcaatttaaaaaccagtgttaac aattttcggctttattttccgggagttaaatcaaccaagggaaaatgtaa ctaatgtttaaatatcttcggatacacacaaagtaaaaccaatttttaca gatgtcgatgttgctcacattttttagaaatattactaaattaaaaatgt tattaaatttatgttcatagagaaccttttccaaataaaaaaataatttt cctgatgttttaagaaaattactgttgttataaattaaaggtgattcaac aaaatatagatagttctttcaataactatctacttttaccattaagtgaa cttactcatgaataatcaacaggaattaaaaataaagttcatgaatactg gttaaagattcagtaaagtttgaggaaataccggaataaatttccaccca aatatgattttttaaaagatacattggcagtacattaaaatgccgatgtt agataaatttgccttcatagctgttatctatttgctcagaactaagccaa gagtttacacaccaaacagaaattaaactatgaatccctcttcgtcgtta hetC... Differentiation in cyanobacteria Find primers to PCR out hetC GTA…(8)…TAC
ttctatgagaatataaaattttccttaagtttct aaaaccgaccattctgatgaataagtccggtttt tgctttttcgctttatttatctatatttccaagt ggggtgacaactatcttgccaatattgtcgttat gaaaaaatctGTAacatgagaTACacaatagcat ttatatttgcttTAgtaTctctctcttgggtggg Differentiation in cyanobacteria GTA…(8)…TAC NtcA binding site …(20-24)…TAnnnT Promoter
Level of PatS Level of HetN Position in cell cycle NtcA HetR Genes needed for differentiation -N Master regulator Differentiation in cyanobacteria Integration of signals through HetR ??? HetQ Stockholm
How to proceed? Choice #1 Publish Grant proposals Build a career Likely result Reviewers trash MS: too speculative
How to proceed? Choice #2 Forget about it Back to PCR Likely result Sometimes miss spectacular finding
How to proceed? Choice #3 Forget about PCR Do backbreaking NtcA binding studies Likely result Might demonstrate binding of NtcA Risky, may lose many months
How to proceed? Choice #4 Determine whether site is likely to be real How? N!... a! (N-a)! High school math approach
How to proceed? Choice #4 Determine whether site is likely to be real How? BIOINFORMATICS Simulation Exhaustive pattern search
End Scenario 2 Story foreach if ($problem =~ /PS1[M-1|M-5|M-9|P-3|P-5]/){ Do_problem_now($problem); } } Molecular Biology: Regulatory protein and binding sites Bioinformatics: Simulations; Nature of randomness Programming: Loops and arrays while ($time_permits);