Presentation is loading. Please wait.

Presentation is loading. Please wait.

Challenges for computer science as a part of Systems Biology Benno Schwikowski Institute for Systems Biology Seattle, WA.

Similar presentations


Presentation on theme: "Challenges for computer science as a part of Systems Biology Benno Schwikowski Institute for Systems Biology Seattle, WA."— Presentation transcript:

1 Challenges for computer science as a part of Systems Biology Benno Schwikowski Institute for Systems Biology Seattle, WA

2 Math and Computer Science ChallengesBenno Schwikowski Species Conditions/time Genes Towards integrative models Protein interaction - Interaction partner - Direct/indirect - Affinity - Effect DNA - Sequence - Genomic locus - Domain content - Intron/exon structure - Regulatory motifs - Chemical modifications - SNPs - Splice variants - Accessibility - Variation mRNA - Abundance - Regulatory information - initiation/ termination signals Protein - Abundance - State - Localization - 3D structure - Functional characterization - Half-life - Active sites - Biochemical function - Cellular role

3 Math and Computer Science ChallengesBenno Schwikowski Challenge: Integrative models …Across genes and proteins: Many genes involved (e.g., multifactorial diseases) …Across model systems: Lack of experimental platforms in target system …Across levels of biological organization (e.g. gene regulatory processes involving phosphorylation) …Across experiments: Robustness against errors in mass spectrometry, mRNA measurements …Across timescales

4 Math and Computer Science ChallengesBenno Schwikowski DNA RNA Proteins Modules Organelles Cells Organs Individuals Populations Ecologies Challenge: Capturing evolutionary constraints "Nothing in biology makes sense except in the light of evolution.“ Theodosius Dobzhansky

5 Challenge: Which tools and experiments to use

6 Math and Computer Science ChallengesBenno Schwikowski Challenge: Choosing experiments Machine Learning Determine most likely classification/parameterization on the basis of a randomly sampled dataset Active Learning Allow an algorithm to query selected data points, using the result of previous queries.

7 Math and Computer Science ChallengesBenno Schwikowski Challenge: Relations between system variables can be quite complex Yuh, Bolouri, Davidson, Science, 1998

8 Math and Computer Science ChallengesBenno Schwikowski Challenge: Relations between system variables can be quite complex Yuh, Bolouri, Davidson, Science, 1998

9 Math and Computer Science ChallengesBenno Schwikowski Challenge: Develop models that allow extremely efficient algorithms AGTCGTACGTGAC... AGTAGACGTGCCG... ACGTGAGATACGT... GAACGGAGTACGT... TCGTGACGGTGAT...

10 Math and Computer Science ChallengesBenno Schwikowski CLUSTALW(1.74) multiple sequence alignment CottonACGGTT-TCCATTGGATGA---AATGAGATAAGAT---CACTGTGC---TTCTTCCACGTG--GCAGGTTGCCAAAGATA-------AGGCTTTACCATT PeaGTTTTT-TCAGTTAGCTTA---GTGGGCATCTTA----CACGTGGC---ATTATTATCCTA--TT-GGTGGCTAATGATA-------AGG--TTAGCACA TobaccoTAGGAT-GAGATAAGATTA---CTGAGGTGCTTTA---CACGTGGC---ACCTCCATTGTG--GT-GACTTAAATGAAGA-------ATGGCTTAGCACC Ice-plantTCCCAT-ACATTGACATAT---ATGGCCCGCCTGCGGCAACAAAAA---AACTAAAGGATA--GCTAGTTGCTACTACAATTC--CCATAACTCACCACC TurnipATTCAT-ATAAATAGAAGG---TCCGCGAACATTG--AAATGTAGATCATGCGTCAGAATT--GTCCTCTCTTAATAGGA-------A-------GGAGC WheatTATGAT-AAAATGAAATAT---TTTGCCCAGCCA-----ACTCAGTCGCATCCTCGGACAA--TTTGTTATCAAGGAACTCAC--CCAAAAACAAGCAAA DuckweedTCGGAT-GGGGGGGCATGAACACTTGCAATCATT-----TCATGACTCATTTCTGAACATGT-GCCCTTGGCAACGTGTAGACTGCCAACATTAATTAAA LarchTAACAT-ATGATATAACAC---CGGGCACACATTCCTAAACAAAGAGTGATTTCAAATATATCGTTAATTACGACTAACAAAA--TGAAAGTACAAGACC CottonCAAGAAAAGTTTCCACCCTC------TTTGTGGTCATAATG-GTT-GTAATGTC-ATCTGATTT----AGGATCCAACGTCACCCTTTCTCCCA-----A PeaC---AAAACTTTTCAATCT-------TGTGTGGTTAATATG-ACT-GCAAAGTTTATCATTTTC----ACAATCCAACAA-ACTGGTTCT---------A TobaccoAAAAATAATTTTCCAACCTTT---CATGTGTGGATATTAAG-ATTTGTATAATGTATCAAGAACC-ACATAATCCAATGGTTAGCTTTATTCCAAGATGA Ice-plantATCACACATTCTTCCATTTCATCCCCTTTTTCTTGGATGAG-ATAAGATATGGGTTCCTGCCAC----GTGGCACCATACCATGGTTTGTTA-ACGATAA TurnipCAAAAGCATTGGCTCAAGTTG-----AGACGAGTAACCATACACATTCATACGTTTTCTTACAAG-ATAAGATAAGATAATGTTATTTCT---------A WheatGCTAGAAAAAGGTTGTGTGGCAGCCACCTAATGACATGAAGGACT-GAAATTTCCAGCACACACA-A-TGTATCCGACGGCAATGCTTCTTC-------- DuckweedATATAATATTAGAAAAAAATC-----TCCCATAGTATTTAGTATTTACCAAAAGTCACACGACCA-CTAGACTCCAATTTACCCAAATCACTAACCAATT LarchTTCTCGTATAAGGCCACCA-------TTGGTAGACACGTAGTATGCTAAATATGCACCACACACA-CTATCAGATATGGTAGTGGGATCTG--ACGGTCA CottonACCAATCTCT---AAATGTT----GTGAGCT---TAG-GCCAAATTT-TATGACTATA--TAT----AGGGGATTGCACC----AAGGCAGTG-ACACTA PeaGGCAGTGGCC---AACTAC--------------------CACAATTT-TAAGACCATAA-TAT----TGGAAATAGAA------AAATCAAT--ACATTA TobaccoGGGGGTTGTT---GATTTTT----GTCCGTTAGATAT-GCGAAATATGTAAAACCTTAT-CAT----TATATATAGAG------TGGTGGGCA-ACGATG Ice-plantGGCTCTTAATCAAAAGTTTTAGGTGTGAATTTAGTTT-GATGAGTTTTAAGGTCCTTAT-TATA---TATAGGAAGGGGG----TGCTATGGA-GCAAGG TurnipCACCTTTCTTTAATCCTGTGGCAGTTAACGACGATATCATGAAATCTTGATCCTTCGAT-CATTAGGGCTTCATACCTCT----TGCGCTTCTCACTATA WheatCACTGATCCGGAGAAGATAAGGAAACGAGGCAACCAGCGAACGTGAGCCATCCCAACCA-CATCTGTACCAAAGAAACGG----GGCTATATATACCGTG DuckweedTTAGGTTGAATGGAAAATAG---AACGCAATAATGTCCGACATATTTCCTATATTTCCG-TTTTTCGAGAGAAGGCCTGTGTACCGATAAGGATGTAATC LarchCGCTTCTCCTCTGGAGTTATCCGATTGTAATCCTTGCAGTCCAATTTCTCTGGTCTGGC-CCA----ACCTTAGAGATTG----GGGCTTATA-TCTATA CottonT-TAAGGGATCAGTGAGAC-TCTTTTGTATAACTGTAGCAT--ATAGTAC PeaTATAAAGCAAGTTTTAGTA-CAAGCTTTGCAATTCAACCAC--A-AGAAC TobaccoCATAGACCATCTTGGAAGT-TTAAAGGGAAAAAAGGAAAAG--GGAGAAA Ice-plantTCCTCATCAAAAGGGAAGTGTTTTTTCTCTAACTATATTACTAAGAGTAC LarchTCTTCTTCACAC---AATCCATTTGTGTAGAGCCGCTGGAAGGTAAATCA TurnipTATAGATAACCA---AAGCAATAGACAGACAAGTAAGTTAAG-AGAAAAG WheatGTGACCCGGCAATGGGGTCCTCAACTGTAGCCGGCATCCTCCTCTCCTCC DuckweedCATGGGGCGACG---CAGTGTGTGGAGGAGCAGGCTCAGTCTCCTTCTCG

11 Math and Computer Science ChallengesBenno Schwikowski Challenge: Developing models that allow extremely efficient algorithms Parsimony score: 1 AGTCGTACGTGAC... AGTAGACGTGCCG... ACGTGAGATACGT... GAACGGAGTACGT... TCGTGACGGTGAT... ACGG ACGT J. Comp Biol. 2002

12 Math and Computer Science ChallengesBenno Schwikowski An Exact Algorithm (generalizing Sankoff and Rousseau 1975) W u [s] =best parsimony score for subtree rooted at node u, if u is labeled with string s. AGTCGTACGTG ACGGGACGTGC ACGTGAGATAC GAACGGAGTAC TCGTGACGGTG … ACGG: 2 ACGT: 1... … ACGG : 0 ACGT : 2... … ACGG : 1 ACGT : 1... … ACGG: +  ACGT: 0... … ACGG: 1 ACGT: 0... 4 k entries … ACGG: 0 ACGT: + ... … ACGG:  ACGT :0... W u [s] =  min ( W v [t] + d(s, t) ) v : child t of u J. Comp Biol. 2002

13 Math and Computer Science ChallengesBenno Schwikowski What are good challenges to tackle? Biological/medical questions asked Experimental technologies to acquire a lot of relevant data Available datasets with a formalized notion of “data quality”

14 Math and Computer Science ChallengesBenno Schwikowski Memory complexity: O(k  4 2k ) per node Number of species Average sequence length Motif length Time complexity: Total time O(n k (4 2k + l )) J. Comp Biol. 2002

15 Technology-based challenges: Universal DNA Tag Systems Existing applications in high-throughput technologies Universal DNA arrays Padlock probes LYNX mRNA technology

16 Formalization Define: weight(A/T)=1, weight(C/G)=2 weight(AACTTG) = 1+1+2+1+1+2 = 8  melting temperature (AACTTG) = 2·weight l-u code problem Given two integers, l < u, find the largest set of tags such that Each tag has weight  u Each string of weight  l occurs at most once J. Comp Biol. 2000 & 2003

17 Math and Computer Science ChallengesBenno Schwikowski Challenge: Visualization Andrea Weston et al. @ ISB & Cytoscape

18 Math and Computer Science ChallengesBenno Schwikowski Challenge: Visualization Cytoscape, pre-release 2.0

19 Math and Computer Science ChallengesBenno Schwikowski A computer scientist’s perspective “Biology is so digital, and incredibly complicated […] I can't be as confident about computer science as I can about biology. Biology easily has 500 years of exciting problems to work on, it's at that level.” Donald Knuth, 7 Dec 1993 Donald Knuth


Download ppt "Challenges for computer science as a part of Systems Biology Benno Schwikowski Institute for Systems Biology Seattle, WA."

Similar presentations


Ads by Google