Presentation is loading. Please wait.

Presentation is loading. Please wait.

Modeling molecular evolution Jodi Schwarz and Marc Smith Vassar College Biol/CS353 Bioinformatics.

Similar presentations


Presentation on theme: "Modeling molecular evolution Jodi Schwarz and Marc Smith Vassar College Biol/CS353 Bioinformatics."— Presentation transcript:

1 Modeling molecular evolution Jodi Schwarz and Marc Smith Vassar College Biol/CS353 Bioinformatics

2 Team taught Biol and CompSci course 7 students: – CS experience: 3 yes, 4 no – Bio experience: 5 yes, 2 no Project-based course; no exams Worked in Biol/CS pairs on projects I3U near end of course; last project before independent research projects

3 Common approach for all projects Biological question Algorithm design – Step-by-step approach to complete a task or solve the problem Implementation – The actual programming “script” that will carry out the steps of the algorithm Evaluation of implementation and algorithm Revision or augmentation

4 I3U: added an experimental component to our basic approach Previous projects focused on pattern finding, mining whole genome data Goal of I3U: Model a biological/evolutionary process Test the model with empirical data Perform computational experiments

5 Model molecular evolution Step 1: model the effect of random vs targeted nucleotide substitutions on a protein sequence – What do we mean by random? – determine the similarity of the original protein sequence to the “evolved” sequence Step 2: Assess the real nt diversity at positions 1, 2, 3 of codons in real homologs (HSP70) – Construct alignment of homologs and determine nt diversity at each position Evaluate the models using the empirical data

6 Learning goals CS students: To apply their knowledge of data structures and algorithms to a biological domain Biology students: To apply their knowledge of the biology to design algorithms For the collaboration: – To become familiar with modeling a biological process: a simple model must be constructed and tested first – To test the model using empirical data

7 Assessment Assignments – Alignment assignment – 2 Perl scripts Model random vs targeted substitution pattern Determine the codon nt diversity in HSP70 genes – Output from the 2 Perl scripts Raw output Graphs summarizing data Observation – Collaboration – Critical thinking

8 Random substitutions substitutions targeted to 3 rd psn Example student results Effect of random vs targeted substitutions on a protein sequence (compared the “ancestral” sequence to the “evolved” sequence ) 100 runs

9 Example student results of empirical data Average diversity by nucleotide position within codons: Codon position 1: 1.50 Codon position 2: 1.29 Codon position 3: 2.32 Most variation occurs in position 3

10 Collaboration across disciplines How we tried to teach collaboration: – We defined the meaning of collaboration CS students do not need to become biologists and vice versa Each person contributes a different set of expertise Learning how to speak each other’s language Communication – We modeled it Overt reliance on each other’s expertise Spontaneous discussions – Giving students lots of experience collaborating: several shifts in pairs over the semester

11 Assessment of collaboration Attitude : reluctant vs eager At beginning (self) vs. during project (experience) Gradational Assessment of Collaboration ScoreSelfExperience 0reluctantavoided 1eagerproblems 2reluctantpositive 3eagerpositive StudentScoreTeam ScoreTeams A02A+C B14B+F C26E+G D3 E33D worked alone F3 G3

12 1 how a genomics approach crosses levels of biological organization 2 how genomic-level science is conducted 3 how computational approaches are deployed to answer genomic questions? 4 how to find potential functional /evolutionary patterns in DNA/protein sequence 5 independently use bioinformatic tools to address biological/genomic questions. 6 examine the output of a bioinformatic analysis and relate it to a biological question. 7 provide one or more clear examples of how genomics uses an interdisciplinary approach Most improvement: questions that are explicitly bioinformatic Least: questions that are more broadly about genomics (CS) Likert Scale (1-5)

13 What worked well Overall approach was great: question, algorithm, implementation, analysis, iteration Use of starter code allowed students to – Undertake much more sophisticated projects – see examples of more advanced algorithm/code Encountering unanticipated results and problems – Gaps in alignments not in groups of 3 – Spontaneous discussions leading to AHA moments Students enjoyed the modeling process – One student’s final project focused on modeling molecular evolution

14 What didn’t work as well Some collaborations are not successful Ran out of time: insufficient analysis and reflection For the I3U: Assessment strategy not well developed – Can we retroactively extract more informative assessment?

15

16 Assessing biology knowledge Algorithm development – Ability to help partner understand different mutation vs selection – Ability to recognize assumptions of model – Ability to use the empirical data to evaluate model

17 Assessing the CS Variables – Abstraction: representing information as data – Types of data: predefined, atomic, aggregate – Scope: declaration, initialization, mutation Algorithms – Control flow: unconditional, conditional, repetition – Input/Output and regex (pattern matching) – Top-down design: subroutines – To reuse or not to reuse (code)? Incremental development / experimentation Elegance: readability and maintainability

18 Biological question – What pattern of nucleotide substitution occurs in protein-coding genes? Algorithm – What does we know about mutation, nt/AA sequences? – Assumptions Implementation – Instructors provided “starter code” – Students read and ran the code to see what it did – Pairs discussed how to add and refine it, and did so Evaluation – Analyze the CS: Did it run and did it do the job we asked? – Analyze the biology: Did it accurately represent the biological process? Testing the models against empirical evidence – Aligned HSP70 genes and evaluated the pattern of substitution Which model most closely matched the biology?


Download ppt "Modeling molecular evolution Jodi Schwarz and Marc Smith Vassar College Biol/CS353 Bioinformatics."

Similar presentations


Ads by Google