Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe.

Similar presentations


Presentation on theme: "CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe."— Presentation transcript:

1 CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe | ssgross | @ cs.stanford.edu Spring 2006 – http://www.stanford.edu/class/cs273/http://www.stanford.edu/class/cs273/

2 Need a Scribe!!

3 Range of Bio-CS Interaction Gene Molecules Tissue/Organs Body system Robotic surgery Molecular structures, similarities and motions Soft-tissue simulation and surgical training Cells Simulation of cell interaction CS273 Sequence alignment Enormous range over space and time

4 Focus on Proteins  Proteins are the workhorses of all living organisms  They perform many vital functions, e.g: Catalysis of reactions Transport of molecules Building blocks of muscles Storage of energy Transmission of signals Defense against intruders

5 Proteins are also of great interest from a computational viewpoint  They are large molecules (few 100s to several 1000s of atoms)  They are made of building blocks (amino acids) drawn from a small “library” of 20 amino-acids  They have an unusual kinematic structure: long serial linkage (backbone) with short side-chains

6 Proteins are associated with many challenging problems  Predict folded structures and motion pathways  Understand why some proteins misfold or partially fold, causing such diseases as: cystic fibrosis, Parkinson, Creutzfeldt-Jakob (mad cow)  Find structural similarities among proteins and classify proteins  Find functional structural motifs in proteins  Predict how proteins bind against other proteins and smaller molecules  Design new drugs  Engineer and design proteins and protein-like structures (polymers)

7 Central Dogma of Molecular Biology

8 transcription translation

9 Protein Sequence O N N N N OO O  Long sequence of amino-acids (dozens to thousands), also called residues  Dictionary of 20 amino-acids (several billion years old) (residue i-1)

10 O N N N N OO O Protein Sequence Peptide bond (partial double bond character) T

11 Central Dogma of Molecular Biology Physiological conditions: aqueous solution, 37°C, pH 7, atmospheric pressure

12 Levels of Protein Structures hemoglobin (4 polypeptide chains) Quaternary

13 Mostly  -helices Mostly  -sheets Mixed

14 Intermediate states Folding Unfolded (denatured) state Folded (native) state Many pathways

15 http://www-shakh.harvard.edu/ProFold2.html How (we think) a protein folds...  G =  H - T  S

16 http://www-shakh.harvard.edu/ProFold2.html How (we think) a protein folds...  G =  H - T  S

17 http://www-shakh.harvard.edu/ProFold2.html How (we think) a protein folds...  G =  H - T  S

18 http://www-shakh.harvard.edu/ProFold2.html How (we think) a protein folds...  G =  H - T  S

19 http://www-shakh.harvard.edu/ProFold2.html How (we think) a protein folds...  G =  H - T  S

20 Motion of Proteins in Folded State HIV-1 protease

21 Structural variability of the overall ensemble of native ubiquitin structures [Shehu, Kavraki, Clementi, 2005]

22 Amylosucrase Flexible Loop Loop 7

23 Central Dogma of Molecular Biology

24 Binding Inhibitor binding to HIV protease Protein-protein binding Ligand-protein binding

25 Binding of Pyruvate to LDH (reduction of pyruvate to lactase ) ASP-195 HIS- 193 ASP-166 ARG-169 + + + THR-245 C C O O O CH 3 NADH GLN-101 ARG-106 Loop Lactate dehydrogenase environment Pyruvate Nicotinamide adenine dinucleotide (coenzyme)

26 What is CS273 about?  Algorithms and computational schemes for molecular biology problems  Molecular biology seen by computer scientists

27  y = f(x)  Biologists like experiments, specifics and classifications They like it better to know many (x i,y i ) – i.e., facts – and classify them, than to know f  Computer scientists like simulation, abstractions, and general algorithms They want to know f – the explanation of the facts – and efficient ways to compute it, but rarely care for any (x i,y i )  One challenge of Computational Biology is to fuse these two cultures The Shock of Two Cultures

28  Two Views of a BioComputation Class  Where are IT resources for biology available and how to use them  How to design efficient data structures and algorithms for biology

29 Main Ideas Behind CS273 1.The information is in the sequence  Sequence  Structure (shape)  Function  Sequence similarity  Structural/functional similarity  Sequences are related by evolution

30 Main Ideas Behind CS273 1.The information is in the sequence  Sequence  Structure (shape)  Function  Sequence similarity  Structural/functional similarity  Sequences are related by evolution 2.Biomolecules move and bind to achieve their functions  Deformation  folded structures of proteins  Motion + deformation  multi-molecule complexes  One cannot just “jump” from sequence to function Protein folding Ligand protein binding

31 SequenceStructureFunction sequence similarity structure similarity

32 Main Ideas Behind CS273 1.The information is in the sequence  Sequence  Structure (shape)  Function  Sequence similarity  Structural/functional similarity  Sequences are related by evolution 2.Biomolecules move and bind to achieve their functions  Deformation  folded structures of proteins  Motion + deformation  multi-molecule complexes  One cannot just “jump” from sequence to function  CS273 is about algorithms for sequence, structure and motion - Finding sequence and shape similarities - Relating structure to function - Extracting structure from experimental data - Computing and analyzing motion pathways

33 Vision Underlying CS273  Goal of computational biology: Low-cost high-bandwidth in-silico biology  Requirements: Reliable models  Efficient algorithms  Algorithmic efficiency by exploiting properties of molecules and processes: Proteins are long kinematic chains Atoms cannot bunch up together Forces have relatively short ranges  Computational Biology is more than using computers to biological problems or mimicking nature (e.g., performing MD simulation)

34 Tentative Schedule 1April 5Introduction 2April 10Protein geometric and kinematic models 3April 12Conformational space 4April 17Inverse kinematics and applications 5April 19Sequence similarity 6April 24Sequence similarity 7April 26Sequence similarity 8May 1Structure comparison 9May 3Structure comparison 10May 8Protein phylogeny, clustering, and classification 11May 10Protein phylogeny, clustering, and classification 12May 15Energy maintenance 13May 17Energy maintenance 14May 22Structure prediction 15May 24Roadmap methods 16May 31Structure prediction 17June 5Structure prediction 18June 7TBA 19June 12Project presentations (2 hours)

35 Instructors and TAs  Instructors: –Serafim Batzoglou –Jean-Claude Latombe  TA: –Sam Gross  Emails: | serafim | latombe | ssgross | @ cs.stanford.edu  Class website: http://cs273.stanford.edu http://cs273.stanford.edu

36 Expected Work  Regular attendance to lectures and active participation  Class scribing (assignments will depend on # of students)  Exciting programming project: http://www.stanford.edu/class/cs273/project/project.html http://www.stanford.edu/class/cs273/project/project.html - Structure prediction - Clustering and distance metrics - Protein design - Something else

37 Questions?


Download ppt "CS273 Algorithms for Structure and Motion in Biology Instructors: Serafim Batzoglou and Jean-Claude Latombe Teaching Assistant: Sam Gross | serafim | latombe."

Similar presentations


Ads by Google